Hello Rohan, sorry for the slow response, the first approach is just fine, for the second approach, we could use the sequential layer which is basically a feedforward network but exposes the layer interface. Anyway, as I said I think the first approach might have some advantages.
Thanks, Marcus > On 2. Apr 2019, at 16:24, Rohan Raj <[email protected]> wrote: > > Dear Ryan and Marcus, > > Please answer my doubts sent in my previous mail as soon as you get time. > > Thanks, > > Rohan Raj > Indian Institute of Technology Guwahati > Assam , India > Phone : +91 8723990557 > > > > On Wed, 27 Mar 2019 at 10:10, Rohan Raj <[email protected] > <mailto:[email protected]>> wrote: > Hello all, > > Apologies for the delay in reply. I have started writing the proposal for the > coming GSOC year. I sincerely wanted to know a few things from the authors. > For the PPO Reinforcement Learning algorithm, we can either have 2 different > neural networks for policy and value estimation or club these into a single > model with different outputs (as openai baselines or deepmind). The first > option is approachable in MlPack. However, I am confused with the second > approach. I feel that the following lines > (https://github.com/mlpack/mlpack/blob/2635297c8793396e57469bc731451fbe18bed656/src/mlpack/methods/ann/layer/add_merge.hpp#L127-L128 > > <https://github.com/mlpack/mlpack/blob/2635297c8793396e57469bc731451fbe18bed656/src/mlpack/methods/ann/layer/add_merge.hpp#L127-L128>) > might be helpful for the purpose, however, I am not completely sure. > > Could you please let me know how we can achieve the parameter sharing in > mlpack? > > Thanks, > > Rohan Raj > Indian Institute of Technology Guwahati > Assam , India > Phone : +91 8723990557 > > > ᐧ > > On Mon, 11 Mar 2019 at 01:11, Ryan Curtin <[email protected] > <mailto:[email protected]>> wrote: > On Fri, Mar 08, 2019 at 04:31:55AM +0530, Rohan Raj wrote: > > Hello Ryan, Marcus and fellow contributors of MLPACK, > > > > I am Rohan Raj (Github : mirraaj) <https://github.com/mirraaj > > <https://github.com/mirraaj>>, > > undergraduate student from Indian Institute of Technology (IIT) Guwahati. I > > am writing this email to you to express my interests in becoming a part of > > *MLPACK* for the coming *Google Summer of Codes 2019.* > > > > I sincerely congratulate Mlpack for being accepted as a mentor organization > > for the coming Google Summer of Codes 2019. I am interested in > > reinforcement learning project for the coming year. In particular, I plan > > to implement Rainbow and PPO for the coming coding season. > > > > My tentative schedule is present below, > > > > Week 1-6 : Implement different Rainbow DQN functions > > > > Week 6-10 : PPO Algorithm > > > > Week 11-12 Bug fixing and final submission. > > > > I believe it is really important to test any function/feature added to the > > mlpack codebase. I have been working on RL and Mlpack for quite a long time > > and I personally think it is difficult to reproduce result sometimes. It is > > also a time taking procedure to stabilize statistical test results on > > mlpack codebase. Hence I would like to go ahead with 2 algorithms so that I > > get proper time to test the algorithms on different environments. > > > > Please let me know your valuable inputs to this short proposal. I will > > definitely add the details of the project in my actual proposal. > > Hi Rohan, > > Thanks for the congratulations and we're happy to have you involved. > Although I am not a reinforcement learning expert and I won't be the > mentor for that project, I will at least say that two weeks set aside > for 'bug fixing' is a bit vague---it's definitely hard to predict when > you'll have bugs, but as you prepare your proposal I'd encourage you to > spend a bit of time thinking about how you will write the tests to catch > all potential bugs you might have during implementation. > > You're right that testing is a very important part, so often when I am > reviewing proposals, I look for a lot of detail about how the proposed > algorithm will be implemented and things of this nature. > > I hope this is helpful. :) > > Thanks! > > Ryan > > -- > Ryan Curtin | "None of your mailman friends can hear you." > [email protected] <mailto:[email protected]> | - Alpha
_______________________________________________ mlpack mailing list [email protected] http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
