Dear Marcus, Could you please let me know if I should try to implement the second method aswell for gsoc proposal? The blog ( https://github.com/mlpack/mlpack/wiki/SummerOfCodeIdeas#reinforcement-learning) doesn't really give the idea whether parameter sharing is the objective of the project or not?
Thanks, Rohan Raj Indian Institute of Technology Guwahati Assam , India Phone : +91 8723990557 On Tue, 2 Apr 2019 at 20:21, Marcus Edel <[email protected]> wrote: > Hello Rohan, > > sorry for the slow response, the first approach is just fine, for the > second > approach, we could use the sequential layer which is basically a > feedforward > network but exposes the layer interface. Anyway, as I said I think the > first > approach might have some advantages. > > Thanks, > Marcus > > On 2. Apr 2019, at 16:24, Rohan Raj <[email protected]> wrote: > > Dear Ryan and Marcus, > > Please answer my doubts sent in my previous mail as soon as you get time. > > Thanks, > > Rohan Raj > Indian Institute of Technology Guwahati > Assam , India > Phone : +91 8723990557 > > > > On Wed, 27 Mar 2019 at 10:10, Rohan Raj <[email protected]> wrote: > >> Hello all, >> >> Apologies for the delay in reply. I have started writing the proposal for >> the coming GSOC year. I sincerely wanted to know a few things from the >> authors. For the PPO Reinforcement Learning algorithm, we can either have 2 >> different neural networks for policy and value estimation or club these >> into a single model with different outputs (as openai baselines or >> deepmind). The first option is approachable in MlPack. However, I am >> confused with the second approach. I feel that the following lines ( >> https://github.com/mlpack/mlpack/blob/2635297c8793396e57469bc731451fbe18bed656/src/mlpack/methods/ann/layer/add_merge.hpp#L127-L128) >> might be helpful for the purpose, however, I am not completely sure. >> >> Could you please let me know how we can achieve the parameter sharing in >> mlpack? >> >> Thanks, >> >> Rohan Raj >> Indian Institute of Technology Guwahati >> Assam , India >> Phone : +91 8723990557 >> >> >> ᐧ >> >> On Mon, 11 Mar 2019 at 01:11, Ryan Curtin <[email protected]> wrote: >> >>> On Fri, Mar 08, 2019 at 04:31:55AM +0530, Rohan Raj wrote: >>> > Hello Ryan, Marcus and fellow contributors of MLPACK, >>> > >>> > I am Rohan Raj (Github : mirraaj) <https://github.com/mirraaj>, >>> > undergraduate student from Indian Institute of Technology (IIT) >>> Guwahati. I >>> > am writing this email to you to express my interests in becoming a >>> part of >>> > *MLPACK* for the coming *Google Summer of Codes 2019.* >>> > >>> > I sincerely congratulate Mlpack for being accepted as a mentor >>> organization >>> > for the coming Google Summer of Codes 2019. I am interested in >>> > reinforcement learning project for the coming year. In particular, I >>> plan >>> > to implement Rainbow and PPO for the coming coding season. >>> > >>> > My tentative schedule is present below, >>> > >>> > Week 1-6 : Implement different Rainbow DQN functions >>> > >>> > Week 6-10 : PPO Algorithm >>> > >>> > Week 11-12 Bug fixing and final submission. >>> > >>> > I believe it is really important to test any function/feature added to >>> the >>> > mlpack codebase. I have been working on RL and Mlpack for quite a long >>> time >>> > and I personally think it is difficult to reproduce result sometimes. >>> It is >>> > also a time taking procedure to stabilize statistical test results on >>> > mlpack codebase. Hence I would like to go ahead with 2 algorithms so >>> that I >>> > get proper time to test the algorithms on different environments. >>> > >>> > Please let me know your valuable inputs to this short proposal. I will >>> > definitely add the details of the project in my actual proposal. >>> >>> Hi Rohan, >>> >>> Thanks for the congratulations and we're happy to have you involved. >>> Although I am not a reinforcement learning expert and I won't be the >>> mentor for that project, I will at least say that two weeks set aside >>> for 'bug fixing' is a bit vague---it's definitely hard to predict when >>> you'll have bugs, but as you prepare your proposal I'd encourage you to >>> spend a bit of time thinking about how you will write the tests to catch >>> all potential bugs you might have during implementation. >>> >>> You're right that testing is a very important part, so often when I am >>> reviewing proposals, I look for a lot of detail about how the proposed >>> algorithm will be implemented and things of this nature. >>> >>> I hope this is helpful. :) >>> >>> Thanks! >>> >>> Ryan >>> >>> -- >>> Ryan Curtin | "None of your mailman friends can hear you." >>> [email protected] | - Alpha >>> >> >
_______________________________________________ mlpack mailing list [email protected] http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
