Re: [mlpack] Reinforcement Learning GSOC

Marcus Edel Thu, 01 Mar 2018 02:05:49 -0800

Hello Sahith,

I like the idea, also since OpenAI abandoned the leaderboard this could be a
great opportunity. I'm a fan of giving a user the opportunity to test the
methods without much hassle, so one idea is to provide an interface for the web,
that exposes a minimal set of settings, something like:


www.mlpack.org/docs/mlpack-git/doxygen/optimizertutorial.html

Let me know what you think, there are a bunch of interesting features, that we
could look into, but we should make sure each is tangible and useful.

Thanks,
Marcus

> On 28. Feb 2018, at 23:03, Sahith D <[email protected]> wrote:
> 
> A playground type project sounds like a great idea. We could start with using 
> the current Q-Learning method already present in the mlpack repository and 
> then apply it to a environments in gym as a sort of tutorial. We could then 
> move onto more complex methods like Double Q-Learning and Monte Carlo Tree 
> Search (just suggestions) just to get started so that more people will get 
> encouraged to try their hand at solving the environments in more creative 
> ways using C++ as the python community is already pretty strong. If we could 
> build something of a leaderboard similar to what OpenAI gym already has then 
> it could foster a creative community of people who want to try more RL. Does 
> this sound good or can it be improved upon?
> 
> Thanks,
> Sahith.
> 
> On Wed, Feb 28, 2018 at 3:50 PM Marcus Edel <[email protected] 
> <mailto:[email protected]>> wrote:
> Hello Sahith,
> 
>> 1. We could implement all the fundamental RL algorithms like those over here
>> https://github.com/dennybritz/reinforcement-learning 
>> <https://github.com/dennybritz/reinforcement-learning> . This repository 
>> contains
>> nearly all the algorithms that are useful for RL according to David Silver's 
>> RL
>> course. They're all currently in python so it could just be a matter of 
>> porting
>> them over to use mlpack.
> 
> I don't think implementing all the methods, is something we should pursue over
> the summer, writing the method itself and coming up with some meaningful tests
> takes time. Also, in my opinion instead of implementing all methods, we should
> pick methods that make sense in a specific context and make them as fast and
> easy to use as possible.
> 
>> 2. We could implement fewer algorithms but work more on solving the OpenAI 
>> gym
>> environments using them. This would require tighter integration of the gym
>> wrapper that you have already written. If enough environments can be solved 
>> then
>> this could become a viable C++ library for comparing RL algorithms in the
>> future.
> 
> I like the idea, this could be a great way to present the RL infrastructure 
> to a
> wider audience, in the form of a playground.
> 
> Let me know what you think.
> 
> Thanks,
> Marcus
> 
>> On 27. Feb 2018, at 23:01, Sahith D <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Hi Marcus,
>> Sorry for not updating you earlier as I had some exams that I needed to 
>> finish first.
>> I've been working on the policy gradient over in this repository which you 
>> can see over here https://github.com/SND96/mlpack-rl 
>> <https://github.com/SND96/mlpack-rl>
>> I also had some ideas on what this project could be about.
>> 
>> 1. We could implement all the fundamental RL algorithms like those over here 
>> https://github.com/dennybritz/reinforcement-learning 
>> <https://github.com/dennybritz/reinforcement-learning> . This repository 
>> contains nearly all the algorithms that are useful for RL according to David 
>> Silver's RL course. They're all currently in python so it could just be a 
>> matter of porting them over to use mlpack. 
>> 2. We could implement fewer algorithms but work more on solving the OpenAI 
>> gym environments using them. This would require tighter integration of the 
>> gym wrapper that you have already written. If enough environments can be 
>> solved then this could become a viable C++ library for comparing RL 
>> algorithms in the future.
>> 
>> Right now I'm working on the solving one of the environments in gym using a 
>> Deep Q-Learning approach similar to what is already there in the mlpack 
>> library from last year's gsoc. Its taking a bit longer than I hoped as I'm 
>> still familiarizing myself with some of the server calls being made and how 
>> to properly get information about the environements. Would appreciate your 
>> thoughts on the ideas that I have and anything else that you had in mind.
>> 
>> Thanks!
>> Sahith
>> 
>> On Fri, Feb 23, 2018 at 1:50 PM Sahith D <[email protected] 
>> <mailto:[email protected]>> wrote:
>> Hi Marcus,
>> I've been having difficulties compiling mlpack which has stalled my 
>> progress. I've opened an issue on the same and appreciate any help.
>> 
>> On Thu, Feb 22, 2018 at 10:09 AM Sahith D <[email protected] 
>> <mailto:[email protected]>> wrote:
>> Hey Marcus,
>> No problem with the slow response as I was familiarizing myself better with 
>> the codebase and the methods present in the meantime. I'll start working on 
>> what you mentioned. I'll notify you when I finish.
>> 
>> Thanks!
>> 
>> On Thu, Feb 22, 2018 at 4:56 AM Marcus Edel <[email protected] 
>> <mailto:[email protected]>> wrote:
>> Hello Sahith,
>> 
>> thanks for getting in touch and sorry for the slow response.
>> 
>> > My name is Sahith. I've been working on Reinforcement Learning for the 
>> > past year
>> > and am interested in coding with mlpack on the RL project for this summer. 
>> > I've
>> > been going through the codebase and have managed to get the Open AI gym 
>> > api up
>> > and running on my computer. Is there any other specific task I can do 
>> > while I
>> > get to know more of the codebase?
>> 
>> Great that you got it all working, another good entry point is to write a 
>> simple
>> RL method, one method that is simple that comes to mind is the Policy 
>> Gradients
>> method. Another idea is to write an example for solving a GYM environment 
>> with
>> the existing codebase, something in the vein of the Kaggel Digit Recognizer
>> Eugene wrote
>> (https://github.com/mlpack/models/tree/master/Kaggle/DigitRecognizer 
>> <https://github.com/mlpack/models/tree/master/Kaggle/DigitRecognizer>).
>> 
>> Let me know if I should clarify anything.
>> 
>> Thanks,
>> Marcus
>> 
>> > On 19. Feb 2018, at 20:41, Sahith D <[email protected] 
>> > <mailto:[email protected]>> wrote:
>> >
>> > Hello Marcus,
>> > My name is Sahith. I've been working on Reinforcement Learning for the 
>> > past year and am interested in coding with mlpack on the RL project for 
>> > this summer. I've been going through the codebase and have managed to get 
>> > the Open AI gym api up and running on my computer. Is there any other 
>> > specific task I can do while I get to know more of the codebase?
>> > Thanks!
>> 
>

_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack

Re: [mlpack] Reinforcement Learning GSOC

Reply via email to