Hello Sahith, I like the idea, also since OpenAI abandoned the leaderboard this could be a great opportunity. I'm a fan of giving a user the opportunity to test the methods without much hassle, so one idea is to provide an interface for the web, that exposes a minimal set of settings, something like:
www.mlpack.org/docs/mlpack-git/doxygen/optimizertutorial.html Let me know what you think, there are a bunch of interesting features, that we could look into, but we should make sure each is tangible and useful. Thanks, Marcus > On 28. Feb 2018, at 23:03, Sahith D <[email protected]> wrote: > > A playground type project sounds like a great idea. We could start with using > the current Q-Learning method already present in the mlpack repository and > then apply it to a environments in gym as a sort of tutorial. We could then > move onto more complex methods like Double Q-Learning and Monte Carlo Tree > Search (just suggestions) just to get started so that more people will get > encouraged to try their hand at solving the environments in more creative > ways using C++ as the python community is already pretty strong. If we could > build something of a leaderboard similar to what OpenAI gym already has then > it could foster a creative community of people who want to try more RL. Does > this sound good or can it be improved upon? > > Thanks, > Sahith. > > On Wed, Feb 28, 2018 at 3:50 PM Marcus Edel <[email protected] > <mailto:[email protected]>> wrote: > Hello Sahith, > >> 1. We could implement all the fundamental RL algorithms like those over here >> https://github.com/dennybritz/reinforcement-learning >> <https://github.com/dennybritz/reinforcement-learning> . This repository >> contains >> nearly all the algorithms that are useful for RL according to David Silver's >> RL >> course. They're all currently in python so it could just be a matter of >> porting >> them over to use mlpack. > > I don't think implementing all the methods, is something we should pursue over > the summer, writing the method itself and coming up with some meaningful tests > takes time. Also, in my opinion instead of implementing all methods, we should > pick methods that make sense in a specific context and make them as fast and > easy to use as possible. > >> 2. We could implement fewer algorithms but work more on solving the OpenAI >> gym >> environments using them. This would require tighter integration of the gym >> wrapper that you have already written. If enough environments can be solved >> then >> this could become a viable C++ library for comparing RL algorithms in the >> future. > > I like the idea, this could be a great way to present the RL infrastructure > to a > wider audience, in the form of a playground. > > Let me know what you think. > > Thanks, > Marcus > >> On 27. Feb 2018, at 23:01, Sahith D <[email protected] >> <mailto:[email protected]>> wrote: >> >> Hi Marcus, >> Sorry for not updating you earlier as I had some exams that I needed to >> finish first. >> I've been working on the policy gradient over in this repository which you >> can see over here https://github.com/SND96/mlpack-rl >> <https://github.com/SND96/mlpack-rl> >> I also had some ideas on what this project could be about. >> >> 1. We could implement all the fundamental RL algorithms like those over here >> https://github.com/dennybritz/reinforcement-learning >> <https://github.com/dennybritz/reinforcement-learning> . This repository >> contains nearly all the algorithms that are useful for RL according to David >> Silver's RL course. They're all currently in python so it could just be a >> matter of porting them over to use mlpack. >> 2. We could implement fewer algorithms but work more on solving the OpenAI >> gym environments using them. This would require tighter integration of the >> gym wrapper that you have already written. If enough environments can be >> solved then this could become a viable C++ library for comparing RL >> algorithms in the future. >> >> Right now I'm working on the solving one of the environments in gym using a >> Deep Q-Learning approach similar to what is already there in the mlpack >> library from last year's gsoc. Its taking a bit longer than I hoped as I'm >> still familiarizing myself with some of the server calls being made and how >> to properly get information about the environements. Would appreciate your >> thoughts on the ideas that I have and anything else that you had in mind. >> >> Thanks! >> Sahith >> >> On Fri, Feb 23, 2018 at 1:50 PM Sahith D <[email protected] >> <mailto:[email protected]>> wrote: >> Hi Marcus, >> I've been having difficulties compiling mlpack which has stalled my >> progress. I've opened an issue on the same and appreciate any help. >> >> On Thu, Feb 22, 2018 at 10:09 AM Sahith D <[email protected] >> <mailto:[email protected]>> wrote: >> Hey Marcus, >> No problem with the slow response as I was familiarizing myself better with >> the codebase and the methods present in the meantime. I'll start working on >> what you mentioned. I'll notify you when I finish. >> >> Thanks! >> >> On Thu, Feb 22, 2018 at 4:56 AM Marcus Edel <[email protected] >> <mailto:[email protected]>> wrote: >> Hello Sahith, >> >> thanks for getting in touch and sorry for the slow response. >> >> > My name is Sahith. I've been working on Reinforcement Learning for the >> > past year >> > and am interested in coding with mlpack on the RL project for this summer. >> > I've >> > been going through the codebase and have managed to get the Open AI gym >> > api up >> > and running on my computer. Is there any other specific task I can do >> > while I >> > get to know more of the codebase? >> >> Great that you got it all working, another good entry point is to write a >> simple >> RL method, one method that is simple that comes to mind is the Policy >> Gradients >> method. Another idea is to write an example for solving a GYM environment >> with >> the existing codebase, something in the vein of the Kaggel Digit Recognizer >> Eugene wrote >> (https://github.com/mlpack/models/tree/master/Kaggle/DigitRecognizer >> <https://github.com/mlpack/models/tree/master/Kaggle/DigitRecognizer>). >> >> Let me know if I should clarify anything. >> >> Thanks, >> Marcus >> >> > On 19. Feb 2018, at 20:41, Sahith D <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > Hello Marcus, >> > My name is Sahith. I've been working on Reinforcement Learning for the >> > past year and am interested in coding with mlpack on the RL project for >> > this summer. I've been going through the codebase and have managed to get >> > the Open AI gym api up and running on my computer. Is there any other >> > specific task I can do while I get to know more of the codebase? >> > Thanks! >> >
_______________________________________________ mlpack mailing list [email protected] http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
