Hi everyone, I'm going to be applying for GSoC this summer and my proposal centers around extending the Reinforcement Learning module of mlpack.
In summary, what I propose to do is: 1) Rainbow DQN 3 of its 6 components are already implemented in mlpack, leaving Dueling, Distributional and Noisy DQN. Since these are extensions to the standard DQN which is already implemented, I estimate that implementing these 3 DQN's, and then Rainbow itself, along with documentation and tests should take around 6 weeks. 2) Actor Critic Models and algorithms If I'm not mistaken, currently mlpack does not implement Actor-Critic models, though there is an issue open (#2262). If no one has implemented it by then, I propose laying the foundation by implementing the basic architecture, and if time permits, add one of the more state-of-the-art algorithms for training it (e.g. A2C). On the other hand, if it is implemented, I will stick to implementing extensions to the AC model (I'm currently deliberating between A2C, Soft Actor-Critic and Optimistic Actor-Critic) in the remaining 6 weeks. I'd like to know if the timeline I've proposed is realistic, if all assumptions are correct, and all algorithms mentioned are of relevance to mlpack. Thanks for your time! Yours Sincerely, Sriram S K
_______________________________________________ mlpack mailing list [email protected] http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack
