Hello, Mentors,
I am second-year postgraduate student majoring in CS. I have commit
some PR in mlpack, especially in Decision Tree / decision stump. These days I
am glad to see that implementing SOTA RL algorithm is arranged as a project in
GSoC2019. I am very interested in contribute code about RL algorithm to mlpack.
I am familiar with Actor-Critic algorithm series such as A3C, DDPG, PPO and
etc. Of course If I want to implement them, I need to dive into them.
Here is my thought on this subject. Firstly I will build a Actor-Critic
algorithm framework in order to integrate various algorithm into it
conveniently later. Then I will implement ACKTR and TRPO/PPO. Considering Month
May, so here are around 4 months to finish this task. My proposal is as
following:
0. week -6 ~ -4 (in April), I am familiar with the framework and logic of
MLPack. However I also should cost time to read the code about the
implementation in method/reinforcement_learning, and the zoq/gym_tcp_api
<https://github.com/zoq/gym_tcp_api>. I think it it better to understand the
system design done by pioneer.
1. week -3 ~ 0 (in May), review these method and refer other implementations
based on Pytorch to summary the framework of the A-C algorithm. A comprehensive
report should be presented. Besides, a manuscript about a scalable
reinforcement learning module design should present.
2. week 1 ~ 4 (Project Beginning), finish the Actor-Critic framework, make it
compatible with existing code, or refactor existing code for scalable
considering future.
3. week 5 ~ 8, finish the PPO and ACKTR. It is SOTA method. Some simple tests
should be done to guarantee code correction.
4. week 9~12, a comprehensive unit test should be done and corresponding
documents and tutorials should be posted. The result should be comparable to
other implementation based on Pytorch. And documents about API should be
clarify. A tutorial containing an RL example or demo also should be posted.
How do you think of it? Can you give me some suggestions?
_______________________________________________
mlpack mailing list
[email protected]
http://knife.lugatgt.org/cgi-bin/mailman/listinfo/mlpack