[GitHub] yifeim commented on issue #10563: Suboptimal performance implementing PPO with Adam Optimizer

GitBox Wed, 26 Sep 2018 23:53:12 -0700

yifeim commented on issue #10563: Suboptimal  performance implementing PPO with 
Adam Optimizer
URL: 
https://github.com/apache/incubator-mxnet/issues/10563#issuecomment-424978630
 
 
   The PPO paper primarily depended on SGD and used Adam only as an alternative 
for better performance. Given he online nature of the problem, I would be 
surprised if SGD makes a fundamental difference.
   
   Also, while the KL term stabilizes the objective, PPO may be too 
conservative if there is no explicit exploration. Weight divergence is 
expected: any optimal policies must be deterministic, i.e. saturate. 
   
   There were some reproducibility discussions around PPO and TRPO. You may 
want to try a few more seeds on the original baseline as well.
   
   My 2cents.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] yifeim commented on issue #10563: Suboptimal performance implementing PPO with Adam Optimizer

Reply via email to