[GitHub] [incubator-mxnet] FoConrad commented on issue #10563: Suboptimal performance implementing PPO with Adam Optimizer

GitHub Fri, 28 Sep 2018 21:49:10 -0700

The way I debugged the implementation is similar to the code I posted above. I 
ran the OpenAI baselines code with my implementation of PPO, made sure they 
were initialized the same, and stepped through comparing the weights and 
gradients.


I found immediately my value function was incorrect.

Also, double check your initialization to begin with. Sometimes PPO was very 
sensitive to the weight initialisation.

Hope this helps!

[ Full content available at: 
https://github.com/apache/incubator-mxnet/issues/10563 ]
This message was relayed via gitbox.apache.org for [email protected]

[GitHub] [incubator-mxnet] FoConrad commented on issue #10563: Suboptimal performance implementing PPO with Adam Optimizer

Reply via email to