[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-25 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331819232 @formath I see. I've checked again different versions of the Adam paper and find the rho in the v2 and v3 versions:

[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-25 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331819232 @formath I see. I've checked again different versions of the Adam paper and find the rho in the v2 and v3 versions:

[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-25 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331819232 @formath I see. I've checked again different versions of the Adam paper and find the rho in the v2 and v3 versions:

[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-25 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331819232 @formath I see. I've checked again different versions of the Adam paper and find the rho in the v2 and v3 versions:

[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-25 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331819232 @formath I see. I've checked again different versions of the Adam paper and find the rho in the v2 and v3 versions:

[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-23 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331651717 @formath I feel that setting `rho` to be smaller than 1 can gradually transform the estimated gradient from a biased estimation to an

[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-23 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331651717 @formath I feel that the `rho` has the effect to gradually transform the estimated gradient from a biased estimation to an unbiased

[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-23 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331651717 @formath I feel that the `rho` has the effect to gradually transform the estimated gradient from a biased estimation to an unbiased

[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-23 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331651717 @formath I feel that the `rho` has the effect to gradually transform the estimated gradient from a biased estimation to an unbiased

[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-23 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331651717 @formath I feel that the `rho` has the effect to gradually transform the estimated gradient from a biased estimation to an unbiased

[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-23 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331651717 @formath I feel that the `rho` has the effect to gradually transform the estimated gradient from a biased estimation to an unbiased

[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-23 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331651717 @formath I feel that the `rho` has the effect to gradually transform the gradient estimator from biased to unbiased, which may have some

[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-23 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331651717 @formath I feel that the `rho` has the effect to gradually transform the gradient estimator from biased to unbiased, which may have some

[GitHub] sxjscience commented on issue #7942: Adam optimizer consistent with paper

2017-09-23 Thread git
sxjscience commented on issue #7942: Adam optimizer consistent with paper URL: https://github.com/apache/incubator-mxnet/pull/7942#issuecomment-331651717 @formath I feel that the `rho` has the effect to gradually transforms the gradient estimator from biased to unbiased, which may have