@StephanieYuan The original version of SVRG could be time-consuming due to the computation of the full gradient at the beginning of each epoch. Could you also include the cheap version in your implementation: https://arxiv.org/abs/1511.01942
[ Full content available at: https://github.com/apache/incubator-mxnet/pull/12376 ] This message was relayed via gitbox.apache.org for [email protected]
