Hi, Sorry it's a bit late, but for anyone coming to this thread in the future, you can have a look at Jän Schutler's gist https://gist.github.com/f0k/f3190ebba6c53887d598d03119ca2066#file-wgan_mnist-py-L283-L285 . There are many different ways to do decay, but they are just a small edit to Jan's code
Ramana On Wednesday, May 31, 2017 at 11:14:09 PM UTC+5:30, Alexander Botev wrote: > > Depends what kind of performance are you measuring it on and what > optimizer you are using? > Is it training or validation/test performance and are you using any > adaptive method (RMSProp, Adam etc..)? > > On Sunday, 28 May 2017 03:03:42 UTC+1, Ji Qiujia wrote: >> >> Recently I am doing mnist image classification using resnet. And I found >> something strange, or interesting. First, though it's usually said that we >> should do early stopping, I found it's always better to run more epochs >> with the initial learning rate, which I set to 0.1 or 0.01, and then >> downscale learning rate quickly. For example, my learning rate strategy is >> to begin with 0.1 and is scaled down by 0.1 at the 200th, 210th, 220th >> epoch with batchsize of 64 and totally 230 epochs. I also found the last >> downscaling of learning rate usually degrade performance. Am I doing >> anything wrong?You are welcomed to share your parameter adjusting >> experience. >> > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
