Hi, 
Sorry it's a bit late, but for anyone coming to this thread in the future, 
you can have a look at Jän Schutler's gist 
https://gist.github.com/f0k/f3190ebba6c53887d598d03119ca2066#file-wgan_mnist-py-L283-L285
 . 
There are many different ways to do decay, but they are just a small edit 
to Jan's code

Ramana

On Wednesday, May 31, 2017 at 11:14:09 PM UTC+5:30, Alexander Botev wrote:
>
> Depends what kind of performance are you measuring it on and what 
> optimizer you are using? 
> Is it training or validation/test performance and are you using any 
> adaptive method (RMSProp, Adam etc..)?
>
> On Sunday, 28 May 2017 03:03:42 UTC+1, Ji Qiujia wrote:
>>
>> Recently I am doing mnist image classification using resnet. And I found 
>> something strange, or interesting. First, though it's usually said that we 
>> should do early stopping, I found it's always better to run more epochs 
>> with the initial learning rate, which I set to 0.1 or 0.01, and then 
>> downscale learning rate quickly. For example, my learning rate strategy is 
>> to begin with 0.1 and is scaled down by 0.1 at the 200th, 210th, 220th 
>> epoch with batchsize of 64 and totally 230 epochs. I also found the last 
>> downscaling of learning rate usually degrade performance. Am I doing 
>> anything wrong?You  are welcomed to share your parameter adjusting 
>> experience.  
>>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to