Recently I am doing mnist image classification using resnet. And I found 
something strange, or interesting. First, though it's usually said that we 
should do early stopping, I found it's always better to run more epochs 
with the initial learning rate, which I set to 0.1 or 0.01, and then 
downscale learning rate quickly. For example, my learning rate strategy is 
to begin with 0.1 and is scaled down by 0.1 at the 200th, 210th, 220th 
epoch with batchsize of 64 and totally 230 epochs. I also found the last 
downscaling of learning rate usually degrade performance. Am I doing 
anything wrong?You  are welcomed to share your parameter adjusting 
experience.  

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to