[GitHub] KeyKy opened a new issue #7448: out of memory when training imagenet with .rec file.

git Sun, 13 Aug 2017 20:37:00 -0700

KeyKy opened a new issue #7448: out of memory when training imagenet with .rec 
file.
URL: https://github.com/apache/incubator-mxnet/issues/7448
 
 
   ## Environment info
   
   Operating System: ubuntu 16.04
   
   Compiler: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
   
   Package used (Python/R/Scala/Julia): Python
   
   MXNet commit hash: 8ad3c8a7a98dfa6bd6f5065cf9c3688f2414c3d4
   
   Python version and distribution: Python2.7.12
   
   ## Error Message:
   
   The usage of memory, at the beginning of training is 5.7%-7.9% , after a 
while (3~4h) it takes 27% and finally it run out of my memory (waked up by the 
alarm message).
   
   ## Steps to reproduce
   or if you are running standard examples, please provide the commands you 
have run that lead to the error.
   
   1. cd examples/image-classification && python train_imagenet.py --network 
my_net --gpus 0,1,2,3 --num-epochs 100 --lr 0.01 --lr-step-epochs 30,60,80,110 
--batch-size 256 --top-k 5 --data-train 
/data_shared/datasets/ILSVRC2015/rec/train_480_q100.rec --data-val 
/data_shared/datasets/ILSVRC2015/rec/val_480_q100.rec --rgb-mean 
123.68,116.779,103.939 --data-nthreads 4 --model-prefix ./my_net
   
   ## What have you tried to solve it?
   
   1. set the prefetch_buffer = 1 but the accuracy of my model drop to 20% and 
set prefetch_buffer back to 2,4,8, the accuracy is right!
   2. cpu memory increase continually when set the prefetch_buffer = 1
   3. 
   
   also find some similar issues:
   
   https://github.com/apache/incubator-mxnet/issues/1411
   https://github.com/apache/incubator-mxnet/issues/3183
   https://github.com/apache/incubator-mxnet/issues/2969
   https://github.com/apache/incubator-mxnet/issues/2111
   https://github.com/apache/incubator-mxnet/issues/2099
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



With regards,
Apache Git Services

[GitHub] KeyKy opened a new issue #7448: out of memory when training imagenet with .rec file.

Reply via email to