[GitHub] [incubator-mxnet] wy3406 opened a new issue #19498: SyncBN causes the memory to gradually increase with iteration

GitBox Sun, 08 Nov 2020 22:40:24 -0800


wy3406 opened a new issue #19498:
URL: https://github.com/apache/incubator-mxnet/issues/19498



   ## Description
   (A clear and concise description of what the bug is.)
   
   - I have a few issues/questions regarding SyncBN
   When using BN training in custom image segmentation, the memory is normal. 
But when I replaced BN with SyncBN, I found that the GPU memory gradually 
increased with iteration until it occupied the entire GPU memory,then the 
training is stuck. I try to use a smaller batch than BN, which also takes up 
all the GPU memory.
   Note there is no warning when I use SyncBN.
   Is there something I have missed?
   
   - Environments: Python 3.6.9 ; TITAN RTX × 8;CUDA 10.1
   
   - Framework: mxnet-cu101-1.7.0 and gluoncv-0.8.0
   
   ### Error Message
   (Paste the complete error message. Please also include stack trace by 
setting environment variable `DMLC_LOG_STACK_TRACE_DEPTH=100` before running 
your script.)
   
   ## To Reproduce
   (If you developed your own code, please provide a short script that 
reproduces the error. For existing examples, please provide link.)
   
   ### Steps to reproduce
   (Paste the commands you ran that produced the error.)
   
   1.
   2.
   
   ## What have you tried to solve it?
   
   1.
   2.
   
   ## Environment
   
   ***We recommend using our script for collecting the diagnostic information 
with the following command***
   `curl --retry 10 -s 
https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/diagnose.py
 | python3`
   
   <details>
   <summary>Environment Information</summary>
   
   ```
   # Paste the diagnose.py command output here
   ```
   
   </details>
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-mxnet] wy3406 opened a new issue #19498: SyncBN causes the memory to gradually increase with iteration

Reply via email to