NanYan1119 commented on issue #10696: [MXNET-366]Extend MXNet Distributed 
Training by AllReduce
URL: https://github.com/apache/incubator-mxnet/pull/10696#issuecomment-419826207
 
 
   I have a problem when test the allreduce version.
   1) I clone the master code from 
https://github.com/threeleafzerg/incubator-mxnet.
   2) I use "make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 
USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1 USE_PROFILER=1 USE_DIST_KVSTORE=1 
USE_ALLREDUCE_DIST_KVSTORE=1 MPI_ROOT=/usr/local/openmpi" to compile. 
   3) But when I run the script 
~/incubator-mxnet/tests/nightly/dist_allreduce_sync_kvstore.py, it end with an 
error:
   
   pure virtual method called
   terminate called without an active exception
   *** Process received signal ***
   Signal: Aborted (6)
   Signal code:  (-6)
   
   It seems the error occurred in destruct the kvstore, can anyone help me? 
@threeleafzerg 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to