jiajinyu opened a new issue #11462: throughput of sparse linear classification 
is small with small batch size
URL: https://github.com/apache/incubator-mxnet/issues/11462
 
 
   ## Description
   For small batch, sparse linear classification uses all CPU, but throughput 
is small. 
   
   
   ## Environment info (Required)
   Machine used: AWS AMI, c5.9xlarge,
   steps to repro:
   1. pip2 install mxnet-mkl
   2. git clone mxnet
   3. in directory `incubator-mxnet/example/sparse/linear_classification`, run 
`python2 train.py --batch-size 1`
   
   We see throughput is around 600 samples/sec. I tried to set up things like 
`export OMP_NUM_THREADS=vCPUs / 2`. It seems that after setting this, the CPU 
usage reduces (only half of the cores are used), but the throughput is not 
reduced. This is even the case when I setting `OMP_NUM_THREADS=1`. 
   
   
   ## Question
   How should I set things up to increase the throughput of the linear 
classfication training for a single machine with multiple cores? Or does MXNet 
currently not optimize in this direction (i.e. not using things like Hogwild!). 
Thanks in advance. 
   
   with @lcytzk
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to