ThomasDelteil commented on issue #11462: throughput of sparse linear classification is small with small batch size URL: https://github.com/apache/incubator-mxnet/issues/11462#issuecomment-402871704 @jiajinyu it seems that indeed for small batch-size the CPU is doing a lot of work but it results in a small throughput. I ran with your example, and I see all my cores being used and only ~400 images/sec throughput. However when I increase the batch-size to 128 I see ~50k throughput, almost a linear scale-up. ``` 2018-07-05 22:27:54,570 Epoch[0] Batch [3400] Speed: 50624.32 samples/sec nll-loss=0.402102 2018-07-05 22:27:54,819 Epoch[0] Batch [3500] Speed: 51431.02 samples/sec nll-loss=0.401584 2018-07-05 22:27:55,081 Epoch[0] Batch [3600] Speed: 48777.71 samples/sec nll-loss=0.399383 ``` I am not an expert in MKL-DNN or the sparse API, maybe @eric-haibin-lin or @zheng-da can shine a bit more light on the issue but I would say based on this experiment that MKLDNN parallelize a lot of operations at the batch-level, which means whether you run a batch of size 1 or a batch of size 128, it is going to take the same amount of time. So the best way to increase throughput is to increase your batch-size.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
