jiajinyu edited a comment on issue #12958: Improve dot(csr, rsp) on CPU by 10x 
URL: https://github.com/apache/incubator-mxnet/pull/12958#issuecomment-438274161
 
 
   I updated the benchmark script and compared perf on different rsp and csr 
density. The conclusion  is that the perf gain diminishes as csr matrix gets 
denser or rsp is very sparse. My 2 cents is that when csr gets denser, the 
binary searhc doesn't matter because of cache locality of `data_r`. 
   
   
   ## comparison
   
   For hash dim = 2 ^ 28, roughly 0.27G, if we have csr density 0.0001. If we 
interpret each row in csr is a training example, then we have 26k features per 
example, and the perf comparison (using  Intel Xeon E5-2650 with 
OMP_NUM_THREADS=24, batch = 128) is the following :
   
   | rsp density | old run in secs | new run in secs |
   | --- | --- | --- |
   | 0.01 | 0.3695 | 0.2037 |
   | 0.05 | 1.4380 | 0.4995 |
   | 0.15 | 4.7707 | 0.9360 |
   | 0.25 | 6.3904 | 1.5058 |
   | 0.50 | 11.4553 | 1.7401 |
   | 0.75 | 14.7599 | 2.9364 |
   
   for csr density 0.001, then we have 260k features per example, and the perf 
comparision is the following.
   
   |rsp density| old run in secs | new run in secs |
   | -- | -- | -- |
   | 0.01 | 0.6384 | 0.9936 |
   | 0.05 | 1.9556 | 2.5660 |
   | 0.15 | 4.6588 | 4.4983 |
   | 0.25 | 7.2088 | 6.0316 |
   | 0.50 | 11.1837 | 8.4714 |
   | 0.75 | 14.7599 | 8.1321 |
   
   the perf is worse when csr density is 1%, which means that we will have 2.6 
M feature per example for dim = 2^28, but I'm not sure it makes sense.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to