luhenry commented on pull request #30810:
URL: https://github.com/apache/spark/pull/30810#issuecomment-748687656


   I updated the PR to depend on the package `dev.ludovic.vectorizedblas-blas` 
instead as it makes it a lot easier for me to go and evolve the algorithms 
independently of Spark, and for the integration to the build system. Let me 
know if that compromise of good for you.
   
   As for the latest results, it's looking much better:
   
   ```
   [info] f2jBLAS    = com.github.fommil.netlib.F2jBLAS
   [info] vectorBLAS = dev.ludovic.blas.VectorizedBLAS
   [info] 
   [info] daxpy:                                    Best Time(ms)   Avg 
Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] 
------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                  45             
45           1        223.7           4.5       1.0X
   [info] vector                                               26             
26           3        389.4           2.6       1.7X
   [info] 
   [info] Unknown processor
   [info] sdot:                                     Best Time(ms)   Avg 
Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] 
------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                  53             
53           1        190.4           5.3       1.0X
   [info] vector                                               16             
17           1        607.4           1.6       3.2X
   [info] 
   [info] Unknown processor
   [info] ddot:                                     Best Time(ms)   Avg 
Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] 
------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                  73             
73           1        137.3           7.3       1.0X
   [info] vector                                               35             
36           3        282.8           3.5       2.1X
   [info] 
   [info] dscal:                                    Best Time(ms)   Avg 
Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] 
------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                  36             
36           1        279.5           3.6       1.0X
   [info] vector                                               21             
21           0        481.4           2.1       1.7X
   [info] 
   [info] dgemv[T]:                                 Best Time(ms)   Avg 
Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] 
------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                  35             
36           0          0.0       35458.6       1.0X
   [info] vector                                               23             
23           0          0.0       23415.6       1.5X
   [info]
   [info] dgemm[T,N]:                               Best Time(ms)   Avg 
Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   [info] 
------------------------------------------------------------------------------------------------------------------------
   [info] f2j                                                 275            
276           1          0.4        2753.6       1.0X
   [info] vector                                              149            
166         169          0.7        1488.5       1.8X
   ```
   
   For much more detailed performance numbers on x86 (w/ AVX-2), I'm currently 
running a JMH benchmark covering more cases. I'll link to it as soon as it 
finishes (by tomorrow morning CET).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to