janniklinde commented on PR #2376:
URL: https://github.com/apache/systemds/pull/2376#issuecomment-3641594972

   Thank you for the very good PR @Biranavan-Parameswaran and thanks for 
addressing the normalization issue for negative shifts in the roll operation. I 
particularly like that you added new correctness tests to increase test 
coverage and that you provide benchmarks to verify speedups for larger / dense 
matrices.
   
   I suggest that `RollOperationThreadSafetyTest` is moved to 
`test.java.org.apache.sysds.performance.matrix` (and renamed to 
`MatrixRollPerf`). As you correctly mentioned, warmups can distort experimental 
results. Similarly, the overhead of `JUnit` is also not ideal for performance 
tests. I would consider doing a performance test in the style of 
`MatrixAggregate` to get more accurate results.
   
   It would be interesting to see the performance of larger dense matrices with 
few numbers of columns (including column vectors). My guess is that performance 
drops in that scenario because you then do many array copies on small ranges. 
If the underlying row-major `DenseBlock` is `DenseBlockFP64`, I think we could 
benefit from replacing the row-by-row copy with one (or two in the overflow 
case) large arraycopy calls (similar to the `clen == 1` case in 
`copyDenseMtx(...)`).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to