phaniarnab opened a new pull request, #1676:
URL: https://github.com/apache/systemds/pull/1676
This patch improves the performance of countDistinctApprox() row/col
aggregation by replacing matrix slicing with direct ops on the input
matrix. This has the most impact in local CP execution mode, as
some simple experiments show:
(numbers represent average over 3 runs)
1. row aggregation
(A) dense: 10000x1000 with sparsity=0.9
1.198s with slicing, 0.874s without slicing - a 27% improvement
(B) sparse: 10000x1000 with sparsity=0.1
0.528s with slicing, 0.512s without slicing - a 3% improvement
As expected, the larger and the more dense the input matrix,
the larger the performance improvement.
2. col aggregation
(A) dense: 1000x10000 with sparsity=0.9
1.186s with slicing, 1.036s without slicing - a 13% improvement
(B) sparse: 1000x10000 with sparsity=0.1
1.272s with slicing, 0.647s without slicing - a 49% improvement
In this case, the sparser the input matrix, the larger the performance
improvement. This phenomenon is a result of employing a hash map M
in the implementation: as the RxC input matrix becomes denser, M's
keyset size approaches C, and the performance approaches the baseline,
which uses slicing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]