BACtaki opened a new pull request, #1650:
URL: https://github.com/apache/systemds/pull/1650

   JIRA: https://issues.apache.org/jira/browse/SYSTEMDS-3390
   
   This patch improves the performance of countDistinctApprox() row/col 
aggregation by replacing matrix slicing with direct
   ops on the input matrix. This has the most impact in CP execution mode given 
the smaller input size (max 1000x1000); some
   simple experiments demonstrate this:
   
   (numbers represent average over 3 runs)
   1. row aggregation
       (A) dense: 10000x1000 with sparsity=0.9
       1.198s with slicing, 0.874s without slicing - a 27% improvement
   
       (B) sparse: 10000x1000 with sparsity=0.1
       0.528s with slicing, 0.512s without slicing - a 3% improvement
   
   As expected, the larger and the more dense the input matrix, the larger the 
performance improvement.
   
   2. col aggregation
       (A) dense: 10000x1000 with sparsity=0.9
       1.186s with slicing, 1.036s without slicing - a 13% improvement
   
       (B) sparse: 10000x1000 with sparsity=0.1
       1.272s with slicing, 0.647s without slicing - a 49% improvement
   
   In this case, the sparser the input matrix, the larger the performance 
improvement. This phenomenon is a result of
   employing a hash map M in the implementation: as the RxC input matrix 
becomes denser, M's keyset size approaches C,
   and the performance approaches the baseline, which uses slicing.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to