mboehm7 commented on pull request #1362: URL: https://github.com/apache/systemds/pull/1362#issuecomment-898735466
LGTM - great catch @ywcb00 and thanks for fixing this performance issue. For local CP and distributed Spark operations we actually decide based on estimated sparsity if we do a sparse hash aggregation or an accumulation into a dense matrix. Likely the federated ctable operations was implemented for correctness so far. In your experiments, however, please test both sparse ctable scenario (like creating permutation matrices) and dense scenarios (like confusion matrices with moderate number of classes). In any case, the revised change is much better, because the previous commit affected the defaults, partially without considering the value, and unintended sparse accumulation can be vastly slower due to binary search and potential shifting. But you're right reevaluated these defaults (e.g., with our revived perftest suite). This reminds me, if feel it's beneficial, please add your federated algorithm tests to this perftest so we can automatically run them before releases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
