mboehm7 commented on pull request #1009:
URL: https://github.com/apache/systemds/pull/1009#issuecomment-671578671
LGTM - thanks for the patch @Shafaq-Siddiqi, this is a great addition. I
only made some minor modifications related to parser error handling (only
frame-frame supported), the integration as `binaryOperations` (to make it
consistent with matrix blocks), modified CP/Spark instructions to fix
systematic typos ("comparision") and avoid unnecessary code redundancy, fixed
the handling of RDD/broadcast dependencies (to avoid too eager cleanup), and
fixed the formatting of tests (position of java annotations).
In a subsequent PR, please first run an experiment to measure runtime
performance. Then, reimplement the frame block comparison operation to directly
allocate and update `BooleanArray`s instead over going through the indirection
of string arrays. This has the advantage of avoiding allocation, avoiding
unnecessary string serialization and parsing, and avoiding cache-unfriendly
operations (row vs column layout). Finally, measure the resulting performance
improvement (e.g., for 100 binary frame comparisons at script level).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]