mboehm7 commented on PR #2301: URL: https://github.com/apache/systemds/pull/2301#issuecomment-3146479980
Sorry for the delay, and thanks for getting started on this task @j143. 1. Integration: Instead of integrating this write into the VariableCPInstruction (where non-binary formats are written), I would recommend to integrate the OOC write into the individual writers (with support for only binary) with a new method which is called if an MatrixObject has indeed an existing OOC stream of blocks. 2. Core Write Logic: In order to yield the same output files as a normal (single-threaded) write, I recommend to not create part files for every single block, but stream all these blocks into a single file. Once you extend the binary write you see that this approach is even easier and result in files that can be processed much faster (not too many files which can be an issue on distributed file systems). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org