mboehm7 commented on PR #2301:
URL: https://github.com/apache/systemds/pull/2301#issuecomment-3146479980

   Sorry for the delay, and thanks for getting started on this task @j143. 
   
   1. Integration: Instead of integrating this write into the 
VariableCPInstruction (where non-binary formats are written), I would recommend 
to integrate the OOC write into the individual writers (with support for only 
binary) with a new method which is called if an MatrixObject has indeed an 
existing OOC stream of blocks.
   
   2. Core Write Logic: In order to yield the same output files as a normal 
(single-threaded) write, I recommend to not create part files for every single 
block, but stream all these blocks into a single file. Once you extend the 
binary write you see that this approach is even easier and result in files that 
can be processed much faster (not too many files which can be an issue on 
distributed file systems).    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to