Baunsgaard commented on pull request #1127:
URL: https://github.com/apache/systemds/pull/1127#issuecomment-748006108


   
   >     * For dense transpose operations, we have two significant parts: 
allocating the dense output, and the multi-threaded transpose operation. On a 
box with 112 vcores, the allocation is 10x more expensive than the actual 
transpose operation. The conclusion would be an in-place transpose wherever 
possible. For example, compression is injected directly after the persistent 
read which makes it safe to use in-place by default for both local and 
distributed compression. This approach would not just improve compression times 
but also eliminate the unnecessary temporary memory requirements. I leave this 
up to you though.
   
   I will look at this! :+1: 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to