jhaberl opened a new pull request #1505:
URL: https://github.com/apache/systemds/pull/1505


   Improve linearization order of LOPs in the DAG to reduce memory consumption 
of intermediates.
   
   The existing solution is a depth-first algorithm. We modify this to execute 
LOPs that result in smaller intermediates first in order to reduce maximum 
memory consumption. This is achieved using a bottom-up approach considering 
memory estimates (OptimizerUtils.estimateSizeExactSparsity) of parent LOPs.
   
   **Results**
   We've compared the implementations using this script:
   ```
   totalResult = matrix(0, rows=1000, cols=1)
   for (i in 1:100) {
       A = rand(rows=1000, cols=1000)
       b = rand(rows=1000, cols=1000)
       C = rand(rows=1000, cols=1000)
       d = rand(rows=1000, cols=1000)
       result = (A + b) %*% rowMaxs(C + d)
       totalResult = totalResult + result
   }
   print(toString(totalResult))
   ```
   In the existing implementations to calculate the result `A + b` is  executed 
first and results in a 1000x1000 matrix as an intermediate while calculating 
`rowMaxs`. In our solution, `rowMaxs` is executed first, which only leads to an 
intermediate vector of 1000x1.
   
   For the following data, we've forced the garbage collector to run after 
every instruction.
   
   When enabling JMLC_MEM_STATISTICS we get these results:
   **Max size of live objects:**
   Depth-first (existing): 30.526 MB (5 total objects)
   Breadth-first: 38.155 MB (6 total objects)
   Min-intermediate (ours): 22.904 MB (5 total objects)
   
   This is a graph of memory consumption over time for the test script:
   
![mem_consumption](https://user-images.githubusercontent.com/95644983/149497162-bf954b3c-2309-4069-a671-90f9b5feae14.png)
   
   For a more detailed view, the same graph but only the first 45 instructions. 
This includes two loop iterations.
   
![mem_consumption_parts](https://user-images.githubusercontent.com/95644983/149497177-9d0acfe8-9478-4b6b-a3ec-7c1a2aa608c8.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to