jhaberl opened a new pull request #1505: URL: https://github.com/apache/systemds/pull/1505
Improve linearization order of LOPs in the DAG to reduce memory consumption of intermediates. The existing solution is a depth-first algorithm. We modify this to execute LOPs that result in smaller intermediates first in order to reduce maximum memory consumption. This is achieved using a bottom-up approach considering memory estimates (OptimizerUtils.estimateSizeExactSparsity) of parent LOPs. **Results** We've compared the implementations using this script: ``` totalResult = matrix(0, rows=1000, cols=1) for (i in 1:100) { A = rand(rows=1000, cols=1000) b = rand(rows=1000, cols=1000) C = rand(rows=1000, cols=1000) d = rand(rows=1000, cols=1000) result = (A + b) %*% rowMaxs(C + d) totalResult = totalResult + result } print(toString(totalResult)) ``` In the existing implementations to calculate the result `A + b` is executed first and results in a 1000x1000 matrix as an intermediate while calculating `rowMaxs`. In our solution, `rowMaxs` is executed first, which only leads to an intermediate vector of 1000x1. For the following data, we've forced the garbage collector to run after every instruction. When enabling JMLC_MEM_STATISTICS we get these results: **Max size of live objects:** Depth-first (existing): 30.526 MB (5 total objects) Breadth-first: 38.155 MB (6 total objects) Min-intermediate (ours): 22.904 MB (5 total objects) This is a graph of memory consumption over time for the test script: ![mem_consumption](https://user-images.githubusercontent.com/95644983/149497162-bf954b3c-2309-4069-a671-90f9b5feae14.png) For a more detailed view, the same graph but only the first 45 instructions. This includes two loop iterations. ![mem_consumption_parts](https://user-images.githubusercontent.com/95644983/149497177-9d0acfe8-9478-4b6b-a3ec-7c1a2aa608c8.png) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@systemds.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org