philipportner commented on a change in pull request #1147:
URL: https://github.com/apache/systemds/pull/1147#discussion_r559782106



##########
File path: scripts/staging/unified-memory-manager/umm.md
##########
@@ -0,0 +1,144 @@
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% end comment %}
+-->
+
+# Unified Memory Manager - Design Document
+
+| **Author(s)** | Philipp Ortner |
+:-------------- |:----------------------------------------------------|
+
+## Description
+This document describes the initial design of an Unified Memory Manager 
proposed
+for SystemDS.
+
+## Design
+The Unified Memory Manager, henceforth UMM, will act as a manager for heap 
memory
+provided by the JVM.
+
+The UMM has a certain (% of JVM heap) amount of memory which it can distribute 
for operands
+and a buffer pool.
+
+Operands are the compiled programs variables which extends the base class 
+`Data`, e.g., `MatrixObject` and `StringObject`.
+The buffer pool manages
+in memory representation of dirty objects that don't exist on the HDFS or 
+other persistent memory. This is currently done by the `LazyWriteBuffer`. 
Intermediates
+could be represented in this buffer.
+
+These two memory areas each will have a min and max amount of memory it can
+occupy, meaning that the boundary for the areas can shift dynamically depending
+on the current load.
+
+||min|max|
+| ------------- |:-------------:| -----:|
+| operations  | 50% | 70% |
+| buffer pool | 15% | 30% |
+
+The UMM will utilise the existing `CacheBlock` structure to manage the buffer
+pool area while it will use the new `OperandBlock (?)` structure to keep track 
of

Review comment:
       I may be using the terminology wrong, I actually meant what you 
described.
   
   I looked through the code again and I think I'm missing something.
   
   The `MatrixObject`, `FrameObject` and the `TensorObject` implement the 
`CacheBlock` interface.
   The `LazyWriteBuffer` itself seems to manage `ByteBuffer` instances, which 
in turn also hold `CacheBlocks`, at least for dense data.
   
   Could you please elaborate again on a few things I seem to not be 100% clear 
about.
   1) Which current classes do you mean when you talk about operation memory?
   2) Which classes are we talking about when you mean intermediates / dirty 
objects that are supposed to go into the buffer pool area?
   
   I'll try to answer them myself, maybe you can correct me.
   ad1) As mentioned in the design document, derivatives of `Data` like 
`MatrixObject`, `FrameObject` and `TensorObject`.
   ad2) Here I'm not sure, looking at the code I would say these are also 
`CacheableData` objects, for example in the `ExecutionContext` or 
`EstimatorDensityMap`, these are working with `MatrixBlocks`, which ultimately 
is also a `CacheBlock`. Only dense data, e.g., `DenseBlock` instances do not 
hold `CacheBlock`.
   
   It kinda feels like I'm missing something, we said the second `Block` does 
not exist currently but it looks like the `CacheBlock` would suffice. In that 
case the split memory area would just be a logical differentiation.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to