[GitHub] [tvm] manupa-arm commented on a change in pull request #9649: [TIR][USMP] Augmenting the algo interface with memory pressure

GitBox Mon, 06 Dec 2021 06:50:34 -0800


manupa-arm commented on a change in pull request #9649:
URL: https://github.com/apache/tvm/pull/9649#discussion_r763075100




##########
File path: include/tvm/tir/usmp/utils.h
##########
@@ -153,6 +153,45 @@ class BufferInfo : public ObjectRef {
   TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS(BufferInfo, ObjectRef, BufferInfoNode);
 };
 
+/*!
+ * \brief This is a composite node that is produced by extract_buffer_info
+ * analysis pass that contains useful global information that could be useful
+ * for memory planning algorithms.
+ */
+struct BufferInfoAnalysisNode : public Object {
+  /*! \brief The BufferInfo object and its associated TIR statement */
+  Map<BufferInfo, tir::Stmt> buffer_info_stmts;
+  /*! \brief This represent maximum amount of memory being used at
+   * any point of time in the inference. This value is largely the
+   * best allocation an algorithm could achieve. Due to

Review comment:
       Thanks! @lhutton1 .
   
   Yes the maximum memory used is the summation of all conflicted 
buffers/tensors at any point in the lifetime of the inference. No matter how a 
memory allocation perform, it will not beat this value because those tensors 
"have" to be live with each other. Therefore this is the value a memory 
planning algorithm strives to achieve that corresponds to the theoretical 
"peak" memory usage
   
   Then it begs the question, why would an algorithm will not achieve this ?
   
   That is due to the complexity of the problem  and it might not be able to 
figure out an offset ordering in tractable time that meets the above value. It 
might as well be there does not exist static a set of offsets that tightly bind 
the tensors to achieve that.
   
   Does that help ?

##########
File path: include/tvm/tir/usmp/utils.h
##########
@@ -153,6 +153,45 @@ class BufferInfo : public ObjectRef {
   TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS(BufferInfo, ObjectRef, BufferInfoNode);
 };
 
+/*!
+ * \brief This is a composite node that is produced by extract_buffer_info
+ * analysis pass that contains useful global information that could be useful
+ * for memory planning algorithms.
+ */
+struct BufferInfoAnalysisNode : public Object {
+  /*! \brief The BufferInfo object and its associated TIR statement */
+  Map<BufferInfo, tir::Stmt> buffer_info_stmts;
+  /*! \brief This represent maximum amount of memory being used at
+   * any point of time in the inference. This value is largely the
+   * best allocation an algorithm could achieve. Due to

Review comment:
       Thanks! @lhutton1 .
   
   The maximum memory used is the summation of all conflicted buffers/tensors 
at any point in the lifetime of the inference. No matter how a memory 
allocation perform, it will not beat this value because those tensors "have" to 
be live with each other. Therefore this is the value a memory planning 
algorithm strives to achieve that corresponds to the theoretical "peak" memory 
usage
   
   Then it begs the question, why would an algorithm will not achieve this ?
   
   That is due to the complexity of the problem  and it might not be able to 
figure out an offset ordering in tractable time that meets the above value. It 
might as well be there does not exist static a set of offsets that tightly bind 
the tensors to achieve that.
   
   Does that help ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] manupa-arm commented on a change in pull request #9649: [TIR][USMP] Augmenting the algo interface with memory pressure

Reply via email to