manupa-arm commented on a change in pull request #9649:
URL: https://github.com/apache/tvm/pull/9649#discussion_r763075100
##########
File path: include/tvm/tir/usmp/utils.h
##########
@@ -153,6 +153,45 @@ class BufferInfo : public ObjectRef {
TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS(BufferInfo, ObjectRef, BufferInfoNode);
};
+/*!
+ * \brief This is a composite node that is produced by extract_buffer_info
+ * analysis pass that contains useful global information that could be useful
+ * for memory planning algorithms.
+ */
+struct BufferInfoAnalysisNode : public Object {
+ /*! \brief The BufferInfo object and its associated TIR statement */
+ Map<BufferInfo, tir::Stmt> buffer_info_stmts;
+ /*! \brief This represent maximum amount of memory being used at
+ * any point of time in the inference. This value is largely the
+ * best allocation an algorithm could achieve. Due to
Review comment:
Thanks! @lhutton1 .
Yes the maximum memory used is the summation of all conflicted
buffers/tensors at any point in the lifetime of the inference. No matter how a
memory allocation perform, it will not beat this value because those tensors
"have" to be live with each other. Therefore this is the value a memory
planning algorithm strives to achieve that corresponds to the theoretical
"peak" memory usage
Then it begs the question, why would an algorithm will not achieve this ?
That is due to the complexity of the problem and it might not be able to
figure out an offset ordering in tractable time that meets the above value. It
might as well be there does not exist static a set of offsets that tightly bind
the tensors to achieve that.
Does that help ?
##########
File path: include/tvm/tir/usmp/utils.h
##########
@@ -153,6 +153,45 @@ class BufferInfo : public ObjectRef {
TVM_DEFINE_MUTABLE_OBJECT_REF_METHODS(BufferInfo, ObjectRef, BufferInfoNode);
};
+/*!
+ * \brief This is a composite node that is produced by extract_buffer_info
+ * analysis pass that contains useful global information that could be useful
+ * for memory planning algorithms.
+ */
+struct BufferInfoAnalysisNode : public Object {
+ /*! \brief The BufferInfo object and its associated TIR statement */
+ Map<BufferInfo, tir::Stmt> buffer_info_stmts;
+ /*! \brief This represent maximum amount of memory being used at
+ * any point of time in the inference. This value is largely the
+ * best allocation an algorithm could achieve. Due to
Review comment:
Thanks! @lhutton1 .
The maximum memory used is the summation of all conflicted buffers/tensors
at any point in the lifetime of the inference. No matter how a memory
allocation perform, it will not beat this value because those tensors "have" to
be live with each other. Therefore this is the value a memory planning
algorithm strives to achieve that corresponds to the theoretical "peak" memory
usage
Then it begs the question, why would an algorithm will not achieve this ?
That is due to the complexity of the problem and it might not be able to
figure out an offset ordering in tractable time that meets the above value. It
might as well be there does not exist static a set of offsets that tightly bind
the tensors to achieve that.
Does that help ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]