yjshen commented on a change in pull request #1682: URL: https://github.com/apache/arrow-datafusion/pull/1682#discussion_r792548957
########## File path: datafusion/src/physical_plan/mod.rs ########## @@ -51,6 +51,11 @@ pub trait RecordBatchStream: Stream<Item = ArrowResult<RecordBatch>> { /// Implementation of this trait should guarantee that all `RecordBatch`'s returned by this /// stream should have the same schema as returned from this method. fn schema(&self) -> SchemaRef; + + /// Returns the current memory usage for this stream. + fn mem_used(&self) -> usize { + 0 + } Review comment: This line adds a `mem_used` method in our essential `RecordBatchStream` trait. A baby step to tracking Non-Limited-Operators' memory usage since I think `SendableRecordBatchStream` is the fundamental entity that holds memory during execution. However, I didn't quite find a way to register these streams generated during `async execute` to our memory manager. I would love to hear your thoughts. If considered not appropriate, I will remove it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org