Re: [PR] feat: use effective memory size for memory management purpose [arrow-datafusion]

via GitHub Sun, 10 Mar 2024 00:15:46 -0800


Dandandan commented on code in PR #9481:
URL: https://github.com/apache/arrow-datafusion/pull/9481#discussion_r1518789784



##########
datafusion/common/src/utils.rs:
##########
@@ -679,12 +679,32 @@ pub fn find_indices<T: PartialEq, S: Borrow<T>>(
         .ok_or_else(|| DataFusionError::Execution("Target not 
found".to_string()))
 }
 
+pub trait EffectiveSize {
+    fn get_effective_memory_size(&self) -> usize;
+}
+
+impl EffectiveSize for ArrayRef {
+    fn get_effective_memory_size(&self) -> usize {
+        self.to_data().get_slice_memory_size().unwrap_or(0)
+    }
+}
+
+impl EffectiveSize for RecordBatch {
+    fn get_effective_memory_size(&self) -> usize {

Review Comment:
   So in the situation we slice a batch and use it once, this will underreport 
memory usage. In the situation we will keep both the original batch and the 
sliced one in different operators, this will still overreport memory usage.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat: use effective memory size for memory management purpose [arrow-datafusion]

Reply via email to