Dandandan commented on code in PR #9481:
URL: https://github.com/apache/arrow-datafusion/pull/9481#discussion_r1518789784
##########
datafusion/common/src/utils.rs:
##########
@@ -679,12 +679,32 @@ pub fn find_indices<T: PartialEq, S: Borrow<T>>(
.ok_or_else(|| DataFusionError::Execution("Target not
found".to_string()))
}
+pub trait EffectiveSize {
+ fn get_effective_memory_size(&self) -> usize;
+}
+
+impl EffectiveSize for ArrayRef {
+ fn get_effective_memory_size(&self) -> usize {
+ self.to_data().get_slice_memory_size().unwrap_or(0)
+ }
+}
+
+impl EffectiveSize for RecordBatch {
+ fn get_effective_memory_size(&self) -> usize {
Review Comment:
So in the situation we slice a batch and use it once, this will underreport
memory usage. In the situation we will keep both the original batch and the
sliced one in different operators, this will still overreport memory usage.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]