alamb commented on code in PR #14823:
URL: https://github.com/apache/datafusion/pull/14823#discussion_r1970437292
##########
datafusion/physical-plan/src/sorts/sort.rs:
##########
@@ -414,6 +419,66 @@ impl ExternalSorter {
Ok(used)
}
+ /// Reconstruct `self.in_mem_batches` to organize the payload buffers of
each
+ /// `StringViewArray` in sequential order by calling `gc()` on them.
+ ///
Review Comment:
```suggestion
///
/// Note this is a workaround until
<https://github.com/apache/arrow-rs/issues/7185> is
/// available
///
```
##########
datafusion/physical-plan/src/sorts/sort.rs:
##########
@@ -1425,7 +1478,7 @@ mod tests {
// Processing 840 KB of data using 400 KB of memory requires at least
2 spills
// It will spill roughly 18000 rows and 800 KBytes.
// We leave a little wiggle room for the actual numbers.
- assert!((2..=10).contains(&spill_count));
+ assert!((12..=18).contains(&spill_count));
Review Comment:
Should we also update the comments?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]