adriangb commented on code in PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#discussion_r3037106292
##########
datafusion/physical-plan/src/sorts/sort_preserving_merge.rs:
##########
@@ -366,7 +366,7 @@ impl ExecutionPlan for SortPreservingMergeExec {
.map(|partition| {
let stream =
self.input.execute(partition,
Arc::clone(&context))?;
- Ok(spawn_buffered(stream, 1))
+ Ok(spawn_buffered(stream, 16))
Review Comment:
Thinking about it, I think if we frame it as replacing `SortExec` with
`BufferExec` it makes it easy to see this is strictly an improvement:
`SortExec` in addition to sorting has to buffer data in memory. It can spill,
but in general it is a memory hungry operator. If we remove it and leave
something that does strictly buffering around that means we are preserving
similar execution but just removing the sort computation. So I think this is
theoretically and practically justifiable.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]