ajegou commented on PR #22852:
URL: https://github.com/apache/datafusion/pull/22852#issuecomment-4670382915

   While preparing this PR we noticed a second, related failure mode that this 
change doesn't address: on multi-partition workloads, when a sibling partition 
has already tightened the shared dynamic filter but a given partition's local 
heap is still empty, attempt_early_completion can't conclude completion (it 
reads the local heap's max row, which is None). That cross-partition starvation 
case is the rest of the regression on multi-partition workloads.
   
   Addressing it cleanly requires lifting the threshold check from the local 
heap to the shared TopKDynamicFilters, which is a larger design change. 
@geoffreyclaude  has a follow-up PR ready that does this and will be opened 
against apache shortly. We're keeping the two changes separate to keep each PR 
narrowly scoped — this one is correctness-safe on its own and the follow-up 
rests cleanly on top.
   
   Pinging @kosiew just in case this has an impact on your review / approval


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to