GitHub user alamb added a comment to the discussion: Best practices for memory-efficient deduplication of pre-sorted Parquet files
Yes, please, I actually did some testing today, - https://github.com/apache/datafusion/issues/16899 - https://github.com/apache/datafusion/pull/16900 What I would expect in this case is to see an `AggregateExec` in the plan that had the annotation of `ordering_mode=PartiallySorted([0]` (note that is different than the "Partial" annotation) ```sql AggregateExec: mode=Partial, gby=[a@0 as a, b@1 as b], aggr=[count(Int64(1))], ordering_mode=PartiallySorted([0]) ``` Perhaps you can double check the explain plan like `EXPLAIN FORMAT INDENT ..` (which will produce a more detailed version of explain that has many more details) Thanks for sticking with this GitHub link: https://github.com/apache/datafusion/discussions/16776#discussioncomment-13881971 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
