alamb commented on issue #7794: URL: https://github.com/apache/arrow-datafusion/issues/7794#issuecomment-1758159097
I got a few more plans, with https://github.com/apache/arrow-datafusion/pull/7796 added. This time I can see that the plan lost the expressions after the `EnforceSorting` pass ``` name: initial_physical_plan +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | plan | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | SortExec: expr=[iox::measurement@0 ASC NULLS LAST,time@1 ASC NULLS LAST] | | ProjectionExec: expr=[iox::measurement@0 as iox::measurement, time@1 as time, mean@2 as mean, mean_1@3 as mean_1] | | FilterExec: iox::row@4 <= 3 | | BoundedWindowAggExec: wdw=[iox::row: Ok(Field { name: "iox::row", data_type: UInt64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), frame: WindowFrame { units: Rows, start_bound: Preceding(UInt64(NULL)), end_bound: CurrentRow }], mode=[Sorted] | | SortExec: expr=[iox::measurement@0 ASC NULLS LAST,time@1 ASC NULLS LAST] | | UnionExec | | ProjectionExec: expr=[cpu as iox::measurement, 0 as time, AVG(cpu.usage_idle)@0 as mean, NULL as mean_1] | | AggregateExec: mode=Final, gby=[], aggr=[AVG(cpu.usage_idle)] | | AggregateExec: mode=Partial, gby=[], aggr=[AVG(cpu.usage_idle)] | | ProjectionExec: expr=[usage_idle@3 as usage_idle] | | DeduplicateExec: [cpu@0 ASC,host@1 ASC,time@2 ASC] | | UnionExec | | ParquetExec: file_groups={1 group: [[2/16/4b0df1d892becf61982458f797547abda59f63fa9f2e599c174f1006654e6f60/44b2af50-b12c-421f-a76a-f58b1f6e674c.parquet]]}, projection=[cpu, host, time, usage_idle, usage_system, __chunk_order], output_ordering=[host@1 ASC, cpu@0 ASC, time@2 ASC, __chunk_order@5 ASC] | | ProjectionExec: expr=[disk as iox::measurement, 0 as time, NULL as mean, AVG(disk.bytes_free)@0 as mean_1] | | AggregateExec: mode=Final, gby=[], aggr=[AVG(disk.bytes_free)] | | AggregateExec: mode=Partial, gby=[], aggr=[AVG(disk.bytes_free)] | | ProjectionExec: expr=[bytes_free@0 as bytes_free] | | DeduplicateExec: [device@2 ASC,host@3 ASC,time@4 ASC] | | UnionExec | | ParquetExec: file_groups={1 group: [[2/7/a4b08cab31cbfc479eaa63d93b92502e43bc53ce90d3800d71ea9aabf3c2f335/cf9cb07f-b871-4ad0-9ae0-81d67b4f8c1d.parquet]]}, projection=[bytes_free, bytes_used, device, host, time, __chunk_order], output_ordering=[host@3 ASC, device@2 ASC, time@4 ASC, __chunk_order@5 ASC] | | | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ``` After the sort preserving repartition exec the sort exprs are still here: ``` name: physical_plan after EnforceDistribution +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | plan | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | OutputRequirementExec | | SortExec: expr=[iox::measurement@0 ASC NULLS LAST,time@1 ASC NULLS LAST] | | CoalescePartitionsExec | | ProjectionExec: expr=[iox::measurement@0 as iox::measurement, time@2 as time, mean@3 as mean, mean_1@4 as mean_1] | | FilterExec: iox::row@1 <= 3 | | ProjectionExec: expr=[iox::measurement@0 as iox::measurement, iox::row@4 as iox::row, time@1 as time, mean@2 as mean, mean_1@3 as mean_1] | | BoundedWindowAggExec: wdw=[iox::row: Ok(Field { name: "iox::row", data_type: UInt64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), frame: WindowFrame { units: Rows, start_bound: Preceding(UInt64(NULL)), end_bound: CurrentRow }], mode=[Sorted] | | sort ok-> SortPreservingRepartitionExec: partitioning=Hash([iox::measurement@0], 16), input_partitions=16, sort_exprs=iox::measurement@0 ASC NULLS LAST,time@1 ASC NULLS LAST | | RepartitionExec: partitioning=RoundRobinBatch(16), input_partitions=1 | | SortExec: expr=[iox::measurement@0 ASC NULLS LAST,time@1 ASC NULLS LAST] | | CoalescePartitionsExec | | UnionExec | | ProjectionExec: expr=[cpu as iox::measurement, 0 as time, AVG(cpu.usage_idle)@0 as mean, NULL as mean_1] | | AggregateExec: mode=Final, gby=[], aggr=[AVG(cpu.usage_idle)] | | CoalescePartitionsExec | | AggregateExec: mode=Partial, gby=[], aggr=[AVG(cpu.usage_idle)] | | RepartitionExec: partitioning=RoundRobinBatch(16), input_partitions=1 | | ParquetExec: file_groups={1 group: [[2/16/4b0df1d892becf61982458f797547abda59f63fa9f2e599c174f1006654e6f60/44b2af50-b12c-421f-a76a-f58b1f6e674c.parquet]]}, projection=[usage_idle] | | ProjectionExec: expr=[disk as iox::measurement, 0 as time, NULL as mean, AVG(disk.bytes_free)@0 as mean_1] | | AggregateExec: mode=Final, gby=[], aggr=[AVG(disk.bytes_free)] | | CoalescePartitionsExec | | AggregateExec: mode=Partial, gby=[], aggr=[AVG(disk.bytes_free)] | | RepartitionExec: partitioning=RoundRobinBatch(16), input_partitions=1 | | ParquetExec: file_groups={1 group: [[2/7/a4b08cab31cbfc479eaa63d93b92502e43bc53ce90d3800d71ea9aabf3c2f335/cf9cb07f-b871-4ad0-9ae0-81d67b4f8c1d.parquet]]}, projection=[bytes_free] | | | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ ``` But after "EnforceSorting" they are gone (why is there a sort above them???) ``` name: physical_plan after EnforceSorting +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | plan | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | OutputRequirementExec | | SortPreservingMergeExec: [iox::measurement@0 ASC NULLS LAST,time@1 ASC NULLS LAST] | | ProjectionExec: expr=[iox::measurement@0 as iox::measurement, time@2 as time, mean@3 as mean, mean_1@4 as mean_1] | | FilterExec: iox::row@1 <= 3 | | ProjectionExec: expr=[iox::measurement@0 as iox::measurement, iox::row@4 as iox::row, time@1 as time, mean@2 as mean, mean_1@3 as mean_1] | | BoundedWindowAggExec: wdw=[iox::row: Ok(Field { name: "iox::row", data_type: UInt64, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }), frame: WindowFrame { units: Rows, start_bound: Preceding(UInt64(NULL)), end_bound: CurrentRow }], mode=[Sorted] | | SortExec: expr=[iox::measurement@0 ASC NULLS LAST,time@1 ASC NULLS LAST] | | BAD -> SortPreservingRepartitionExec: partitioning=Hash([iox::measurement@0], 16), input_partitions=16 | | RepartitionExec: partitioning=RoundRobinBatch(16), input_partitions=2 | | UnionExec | | ProjectionExec: expr=[cpu as iox::measurement, 0 as time, AVG(cpu.usage_idle)@0 as mean, NULL as mean_1] | | AggregateExec: mode=Final, gby=[], aggr=[AVG(cpu.usage_idle)] | | CoalescePartitionsExec | | AggregateExec: mode=Partial, gby=[], aggr=[AVG(cpu.usage_idle)] | | RepartitionExec: partitioning=RoundRobinBatch(16), input_partitions=1 | | ParquetExec: file_groups={1 group: [[2/16/4b0df1d892becf61982458f797547abda59f63fa9f2e599c174f1006654e6f60/44b2af50-b12c-421f-a76a-f58b1f6e674c.parquet]]}, projection=[usage_idle] | | ProjectionExec: expr=[disk as iox::measurement, 0 as time, NULL as mean, AVG(disk.bytes_free)@0 as mean_1] | | AggregateExec: mode=Final, gby=[], aggr=[AVG(disk.bytes_free)] | | CoalescePartitionsExec | | AggregateExec: mode=Partial, gby=[], aggr=[AVG(disk.bytes_free)] | | RepartitionExec: partitioning=RoundRobinBatch(16), input_partitions=1 | | ParquetExec: file_groups={1 group: [[2/7/a4b08cab31cbfc479eaa63d93b92502e43bc53ce90d3800d71ea9aabf3c2f335/cf9cb07f-b871-4ad0-9ae0-81d67b4f8c1d.parquet]]}, projection=[bytes_free] | | | +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
