asubiotto commented on issue #14991:
URL: https://github.com/apache/datafusion/issues/14991#issuecomment-2720197770

   Nowhere. What I mean is that the `AggregateExec` **should** have 
`ordering_mode=Sorted` as you share in your previous comment but it **does 
not** (and I would expect it to). Apologies if this is off-topic for the issue 
but since you were looking for examples where the sorted properties of the 
group columns are somehow lost I thought it would be good to share a practical 
example (I was searching for ordered aggregation support and came across this 
issue).
   
   Specifically `generated_id` is in sorted order and since it is the only 
group column for the aggregation I would expect `ordering_mode=Sorted` in the 
`AggregateExec`. For example, if I change the values/agg function and remove 
the unnest, this ordering is preserved:
   ```
   EXPLAIN WITH unnested AS (SELECT
       ROW_NUMBER() OVER () AS generated_id,
       value as ar
     FROM range(1,5)) SELECT generated_id, sum(ar) FROM unnested group by 
generated_id;
   
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type     | plan                                                       
                                                                                
                                                                                
                                                                                
                                                                                
                                                       |
   
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | logical_plan  | Aggregate: groupBy=[[unnested.generated_id]], 
aggr=[[sum(unnested.ar)]]                                                       
                                                                                
                                                                                
                                                                                
                                                                    |
   |               |   SubqueryAlias: unnested                                  
                                                                                
                                                                                
                                                                                
                                                                                
                                                       |
   |               |     Projection: row_number() ROWS BETWEEN UNBOUNDED 
PRECEDING AND UNBOUNDED FOLLOWING AS generated_id, tmp_table.value AS ar        
                                                                                
                                                                                
                                                                                
                                                              |
   |               |       WindowAggr: windowExpr=[[row_number() ROWS BETWEEN 
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING]]                                   
                                                                                
                                                                                
                                                                                
                                                         |
   |               |         TableScan: tmp_table projection=[value]            
                                                                                
                                                                                
                                                                                
                                                                                
                                                       |
   | physical_plan | AggregateExec: mode=FinalPartitioned, gby=[generated_id@0 
as generated_id], aggr=[sum(unnested.ar)], ordering_mode=Sorted                 
                                                                                
                                                                                
                                                                                
                                                        |
   |               |   SortExec: expr=[generated_id@0 ASC NULLS LAST], 
preserve_partitioning=[true]                                                    
                                                                                
                                                                                
                                                                                
                                                                |
   |               |     CoalesceBatchesExec: target_batch_size=8192            
                                                                                
                                                                                
                                                                                
                                                                                
                                                       |
   |               |       RepartitionExec: partitioning=Hash([generated_id@0], 
12), input_partitions=12                                                        
                                                                                
                                                                                
                                                                                
                                                       |
   |               |         AggregateExec: mode=Partial, gby=[generated_id@0 
as generated_id], aggr=[sum(unnested.ar)], ordering_mode=Sorted                 
                                                                                
                                                                                
                                                                                
                                                         |
   |               |           RepartitionExec: 
partitioning=RoundRobinBatch(12), input_partitions=1                            
                                                                                
                                                                                
                                                                                
                                                                                
       |
   |               |             ProjectionExec: expr=[row_number() ROWS 
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING@1 as generated_id, value@0 
as ar]                                                                          
                                                                                
                                                                                
                                                               |
   |               |               BoundedWindowAggExec: wdw=[row_number() ROWS 
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING: Ok(Field { name: 
"row_number() ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING", 
data_type: UInt64, nullable: false, dict_id: 0, dict_is_ordered: false, 
metadata: {} }), frame: WindowFrame { units: Rows, start_bound: 
Preceding(UInt64(NULL)), end_bound: Following(UInt64(NULL)), is_causal: false 
}], mode=[Sorted] |
   |               |                 LazyMemoryExec: partitions=1, 
batch_generators=[range: start=1, end=5, batch_size=8192]                       
                                                                                
                                                                                
                                                                                
                                                                    |
   |               |                                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                       |
   
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   2 row(s) fetched.
   Elapsed 0.009 seconds.
   ``` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to