Re: [I] Analysis to support`SortPreservingMerge` --> `ProgressiveEval` [datafusion]

via GitHub Mon, 17 Mar 2025 11:02:26 -0700


alamb commented on issue #15191:
URL: https://github.com/apache/datafusion/issues/15191#issuecomment-2730417397


   > Incidentally this is the use case I am targeting. Anyway, this query would 
result in at least 3 partitions, two of which are overlapping in time. If we 
could generalize ProgressiveEval to have multiple partition groupings, we could 
do something like this:
   
   This is a neat idea -- basically a cascade of progressive evals to avoid 
some (but not all) merging
   
   
   > I see that one of the target use cases for ProgressiveEval is 
https://github.com/apache/datafusion/issues/6672. I am a little curious to see 
the implementation, because the way I see it, ProgressiveEval will solve some 
but not all instances of this problem.
   
   Our progressive eval implementation is here 
https://github.com/influxdata/influxdb3_core/blob/26a30bf8d6e2b6b3f1dd905c4ec27e3db6e20d5f/iox_query/src/provider/progressive_eval.rs
   
   (there is a PR to add it to Datafusion here: 
https://github.com/apache/datafusion/pull/10490)
   
   
   > Here is the issue I am worried about. When you use ListingTable you have a 
config option called target_partitions. By default it is set to the number of 
available cpu cores on the system. If the number of files exceeds 
target_partitions then it will start to merge files into the same partitions 
with no guarantees on ordering. Let me demonstrate using datafusion-cli:
   
   > If the number of files exceeds target_partitions then it will start to 
merge files into the same partitions with no guarantees on ordering. 
   
   I think it uses 
[`FileGroupPartitioner`](https://docs.rs/datafusion/latest/datafusion/datasource/physical_plan/struct.FileGroupPartitioner.html)
 that maintains the same ordering... But maybe it is a different ordering than 
you have in mind
   
   You can define an order using `WITH ORDER` like this:
   
   ```sql
   DataFusion CLI v46.0.1
   > copy (values (4, '2025-03-01')) to '/tmp/test/1.parquet';
   +-------+
   | count |
   +-------+
   | 1     |
   +-------+
   1 row(s) fetched.
   Elapsed 0.004 seconds.
   
   > copy (values (3, '2025-03-02')) to '/tmp/test/2.parquet';
   +-------+
   | count |
   +-------+
   | 1     |
   +-------+
   1 row(s) fetched.
   Elapsed 0.004 seconds.
   
   > create external table test stored as parquet location '/tmp/test' with 
order (column2 ASC);
   0 row(s) fetched.
   Elapsed 0.006 seconds.
   ```
   
   Then the sort is not needed:
   ```sql
   > explain select * from test order by column2 asc;
   
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type     | plan                                                       
                                                                                
                                                                                
                                                                           |
   
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | logical_plan  | Sort: test.column2 ASC NULLS LAST                          
                                                                                
                                                                                
                                                                           |
   |               |   TableScan: test projection=[column1, column2]            
                                                                                
                                                                                
                                                                           |
   | physical_plan | SortPreservingMergeExec: [column2@1 ASC NULLS LAST]        
                                                                                
                                                                                
                                                                           |
   |               |   DataSourceExec: file_groups={16 groups: 
[[tmp/test/1.parquet:0..107], [tmp/test/2.parquet:0..107], 
[tmp/test/1.parquet:107..214], [tmp/test/2.parquet:107..214], 
[tmp/test/1.parquet:214..321], ...]}, projection=[column1, column2], 
output_ordering=[column2@1 ASC NULLS LAST], file_type=parquet |
   |               |                                                            
                                                                                
                                                                                
                                                                           |
   
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   2 row(s) fetched.
   Elapsed 0.002 seconds.
   ```
   
   Though datafusion will put it when needed
   ```sql
   > explain select * from test order by column2 desc;
   
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type     | plan                                                       
                                                                                
                                                                                
                                                                             |
   
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | logical_plan  | Sort: test.column2 DESC NULLS FIRST                        
                                                                                
                                                                                
                                                                             |
   |               |   TableScan: test projection=[column1, column2]            
                                                                                
                                                                                
                                                                             |
   | physical_plan | SortPreservingMergeExec: [column2@1 DESC]                  
                                                                                
                                                                                
                                                                             |
   |               |   SortExec: expr=[column2@1 DESC], 
preserve_partitioning=[true]                                                    
                                                                                
                                                                                
                     |
   |               |     DataSourceExec: file_groups={16 groups: 
[[tmp/test/1.parquet:0..107], [tmp/test/2.parquet:0..107], 
[tmp/test/1.parquet:107..214], [tmp/test/2.parquet:107..214], 
[tmp/test/1.parquet:214..321], ...]}, projection=[column1, column2], 
output_ordering=[column2@1 ASC NULLS LAST], file_type=parquet |
   |               |                                                            
                                                                                
                                                                                
                                                                             |
   
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   2 row(s) fetched.
   Elapsed 0.012 seconds.
   ```
   
   > I apologize if I come off as a bit overbearing 😅 but this issue is near 
and dear to my heart. Eliminating sorts has been one of the most important 
things in my team's project, and it sounds like InfluxDB has been dealing with 
the same issue.
   
   
   
   Not at all! Yes we have done a lot of this (as have @berkaysynnada 
@ozankabak and @akurmustafa at Synnada). It is a very important optimization
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [I] Analysis to support`SortPreservingMerge` --> `ProgressiveEval` [datafusion]

Reply via email to