zhuqi-lucas opened a new pull request, #13942:
URL: https://github.com/apache/datafusion/pull/13942

   ## Which issue does this PR close?
   
   Closes [#13941 ](https://github.com/apache/datafusion/issues/13941)
   
   ## Rationale for this change
   
   
   ## What changes are included in this PR?
   
   
   
   ## Are these changes tested?
   yes:
   
   ```rust
   cargo run --release --bin dfbench clickbench --iterations 5   --query 35 
--explain
       Finished `release` profile [optimized] target(s) in 0.18s
        Running `target/release/dfbench clickbench --iterations 5 --query 35 
--explain`
   Running benchmarks with the following options: RunOpt { query: Some(35), 
common: CommonOpt { iterations: 5, partitions: None, batch_size: 8192, debug: 
false }, path: "benchmarks/data/hits.parquet", queries_path: 
"benchmarks/queries/clickbench/queries.sql", output_path: None, explain: true }
   Q35: SELECT "ClientIP", "ClientIP" - 1, "ClientIP" - 2, "ClientIP" - 3, 
COUNT(*) AS c FROM hits GROUP BY "ClientIP", "ClientIP" - 1, "ClientIP" - 2, 
"ClientIP" - 3 ORDER BY c DESC LIMIT 10;
   Query 35 iteration 0 took 1202.7 ms and returned 10 rows
   Query 35 iteration 1 took 1092.0 ms and returned 10 rows
   Query 35 iteration 2 took 1021.7 ms and returned 10 rows
   Query 35 iteration 3 took 1022.2 ms and returned 10 rows
   Query 35 iteration 4 took 982.8 ms and returned 10 rows
   
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type     | plan                                                       
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                   |
   
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   | logical_plan  | Sort: c DESC NULLS FIRST, fetch=10                         
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                   |
   |               |   Projection: hits.ClientIP, hits.ClientIP - Int64(1), 
hits.ClientIP - Int64(2), hits.ClientIP - Int64(3), count(*) AS c               
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |     Aggregate: groupBy=[[hits.ClientIP, __common_expr_1 AS 
hits.ClientIP - Int64(1), __common_expr_1 AS hits.ClientIP - Int64(2), 
__common_expr_1 AS hits.ClientIP - Int64(3)]], aggr=[[count(Int64(1)) AS 
count(*)]]                                                                      
                                                                                
                                                                                
                                                   |
   |               |       Projection: CAST(hits.ClientIP AS Int64) AS 
__common_expr_1, hits.ClientIP                                                  
                                                                                
                                                                                
                                                                                
                                                                                
                                            |
   |               |         TableScan: hits projection=[ClientIP]              
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                   |
   | physical_plan | SortPreservingMergeExec: [c@4 DESC], fetch=10              
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                   |
   |               |   SortExec: TopK(fetch=10), expr=[c@4 DESC], 
preserve_partitioning=[true]                                                    
                                                                                
                                                                                
                                                                                
                                                                                
                                                 |
   |               |     ProjectionExec: expr=[ClientIP@0 as ClientIP, 
hits.ClientIP - Int64(1)@1 as hits.ClientIP - Int64(1), hits.ClientIP - 
Int64(2)@2 as hits.ClientIP - Int64(2), hits.ClientIP - Int64(3)@3 as 
hits.ClientIP - Int64(3), count(*)@4 as c]                                      
                                                                                
                                                                                
                                                              |
   |               |       AggregateExec: mode=FinalPartitioned, 
gby=[ClientIP@0 as ClientIP, hits.ClientIP - Int64(1)@1 as hits.ClientIP - 
Int64(1), hits.ClientIP - Int64(2)@2 as hits.ClientIP - Int64(2), hits.ClientIP 
- Int64(3)@3 as hits.ClientIP - Int64(3)], aggr=[count(*)]                      
                                                                                
                                                                                
                                                       |
   |               |         CoalesceBatchesExec: target_batch_size=8192        
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                   |
   |               |           RepartitionExec: partitioning=Hash([ClientIP@0, 
hits.ClientIP - Int64(1)@1, hits.ClientIP - Int64(2)@2, hits.ClientIP - 
Int64(3)@3], 14), input_partitions=14                                           
                                                                                
                                                                                
                                                                                
                                            |
   |               |             AggregateExec: mode=Partial, gby=[ClientIP@1 
as ClientIP, __common_expr_1@0 - 1 as hits.ClientIP - Int64(1), 
__common_expr_1@0 - 2 as hits.ClientIP - Int64(2), __common_expr_1@0 - 3 as 
hits.ClientIP - Int64(3)], aggr=[count(*)]                                      
                                                                                
                                                                                
                                                         |
   |               |               ProjectionExec: expr=[CAST(ClientIP@0 AS 
Int64) as __common_expr_1, ClientIP@0 as ClientIP]                              
                                                                                
                                                                                
                                                                                
                                                                                
                                       |
   |               |                 ParquetExec: file_groups={14 groups: 
[[Users/zhuqi/arrow-datafusion/benchmarks/data/hits.parquet:0..1055712604], 
[Users/zhuqi/arrow-datafusion/benchmarks/data/hits.parquet:1055712604..2111425208],
 
[Users/zhuqi/arrow-datafusion/benchmarks/data/hits.parquet:2111425208..3167137812],
 
[Users/zhuqi/arrow-datafusion/benchmarks/data/hits.parquet:3167137812..4222850416],
 
[Users/zhuqi/arrow-datafusion/benchmarks/data/hits.parquet:4222850416..5278563020],
 ...]}, projection=[ClientIP] |
   |               |                                                            
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
                                   |
   
+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
   ```
   
   
   ## Are there any user-facing changes?
   
   yes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to