[ 
https://issues.apache.org/jira/browse/HIVE-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-16421:
-----------------------------------
    Status: Open  (was: Patch Available)

> Runtime filtering breaks user-level explain
> -------------------------------------------
>
>                 Key: HIVE-16421
>                 URL: https://issues.apache.org/jira/browse/HIVE-16421
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Pengcheng Xiong
>            Assignee: Pengcheng Xiong
>         Attachments: HIVE-16421.01.patch, HIVE-16421.02.patch
>
>
> Query:
> {noformat}
> SELECT LAG(COALESCE(t2.int_col_14, t1.int_col_80),22) OVER (ORDER BY 
> t1.tinyint_col_52 DESC) AS int_col FROM table_6 t1 INNER JOIN table_14 t2 ON 
> ((t2.decimal0101_col_55) = (t1.decimal0101_col_9));
> {noformat}
> Without runtime filtering
> {noformat}
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
> |                                                                             
>                               Explain                                         
>                                                                   |
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
> | Plan not optimized by CBO.                                                  
>                                                                               
>                                                                   |
> |                                                                             
>                                                                               
>                                                                   |
> | Vertex dependency in root stage                                             
>                                                                               
>                                                                   |
> | Map 1 <- Map 3 (BROADCAST_EDGE)                                             
>                                                                               
>                                                                   |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE)                                            
>                                                                               
>                                                                   |
> |                                                                             
>                                                                               
>                                                                   |
> | Stage-0                                                                     
>                                                                               
>                                                                   |
> |    Fetch Operator                                                           
>                                                                               
>                                                                   |
> |       limit:-1                                                              
>                                                                               
>                                                                   |
> |       Stage-1                                                               
>                                                                               
>                                                                   |
> |          Reducer 2                                                          
>                                                                               
>                                                                   |
> |          File Output Operator [FS_364]                                      
>                                                                               
>                                                                   |
> |             compressed:false                                                
>                                                                               
>                                                                   |
> |             Statistics:Num rows: 74781721 Data size: 299126884 Basic stats: 
> COMPLETE Column stats: COMPLETE                                               
>                                                                   |
> |             table:{"input 
> format:":"org.apache.hadoop.mapred.TextInputFormat","output 
> format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"}
>   |
> |             Select Operator [SEL_362]                                       
>                                                                               
>                                                                   |
> |                outputColumnNames:["_col0"]                                  
>                                                                               
>                                                                   |
> |                Statistics:Num rows: 74781721 Data size: 299126884 Basic 
> stats: COMPLETE Column stats: COMPLETE                                        
>                                                                       |
> |                PTF Operator [PTF_361]                                       
>                                                                               
>                                                                   |
> |                   Function definitions:[{"Input 
> definition":{"type:":"WINDOWING"}},{"order 
> by:":"_col51(DESC)","name:":"windowingtablefunction","partition by:":"0"}]    
>                                                    |
> |                   Statistics:Num rows: 74781721 Data size: 897380652 Basic 
> stats: COMPLETE Column stats: COMPLETE                                        
>                                                                    |
> |                   Select Operator [SEL_360]                                 
>                                                                               
>                                                                   |
> |                   |  outputColumnNames:["_col51","_col79","_col97"]         
>                                                                               
>                                                                   |
> |                   |  Statistics:Num rows: 74781721 Data size: 897380652 
> Basic stats: COMPLETE Column stats: COMPLETE                                  
>                                                                       |
> |                   |<-Map 1 [SIMPLE_EDGE] vectorized                         
>                                                                               
>                                                                   |
> |                      Reduce Output Operator [RS_375]                        
>                                                                               
>                                                                   |
> |                         key expressions:0 (type: int), _col51 (type: 
> tinyint)                                                                      
>                                                                          |
> |                         Map-reduce partition columns:0 (type: int)          
>                                                                               
>                                                                   |
> |                         sort order:+-                                       
>                                                                               
>                                                                   |
> |                         Statistics:Num rows: 74781721 Data size: 897380652 
> Basic stats: COMPLETE Column stats: COMPLETE                                  
>                                                                    |
> |                         value expressions:_col79 (type: int), _col97 (type: 
> int)                                                                          
>                                                                   |
> |                         Map Join Operator [MAPJOIN_374]                     
>                                                                               
>                                                                   |
> |                         |  condition map:[{"":"Inner Join 0 to 1"}]         
>                                                                               
>                                                                   |
> |                         |  HybridGraceHashJoin:true                         
>                                                                               
>                                                                   |
> |                         |  keys:{"Map 3":"decimal0101_col_55 (type: 
> decimal(1,1))","Map 1":"decimal0101_col_9 (type: decimal(1,1))"}              
>                                                                           |
> |                         |  outputColumnNames:["_col51","_col79","_col97"]   
>                                                                               
>                                                                   |
> |                         |  Statistics:Num rows: 74781721 Data size: 
> 897380652 Basic stats: COMPLETE Column stats: COMPLETE                        
>                                                                           |
> |                         |<-Map 3 [BROADCAST_EDGE] vectorized                
>                                                                               
>                                                                   |
> |                         |  Reduce Output Operator [RS_372]                  
>                                                                               
>                                                                   |
> |                         |     key expressions:decimal0101_col_55 (type: 
> decimal(1,1))                                                                 
>                                                                       |
> |                         |     Map-reduce partition 
> columns:decimal0101_col_55 (type: decimal(1,1))                               
>                                                                               
>              |
> |                         |     sort order:+                                  
>                                                                               
>                                                                   |
> |                         |     Statistics:Num rows: 26256 Data size: 2749496 
> Basic stats: COMPLETE Column stats: COMPLETE                                  
>                                                                   |
> |                         |     value expressions:int_col_14 (type: int)      
>                                                                               
>                                                                   |
> |                         |     Filter Operator [FIL_371]                     
>                                                                               
>                                                                   |
> |                         |        predicate:decimal0101_col_55 is not null 
> (type: boolean)                                                               
>                                                                     |
> |                         |        Statistics:Num rows: 26256 Data size: 
> 2749496 Basic stats: COMPLETE Column stats: COMPLETE                          
>                                                                        |
> |                         |        TableScan [TS_353]                         
>                                                                               
>                                                                   |
> |                         |           alias:t2                                
>                                                                               
>                                                                   |
> |                         |           Statistics:Num rows: 29079 Data size: 
> 117014275 Basic stats: COMPLETE Column stats: COMPLETE                        
>                                                                     |
> |                         |<-Filter Operator [FIL_373]                        
>                                                                               
>                                                                   |
> |                               predicate:decimal0101_col_9 is not null 
> (type: boolean)                                                               
>                                                                         |
> |                               Statistics:Num rows: 48419 Data size: 5233788 
> Basic stats: COMPLETE Column stats: COMPLETE                                  
>                                                                   |
> |                               TableScan [TS_352]                            
>                                                                               
>                                                                   |
> |                                  alias:t1                                   
>                                                                               
>                                                                   |
> |                                  Statistics:Num rows: 53742 Data size: 
> 200230374 Basic stats: COMPLETE Column stats: COMPLETE                        
>                                                                        |
> |                                                                             
>                                                                               
>                                                                   |
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
> {noformat}
> With runtime filtering:
> {noformat}
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
> |                                                                             
>                                                                     Explain   
>                                                                               
>                                                                  |
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
> | STAGE DEPENDENCIES:                                                         
>                                                                               
>                                                                               
>                                                                  |
> |   Stage-1 is a root stage                                                   
>                                                                               
>                                                                               
>                                                                  |
> |   Stage-0 depends on stages: Stage-1                                        
>                                                                               
>                                                                               
>                                                                  |
> |                                                                             
>                                                                               
>                                                                               
>                                                                  |
> | STAGE PLANS:                                                                
>                                                                               
>                                                                               
>                                                                  |
> |   Stage: Stage-1                                                            
>                                                                               
>                                                                               
>                                                                  |
> |     Tez                                                                     
>                                                                               
>                                                                               
>                                                                  |
> |       DagId: hive_20170411232247_e177745a-39d0-4ae7-8ca0-871a137b36fa:1     
>                                                                               
>                                                                               
>                                                                  |
> |       Edges:                                                                
>                                                                               
>                                                                               
>                                                                  |
> |         Map 1 <- Map 3 (BROADCAST_EDGE), Reducer 4 (BROADCAST_EDGE)         
>                                                                               
>                                                                               
>                                                                  |
> |         Reducer 2 <- Map 1 (SIMPLE_EDGE)                                    
>                                                                               
>                                                                               
>                                                                  |
> |         Reducer 4 <- Map 3 (SIMPLE_EDGE)                                    
>                                                                               
>                                                                               
>                                                                  |
> |       DagName:                                                              
>                                                                               
>                                                                               
>                                                                  |
> |       Vertices:                                                             
>                                                                               
>                                                                               
>                                                                  |
> |         Map 1                                                               
>                                                                               
>                                                                               
>                                                                  |
> |             Map Operator Tree:                                              
>                                                                               
>                                                                               
>                                                                  |
> |                 TableScan                                                   
>                                                                               
>                                                                               
>                                                                  |
> |                   alias: t1                                                 
>                                                                               
>                                                                               
>                                                                  |
> |                   filterExpr: (decimal0101_col_9 is not null and 
> (decimal0101_col_9 BETWEEN DynamicValue(RS_7_t2_decimal0101_col_9_min) AND 
> DynamicValue(RS_7_t2_decimal0101_col_9_max) and 
> in_bloom_filter(decimal0101_col_9, 
> DynamicValue(RS_7_t2_decimal0101_col_9_bloom_filter)))) (type: boolean)   |
> |                   Statistics: Num rows: 53742 Data size: 5809320 Basic 
> stats: COMPLETE Column stats: COMPLETE                                        
>                                                                               
>                                                                       |
> |                   Filter Operator                                           
>                                                                               
>                                                                               
>                                                                  |
> |                     predicate: (decimal0101_col_9 is not null and 
> (decimal0101_col_9 BETWEEN DynamicValue(RS_7_t2_decimal0101_col_9_min) AND 
> DynamicValue(RS_7_t2_decimal0101_col_9_max) and 
> in_bloom_filter(decimal0101_col_9, 
> DynamicValue(RS_7_t2_decimal0101_col_9_bloom_filter)))) (type: boolean)  |
> |                     Statistics: Num rows: 48419 Data size: 5233908 Basic 
> stats: COMPLETE Column stats: COMPLETE                                        
>                                                                               
>                                                                     |
> |                     Select Operator                                         
>                                                                               
>                                                                               
>                                                                  |
> |                       expressions: decimal0101_col_9 (type: decimal(1,1)), 
> tinyint_col_52 (type: tinyint), int_col_80 (type: int)                        
>                                                                               
>                                                                   |
> |                       outputColumnNames: _col0, _col1, _col2                
>                                                                               
>                                                                               
>                                                                  |
> |                       Statistics: Num rows: 48419 Data size: 5233908 Basic 
> stats: COMPLETE Column stats: COMPLETE                                        
>                                                                               
>                                                                   |
> |                       Map Join Operator                                     
>                                                                               
>                                                                               
>                                                                  |
> |                         condition map:                                      
>                                                                               
>                                                                               
>                                                                  |
> |                              Inner Join 0 to 1                              
>                                                                               
>                                                                               
>                                                                  |
> |                         keys:                                               
>                                                                               
>                                                                               
>                                                                  |
> |                           0 _col0 (type: decimal(1,1))                      
>                                                                               
>                                                                               
>                                                                  |
> |                           1 _col1 (type: decimal(1,1))                      
>                                                                               
>                                                                               
>                                                                  |
> |                         outputColumnNames: _col1, _col2, _col3              
>                                                                               
>                                                                               
>                                                                  |
> |                         input vertices:                                     
>                                                                               
>                                                                               
>                                                                  |
> |                           1 Map 3                                           
>                                                                               
>                                                                               
>                                                                  |
> |                         Statistics: Num rows: 74781721 Data size: 897380652 
> Basic stats: COMPLETE Column stats: COMPLETE                                  
>                                                                               
>                                                                  |
> |                         Reduce Output Operator                              
>                                                                               
>                                                                               
>                                                                  |
> |                           key expressions: 0 (type: int), _col1 (type: 
> tinyint)                                                                      
>                                                                               
>                                                                       |
> |                           sort order: +-                                    
>                                                                               
>                                                                               
>                                                                  |
> |                           Map-reduce partition columns: 0 (type: int)       
>                                                                               
>                                                                               
>                                                                  |
> |                           Statistics: Num rows: 74781721 Data size: 
> 897380652 Basic stats: COMPLETE Column stats: COMPLETE                        
>                                                                               
>                                                                          |
> |                           value expressions: _col2 (type: int), _col3 
> (type: int)                                                                   
>                                                                               
>                                                                        |
> |             Execution mode: vectorized, llap                                
>                                                                               
>                                                                               
>                                                                  |
> |         Map 3                                                               
>                                                                               
>                                                                               
>                                                                  |
> |             Map Operator Tree:                                              
>                                                                               
>                                                                               
>                                                                  |
> |                 TableScan                                                   
>                                                                               
>                                                                               
>                                                                  |
> |                   alias: t2                                                 
>                                                                               
>                                                                               
>                                                                  |
> |                   filterExpr: decimal0101_col_55 is not null (type: 
> boolean)                                                                      
>                                                                               
>                                                                          |
> |                   Statistics: Num rows: 29079 Data size: 3045240 Basic 
> stats: COMPLETE Column stats: COMPLETE                                        
>                                                                               
>                                                                       |
> |                   Filter Operator                                           
>                                                                               
>                                                                               
>                                                                  |
> |                     predicate: decimal0101_col_55 is not null (type: 
> boolean)                                                                      
>                                                                               
>                                                                         |
> |                     Statistics: Num rows: 26256 Data size: 2749612 Basic 
> stats: COMPLETE Column stats: COMPLETE                                        
>                                                                               
>                                                                     |
> |                     Select Operator                                         
>                                                                               
>                                                                               
>                                                                  |
> |                       expressions: int_col_14 (type: int), 
> decimal0101_col_55 (type: decimal(1,1))                                       
>                                                                               
>                                                                               
>     |
> |                       outputColumnNames: _col0, _col1                       
>                                                                               
>                                                                               
>                                                                  |
> |                       Statistics: Num rows: 26256 Data size: 2749612 Basic 
> stats: COMPLETE Column stats: COMPLETE                                        
>                                                                               
>                                                                   |
> |                       Reduce Output Operator                                
>                                                                               
>                                                                               
>                                                                  |
> |                         key expressions: _col1 (type: decimal(1,1))         
>                                                                               
>                                                                               
>                                                                  |
> |                         sort order: +                                       
>                                                                               
>                                                                               
>                                                                  |
> |                         Map-reduce partition columns: _col1 (type: 
> decimal(1,1))                                                                 
>                                                                               
>                                                                           |
> |                         Statistics: Num rows: 26256 Data size: 2749612 
> Basic stats: COMPLETE Column stats: COMPLETE                                  
>                                                                               
>                                                                       |
> |                         value expressions: _col0 (type: int)                
>                                                                               
>                                                                               
>                                                                  |
> |                       Select Operator                                       
>                                                                               
>                                                                               
>                                                                  |
> |                         expressions: _col1 (type: decimal(1,1))             
>                                                                               
>                                                                               
>                                                                  |
> |                         outputColumnNames: _col0                            
>                                                                               
>                                                                               
>                                                                  |
> |                         Statistics: Num rows: 26256 Data size: 2749612 
> Basic stats: COMPLETE Column stats: COMPLETE                                  
>                                                                               
>                                                                       |
> |                         Group By Operator                                   
>                                                                               
>                                                                               
>                                                                  |
> |                           aggregations: min(_col0), max(_col0), 
> bloom_filter(_col0, expectedEntries=17)                                       
>                                                                               
>                                                                              |
> |                           mode: hash                                        
>                                                                               
>                                                                               
>                                                                  |
> |                           outputColumnNames: _col0, _col1, _col2            
>                                                                               
>                                                                               
>                                                                  |
> |                           Statistics: Num rows: 1 Data size: 336 Basic 
> stats: COMPLETE Column stats: COMPLETE                                        
>                                                                               
>                                                                       |
> |                           Reduce Output Operator                            
>                                                                               
>                                                                               
>                                                                  |
> |                             sort order:                                     
>                                                                               
>                                                                               
>                                                                  |
> |                             Statistics: Num rows: 1 Data size: 336 Basic 
> stats: COMPLETE Column stats: COMPLETE                                        
>                                                                               
>                                                                     |
> |                             value expressions: _col0 (type: decimal(1,1)), 
> _col1 (type: decimal(1,1)), _col2 (type: binary)                              
>                                                                               
>                                                                   |
> |             Execution mode: vectorized, llap                                
>                                                                               
>                                                                               
>                                                                  |
> |         Reducer 2                                                           
>                                                                               
>                                                                               
>                                                                  |
> |             Execution mode: llap                                            
>                                                                               
>                                                                               
>                                                                  |
> |             Reduce Operator Tree:                                           
>                                                                               
>                                                                               
>                                                                  |
> |               Select Operator                                               
>                                                                               
>                                                                               
>                                                                  |
> |                 expressions: KEY.reducesinkkey1 (type: tinyint), 
> VALUE._col1 (type: int), VALUE._col2 (type: int)                              
>                                                                               
>                                                                             |
> |                 outputColumnNames: _col1, _col2, _col3                      
>                                                                               
>                                                                               
>                                                                  |
> |                 Statistics: Num rows: 74781721 Data size: 897380652 Basic 
> stats: COMPLETE Column stats: COMPLETE                                        
>                                                                               
>                                                                    |
> |                 PTF Operator                                                
>                                                                               
>                                                                               
>                                                                  |
> |                   Function definitions:                                     
>                                                                               
>                                                                               
>                                                                  |
> |                       Input definition                                      
>                                                                               
>                                                                               
>                                                                  |
> |                         input alias: ptf_0                                  
>                                                                               
>                                                                               
>                                                                  |
> |                         output shape: _col1: tinyint, _col2: int, _col3: 
> int                                                                           
>                                                                               
>                                                                     |
> |                         type: WINDOWING                                     
>                                                                               
>                                                                               
>                                                                  |
> |                       Windowing table definition                            
>                                                                               
>                                                                               
>                                                                  |
> |                         input alias: ptf_1                                  
>                                                                               
>                                                                               
>                                                                  |
> |                         name: windowingtablefunction                        
>                                                                               
>                                                                               
>                                                                  |
> |                         order by: _col1 DESC NULLS LAST                     
>                                                                               
>                                                                               
>                                                                  |
> |                         partition by: 0                                     
>                                                                               
>                                                                               
>                                                                  |
> |                         raw input shape:                                    
>                                                                               
>                                                                               
>                                                                  |
> |                         window functions:                                   
>                                                                               
>                                                                               
>                                                                  |
> |                             window function definition                      
>                                                                               
>                                                                               
>                                                                  |
> |                               alias: LAG_window_0                           
>                                                                               
>                                                                               
>                                                                  |
> |                               arguments: COALESCE(_col3,_col2), 22          
>                                                                               
>                                                                               
>                                                                  |
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
> |                                                                             
>                                                                     Explain   
>                                                                               
>                                                                  |
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
> |                               name: LAG                                     
>                                                                               
>                                                                               
>                                                                  |
> |                               window function: GenericUDAFLagEvaluator      
>                                                                               
>                                                                               
>                                                                  |
> |                               window frame: PRECEDING(MAX)~FOLLOWING(MAX)   
>                                                                               
>                                                                               
>                                                                  |
> |                               isPivotResult: true                           
>                                                                               
>                                                                               
>                                                                  |
> |                   Statistics: Num rows: 74781721 Data size: 897380652 Basic 
> stats: COMPLETE Column stats: COMPLETE                                        
>                                                                               
>                                                                  |
> |                   Select Operator                                           
>                                                                               
>                                                                               
>                                                                  |
> |                     expressions: LAG_window_0 (type: int)                   
>                                                                               
>                                                                               
>                                                                  |
> |                     outputColumnNames: _col0                                
>                                                                               
>                                                                               
>                                                                  |
> |                     Statistics: Num rows: 74781721 Data size: 299126884 
> Basic stats: COMPLETE Column stats: COMPLETE                                  
>                                                                               
>                                                                      |
> |                     File Output Operator                                    
>                                                                               
>                                                                               
>                                                                  |
> |                       compressed: false                                     
>                                                                               
>                                                                               
>                                                                  |
> |                       Statistics: Num rows: 74781721 Data size: 299126884 
> Basic stats: COMPLETE Column stats: COMPLETE                                  
>                                                                               
>                                                                    |
> |                       table:                                                
>                                                                               
>                                                                               
>                                                                  |
> |                           input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat                              
>                                                                               
>                                                                               
>                        |
> |                           output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat                     
>                                                                               
>                                                                               
>                       |
> |                           serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe                            
>                                                                               
>                                                                               
>                               |
> |         Reducer 4                                                           
>                                                                               
>                                                                               
>                                                                  |
> |             Execution mode: vectorized, llap                                
>                                                                               
>                                                                               
>                                                                  |
> |             Reduce Operator Tree:                                           
>                                                                               
>                                                                               
>                                                                  |
> |               Group By Operator                                             
>                                                                               
>                                                                               
>                                                                  |
> |                 aggregations: min(VALUE._col0), max(VALUE._col1), 
> bloom_filter(VALUE._col2, expectedEntries=17)                                 
>                                                                               
>                                                                            |
> |                 mode: final                                                 
>                                                                               
>                                                                               
>                                                                  |
> |                 outputColumnNames: _col0, _col1, _col2                      
>                                                                               
>                                                                               
>                                                                  |
> |                 Statistics: Num rows: 1 Data size: 336 Basic stats: 
> COMPLETE Column stats: COMPLETE                                               
>                                                                               
>                                                                          |
> |                 Reduce Output Operator                                      
>                                                                               
>                                                                               
>                                                                  |
> |                   sort order:                                               
>                                                                               
>                                                                               
>                                                                  |
> |                   Statistics: Num rows: 1 Data size: 336 Basic stats: 
> COMPLETE Column stats: COMPLETE                                               
>                                                                               
>                                                                        |
> |                   value expressions: _col0 (type: decimal(1,1)), _col1 
> (type: decimal(1,1)), _col2 (type: binary)                                    
>                                                                               
>                                                                       |
> |                                                                             
>                                                                               
>                                                                               
>                                                                  |
> |   Stage: Stage-0                                                            
>                                                                               
>                                                                               
>                                                                  |
> |     Fetch Operator                                                          
>                                                                               
>                                                                               
>                                                                  |
> |       limit: -1                                                             
>                                                                               
>                                                                               
>                                                                  |
> |       Processor Tree:                                                       
>                                                                               
>                                                                               
>                                                                  |
> |         ListSink                                                            
>                                                                               
>                                                                               
>                                                                  |
> |                                                                             
>                                                                               
>                                                                               
>                                                                  |
> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
> 135 rows selected (2.348 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to