pan3793 opened a new pull request #35431:
URL: https://github.com/apache/spark/pull/35431


   ### What changes were proposed in this pull request?
   
   This is a backport PR of #33664 for branch-3.1.
   
   ### Why are the changes needed?
   
   We found a query in production that cost lots of time in optimize phase when 
enable DPP, the SQL pattern like
   
   ```
   select <cols...>
   from a
   left join b on a.<col> = b.<col>
   left join c on b.<col> = c.<col>
   left join d on c.<col> = d.<col>
   left join e on d.<col> = e.<col>
   left join f on e.<col> = f.<col>
   left join g on f.<col> = g.<col>
   left join h on g.<col> = h.<col>
   ...
   ```
   
   <details>
   
   <summary>Before this PR, Analyzer/Optimizer costs 15821.6 seconds</summary>
   
   ```
   === Metrics of Analyzer/Optimizer Rules ===
   Total number of runs: 24763671
   Total time: 15821.588159083 seconds
   
   Rule                                                                         
                      Effective Time / Total Time                     Effective 
Runs / Total Runs
   
   org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries         
                      6325784430499 / 14213630904736                  2047 / 
342802
   org.apache.spark.sql.catalyst.optimizer.ColumnPruning                        
                      28009746 / 267164777817                         1 / 691736
   org.apache.spark.sql.catalyst.analysis.UpdateAttributeNullability            
                      0 / 118329246545                                0 / 342804
   org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases               
                      8997600 / 71070686823                           1 / 348934
   org.apache.spark.sql.catalyst.optimizer.BooleanSimplification                
                      923149600 / 59903945393                         2036 / 
348934
   org.apache.spark.sql.catalyst.optimizer.NullPropagation                      
                      12343892 / 52895299683                          1 / 348934
   org.apache.spark.sql.execution.datasources.SchemaPruning                     
                      0 / 51638842573                                 0 / 171401
   org.apache.spark.sql.catalyst.optimizer.UnwrapCastInBinaryComparison         
                      0 / 42686293006                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.SimplifyConditionals                 
                      0 / 41190229736                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints          
                      1482075440 / 40899690529                        2048 / 
171401
   org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison             
                      0 / 40747831162                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions    
                      0 / 39946334062                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.FoldablePropagation                  
                      655084587 / 39426073462                         2047 / 
348934
   org.apache.spark.sql.catalyst.optimizer.PruneFilters                         
                      489780 / 34731670364                            1 / 520335
   org.apache.spark.sql.catalyst.optimizer.OptimizeJsonExprs                    
                      0 / 33533545307                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.ReplaceNullWithFalseInPredicate      
                      0 / 31910754610                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.ConstantFolding                      
                      22871375 / 31308048659                          1 / 348934
   org.apache.spark.sql.catalyst.optimizer.LikeSimplification                   
                      0 / 31276042884                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.OptimizeWindowFunctions              
                      0 / 31100888868                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.ReorderAssociativeOperator           
                      0 / 30855866688                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.SimplifyCasts                        
                      0 / 30414758547                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.OptimizeIn                           
                      0 / 30211101358                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.OptimizeUpdateFields                 
                      0 / 29986491532                                 0 / 348936
   org.apache.spark.sql.catalyst.optimizer.RemoveDispensableExpressions         
                      0 / 29740191900                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.SimplifyExtractValueOps              
                      0 / 25399289000                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.RewriteCorrelatedScalarSubquery      
                      0 / 21381558331                                 0 / 348934
   org.apache.spark.sql.catalyst.optimizer.PushDownPredicates                   
                      5831922974 / 18327095753                        3072 / 
691737
   org.apache.spark.sql.catalyst.optimizer.ReplaceExpressions                   
                      5007709 / 17431480903                           1 / 171401
   org.apache.spark.sql.catalyst.optimizer.GetCurrentDatabaseAndCatalog         
                      0 / 15321596322                                 0 / 171401
   org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTime                   
                      0 / 15060720651                                 0 / 171401
   org.apache.spark.sql.catalyst.optimizer.DecimalAggregates                    
                      0 / 14874103589                                 0 / 171401
   org.apache.spark.sql.execution.datasources.PruneFileSourcePartitions         
                      50420951 / 14798400219                          1 / 171401
   org.apache.spark.sql.catalyst.optimizer.ReplaceUpdateFieldsExpression        
                      0 / 14791439186                                 0 / 171401
   org.apache.spark.sql.catalyst.optimizer.EliminateMapObjects                  
                      0 / 14361213979                                 0 / 171401
   org.apache.spark.sql.catalyst.optimizer.ReassignLambdaVariableID             
                      0 / 14258459712                                 0 / 171401
   org.apache.spark.sql.catalyst.optimizer.RewriteNonCorrelatedExists           
                      0 / 14094898098                                 0 / 171401
   org.apache.spark.sql.catalyst.optimizer.RemoveNoopOperators                  
                      1975195 / 12967136767                           2 / 691736
   org.apache.spark.sql.catalyst.optimizer.PullupCorrelatedPredicates           
                      0 / 12573667353                                 0 / 171401
   org.apache.spark.sql.execution.datasources.v2.V2ScanRelationPushDown         
                      0 / 11355809540                                 0 / 171401
   org.apache.spark.sql.execution.dynamicpruning.CleanupDynamicPruningFilters   
                      350911193 / 9931582633                          2048 / 
171401
   org.apache.spark.sql.catalyst.optimizer.ConstantPropagation                  
                      0 / 8268511133                                  0 / 348934
   org.apache.spark.sql.catalyst.optimizer.CollapseProject                      
                      10135014 / 6764166524                           1 / 520335
   org.apache.spark.sql.catalyst.optimizer.EliminateResolvedHint                
                      0 / 6244880486                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.CombineUnions                        
                      0 / 5811583863                                  0 / 520335
   org.apache.spark.sql.catalyst.optimizer.RewriteDistinctAggregates            
                      28926471 / 5423624109                           1 / 171401
   org.apache.spark.sql.catalyst.optimizer.CollapseRepartition                  
                      0 / 5245332974                                  0 / 348934
   org.apache.spark.sql.catalyst.optimizer.EliminateSorts                       
                      0 / 5051398253                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.PropagateEmptyRelation               
                      0 / 4975057807                                  0 / 342802
   org.apache.spark.sql.catalyst.optimizer.NormalizeFloatingNumbers             
                      0 / 4804919393                                  0 / 171401
   org.apache.spark.sql.execution.python.ExtractPythonUDFFromAggregate          
                      0 / 4408441952                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.CollapseWindow                       
                      0 / 4349408795                                  0 / 348934
   org.apache.spark.sql.catalyst.optimizer.EliminateSerialization               
                      0 / 4170274724                                  0 / 348934
   org.apache.spark.sql.catalyst.optimizer.TransposeWindow                      
                      0 / 4113327220                                  0 / 348934
   org.apache.spark.sql.catalyst.optimizer.ReorderJoin                          
                      0 / 4107443950                                  0 / 348934
   org.apache.spark.sql.catalyst.optimizer.CombineFilters                       
                      0 / 4061601886                                  0 / 348934
   org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion           
                      0 / 4042615936                                  0 / 348934
   org.apache.spark.sql.catalyst.optimizer.PushDownLeftSemiAntiJoin             
                      0 / 4024293929                                  0 / 348934
   org.apache.spark.sql.catalyst.optimizer.EliminateLimits                      
                      0 / 3908219408                                  0 / 348934
   org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery             
                      135506567 / 3895045754                          2047 / 
171401
   org.apache.spark.sql.catalyst.optimizer.EliminateOuterJoin                   
                      0 / 3875669986                                  0 / 348934
   org.apache.spark.sql.catalyst.optimizer.PushLeftSemiLeftAntiThroughJoin      
                      0 / 3875663441                                  0 / 348934
   org.apache.spark.sql.catalyst.optimizer.LimitPushDown                        
                      0 / 3859481346                                  0 / 348934
   org.apache.spark.sql.catalyst.optimizer.ReplaceDeduplicateWithAggregate      
                      0 / 3404517096                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.PushExtraPredicateThroughJoin        
                      0 / 2876802099                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.PushPredicateThroughNonJoin          
                      0 / 2546037832                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.ExtractPythonUDFFromJoinCondition    
                      0 / 2392444200                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.RemoveLiteralFromGroupExpressions    
                      0 / 2297492778                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.RewriteExceptAll                     
                      0 / 2239751163                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.InferFiltersFromGenerate             
                      0 / 2189398940                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.RemoveRepetitionFromGroupExpressions 
                      0 / 2187645769                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter              
                      0 / 2135169574                                  0 / 171401
   org.apache.spark.sql.execution.python.ExtractGroupingPythonUDFFromAggregate  
                      0 / 2058019085                                  0 / 171401
   org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases              
                      602948 / 2030110600                             1 / 171401
   org.apache.spark.sql.catalyst.optimizer.OptimizeLimitZero                    
                      0 / 2028208284                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters                  
                      0 / 1985665817                                  0 / 171401
   org.apache.spark.sql.catalyst.analysis.EliminateView                         
                      0 / 1965869470                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.ObjectSerializerPruning              
                      0 / 1892132711                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.RewriteIntersectAll                  
                      0 / 1891839429                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.ReplaceIntersectWithSemiJoin         
                      0 / 1873028976                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.ReplaceDistinctWithAggregate         
                      0 / 1872377392                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithAntiJoin            
                      0 / 1860128741                                  0 / 171401
   org.apache.spark.sql.catalyst.optimizer.CheckCartesianProducts               
                      0 / 405470117                                   0 / 342802
   org.apache.spark.sql.execution.OptimizeMetadataOnlyQuery                     
                      0 / 242854215                                   0 / 171401
   org.apache.spark.sql.catalyst.optimizer.CostBasedJoinReorder                 
                      0 / 226833963                                   0 / 171401
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences            
                      117151873 / 123533385                           7 / 13
   
org.apache.spark.sql.catalyst.analysis.TypeCoercion$FunctionArgumentConversion  
                   102066025 / 121689146                           5 / 13
   org.apache.spark.sql.catalyst.optimizer.CombineConcats                       
                      0 / 112256648                                   0 / 348934
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$ImplicitTypeCasts        
                      76183548 / 108546928                            4 / 13
   org.apache.spark.sql.catalyst.optimizer.EliminateAggregateFilter             
                      0 / 107905167                                   0 / 348934
   org.apache.spark.sql.catalyst.optimizer.EliminateDistinct                    
                      0 / 103804760                                   0 / 171401
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions             
                      85607389 / 94162213                             8 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$Division                 
                      76446295 / 93129469                             5 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$PromoteStrings           
                      53450988 / 87954605                             1 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractGenerator             
                      0 / 85268434                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$IfCoercion               
                      50348094 / 80404327                             4 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$CaseWhenCoercion         
                      39786113 / 59089899                             3 / 13
   org.apache.spark.sql.catalyst.analysis.ResolveTimeZone                       
                      56110133 / 57619849                             10 / 13
   org.apache.spark.sql.execution.dynamicpruning.PartitionPruning               
                      22888739 / 54329716                             1 / 171401
   org.apache.spark.sql.catalyst.analysis.DecimalPrecision                      
                      0 / 49832160                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$InConversion             
                      0 / 46158859                                    0 / 13
   org.apache.spark.sql.execution.datasources.FindDataSourceTable               
                      43323608 / 44575170                             1 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$ConcatCoercion           
                      14900959 / 39466022                             1 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveBinaryArithmetic      
                      0 / 38815701                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$BooleanEquality          
                      0 / 36470552                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$DateTimeOperations       
                      7216638 / 35530574                              1 / 13
   org.apache.spark.sql.execution.python.ExtractPythonUDFs                      
                      0 / 34271447                                    0 / 171401
   org.apache.spark.sql.catalyst.analysis.ResolveLambdaVariables                
                      0 / 32166544                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$IntegralDivision         
                      0 / 30547791                                    0 / 13
   
org.apache.spark.sql.catalyst.analysis.ResolveExpressionsWithNamePlaceholders   
                   0 / 30506602                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$StringLiteralCoercion    
                      0 / 30116261                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.ResolveHigherOrderFunctions           
                      0 / 27859446                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$EltCoercion              
                      0 / 24218064                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$WindowFrameCoercion      
                      0 / 23413444                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRandomSeed            
                      0 / 22036715                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$StackCoercion            
                      0 / 21957835                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowOrder           
                      0 / 21463552                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$MapZipWithCoercion       
                      0 / 20489026                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowFrame           
                      0 / 20481142                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates             
                      0 / 19715779                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveMissingReferences     
                      0 / 19693693                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TimeWindowing                         
                      0 / 19530333                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations             
                      16526728 / 18814733                             1 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubquery              
                      0 / 16649474                                    0 / 13
   org.apache.spark.sql.execution.aggregate.ResolveEncodersInScalaAgg           
                      0 / 15275758                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$LookupFunctions              
                      0 / 11329936                                    0 / 2
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer          
                      4281027 / 10765657                              1 / 13
   org.apache.spark.sql.catalyst.analysis.CTESubstitution                       
                      0 / 9577794                                     0 / 2
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAliases               
                      0 / 9217208                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns           
                      0 / 8289279                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions     
                      0 / 8192539                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.ApplyCharTypePadding                  
                      0 / 7991101                                     0 / 2
   org.apache.spark.sql.catalyst.analysis.CleanupAliases                        
                      5307130 / 7435920                               1 / 3
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes   
                      0 / 6506053                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast                
                      0 / 6446727                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNewInstance           
                      0 / 6075593                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog                 
                      0 / 4092936                                     0 / 13
   
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOrdinalInOrderByAndGroupBy
                  0 / 3430531                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveTables                
                      0 / 2966446                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGroupingAnalytics     
                      0 / 2864876                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$HandleNullInputsForUDF       
                      0 / 2818668                                     0 / 2
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggAliasInGroupBy     
                      0 / 2816430                                     0 / 13
   org.apache.spark.sql.execution.analysis.DetectAmbiguousSelfJoin              
                      0 / 2491546                                     0 / 2
   org.apache.spark.sql.catalyst.analysis.ResolveUnion                          
                      0 / 2477483                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOutputRelation        
                      0 / 2463866                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveEncodersInUDF         
                      0 / 2376711                                     0 / 2
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGenerate              
                      0 / 2133804                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.ResolvePartitionSpec                  
                      0 / 1730246                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubqueryColumnAliases 
                      0 / 1712259                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions    
                      0 / 1698918                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolvePivot                 
                      0 / 1618569                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNaturalAndUsingJoin   
                      0 / 1589833                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.ResolveCatalogs                       
                      0 / 1560151                                     0 / 13
   
org.apache.spark.sql.catalyst.expressions.codegen.package$ExpressionCanonicalizer$CleanExpressions
 0 / 1500410                                     0 / 1049
   org.apache.spark.sql.catalyst.analysis.ResolveInlineTables                   
                      0 / 1463204                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUserSpecifiedColumns  
                      0 / 1355593                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNamespace             
                      0 / 1339813                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.ResolveTableValuedFunctions           
                      0 / 1338443                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveInsertInto            
                      0 / 1323062                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$PullOutNondeterministic      
                      0 / 1318399                                     0 / 2
   org.apache.spark.sql.execution.datasources.ResolveSQLOnFile                  
                      0 / 1302134                                     0 / 13
   org.apache.spark.sql.execution.datasources.FallBackFileSourceV2              
                      0 / 1074454                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.SubstituteUnresolvedOrdinals          
                      0 / 698629                                      0 / 2
   org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveJoinStrategyHints 
                      0 / 452536                                      0 / 2
   org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveCoalesceHints     
                      0 / 306642                                      0 / 2
   org.apache.spark.sql.catalyst.analysis.Analyzer$WindowsSubstitution          
                      0 / 300756                                      0 / 2
   org.apache.spark.sql.execution.datasources.PreprocessTableCreation           
                      0 / 276685                                      0 / 2
   org.apache.spark.sql.catalyst.analysis.EliminateUnions                       
                      0 / 214857                                      0 / 2
   org.apache.spark.sql.catalyst.analysis.UpdateOuterReferences                 
                      0 / 203514                                      0 / 2
   org.apache.spark.sql.execution.datasources.PreprocessTableInsertion          
                      0 / 164163                                      0 / 2
   org.apache.spark.sql.catalyst.analysis.ResolveNoopDropTable                  
                      0 / 108345                                      0 / 2
   org.apache.spark.sql.execution.datasources.DataSourceAnalysis                
                      0 / 90574                                       0 / 2
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAlterTableChanges     
                      0 / 85175                                       0 / 2
   org.apache.spark.sql.catalyst.analysis.ResolveHints$RemoveAllHints           
                      0 / 82212                                       0 / 2
   org.apache.spark.sql.catalyst.analysis.ResolveHints$DisableHints             
                      0 / 7780                                        0 / 2
   ```
   </details>
   
   <details>
   
   <summary>After this PR, Analyzer/Optimizer costs 2.4 seconds 
seconds</summary>
   
   ```
   === Metrics of Analyzer/Optimizer Rules ===
   Total number of runs: 2140
   Total time: 2.407325128 seconds
   
   Rule                                                                         
                      Effective Time / Total Time                     Effective 
Runs / Total Runs
   
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$ImplicitTypeCasts        
                      79087019 / 116017648                            4 / 13
   
org.apache.spark.sql.catalyst.analysis.TypeCoercion$FunctionArgumentConversion  
                   88569423 / 112854377                            5 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences            
                      89163583 / 95807127                             7 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$Division                 
                      73197512 / 91715660                             5 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions             
                      73390555 / 81845702                             8 / 13
   org.apache.spark.sql.catalyst.optimizer.ColumnPruning                        
                      24099474 / 80271976                             1 / 6
   org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractGenerator             
                      0 / 79928019                                    0 / 13
   org.apache.spark.sql.catalyst.optimizer.PushDownPredicates                   
                      73617831 / 79035882                             2 / 7
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$IfCoercion               
                      52077624 / 78992522                             4 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$PromoteStrings           
                      39724587 / 73183778                             1 / 13
   org.apache.spark.sql.catalyst.analysis.ResolveTimeZone                       
                      52807613 / 54633834                             10 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$DateTimeOperations       
                      21340706 / 53040407                             1 / 13
   org.apache.spark.sql.execution.datasources.PruneFileSourcePartitions         
                      51595369 / 51595369                             1 / 1
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$CaseWhenCoercion         
                      32842401 / 49095324                             3 / 13
   org.apache.spark.sql.catalyst.analysis.DecimalPrecision                      
                      0 / 48574229                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$InConversion             
                      0 / 43384882                                    0 / 13
   org.apache.spark.sql.execution.datasources.FindDataSourceTable               
                      40348276 / 41709977                             1 / 13
   org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints          
                      40235278 / 40235278                             1 / 1
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$ConcatCoercion           
                      13542681 / 38418852                             1 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$BooleanEquality          
                      0 / 35130781                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$IntegralDivision         
                      0 / 35129152                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$StringLiteralCoercion    
                      0 / 32116493                                    0 / 13
   org.apache.spark.sql.catalyst.optimizer.RewriteDistinctAggregates            
                      32065069 / 32065069                             1 / 1
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveBinaryArithmetic      
                      0 / 30166678                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.ResolveLambdaVariables                
                      0 / 30082254                                    0 / 13
   org.apache.spark.sql.catalyst.optimizer.ConstantFolding                      
                      23910538 / 29049259                             1 / 4
   
org.apache.spark.sql.catalyst.analysis.ResolveExpressionsWithNamePlaceholders   
                   0 / 26220500                                    0 / 13
   org.apache.spark.sql.execution.dynamicpruning.PartitionPruning               
                      25857276 / 25857276                             1 / 1
   org.apache.spark.sql.catalyst.optimizer.RemoveRedundantAliases               
                      11800935 / 25453815                             1 / 4
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$EltCoercion              
                      0 / 24882479                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.ResolveHigherOrderFunctions           
                      0 / 24773800                                    0 / 13
   org.apache.spark.sql.catalyst.optimizer.BooleanSimplification                
                      0 / 23687102                                    0 / 4
   org.apache.spark.sql.catalyst.optimizer.OptimizeUpdateFields                 
                      0 / 22496833                                    0 / 6
   org.apache.spark.sql.catalyst.analysis.UpdateAttributeNullability            
                      0 / 22086610                                    0 / 4
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowOrder           
                      0 / 21776763                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$WindowFrameCoercion      
                      0 / 21668319                                    0 / 13
   org.apache.spark.sql.execution.datasources.SchemaPruning                     
                      0 / 21119650                                    0 / 1
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$StackCoercion            
                      0 / 20995370                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$MapZipWithCoercion       
                      0 / 20852224                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveWindowFrame           
                      0 / 20482582                                    0 / 13
   org.apache.spark.sql.catalyst.optimizer.NullPropagation                      
                      8449073 / 20310580                              1 / 4
   org.apache.spark.sql.catalyst.optimizer.SimplifyConditionals                 
                      0 / 19685107                                    0 / 4
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveMissingReferences     
                      0 / 18005681                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRandomSeed            
                      0 / 17873443                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates             
                      0 / 16909722                                    0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations             
                      14231820 / 16732791                             1 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubquery              
                      0 / 16280559                                    0 / 13
   org.apache.spark.sql.catalyst.optimizer.FoldablePropagation                  
                      0 / 16121492                                    0 / 4
   org.apache.spark.sql.catalyst.analysis.TimeWindowing                         
                      0 / 15969419                                    0 / 13
   org.apache.spark.sql.execution.aggregate.ResolveEncodersInScalaAgg           
                      0 / 15441100                                    0 / 13
   org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison             
                      0 / 14334553                                    0 / 4
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveDeserializer          
                      5534984 / 12617237                              1 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$LookupFunctions              
                      0 / 10014043                                    0 / 2
   org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions    
                      0 / 9761592                                     0 / 4
   org.apache.spark.sql.catalyst.analysis.CTESubstitution                       
                      0 / 9747538                                     0 / 2
   org.apache.spark.sql.catalyst.optimizer.RewriteCorrelatedScalarSubquery      
                      0 / 9694808                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.UnwrapCastInBinaryComparison         
                      0 / 9665154                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.OptimizeJsonExprs                    
                      0 / 9430804                                     0 / 4
   org.apache.spark.sql.catalyst.analysis.Analyzer$AddMetadataColumns           
                      0 / 9322119                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAliases               
                      0 / 9113696                                     0 / 13
   org.apache.spark.sql.catalyst.optimizer.SimplifyExtractValueOps              
                      0 / 8938695                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.OptimizeIn                           
                      0 / 8921298                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.ReplaceNullWithFalseInPredicate      
                      0 / 8695066                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.RemoveDispensableExpressions         
                      0 / 8438285                                     0 / 4
   org.apache.spark.sql.catalyst.analysis.ApplyCharTypePadding                  
                      0 / 8406029                                     0 / 2
   org.apache.spark.sql.catalyst.optimizer.ReorderAssociativeOperator           
                      0 / 8196064                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.CollapseProject                      
                      6878987 / 8000632                               1 / 5
   org.apache.spark.sql.catalyst.optimizer.SimplifyCasts                        
                      0 / 7996708                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.LikeSimplification                   
                      0 / 7983621                                     0 / 4
   org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractWindowExpressions     
                      0 / 7948113                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast                
                      0 / 7930266                                     0 / 13
   org.apache.spark.sql.catalyst.optimizer.OptimizeWindowFunctions              
                      0 / 7569548                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.PruneFilters                         
                      1298537 / 7511669                               1 / 5
   org.apache.spark.sql.execution.python.ExtractPythonUDFs                      
                      0 / 7419094                                     0 / 1
   org.apache.spark.sql.execution.datasources.v2.V2ScanRelationPushDown         
                      0 / 7129007                                     0 / 1
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNewInstance           
                      0 / 6953704                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.TypeCoercion$WidenSetOperationTypes   
                      0 / 6409068                                     0 / 13
   org.apache.spark.sql.catalyst.optimizer.RemoveNoopOperators                  
                      2948625 / 6073057                               2 / 6
   org.apache.spark.sql.catalyst.analysis.CleanupAliases                        
                      3304603 / 5512866                               1 / 3
   org.apache.spark.sql.catalyst.optimizer.ReplaceExpressions                   
                      5247113 / 5247113                               1 / 1
   org.apache.spark.sql.catalyst.analysis.ResolveSessionCatalog                 
                      0 / 4331593                                     0 / 13
   org.apache.spark.sql.catalyst.optimizer.NormalizeFloatingNumbers             
                      0 / 3939580                                     0 / 1
   org.apache.spark.sql.catalyst.optimizer.ConstantPropagation                  
                      0 / 3872746                                     0 / 4
   org.apache.spark.sql.execution.dynamicpruning.CleanupDynamicPruningFilters   
                      3285689 / 3285689                               1 / 1
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveTables                
                      0 / 2954363                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGroupingAnalytics     
                      0 / 2772122                                     0 / 13
   org.apache.spark.sql.catalyst.optimizer.DecimalAggregates                    
                      0 / 2751645                                     0 / 1
   org.apache.spark.sql.catalyst.analysis.Analyzer$HandleNullInputsForUDF       
                      0 / 2718652                                     0 / 2
   
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOrdinalInOrderByAndGroupBy
                  0 / 2682978                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggAliasInGroupBy     
                      0 / 2598472                                     0 / 13
   org.apache.spark.sql.execution.python.ExtractPythonUDFFromAggregate          
                      0 / 2542252                                     0 / 1
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveEncodersInUDF         
                      0 / 2451105                                     0 / 2
   org.apache.spark.sql.catalyst.optimizer.EliminateMapObjects                  
                      0 / 2438919                                     0 / 1
   org.apache.spark.sql.execution.analysis.DetectAmbiguousSelfJoin              
                      0 / 2428491                                     0 / 2
   
org.apache.spark.sql.catalyst.expressions.codegen.package$ExpressionCanonicalizer$CleanExpressions
 0 / 2277468                                     0 / 1049
   org.apache.spark.sql.catalyst.optimizer.EliminateAggregateFilter             
                      0 / 2273198                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.EliminateSorts                       
                      0 / 2256059                                     0 / 1
   org.apache.spark.sql.catalyst.optimizer.ReplaceUpdateFieldsExpression        
                      0 / 2227798                                     0 / 1
   org.apache.spark.sql.catalyst.analysis.ResolveUnion                          
                      0 / 2220071                                     0 / 13
   org.apache.spark.sql.catalyst.optimizer.PullupCorrelatedPredicates           
                      0 / 2216158                                     0 / 1
   org.apache.spark.sql.catalyst.optimizer.ReassignLambdaVariableID             
                      0 / 2190345                                     0 / 1
   org.apache.spark.sql.catalyst.optimizer.RewriteNonCorrelatedExists           
                      0 / 2106499                                     0 / 1
   org.apache.spark.sql.catalyst.optimizer.RemoveRepetitionFromGroupExpressions 
                      0 / 2088289                                     0 / 1
   org.apache.spark.sql.catalyst.optimizer.CombineConcats                       
                      0 / 2073285                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.EliminateResolvedHint                
                      0 / 1922211                                     0 / 1
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveSubqueryColumnAliases 
                      0 / 1900401                                     0 / 13
   org.apache.spark.sql.catalyst.optimizer.ComputeCurrentTime                   
                      0 / 1895150                                     0 / 1
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveGenerate              
                      0 / 1863956                                     0 / 13
   org.apache.spark.sql.catalyst.optimizer.GetCurrentDatabaseAndCatalog         
                      0 / 1816019                                     0 / 1
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAggregateFunctions    
                      0 / 1787068                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNaturalAndUsingJoin   
                      0 / 1769046                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.ResolveCatalogs                       
                      0 / 1605576                                     0 / 13
   org.apache.spark.sql.catalyst.optimizer.EliminateDistinct                    
                      0 / 1591594                                     0 / 1
   org.apache.spark.sql.catalyst.analysis.ResolveInlineTables                   
                      0 / 1547587                                     0 / 13
   org.apache.spark.sql.catalyst.optimizer.RewritePredicateSubquery             
                      0 / 1526146                                     0 / 1
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolvePivot                 
                      0 / 1506795                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.ResolvePartitionSpec                  
                      0 / 1502020                                     0 / 13
   org.apache.spark.sql.catalyst.optimizer.PushExtraPredicateThroughJoin        
                      0 / 1442567                                     0 / 1
   org.apache.spark.sql.catalyst.analysis.ResolveTableValuedFunctions           
                      0 / 1420037                                     0 / 13
   org.apache.spark.sql.execution.datasources.ResolveSQLOnFile                  
                      0 / 1399535                                     0 / 13
   org.apache.spark.sql.execution.python.ExtractGroupingPythonUDFFromAggregate  
                      0 / 1385780                                     0 / 1
   org.apache.spark.sql.catalyst.optimizer.Optimizer$OptimizeSubqueries         
                      0 / 1359249                                     0 / 1
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveNamespace             
                      0 / 1356344                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveOutputRelation        
                      0 / 1355910                                     0 / 13
   org.apache.spark.sql.execution.datasources.FallBackFileSourceV2              
                      0 / 1325995                                     0 / 13
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUserSpecifiedColumns  
                      0 / 1325072                                     0 / 13
   org.apache.spark.sql.catalyst.optimizer.CollapseRepartition                  
                      0 / 1322557                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.PushDownLeftSemiAntiJoin             
                      0 / 1300710                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.EliminateSerialization               
                      0 / 1227217                                     0 / 4
   org.apache.spark.sql.catalyst.analysis.Analyzer$PullOutNondeterministic      
                      0 / 1226182                                     0 / 2
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveInsertInto            
                      0 / 1225119                                     0 / 13
   org.apache.spark.sql.catalyst.optimizer.PushLeftSemiLeftAntiThroughJoin      
                      0 / 1172415                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.ReorderJoin                          
                      0 / 1158346                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.PropagateEmptyRelation               
                      0 / 1156508                                     0 / 2
   org.apache.spark.sql.catalyst.optimizer.PushProjectionThroughUnion           
                      0 / 1082244                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.CombineUnions                        
                      0 / 1050254                                     0 / 5
   org.apache.spark.sql.catalyst.optimizer.CollapseWindow                       
                      0 / 1003136                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.TransposeWindow                      
                      0 / 1001447                                     0 / 4
   org.apache.spark.sql.catalyst.optimizer.ExtractPythonUDFFromJoinCondition    
                      0 / 1001440                                     0 / 1
   org.apache.spark.sql.catalyst.optimizer.LimitPushDown                        
                      0 / 974802                                      0 / 4
   org.apache.spark.sql.catalyst.optimizer.EliminateLimits                      
                      0 / 927602                                      0 / 4
   org.apache.spark.sql.catalyst.optimizer.CombineFilters                       
                      0 / 924219                                      0 / 4
   org.apache.spark.sql.catalyst.optimizer.EliminateOuterJoin                   
                      0 / 768081                                      0 / 4
   org.apache.spark.sql.catalyst.analysis.SubstituteUnresolvedOrdinals          
                      0 / 733641                                      0 / 2
   org.apache.spark.sql.catalyst.optimizer.RemoveLiteralFromGroupExpressions    
                      0 / 449017                                      0 / 1
   org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases              
                      434360 / 434360                                 1 / 1
   org.apache.spark.sql.catalyst.optimizer.PushPredicateThroughNonJoin          
                      0 / 379178                                      0 / 1
   org.apache.spark.sql.catalyst.optimizer.InferFiltersFromGenerate             
                      0 / 295206                                      0 / 1
   org.apache.spark.sql.catalyst.analysis.Analyzer$WindowsSubstitution          
                      0 / 292907                                      0 / 2
   org.apache.spark.sql.catalyst.optimizer.CombineTypedFilters                  
                      0 / 256204                                      0 / 1
   org.apache.spark.sql.catalyst.optimizer.ObjectSerializerPruning              
                      0 / 249912                                      0 / 1
   org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveCoalesceHints     
                      0 / 249513                                      0 / 2
   org.apache.spark.sql.catalyst.analysis.EliminateUnions                       
                      0 / 240773                                      0 / 2
   org.apache.spark.sql.catalyst.analysis.ResolveHints$ResolveJoinStrategyHints 
                      0 / 230937                                      0 / 2
   org.apache.spark.sql.catalyst.optimizer.ReplaceDeduplicateWithAggregate      
                      0 / 222074                                      0 / 1
   org.apache.spark.sql.catalyst.analysis.EliminateView                         
                      0 / 216679                                      0 / 1
   org.apache.spark.sql.catalyst.analysis.UpdateOuterReferences                 
                      0 / 137675                                      0 / 2
   org.apache.spark.sql.execution.datasources.PreprocessTableCreation           
                      0 / 124279                                      0 / 2
   org.apache.spark.sql.catalyst.analysis.ResolveNoopDropTable                  
                      0 / 111526                                      0 / 2
   org.apache.spark.sql.execution.datasources.DataSourceAnalysis                
                      0 / 100454                                      0 / 2
   org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithFilter              
                      0 / 99833                                       0 / 1
   org.apache.spark.sql.catalyst.optimizer.OptimizeLimitZero                    
                      0 / 98320                                       0 / 1
   org.apache.spark.sql.execution.datasources.PreprocessTableInsertion          
                      0 / 96634                                       0 / 2
   org.apache.spark.sql.catalyst.optimizer.RewriteExceptAll                     
                      0 / 96403                                       0 / 1
   org.apache.spark.sql.catalyst.optimizer.RewriteIntersectAll                  
                      0 / 94322                                       0 / 1
   org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveAlterTableChanges     
                      0 / 94092                                       0 / 2
   org.apache.spark.sql.catalyst.optimizer.ReplaceIntersectWithSemiJoin         
                      0 / 92092                                       0 / 1
   org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithAntiJoin            
                      0 / 91821                                       0 / 1
   org.apache.spark.sql.catalyst.analysis.ResolveHints$RemoveAllHints           
                      0 / 91637                                       0 / 2
   org.apache.spark.sql.catalyst.optimizer.ReplaceDistinctWithAggregate         
                      0 / 90672                                       0 / 1
   org.apache.spark.sql.catalyst.optimizer.CheckCartesianProducts               
                      0 / 42624                                       0 / 2
   org.apache.spark.sql.execution.OptimizeMetadataOnlyQuery                     
                      0 / 25905                                       0 / 1
   org.apache.spark.sql.catalyst.optimizer.CostBasedJoinReorder                 
                      0 / 9020                                        0 / 1
   org.apache.spark.sql.catalyst.analysis.ResolveHints$DisableHints             
                      0 / 8111                                        0 / 2
   ```
   
   </details>
   
   The original description of SPARK-36444 did not show this improvement, but 
it do significantly improve the SQL compile performance for such cases.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Significant SQL compile performance improvement for some cases.
   
   ### How was this patch tested?
   
   Added UT.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to