[ 
https://issues.apache.org/jira/browse/FLINK-19103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-19103:
-----------------------------------
    Labels: stale-major  (was: )

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issues has been marked as 
Major but is unassigned and neither itself nor its Sub-Tasks have been updated 
for 30 days. I have gone ahead and added a "stale-major" to the issue". If this 
ticket is a Major, please either assign yourself or give an update. Afterwards, 
please remove the label or in 7 days the issue will be deprioritized.


> The PushPartitionIntoTableSourceScanRule will lead a performance problem when 
> there are still many partitions after pruning
> ---------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-19103
>                 URL: https://issues.apache.org/jira/browse/FLINK-19103
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / Planner
>    Affects Versions: 1.10.2, 1.11.1
>            Reporter: fa zheng
>            Priority: Major
>              Labels: stale-major
>             Fix For: 1.14.0
>
>
> The PushPartitionIntoTableSourceScanRule will obtain new statistic after 
> pruning, however, it uses a for loop to get statistics of each partitions and 
> then merge them together. During this process, flink will try to call 
> metastore's interface four times in one loop. When remaining partitions are 
> huge, it spends a lot of time to get new statistic. 
>  
> {code:scala}
>     val newStatistic = {
>       val tableStats = catalogOption match {
>         case Some(catalog) =>
>           def mergePartitionStats(): TableStats = {
>             var stats: TableStats = null
>             for (p <- remainingPartitions) {
>               getPartitionStats(catalog, tableIdentifier, p) match {
>                 case Some(currStats) =>
>                   if (stats == null) {
>                     stats = currStats
>                   } else {
>                     stats = stats.merge(currStats)
>                   }
>                 case None => return null
>               }
>             }
>             stats
>           }
>           mergePartitionStats()
>         case None => null
>       }
>       
> FlinkStatistic.builder().statistic(statistic).tableStats(tableStats).build()
>     }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to