[ 
https://issues.apache.org/jira/browse/AIRFLOW-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15355928#comment-15355928
 ] 

ASF subversion and git services commented on AIRFLOW-243:
---------------------------------------------------------

Commit bf28de4e601c165020669fd593964187b6246131 in incubator-airflow's branch 
refs/heads/master from [~xuanji]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=bf28de4 ]

[AIRFLOW-243] Create NamedHivePartitionSensor

Closes #1593 from zodiac/create-NamedHivePartitionSensor


> Use a more efficient Thrift call for HivePartitionSensor
> --------------------------------------------------------
>
>                 Key: AIRFLOW-243
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-243
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: operators
>    Affects Versions: Airflow 2.0
>            Reporter: Paul Yang
>            Assignee: Li Xuanji
>            Priority: Minor
>             Fix For: Airflow 2.0
>
>
> The {{HivePartitionSesnor}} uses the `get_partitions_by_filter` Thrift call 
> that can result in some expensive SQL queries for tables that have many 
> partitions and are partitioned by multiple keys. We've seen our metastore DB 
> get hammered by these sensors resulting in service degradation for other 
> metastore users.
> The {{MetastorePartitionSensor}} is efficient, but it can result in too many 
> connections to the metastore DB.
> An alternative is to use the `get_partition_by_name` Thrift call that 
> translates into more efficient SQL queries. Because connections will be 
> pooled on the Thrift server, the DB won't get overloaded as with the 
> {{MetastorePartitionSensor}}. The semantics of the arguments will change, so 
> either a new argument needs to be introduced, or a new operator needs to be 
> created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to