[
https://issues.apache.org/jira/browse/AIRFLOW-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15355928#comment-15355928
]
ASF subversion and git services commented on AIRFLOW-243:
---------------------------------------------------------
Commit bf28de4e601c165020669fd593964187b6246131 in incubator-airflow's branch
refs/heads/master from [~xuanji]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=bf28de4 ]
[AIRFLOW-243] Create NamedHivePartitionSensor
Closes #1593 from zodiac/create-NamedHivePartitionSensor
> Use a more efficient Thrift call for HivePartitionSensor
> --------------------------------------------------------
>
> Key: AIRFLOW-243
> URL: https://issues.apache.org/jira/browse/AIRFLOW-243
> Project: Apache Airflow
> Issue Type: Improvement
> Components: operators
> Affects Versions: Airflow 2.0
> Reporter: Paul Yang
> Assignee: Li Xuanji
> Priority: Minor
> Fix For: Airflow 2.0
>
>
> The {{HivePartitionSesnor}} uses the `get_partitions_by_filter` Thrift call
> that can result in some expensive SQL queries for tables that have many
> partitions and are partitioned by multiple keys. We've seen our metastore DB
> get hammered by these sensors resulting in service degradation for other
> metastore users.
> The {{MetastorePartitionSensor}} is efficient, but it can result in too many
> connections to the metastore DB.
> An alternative is to use the `get_partition_by_name` Thrift call that
> translates into more efficient SQL queries. Because connections will be
> pooled on the Thrift server, the DB won't get overloaded as with the
> {{MetastorePartitionSensor}}. The semantics of the arguments will change, so
> either a new argument needs to be introduced, or a new operator needs to be
> created.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)