phanikumv commented on code in PR #28874: URL: https://github.com/apache/airflow/pull/28874#discussion_r1067780016
########## airflow/providers/apache/hive/provider.yaml: ########## @@ -56,6 +56,7 @@ dependencies: # the sasl library anyway (and there sasl library version is not relevant) - sasl>=0.3.1; python_version>="3.9" - thrift>=0.9.2 + - impyla Review Comment: We have written our own asyncio method as shown below, because impyla returns immediately after submitting the query. Please check the link https://github.com/cloudera/impyla/blob/v0.16a2/impala/hiveserver2.py#L334-L338 ``` async def partition_exists(self, table: str, schema: str, partition: str, polling_interval: float) -> str: """ Checks for the existence of a partition in the given hive table. :param table: table in hive where the partition exists. :param schema: database where the hive table exists :param partition: partition to check for in given hive database and hive table. :param polling_interval: polling interval in seconds to sleep between checks """ client = self.get_hive_client() cursor = client.cursor() query = f"show partitions {schema}.{table} partition({partition})" cursor.execute_async(query) while cursor.is_executing(): await asyncio.sleep(polling_interval) results = cursor.fetchall() if len(results) == 0: return "failure" return "success" ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
