[ 
https://issues.apache.org/jira/browse/AIRFLOW-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil updated AIRFLOW-4965:
---------------------------
    Summary: Handling throttling in GCP AI operators  (was: Handling throttling 
in AI Airflow operators)

> Handling throttling in GCP AI operators
> ---------------------------------------
>
>                 Key: AIRFLOW-4965
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4965
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: gcp
>    Affects Versions: 1.10.3
>            Reporter: Kamil
>            Priority: Minor
>
> Polidea develops Apache Airflow operators for following Google Cloud AI 
> services:
>  * Cloud Translate
>  * Cloud Vision 
>  * Cloud Text-To-Speech
>  * Cloud Speech-To-Text
>  * Cloud Translate Speech 
>  * Cloud Natural Language 
>  * Cloud Video Intelligence
> Those API implement quota verification and throttle requests that exceed the 
> quota. Here are the relevant links describing it:
> [https://cloud.google.com/translate/quotas] 
> [https://cloud.google.com/vision/quotas]
> [https://cloud.google.com/speech-to-text/quotas]
> [https://cloud.google.com/text-to-speech/quotas]
> [https://cloud.google.com/natural-language/quotas]
> [https://cloud.google.com/video-intelligence/quotas]
> There are several types of quotas and limits:
>  
> *Translate:*
>  * characters per day [403 - error  “Daily Limit Exceeded”]
>  * characters per 100 seconds (per project or per project/user) [403 error 
> “User Rate Limit Exceeded”] [TEMPORARY]
>  
> *Vision:*
>  * image file size
>  * requests per minute [TEMPORARY]
>  * images per feature per month 
>  
> *Text to speech:*
>  * total characters per request
>  * requests per minute [TEMPORARY]
>  * characters per minute [TEMPORARY]
>  
> *Speech to text*
>  * limits of the content size
>  * limits of the phrases/characters per request for context
>  * requests per 60 seconds [TEMPORARY]
>  * processing per day
> *Natural Language*
>  * Text Content size 
>  * Token quota and Entity mentions (ignored?)
>  * requests per 100 seconds [TEMPORARY]
>  * requests per day 
> *Video Intelligence*
>  * video size
>  * requests per minute [TEMPORARY]
>  * backend time in seconds per minute [TEMPORARY]
>  
> In all Cloud AI operators we are using Python Client API. Most methods are 
> using built-in 
> [Retry|https://googleapis.github.io/google-cloud-python/latest/core/retry.html#google.api_core.retry.Retry]
>  object and Retry mechanism. The assumption is that for functions that use 
> the mechanism, it is implemented correctly and by default “retriable” errors 
> only are retried. User can configure behaviour of the Retry object - 
> exponential back-off factor, delays, etc. In the current API Retry object can 
> be provided by the user creating the DAG and using the operator:
>  
> The APIS that use Retry object are:
>  * *Cloud Vision Product Search*
>  * *Cloud Vision Extra*
>  * *Cloud Vision Detect*
>  * *Cloud Natural Language*
>  * *Cloud Speech*
>  * *Cloud Video Intelligence*
> **
> The Retry mechanism provided by the Client API should be enough to handle 
> temporary bursts of requests. User can control exponential back-off rate and 
> will be able to adjust it to their own needs. They are also able to manually 
> restart failed jobs using standard Airflow mechanisms in case their 
> configuration is not well adjusted to their limits.
>  
> The only case where Retry is not used in the API is Translate operator - 
> specifically 
> [translate|https://googleapis.github.io/google-cloud-python/latest/translate/client.html#google.cloud.translate_v2.client.Client.translate]
>  API. 
>  
> In case of Translate API, the proposal is to use Retry decorator in our own 
> hook and perform retries only in case of *“User Rate Limit Exceeded”* error, 
> all other errors (size limit and Daily Limit Exceeded) should be treated as 
> non-retriable. In those cases users will be able to manually restart failed 
> jobs. 
>  
> h1. Implementation
> We analyzed two solutions:
>  # extension of the built-in mechanism from google-cloud-python library - 
> Retry
>  # external library - tenacity
>  
> The use of the first solution seems natural, but it is problematic. Each 
> method creates a retry object by default from a configuration based on a 
> private file with configuration.
> Reference: 
> [https://github.com/googleapis/google-cloud-python/blob/b718d2d9bb32b0e7934ae90d57dc80c81ce0fb73/vision/google/cloud/vision_v1/gapic/image_annotator_client.py#L162-L168
> ] 
> [https://github.com/googleapis/google-cloud-python/blob/b718d2d9bb32b0e7934ae90d57dc80c81ce0fb73/vision/google/cloud/vision_v1/gapic/image_annotator_client.py#L296-L304
> ]If we would like to extend this mechanism, we would have to copy the logic 
> to our mechanism. The google-cloud-python library does not allow us to easily 
> change only part of the configuration of retry object.
>  
> The retry mechanism is not supported by all services (See: [Current 
> approach|https://docs.google.com/document/d/1QYwTy6r7bbLK3cmE9x1VShelHQ4B9SLHZDVjp6ja1S4/edit#heading=h.y76gaxcevnym]),
>  so there is a need to create a separate mechanism. The new mechanism based 
> on the external library will work with all services. This will provide a more 
> predictable developer experience.
>  
> The tenacity library provides a code retry mechanism based on the decorator. 
> It use wait strategy that applies exponential backoff. All hook methods that 
> are covered by quota restrictions will get a new decorator.
>  
> Sample implementation:
> |@tenacity.retry(
>     wait=tenacity.wait_exponential(min=1, max=100),
>     retry=retry_if_temporary_quota(),
> )
> *def* fetch():
>     response = client.translate(TEXT, target_language="PL")['translatedText']
>     *return* response|
>  
> _retry_if_temporary_quota_ is a factory method that creates a predicate to 
> check if the exception concerns the quota restriction.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to