[jira] [Updated] (AIRFLOW-4965) Handling throttling in GCP AI operators

Kamil (JIRA) Tue, 16 Jul 2019 01:46:26 -0700


     [ 
https://issues.apache.org/jira/browse/AIRFLOW-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Kamil updated AIRFLOW-4965:
---------------------------
    Description: 
Polidea develops Apache Airflow operators for following Google Cloud AI 
services:
 * Cloud Translate
 * Cloud Vision
 * Cloud Text-To-Speech
 * Cloud Speech-To-Text
 * Cloud Translate Speech
 * Cloud Natural Language
 * Cloud Video Intelligence

Those API implement quota verification and throttle requests that exceed the 
quota. Here are the relevant links describing it:

[https://cloud.google.com/translate/quotas] 

[https://cloud.google.com/vision/quotas]

[https://cloud.google.com/speech-to-text/quotas]

[https://cloud.google.com/text-to-speech/quotas]

[https://cloud.google.com/natural-language/quotas]

[https://cloud.google.com/video-intelligence/quotas]

There are several types of quotas and limits:

 

*Translate:*
 * characters per day [403 - error  “Daily Limit Exceeded”]
 * characters per 100 seconds (per project or per project/user) [403 error 
“User Rate Limit Exceeded”] [TEMPORARY]

 

*Vision:*
 * image file size
 * requests per minute [TEMPORARY]
 * images per feature per month 

 

*Text to speech:*
 * total characters per request
 * requests per minute [TEMPORARY]
 * characters per minute [TEMPORARY]

 

*Speech to text*
 * limits of the content size
 * limits of the phrases/characters per request for context
 * requests per 60 seconds [TEMPORARY]
 * processing per day

*Natural Language*
 * Text Content size 

 * Token quota and Entity mentions (ignored?)
 * requests per 100 seconds [TEMPORARY]
 * requests per day 

*Video Intelligence*
 * video size
 * requests per minute [TEMPORARY]
 * backend time in seconds per minute [TEMPORARY]

 

In all Cloud AI operators we are using Python Client API. Most methods are 
using built-in 
[Retry|https://googleapis.github.io/google-cloud-python/latest/core/retry.html#google.api_core.retry.Retry]
 object and Retry mechanism. The assumption is that for functions that use the 
mechanism, it is implemented correctly and by default “retriable” errors only 
are retried. User can configure behaviour of the Retry object - exponential 
back-off factor, delays, etc. In the current API Retry object can be provided 
by the user creating the DAG and using the operator:
  

The APIS that use Retry object are:
 * *Cloud Vision Product Search*

 * *Cloud Vision Extra*

 * *Cloud Vision Detect*

 * *Cloud Natural Language*

 * *Cloud Speech*

 * *Cloud Video Intelligence*

**

The Retry mechanism provided by the Client API should be enough to handle 
temporary bursts of requests. User can control exponential back-off rate and 
will be able to adjust it to their own needs. They are also able to manually 
restart failed jobs using standard Airflow mechanisms in case their 
configuration is not well adjusted to their limits.

 

The only case where Retry is not used in the API is Translate operator - 
specifically 
[translate|https://googleapis.github.io/google-cloud-python/latest/translate/client.html#google.cloud.translate_v2.client.Client.translate]
 API. 

 

In case of Translate API, the proposal is to use Retry decorator in our own 
hook and perform retries only in case of *“User Rate Limit Exceeded”* error, 
all other errors (size limit and Daily Limit Exceeded) should be treated as 
non-retriable. In those cases users will be able to manually restart failed 
jobs. 

 
h1. Implementation

We analyzed two solutions:
 # extension of the built-in mechanism from google-cloud-python library - Retry
 # external library - tenacity

 

The use of the first solution seems natural, but it is problematic. Each method 
creates a retry object by default from a configuration based on a private file 
with configuration.
 Reference: 
 
[https://github.com/googleapis/google-cloud-python/blob/b718d2d9bb32b0e7934ae90d57dc80c81ce0fb73/vision/google/cloud/vision_v1/gapic/image_annotator_client.py#L162-L168
 ] 
[https://github.com/googleapis/google-cloud-python/blob/b718d2d9bb32b0e7934ae90d57dc80c81ce0fb73/vision/google/cloud/vision_v1/gapic/image_annotator_client.py#L296-L304
 ]If we would like to extend this mechanism, we would have to copy the logic to 
our mechanism. The google-cloud-python library does not allow us to easily 
change only part of the configuration of retry object.

 

The retry mechanism is not supported by all services (See: [Current 
approach|https://docs.google.com/document/d/1QYwTy6r7bbLK3cmE9x1VShelHQ4B9SLHZDVjp6ja1S4/edit#heading=h.y76gaxcevnym]),
 so there is a need to create a separate mechanism. The new mechanism based on 
the external library will work with all services. This will provide a more 
predictable developer experience.

 

The tenacity library provides a code retry mechanism based on the decorator. It 
use wait strategy that applies exponential backoff. All hook methods that are 
covered by quota restrictions will get a new decorator.

 

Sample implementation:

{{@tenacity.retry(}}
{{    wait=tenacity.wait_exponential(min=1, max=100),}}
{{    retry=retry_if_temporary_quota(),}}
{{)}}
{{def fetch():}}
{{    response = client.translate(TEXT, 
target_language="PL")['translatedText']}}
{{    return response|}}

   

 

_retry_if_temporary_quota_ is a factory method that creates a predicate to 
check if the exception concerns the quota restriction.

  was:
Polidea develops Apache Airflow operators for following Google Cloud AI 
services:
 * Cloud Translate
 * Cloud Vision
 * Cloud Text-To-Speech
 * Cloud Speech-To-Text
 * Cloud Translate Speech
 * Cloud Natural Language
 * Cloud Video Intelligence

Those API implement quota verification and throttle requests that exceed the 
quota. Here are the relevant links describing it:

[https://cloud.google.com/translate/quotas] 

[https://cloud.google.com/vision/quotas]

[https://cloud.google.com/speech-to-text/quotas]

[https://cloud.google.com/text-to-speech/quotas]

[https://cloud.google.com/natural-language/quotas]

[https://cloud.google.com/video-intelligence/quotas]

There are several types of quotas and limits:

 

*Translate:*
 * characters per day [403 - error  “Daily Limit Exceeded”]
 * characters per 100 seconds (per project or per project/user) [403 error 
“User Rate Limit Exceeded”] [TEMPORARY]

 

*Vision:*
 * image file size
 * requests per minute [TEMPORARY]
 * images per feature per month 

 

*Text to speech:*
 * total characters per request
 * requests per minute [TEMPORARY]
 * characters per minute [TEMPORARY]

 

*Speech to text*
 * limits of the content size
 * limits of the phrases/characters per request for context
 * requests per 60 seconds [TEMPORARY]
 * processing per day

*Natural Language*
 * Text Content size 

 * Token quota and Entity mentions (ignored?)
 * requests per 100 seconds [TEMPORARY]
 * requests per day 

*Video Intelligence*
 * video size
 * requests per minute [TEMPORARY]
 * backend time in seconds per minute [TEMPORARY]

 

In all Cloud AI operators we are using Python Client API. Most methods are 
using built-in 
[Retry|https://googleapis.github.io/google-cloud-python/latest/core/retry.html#google.api_core.retry.Retry]
 object and Retry mechanism. The assumption is that for functions that use the 
mechanism, it is implemented correctly and by default “retriable” errors only 
are retried. User can configure behaviour of the Retry object - exponential 
back-off factor, delays, etc. In the current API Retry object can be provided 
by the user creating the DAG and using the operator:
  

The APIS that use Retry object are:
 * *Cloud Vision Product Search*

 * *Cloud Vision Extra*

 * *Cloud Vision Detect*

 * *Cloud Natural Language*

 * *Cloud Speech*

 * *Cloud Video Intelligence*

**

The Retry mechanism provided by the Client API should be enough to handle 
temporary bursts of requests. User can control exponential back-off rate and 
will be able to adjust it to their own needs. They are also able to manually 
restart failed jobs using standard Airflow mechanisms in case their 
configuration is not well adjusted to their limits.

 

The only case where Retry is not used in the API is Translate operator - 
specifically 
[translate|https://googleapis.github.io/google-cloud-python/latest/translate/client.html#google.cloud.translate_v2.client.Client.translate]
 API. 

 

In case of Translate API, the proposal is to use Retry decorator in our own 
hook and perform retries only in case of *“User Rate Limit Exceeded”* error, 
all other errors (size limit and Daily Limit Exceeded) should be treated as 
non-retriable. In those cases users will be able to manually restart failed 
jobs. 

 
h1. Implementation

We analyzed two solutions:
 # extension of the built-in mechanism from google-cloud-python library - Retry
 # external library - tenacity

 

The use of the first solution seems natural, but it is problematic. Each method 
creates a retry object by default from a configuration based on a private file 
with configuration.
 Reference: 
 
[https://github.com/googleapis/google-cloud-python/blob/b718d2d9bb32b0e7934ae90d57dc80c81ce0fb73/vision/google/cloud/vision_v1/gapic/image_annotator_client.py#L162-L168
 ] 
[https://github.com/googleapis/google-cloud-python/blob/b718d2d9bb32b0e7934ae90d57dc80c81ce0fb73/vision/google/cloud/vision_v1/gapic/image_annotator_client.py#L296-L304
 ]If we would like to extend this mechanism, we would have to copy the logic to 
our mechanism. The google-cloud-python library does not allow us to easily 
change only part of the configuration of retry object.

 

The retry mechanism is not supported by all services (See: [Current 
approach|https://docs.google.com/document/d/1QYwTy6r7bbLK3cmE9x1VShelHQ4B9SLHZDVjp6ja1S4/edit#heading=h.y76gaxcevnym]),
 so there is a need to create a separate mechanism. The new mechanism based on 
the external library will work with all services. This will provide a more 
predictable developer experience.

 

The tenacity library provides a code retry mechanism based on the decorator. It 
use wait strategy that applies exponential backoff. All hook methods that are 
covered by quota restrictions will get a new decorator.

 

Sample implementation:
|@tenacity.retry(
     wait=tenacity.wait_exponential(min=1, max=100),
     retry=retry_if_temporary_quota(),
 )
 *def* fetch():
     response = client.translate(TEXT, target_language="PL")['translatedText']|

    *return* response|

 

_retry_if_temporary_quota_ is a factory method that creates a predicate to 
check if the exception concerns the quota restriction.


> Handling throttling in GCP AI operators
> ---------------------------------------
>
>                 Key: AIRFLOW-4965
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4965
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: gcp
>    Affects Versions: 1.10.3
>            Reporter: Kamil
>            Assignee: Kamil
>            Priority: Minor
>
> Polidea develops Apache Airflow operators for following Google Cloud AI 
> services:
>  * Cloud Translate
>  * Cloud Vision
>  * Cloud Text-To-Speech
>  * Cloud Speech-To-Text
>  * Cloud Translate Speech
>  * Cloud Natural Language
>  * Cloud Video Intelligence
> Those API implement quota verification and throttle requests that exceed the 
> quota. Here are the relevant links describing it:
> [https://cloud.google.com/translate/quotas] 
> [https://cloud.google.com/vision/quotas]
> [https://cloud.google.com/speech-to-text/quotas]
> [https://cloud.google.com/text-to-speech/quotas]
> [https://cloud.google.com/natural-language/quotas]
> [https://cloud.google.com/video-intelligence/quotas]
> There are several types of quotas and limits:
>  
> *Translate:*
>  * characters per day [403 - error  “Daily Limit Exceeded”]
>  * characters per 100 seconds (per project or per project/user) [403 error 
> “User Rate Limit Exceeded”] [TEMPORARY]
>  
> *Vision:*
>  * image file size
>  * requests per minute [TEMPORARY]
>  * images per feature per month 
>  
> *Text to speech:*
>  * total characters per request
>  * requests per minute [TEMPORARY]
>  * characters per minute [TEMPORARY]
>  
> *Speech to text*
>  * limits of the content size
>  * limits of the phrases/characters per request for context
>  * requests per 60 seconds [TEMPORARY]
>  * processing per day
> *Natural Language*
>  * Text Content size 
>  * Token quota and Entity mentions (ignored?)
>  * requests per 100 seconds [TEMPORARY]
>  * requests per day 
> *Video Intelligence*
>  * video size
>  * requests per minute [TEMPORARY]
>  * backend time in seconds per minute [TEMPORARY]
>  
> In all Cloud AI operators we are using Python Client API. Most methods are 
> using built-in 
> [Retry|https://googleapis.github.io/google-cloud-python/latest/core/retry.html#google.api_core.retry.Retry]
>  object and Retry mechanism. The assumption is that for functions that use 
> the mechanism, it is implemented correctly and by default “retriable” errors 
> only are retried. User can configure behaviour of the Retry object - 
> exponential back-off factor, delays, etc. In the current API Retry object can 
> be provided by the user creating the DAG and using the operator:
>   
> The APIS that use Retry object are:
>  * *Cloud Vision Product Search*
>  * *Cloud Vision Extra*
>  * *Cloud Vision Detect*
>  * *Cloud Natural Language*
>  * *Cloud Speech*
>  * *Cloud Video Intelligence*
> **
> The Retry mechanism provided by the Client API should be enough to handle 
> temporary bursts of requests. User can control exponential back-off rate and 
> will be able to adjust it to their own needs. They are also able to manually 
> restart failed jobs using standard Airflow mechanisms in case their 
> configuration is not well adjusted to their limits.
>  
> The only case where Retry is not used in the API is Translate operator - 
> specifically 
> [translate|https://googleapis.github.io/google-cloud-python/latest/translate/client.html#google.cloud.translate_v2.client.Client.translate]
>  API. 
>  
> In case of Translate API, the proposal is to use Retry decorator in our own 
> hook and perform retries only in case of *“User Rate Limit Exceeded”* error, 
> all other errors (size limit and Daily Limit Exceeded) should be treated as 
> non-retriable. In those cases users will be able to manually restart failed 
> jobs. 
>  
> h1. Implementation
> We analyzed two solutions:
>  # extension of the built-in mechanism from google-cloud-python library - 
> Retry
>  # external library - tenacity
>  
> The use of the first solution seems natural, but it is problematic. Each 
> method creates a retry object by default from a configuration based on a 
> private file with configuration.
>  Reference: 
>  
> [https://github.com/googleapis/google-cloud-python/blob/b718d2d9bb32b0e7934ae90d57dc80c81ce0fb73/vision/google/cloud/vision_v1/gapic/image_annotator_client.py#L162-L168
>  ] 
> [https://github.com/googleapis/google-cloud-python/blob/b718d2d9bb32b0e7934ae90d57dc80c81ce0fb73/vision/google/cloud/vision_v1/gapic/image_annotator_client.py#L296-L304
>  ]If we would like to extend this mechanism, we would have to copy the logic 
> to our mechanism. The google-cloud-python library does not allow us to easily 
> change only part of the configuration of retry object.
>  
> The retry mechanism is not supported by all services (See: [Current 
> approach|https://docs.google.com/document/d/1QYwTy6r7bbLK3cmE9x1VShelHQ4B9SLHZDVjp6ja1S4/edit#heading=h.y76gaxcevnym]),
>  so there is a need to create a separate mechanism. The new mechanism based 
> on the external library will work with all services. This will provide a more 
> predictable developer experience.
>  
> The tenacity library provides a code retry mechanism based on the decorator. 
> It use wait strategy that applies exponential backoff. All hook methods that 
> are covered by quota restrictions will get a new decorator.
>  
> Sample implementation:
> {{@tenacity.retry(}}
> {{    wait=tenacity.wait_exponential(min=1, max=100),}}
> {{    retry=retry_if_temporary_quota(),}}
> {{)}}
> {{def fetch():}}
> {{    response = client.translate(TEXT, 
> target_language="PL")['translatedText']}}
> {{    return response|}}
>    
>  
> _retry_if_temporary_quota_ is a factory method that creates a predicate to 
> check if the exception concerns the quota restriction.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (AIRFLOW-4965) Handling throttling in GCP AI operators

Reply via email to