jimmycfa opened a new issue #16763:
URL: https://github.com/apache/airflow/issues/16763


   **Apache Airflow version**: 2.0.2
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl 
version`): NA
   
   **Environment**: MWAA and Locally
   
   - **Cloud provider or hardware configuration**: AWS
   - **OS** (e.g. from /etc/os-release): NA
   - **Kernel** (e.g. `uname -a`): NA
   - **Install tools**: NA
   - **Others**: NA
   
   **What happened**:
   
   When calling the SagemakerProcessingOperator sometimes get: 
"botocore.exceptions.ClientError: An error occurred (ThrottlingException)" due 
to excessive ListProcessingJobs operations.
   
   **What you expected to happen**:
   
   The job should have started without timing out. I believe one fix would be 
to use the `NameContains` functionality of boto3 
[list_processing_jobs](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.list_processing_jobs)
 so you don't have to paginate as is occurring 
[here](https://github.com/apache/airflow/blob/main/airflow/providers/amazon/aws/hooks/sagemaker.py#L916).
 
   
   **How to reproduce it**:
   
   If you incrementally create Sagemaker Processing jobs you will eventually 
see the Throttling as the pagination increases.
   
   **Anything else we need to know**:
   
   This looks like it is happening when the account already has a lot of former 
Sagemaker Processing jobs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to