jimmycfa commented on issue #16763:
URL: https://github.com/apache/airflow/issues/16763#issuecomment-873010485


   For posterity sake we are using boto3==1.17.99. This actually appears to be 
an issue with the way that NameContains filter gets applied:
   
   The NameContains is getting passed into list_processing_jobs but it doesn't 
actually filter on the entire set of ProcessingJobs. It appears to filter per 
batch of 100 so you still end up calling the list_processing_jobs in that 
SagemakerOperator 30+ times back to back. Another way of saying this is if I 
specify NameContains in the list_processing_jobs with a job name that doesn't 
exist and I have over 3500 processing jobs it will return an empty set of 
ProcessingJobSummaries BUT still includes a NextToken. It will do this 35 more 
times as the max results = 100 for that call and you likely run into Throttling 
issues.
   
   I believe the expected behavior of that boto3 call should be the 
NameContains filter should be being applied to the entire set of jobs and then 
returning results vs per batch so that the first call through returns an empty 
set for ProcessingJobSummaries and NO NextToken.
   
   I'm going to reopen but this does appear to be a boto3 issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to