LaPetiteSouris opened a new pull request, #24554: URL: https://github.com/apache/airflow/pull/24554
## What - Add `SqsBatchSensor` which polls SQS multiple times before returning the results  ## Why - SQS allows [10 messages per batch](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-batch-api-actions.html) - Current `SqsSensor` perform 1 poll per `poke`, which means effectively `SqsSensor` retrieves 10 messages per execution at max - In many cases, we may have hundred of messages stuck in the queue and `SqsSensor` dequeues only 10 messages per execution, thus to accelerate a bit, we need to add multiple `SqsSensor` tasks. `SqsSensor` tasks is a type of sensor that runs constantly (every minute for example), and consume worker execution slots. - In other cases, we have tasks that should be triggered with arguments retrieved for SQS. Example: the SQS messages contain IDs of items to be processed, and the next task will be triggered with arguments `--id_list=[id1,id2,id3,...]`. Having 10 messages per batch mean the downstream task has to be triggered multiple times, each time with only 10 ids max. - Similar to the mechanism provided by [AWS Lambda integration with SQS](https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html), we want to introduce the notion of batch in SQS, just like the way AWS Lambda processes SQS messages. Contributed thanks to efforts of [DevDevEve](https://github.com/DevDevEve) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
