LaPetiteSouris commented on code in PR #24554:
URL: https://github.com/apache/airflow/pull/24554#discussion_r903454860
##########
airflow/providers/amazon/aws/sensors/sqs.py:
##########
@@ -215,3 +229,68 @@ def __init__(self, *args, **kwargs):
stacklevel=2,
)
super().__init__(*args, **kwargs)
+
+
+class SqsBatchSensor(SqsSensor):
+ """
+ Get messages from an Amazon SQS queue in batches and then delete the
retrieved messages from the queue.
+ If deletion of messages fails an AirflowException is thrown. Otherwise,
all messages
+ are pushed through XCom with the key ``messages``.
+
+ This sensor is identical to SqsSensor, except that SqsSensor performs one
and only one SQS call
+ per poke, which limits the result to a maximum of 10 messages, while
SqsBatchSensor performs multiple
+ SQS API calls per poke and combines the results into one list.
+
+ .. seealso::
+ For more information on how to use this sensor, take a look at the
guide:
+ :ref:`howto/sensor:SqsBatchSensor`
+
+ :param num_batches: The number of times the sensor will call the SQS API
to receive messages (default: 1)
+ """
+
+ def __init__(
+ self,
+ *,
+ num_batches: int = 1,
+ **kwargs,
+ ):
+ super().__init__(**kwargs)
+ self.num_batches = num_batches
+
+ def poke(self, context: 'Context'):
+ """
+ Check subscribed queue for messages and write them to xcom with the
``messages`` key.
+
+ :param context: the context object
+ :return: ``True`` if message is available or ``False``
+ """
+ sqs_conn = self.get_hook().get_conn()
+ message_batch: List[Any] = []
+ # perform multiple SQS call to retrieve messages in series
+ for _ in range(self.num_batches):
+ messages = self.poll_sqs(sqs_conn=sqs_conn)
+
+ if not len(messages):
+ continue
+
+ message_batch.extend(messages)
+
+ if self.delete_message_on_reception:
+
+ self.log.info("Deleting %d messages", len(messages))
+
+ entries = [
+ {'Id': message['MessageId'], 'ReceiptHandle':
message['ReceiptHandle']}
+ for message in messages
+ ]
+ response =
sqs_conn.delete_message_batch(QueueUrl=self.sqs_queue, Entries=entries)
+
+ if 'Successful' not in response:
+ raise AirflowException(
+ 'Delete SQS Messages failed ' + str(response) + ' for
messages ' + str(messages)
+ )
+ if not len(message_batch):
+ return False
Review Comment:
Sorry coming late to the party, and sorry for commenting on a closed issue,
however, it's worth noting that:
- Delete a message on SQS is not something can be considered to be manual.
Because after the consumption of a message, the consumer must ACK the
consumption, in AWS SQS, the ACK is the deletion.
- Indeed, you are right when noticing that more works are needed for this
sensor thing. When using the sensor extensively, we had to decouple the message
deletion logic out of the sensor. See the [PR to make message delete opt
in](https://github.com/apache/airflow/pull/21159)
Great that there is a future discussion on this behavior. Also great that
this discussion is closed and out of this current PR (otherwise I would not
know how to handle)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]