vincbeck commented on PR #49938: URL: https://github.com/apache/airflow/pull/49938#issuecomment-2844589605
> I'd also like to open a discussion about the end-user interface for `MessageQueueTrigger`. While working on the `KafkaMessageQueueProvider`, I noticed that the constructor arguments for different provider-specific trigger classes vary significantly. > > e.g. Kafka: `AwaitMessageTrigger`, Google Pub/Sub: `PubsubPullTrigger` > > https://github.com/apache/airflow/blob/c37e6eb482548d913d71c3ac023b7d74c05d5b5e/providers/apache/kafka/src/airflow/providers/apache/kafka/triggers/await_message.py#L59-L68 > > https://github.com/apache/airflow/blob/c37e6eb482548d913d71c3ac023b7d74c05d5b5e/providers/google/src/airflow/providers/google/cloud/triggers/pubsub.py#L54-L63 > > Additionally, the `queue` URI argument in the current interface is used solely to match the appropriate provider, and not necessarily tied to an Airflow Connection. This might be confusing to users. > > Should we consider replacing the `queue` argument in the `MessageQueueTrigger` constructor with something more explicit, like `type` or `provider`? > > For example: > > ```python > AVAILABLE_MESSAGE_QUEUE_PROVIDERS = Literal[ > "sqs", > "kafka", > ] > > class MessageQueueTrigger(BaseEventTrigger): > > def __init__(self, *, provider: AVAILABLE_MESSAGE_QUEUE_PROVIDERS, **kwargs: Any) -> None: > self.provider = provider > self.kwargs = kwargs > > @cached_property > def trigger(self) -> BaseEventTrigger: > # get trigger_class based on provider attribute > ``` > > cc @potiuk, @vincbeck, @vikramkoka Good question. I really wanted to avoid that and totally abstract the notion of provider to the user, that's why I came up with the parameter `queue`. But I dont think it is possible to know from a Kafka topic name or a Google pubsub project_id the identity of the provider. One solution would be to "create" a format/protocol where we could identity the provider and extract the queue identifier: - Kafka: `kafka://{topic_name}` - Google pubsub: `google://{project_id}` But this would create a new complexity specific to Airflow so I dont think this is a good idea. To be honest I am not a big fan of introducing this new parameter `provider` but I do not see other solutions. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
