vincbeck commented on PR #49938:
URL: https://github.com/apache/airflow/pull/49938#issuecomment-2844589605

   > I'd also like to open a discussion about the end-user interface for 
`MessageQueueTrigger`. While working on the `KafkaMessageQueueProvider`, I 
noticed that the constructor arguments for different provider-specific trigger 
classes vary significantly.
   > 
   > e.g. Kafka: `AwaitMessageTrigger`, Google Pub/Sub: `PubsubPullTrigger`
   > 
   > 
https://github.com/apache/airflow/blob/c37e6eb482548d913d71c3ac023b7d74c05d5b5e/providers/apache/kafka/src/airflow/providers/apache/kafka/triggers/await_message.py#L59-L68
   > 
   > 
https://github.com/apache/airflow/blob/c37e6eb482548d913d71c3ac023b7d74c05d5b5e/providers/google/src/airflow/providers/google/cloud/triggers/pubsub.py#L54-L63
   > 
   > Additionally, the `queue` URI argument in the current interface is used 
solely to match the appropriate provider, and not necessarily tied to an 
Airflow Connection. This might be confusing to users.
   > 
   > Should we consider replacing the `queue` argument in the 
`MessageQueueTrigger` constructor with something more explicit, like `type` or 
`provider`?
   > 
   > For example:
   > 
   > ```python
   > AVAILABLE_MESSAGE_QUEUE_PROVIDERS = Literal[
   >     "sqs",
   >     "kafka",
   > ]
   > 
   > class MessageQueueTrigger(BaseEventTrigger):
   > 
   >     def __init__(self, *, provider: AVAILABLE_MESSAGE_QUEUE_PROVIDERS, 
**kwargs: Any) -> None:
   >         self.provider = provider
   >         self.kwargs = kwargs
   > 
   >    @cached_property
   >     def trigger(self) -> BaseEventTrigger:
   >         # get trigger_class based on provider attribute
   > ```
   > 
   > cc @potiuk, @vincbeck, @vikramkoka
   
   Good question. I really wanted to avoid that and totally abstract the notion 
of provider to the user, that's why I came up with the parameter `queue`. But I 
dont think it is possible to know from a Kafka topic name or a Google pubsub 
project_id the identity of the provider. One solution would be to "create" a 
format/protocol where we could identity the provider and extract the queue 
identifier:
    - Kafka: `kafka://{topic_name}`
    - Google pubsub: `google://{project_id}`
   
   But this would create a new complexity specific to Airflow so I dont think 
this is a good idea. To be honest I am not a big fan of introducing this new 
parameter `provider` but I do not see other solutions. WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to