potiuk commented on code in PR #53407:
URL: https://github.com/apache/airflow/pull/53407#discussion_r2216500009


##########
providers/http/src/airflow/providers/http/hooks/http.py:
##########
@@ -172,7 +175,7 @@ def get_conn(
         :return: A configured requests.Session object.
         """
         session = Session()
-        connection = self.get_connection(self.http_conn_id)
+        connection = self.get_connection(self.get_conn_id())

Review Comment:
   The reason why we have`xxx_conn_id` is `default_args`. 
   
   The pattern is that you can specify one `aws_conn_id` in default_args and 
one `google_conn_id` and then all AWS operators in the Dag would use the former 
and all Google operators would use the latter. So it makes perfect sense to use 
it and continue using it in the future - this reason still "stands"  - so we 
should keep having that approach - i.e. generally same `xxx_conn_id` per 
provider should be `standard` and we should keep it.
   
   And `conn_attr_name` is defined so that Connection.get_hook() used in a 
number of places - for example in `test_connection` can properly pass the 
conn_id to the right `__init__` parameter when it tries to instantiate the Hook 
dynamically.
   
   And I am not sure if we want to add get_conn_id() in BaseHook. It does not 
bring a lot of value IMHO. With autocomplete in IDE, MyPy and ruff checks, i 
pretty much like the explicit using of self.xxx_conn_id rather than 
get_conn_id(). I am not sure if there are **real** benefits of having it for 
all hooks. The real benefit would be to get rid of `conn_attr_name` and finding 
the `xxx_conn_id` automaticaly - because that is where we have **some** level 
of duplication - same parameter name and "conn_attr_name" string.
   
   The only reason we used `get_conn_id()`  pattern so far was I think 
common.sql where it is `actually` needed - if you look at DbApiHook, it has 
this implementation:
   
   ```python
       def get_conn_id(self) -> str:
           return getattr(self, self.conn_name_attr)
   ```
   
   And it is needed, because all SQL operators that take DbApiHook-derived 
Hooks are using get_conn_id to actually retrieve the connection id in 
consistent way. But this is really a `DbApiHook` API rather than "BaseHook" API 
- and I am afraid moving it to BaseHook will muddy the waters and dillute that 
strong tie between `DBApiHook` and `get_conn_id()`.
   
   Also in general - we should be VERY careful about modifying `Base` classes 
from task.sdk. Those "task.sdk" classes are Public APIs of Airflow. I think 
basically **anything** we add there should go through discussion on devlist 
with justifying why we want to do it - because adding **anything** there will 
mean we will have to maintain it forever, and it will be extremely painful if 
we make some mistake there and release airflow with some changes that could 
make things difficult to maintain. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to