potiuk commented on PR #25980:
URL: https://github.com/apache/airflow/pull/25980#issuecomment-1241415888

   I'd remove it. It might break someone's workflow (https://xkcd.com/1172/) 
but I tihnk we have not promised any "special" behaviour here. 
   
   I can imagine users who have S3 connections defined and use get_hook() to 
instantate the hook dynamically.  This has been used for example in 
GenerictTransfer indeed but also I've seen it used in custom operators to 
depend on the conn_type to return particular Hook. We use it heavily in all SQL 
operators. BUT SQL operators have DBApiHook as base and follow the same 
protocols (and we unified them even more now via common.sql provider). 
   
   S3 Has no such promisses, no common interface, if there is any similarity 
with similar (GCS/Azure storage) hooks, this is at most incidental and I think 
this is about time that we don't care about those incidental similarities. Not 
having common "protocol" makes it next to impossible to use the hooks 
"dynamically" created by get_hook() depending on the connection type (though I 
Imagine someone doing '(S3Hook) get_hook("s3_conn")` rather `than 
S3Hook(conn_id=s3_conn)`. We would break the former workflow if we remove S3 
conn_type - but as long as we  describe it in changelog (including how ot 
conver it) and bump major version of Amazon provider, I have completely no 
problem with it.
   
   Even the GenericTransfer @dstandish  mentioned is Generic-ish, not truly 
Generic. It assumes that source Hook has "get_records" method and target hook 
has "run" (if preoperator is set) and "insert_rows' method. And when you look 
at the implementations - only SQL hooks follow it, so this really "Generic SQL 
transfer" operator. 
   
   I personally think "get_hook()" method only makes sense in SQL in our case 
and it useless elsewhere (though some individual workflows might use it and 
custom Hooks can take advantage of it).
   
   I think personally that unless we have a well defined protocol to follow, 
any "generic" use of conn_type is next to useless - and in S3 case, I woudl not 
worry about deleting it.
   
   However there is one thing that I think SHOULD be handled - rather than 
deprecatng it, I'd simply convert it dynamically to AWS automatically. 
currently, when someone has it defined and opens it in UI the configuration 
(keys etc)  will be editable and saved. If we remove conn_type, the first time 
it is opened in the UI, I believe it will be converted dynamically to a 
"default" connection (not sure) and some information might be lost. So I would 
try to handle the scenario when somoene has S3 connection defined in the DB, 
opens it and it gets automatically converted to AWS one - not sure how to do it 
best though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to