jens-scheffler-bosch commented on PR #34729:
URL: https://github.com/apache/airflow/pull/34729#issuecomment-1753924155

   I was reading the AIP doc in general and feel like revising the PR in 
parallel is making the dicsussion a bit tooo complex. As I added a few comments 
I'm just adding a few here and would favor to have the AIP settled first before 
code and AIP (=concepts) are discussed in parallel - feels a bit of 
in-consistency.
   
   When reading the comments I feel stronger in my opinion that the concept of 
a "mount" from the term really is a bit mis-placed. As also commented in the 
AIP this gived the feeling to users that really something on OS level is being 
made. Seeing the code examples by @uranusjr I would really favor in (1) ising 
the Pythonic way of context managers and drop the ideas of mounting (is also 
not needed to mount in `pathlib.Path`) but just operate on some abstract file 
system object. For me there is actually no real need to introduce new mounting 
concepts if we could use the well - long existing - connection facility in 
Airflow. Connections are a very good place to abstract some backend storage 
endpoints away from DAG code and you could easily use existing provider 
facilities to have a `with Connection(MY_S2_ENDPOINT).filespec(): ...`or so.
   
   Not that it sounds dis-couraging, I very much like the IO library idea in 
order to reduce the XtoY Operators but I fear adding a total new FS/IO concept 
w/o integration of existing Connection/Dataset/Provider core concepts I feel 
this is not having a consistent product smell.
   
   Therefore I'd favor to first compare (general) options of FS concept 
integrations, pro(s) and con(s), compare them (maybe with example DAG code) 
before having a code level discussion which is hard to follow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to