Re: [PR] WIP: Add Airflow FS [airflow]

via GitHub Fri, 06 Oct 2023 01:58:18 -0700


uranusjr commented on PR #34729:
URL: https://github.com/apache/airflow/pull/34729#issuecomment-1750230494


   I thought about the overall design some more. Aggregated comments on 
multiple things.
   
   re: `__truediv__` returning str. In `pathlib`, `Path.__truediv__` returns 
another Path instance so you can do e.g. `path / "foo" / "bar"`, which users 
would likely expect. I thought this should return another Mount object, but you 
pointed out the thing I got wrong, Mount does not represent a file endpoint, 
but more like a volume, so `__truediv__` should return something of another 
type similar to Path instead. At this point I feel we should leave that part 
out of the initial implementation and focus on getting other things right (see 
below).
   
   re: `unmount`. I’m guessing, but perhaps this is similar to @ashb’s concern 
on global state. Currently it is not at all clear when the user _can_ call 
cleanup, due to how `@task` functions are actually executed in separate 
workers, and how global variables in the DAG file work may not be obvious to 
users. I feel Airflow should either require a more explicit approach such as
   
   ```python
   m = Mount("s3://...")  # Does not register anything but simply creates an 
object in memory.
   
   @task
   def foo():
       with m:  # Actually registers the mount.
   ```
   
   or be magically smarter about this and control the mount registration only 
inside a particular task context instead. We can draw inspiration from other 
platforms, such as how Pytest uses argument names to infer fixtures (which have 
a similar context problem), or how Dagster has `@asset(deps=...)` to define the 
task needs certain resources in the context.
   
   Finally, I think this needs an AIP to document the various design decisions 
and better describe the high-level view.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] WIP: Add Airflow FS [airflow]

Reply via email to