uranusjr commented on PR #34729:
URL: https://github.com/apache/airflow/pull/34729#issuecomment-1748680743

   What I’m envisioning is something like
   
   ```python
   warehouse_mnt = afs.mount("s3://warehouse")  # Can have conn_id too, it’s 
orthogontal.
   output_mnt = afs.mount("file:///tmp")
   
   @dag
   def my_dag:
   
     @task
     def load_file(src):
        with afs.open(src) as f:
          f.read()
   
     load_file(warehouse_mnt / "my_data.csv")
   ```
   
   instead of exposing the mount to the user, we encapsulate the data inside 
the Mount object and expose a Path-like interface to let the user operate on it 
directly. You can work with the mount directly as well, either by passing a 
mount point explicitly to `mount` or by accessing `mnt.mount_location` (or 
whatever, returns the location as as string) and work with that.
   
   The Dataset part I’m thinking now is pretty simple, just make the Mount 
object inherit from Dataset (or _is_ Dataset?) so that object can be used for 
both purposes without duplicating the URL if you need that. Not that useful but 
the two are really the same idea (a reference to some resource) that I feel 
shouldn’t be two things.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to