bolkedebruin commented on PR #35612:
URL: https://github.com/apache/airflow/pull/35612#issuecomment-1811985440

   > (Edited; see below) I read the implementation and it seems the parsing 
logic in `ObjectStoragePath` is a bit fragile if the user passes in some weird 
combo like
   > 
   > ```python
   > P = ObjectStoragePath
   > 
   > P(P("s3://bucket/path"), P("file://storage/path"))
   > ```
   > 
   > In pathlib, passing in an absolute path would simply overwrite everything 
passed in previous arguments and simply gives you back that absolute path, but 
from I can tell the current implementation cannot do that, and returns a 
somewhat nonsensical result. I’ll provide some improvements maybe after this is 
merged or at least stablises.
   > 
   > Edited: It seems like this weirdness is inherited from UPath. I guess I’ll 
attempt to submit a fix upstream at some point.
   > 
   
   In my previous implementation I checked for the same backing store and would 
raise an exception if they wouldn't be equal. I think we should do the same 
here, cause the behavior or `pathlib.Path` doesn't make sense here as 
`pathlib.Path` has the same backing store / filesystem per design.
   
   > Another not directly related point, I’m thinking we can probably merge 
`conn_id` into the URI itself like this: `s3://conn_id@bucket/`. This seems 
cleaner from the user’s standpoint. The conn_id value should still be parsed 
out in the implementation, but I feel the user shouldn’t be required to pass it 
in as a separate argument. We can discuss this later after this is merged.
   
   I like that, but it should be indeed a separate PR.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to