bolkedebruin commented on PR #35612:
URL: https://github.com/apache/airflow/pull/35612#issuecomment-1811985440
> (Edited; see below) I read the implementation and it seems the parsing
logic in `ObjectStoragePath` is a bit fragile if the user passes in some weird
combo like
>
> ```python
> P = ObjectStoragePath
>
> P(P("s3://bucket/path"), P("file://storage/path"))
> ```
>
> In pathlib, passing in an absolute path would simply overwrite everything
passed in previous arguments and simply gives you back that absolute path, but
from I can tell the current implementation cannot do that, and returns a
somewhat nonsensical result. I’ll provide some improvements maybe after this is
merged or at least stablises.
>
> Edited: It seems like this weirdness is inherited from UPath. I guess I’ll
attempt to submit a fix upstream at some point.
>
In my previous implementation I checked for the same backing store and would
raise an exception if they wouldn't be equal. I think we should do the same
here, cause the behavior or `pathlib.Path` doesn't make sense here as
`pathlib.Path` has the same backing store / filesystem per design.
> Another not directly related point, I’m thinking we can probably merge
`conn_id` into the URI itself like this: `s3://conn_id@bucket/`. This seems
cleaner from the user’s standpoint. The conn_id value should still be parsed
out in the implementation, but I feel the user shouldn’t be required to pass it
in as a separate argument. We can discuss this later after this is merged.
I like that, but it should be indeed a separate PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]