Nataneljpwd commented on code in PR #52002:
URL: https://github.com/apache/airflow/pull/52002#discussion_r2160427041
##########
task-sdk/tests/task_sdk/io/test_path.py:
##########
@@ -182,6 +184,27 @@ def test_bucket_key_protocol(self):
assert o.protocol == protocol
+class TestConnPath:
+ def test_connection_inferred_protocol(self):
Review Comment:
I would add a test on what happens when each or one or both of the required
elements are missing
##########
task-sdk/src/airflow/sdk/io/path.py:
##########
@@ -84,6 +86,33 @@ class ObjectStoragePath(CloudPath):
__slots__ = ("_hash_cached",)
+ @classmethod
+ def from_conn(cls, conn: Connection, path: str) -> ObjectStoragePath:
+ """
+ Construct an ObjectStoragePath from an Airflow Connection and a
relative path.
+
+ This method assumes the Connection is of type `objectstore` and
requires the following
+ keys to be present in its 'extra' JSON configuration:
+
+ - "provider": the scheme or protocol (e.g., "s3", "gs", "file")
+ - "base_path": the base bucket, container, or directory (e.g.,
"my-bucket", "tmp/export")
+
+ The final URI is assembled as: <provider>://<base_path>/<path>, and
passed to ObjectStoragePath.
+
+ :param conn: Airflow Connection object of type `objectstore`
+ :param path: Relative path to append to the base path
+ :return: ObjectStoragePath instance
+ :raises ValueError: if the connection type is not `objectstore`
+ """
+ if conn.conn_type == "objectstore":
+ provider = conn.extra_dejson.get("provider")
+ base_path = conn.extra_dejson.get("base_path")
Review Comment:
Won't this be troublesome if the values are not set?
Is it thrown in url unsplit? I would check it as well if the fields exist
and throw a more indicative error than what urlunsplit would throw, to make it
easier to debug
I don't think the check exists in the constructor and you may get an object
storage type provider without the required fields (including the extra dejson)
##########
task-sdk/src/airflow/sdk/io/path.py:
##########
@@ -84,6 +86,33 @@ class ObjectStoragePath(CloudPath):
__slots__ = ("_hash_cached",)
+ @classmethod
+ def from_conn(cls, conn: Connection, path: str) -> ObjectStoragePath:
+ """
+ Construct an ObjectStoragePath from an Airflow Connection and a
relative path.
+
+ This method assumes the Connection is of type `objectstore` and
requires the following
+ keys to be present in its 'extra' JSON configuration:
+
+ - "provider": the scheme or protocol (e.g., "s3", "gs", "file")
+ - "base_path": the base bucket, container, or directory (e.g.,
"my-bucket", "tmp/export")
+
+ The final URI is assembled as: <provider>://<base_path>/<path>, and
passed to ObjectStoragePath.
+
+ :param conn: Airflow Connection object of type `objectstore`
+ :param path: Relative path to append to the base path
+ :return: ObjectStoragePath instance
+ :raises ValueError: if the connection type is not `objectstore`
+ """
+ if conn.conn_type == "objectstore":
Review Comment:
Isn't it better to check if it is not an object store and throw when that
happens?
It will reduce nesting in the code and will make it easier to work with
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]