mamdouhtawfik commented on code in PR #45334:
URL: https://github.com/apache/airflow/pull/45334#discussion_r1900689973


##########
providers/src/airflow/providers/databricks/operators/databricks.py:
##########
@@ -242,7 +243,35 @@ def get_link(
         *,
         ti_key: TaskInstanceKey,
     ) -> str:
-        return XCom.get_value(key=XCOM_RUN_PAGE_URL_KEY, ti_key=ti_key)
+        # Retrieve the value from XCom
+        value = XCom.get_value(key=XCOM_RUN_PAGE_URL_KEY, ti_key=ti_key)
+
+        # Check if the value is an S3 URL
+        if value and value.startswith("s3://"):

Review Comment:
   What if the implementation doesn't store an s3 url but just object key? What 
about other schemes that are not s3?



##########
providers/src/airflow/providers/databricks/operators/databricks.py:
##########
@@ -242,7 +243,35 @@ def get_link(
         *,
         ti_key: TaskInstanceKey,
     ) -> str:
-        return XCom.get_value(key=XCOM_RUN_PAGE_URL_KEY, ti_key=ti_key)
+        # Retrieve the value from XCom
+        value = XCom.get_value(key=XCOM_RUN_PAGE_URL_KEY, ti_key=ti_key)
+
+        # Check if the value is an S3 URL
+        if value and value.startswith("s3://"):
+            try:
+                # Parse the S3 URL and extract job information
+                return self.construct_databricks_url(value)
+            except Exception:
+                # Return a placeholder link in case of an error
+                return "#"
+
+        # Return the value directly if it's not an S3 URL
+        return value
+
+    @staticmethod
+    def construct_databricks_url(s3_url: str) -> str:
+        """
+        Convert an S3 URL to a Databricks workspace URL.
+
+        Assumes the S3 URL points to a JSON file that contains job details.
+        """
+        parsed = urlparse(s3_url)
+
+        # Example logic: Extract the job run ID from the path
+        job_details = parsed.path.split("/")[-1].replace(".json", "")
+        databricks_workspace_url = 
f"https://<your-databricks-workspace-url>/#job/{job_details}/run"

Review Comment:
   Is this clickable? How/where is `<your-databricks-workspace-url>` going to 
be replaced?



##########
providers/src/airflow/providers/databricks/operators/databricks.py:
##########
@@ -242,7 +243,35 @@ def get_link(
         *,
         ti_key: TaskInstanceKey,
     ) -> str:
-        return XCom.get_value(key=XCOM_RUN_PAGE_URL_KEY, ti_key=ti_key)
+        # Retrieve the value from XCom

Review Comment:
   I generally appreciate having comments, yet I think the method name 
`get_value` is already saying that a value is being retrieved



##########
providers/src/airflow/providers/databricks/operators/databricks.py:
##########
@@ -242,7 +243,35 @@ def get_link(
         *,
         ti_key: TaskInstanceKey,
     ) -> str:
-        return XCom.get_value(key=XCOM_RUN_PAGE_URL_KEY, ti_key=ti_key)
+        # Retrieve the value from XCom
+        value = XCom.get_value(key=XCOM_RUN_PAGE_URL_KEY, ti_key=ti_key)
+
+        # Check if the value is an S3 URL
+        if value and value.startswith("s3://"):
+            try:
+                # Parse the S3 URL and extract job information
+                return self.construct_databricks_url(value)
+            except Exception:
+                # Return a placeholder link in case of an error
+                return "#"
+
+        # Return the value directly if it's not an S3 URL
+        return value
+
+    @staticmethod
+    def construct_databricks_url(s3_url: str) -> str:
+        """
+        Convert an S3 URL to a Databricks workspace URL.
+
+        Assumes the S3 URL points to a JSON file that contains job details.
+        """
+        parsed = urlparse(s3_url)
+
+        # Example logic: Extract the job run ID from the path
+        job_details = parsed.path.split("/")[-1].replace(".json", "")

Review Comment:
   Are we assuming that the json file path contains the job details? That's not 
true, the details (the complete URL) is stored _inside_ the file



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to