jedcunningham commented on code in PR #52876:
URL: https://github.com/apache/airflow/pull/52876#discussion_r2193285649


##########
providers/amazon/src/airflow/providers/amazon/aws/bundles/s3.py:
##########
@@ -137,10 +137,20 @@ def refresh(self) -> None:
             )
 
     def view_url(self, version: str | None = None) -> str | None:

Review Comment:
   Is this used for non-versioned bundles? That feels like a bug...



##########
providers/git/src/airflow/providers/git/bundles/git.py:
##########
@@ -218,32 +217,53 @@ def _convert_git_ssh_url_to_https(url: str) -> str:
         return f"{domain}/{repo_path}"
 
     def view_url(self, version: str | None = None) -> str | None:
+        """
+        Return a URL for viewing the DAGs in the repository.
+
+        This method is deprecated and will be removed in a future release. Use 
`view_url_template` instead.
+        """
         if not version:
             return None
-        url = self.repo_url
-        if not url:
+        template = self.view_url_template()
+        if not template:
             return None
+        return template.format(version=version)
+
+    def view_url_template(self) -> str | None:
+        if hasattr(self, "_view_url_template") and self._view_url_template:
+            # Backward compatibility for released Airflow versions

Review Comment:
   This comment is a bit misleading, this is either backcompat or when 
`view_url_template` isn't configured manually right?



##########
airflow-core/src/airflow/dag_processing/bundles/manager.py:
##########
@@ -124,12 +174,55 @@ def parse_config(self) -> None:
     @provide_session
     def sync_bundles_to_db(self, *, session: Session = NEW_SESSION) -> None:
         self.log.debug("Syncing DAG bundles to the database")
+
+        def _signed_template(new_template_: str | None, bundle_name: str) -> 
str | None:
+            if new_template_:
+                if not _is_safe_bundle_url(new_template_):
+                    self.log.warning(
+                        "Bundle %s has unsafe URL template '%s', skipping URL 
update",
+                        bundle_name,
+                        new_template_,
+                    )
+                    new_template_ = None
+                else:
+                    # Sign the URL for integrity verification
+                    new_template_ = _sign_bundle_url(new_template_, 
bundle_name)
+                    self.log.debug("Signed URL template for bundle %s", 
bundle_name)
+            return new_template_
+
         stored = {b.name: b for b in session.query(DagBundleModel).all()}
+
         for name in self._bundle_config.keys():
             if bundle := stored.pop(name, None):
                 bundle.active = True
+                # Update URL template and parameters if they've changed
+                bundle_instance = self.get_bundle(name)
+                new_template = bundle_instance.view_url_template()
+                new_params = self._extract_template_params(bundle_instance)

Review Comment:
   We should dedup what we can between this if/else block.



##########
airflow-core/src/airflow/dag_processing/bundles/base.py:
##########
@@ -316,10 +323,31 @@ def view_url(self, version: str | None = None) -> str | 
None:
         URL to view the bundle on an external website. This is shown to users 
in the Airflow UI, allowing them to navigate to this url for more details about 
that version of the bundle.
 
         This needs to function without `initialize` being called.
-
         :param version: Version to view
         :return: URL to view the bundle
         """
+        warnings.warn(
+            "The 'view_url' method is deprecated and will be removed in a 
future version. "
+            "Use 'view_url_template' instead.",

Review Comment:
   Should we point folks at `render_url` instead of in addition to 
`view_url_template`?



##########
providers/git/src/airflow/providers/git/bundles/git.py:
##########
@@ -218,32 +217,53 @@ def _convert_git_ssh_url_to_https(url: str) -> str:
         return f"{domain}/{repo_path}"
 
     def view_url(self, version: str | None = None) -> str | None:
+        """
+        Return a URL for viewing the DAGs in the repository.
+
+        This method is deprecated and will be removed in a future release. Use 
`view_url_template` instead.
+        """
         if not version:
             return None
-        url = self.repo_url
-        if not url:
+        template = self.view_url_template()
+        if not template:
             return None
+        return template.format(version=version)
+
+    def view_url_template(self) -> str | None:
+        if hasattr(self, "_view_url_template") and self._view_url_template:
+            # Backward compatibility for released Airflow versions
+            return self._view_url_template
+
+        if not self.repo_url:
+            return None
+
+        url = self.repo_url
         if url.startswith("git@"):
             url = self._convert_git_ssh_url_to_https(url)
         if url.endswith(".git"):
             url = url[:-4]
+
         parsed_url = urlparse(url)
         host = parsed_url.hostname
         if not host:
             return None
+
         if parsed_url.username or parsed_url.password:
             new_netloc = host
             if parsed_url.port:
                 new_netloc += f":{parsed_url.port}"
             url = parsed_url._replace(netloc=new_netloc).geturl()
+
         host_patterns = {
-            "github.com": f"{url}/tree/{version}",
-            "gitlab.com": f"{url}/-/tree/{version}",
-            "bitbucket.org": f"{url}/src/{version}",
+            "github.com": f"{url}/tree/{{version}}",
+            "gitlab.com": f"{url}/-/tree/{{version}}",
+            "bitbucket.org": f"{url}/src/{{version}}",
         }
-        if self.subdir:
-            host_patterns = {k: f"{v}/{self.subdir}" for k, v in 
host_patterns.items()}
+
+        # Add subdir placeholder if applicable
         for allowed_host, template in host_patterns.items():
             if host == allowed_host or host.endswith(f".{allowed_host}"):
+                if self.subdir:
+                    return f"{template}/{self.subdir}"

Review Comment:
   We don't seem to be using the whole `template_fields` and `params` 
complexity, and just adding subdir straight on here. Do we need that concept at 
all? I don't see why the template can't just have that baked in.
   
   I could see supporting more things down the road (e.g. relative path), but 
this template_fields / params would be static per bundle anyways.



##########
providers/amazon/src/airflow/providers/amazon/aws/bundles/s3.py:
##########
@@ -137,10 +137,20 @@ def refresh(self) -> None:
             )
 
     def view_url(self, version: str | None = None) -> str | None:
+        """
+        Return a URL for viewing the DAGs in S3. Currently, versioning is not 
supported.
+
+        This method is deprecated and will be removed in a future release. Use 
`view_url_template` instead.

Review Comment:
   We should leave a breadcrumb as to when we can remove it, e.g. when we drop 
support for 3.0.
   
   Maybe we conditionally add a deprecation warning based on the core version?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to