Copilot commented on code in PR #63471:
URL: https://github.com/apache/airflow/pull/63471#discussion_r3048482267


##########
providers/git/src/airflow/providers/git/hooks/git.py:
##########
@@ -149,16 +150,35 @@ def _process_git_auth_url(self):
             encoded_user = urlquote(self.user_name, safe="")
             encoded_token = urlquote(self.auth_token, safe="")
             self.repo_url = self.repo_url.replace("https://";, 
f"https://{encoded_user}:{encoded_token}@";, 1)
+            self._set_http_auth_env()
         elif self.auth_token and self.repo_url.startswith("http://";):
             encoded_user = urlquote(self.user_name, safe="")
             encoded_token = urlquote(self.auth_token, safe="")
             self.repo_url = self.repo_url.replace("http://";, 
f"http://{encoded_user}:{encoded_token}@";, 1)
+            self._set_http_auth_env()
         elif self.repo_url.startswith("http://";):
             # if no auth token, use the repo url as is
             pass
         elif not self.repo_url.startswith("git@") and not 
self.repo_url.startswith("https://";):
             self.repo_url = os.path.expanduser(self.repo_url)
 
+    def _set_http_auth_env(self):
+        """
+        Set git config env vars to force HTTP authentication via extraHeader.
+
+        Git does not send credentials for public repositories since the server
+        does not respond with a 401 challenge. This forces the Authorization
+        header to be sent on every request, allowing authenticated rate limits.
+
+        Uses GIT_CONFIG_* environment variables (git >= 2.31) to inject an
+        ``http.extraHeader`` with a Basic auth token.
+        """
+        credentials = f"{self.user_name}:{self.auth_token}"
+        encoded = base64.b64encode(credentials.encode()).decode()
+        self.env["GIT_CONFIG_COUNT"] = "1"
+        self.env["GIT_CONFIG_KEY_0"] = "http.extraHeader"
+        self.env["GIT_CONFIG_VALUE_0"] = f"Authorization: Basic {encoded}"

Review Comment:
   `http.extraHeader` is configured globally via 
`GIT_CONFIG_KEY_0=http.extraHeader`, so the Authorization header may be sent to 
*any* HTTP(S) endpoint Git contacts during this operation (redirects, 
alternates, or other HTTP remotes), which is a credential-leak risk. Consider 
scoping the config to the target host by using Git’s per-URL config form (e.g. 
`http.<url>.extraHeader` derived from the repo URL without embedded 
credentials), so the header is only attached for requests to that host/prefix.



##########
providers/git/src/airflow/providers/git/hooks/git.py:
##########
@@ -149,16 +150,35 @@ def _process_git_auth_url(self):
             encoded_user = urlquote(self.user_name, safe="")
             encoded_token = urlquote(self.auth_token, safe="")
             self.repo_url = self.repo_url.replace("https://";, 
f"https://{encoded_user}:{encoded_token}@";, 1)
+            self._set_http_auth_env()
         elif self.auth_token and self.repo_url.startswith("http://";):
             encoded_user = urlquote(self.user_name, safe="")
             encoded_token = urlquote(self.auth_token, safe="")
             self.repo_url = self.repo_url.replace("http://";, 
f"http://{encoded_user}:{encoded_token}@";, 1)
+            self._set_http_auth_env()
         elif self.repo_url.startswith("http://";):
             # if no auth token, use the repo url as is
             pass
         elif not self.repo_url.startswith("git@") and not 
self.repo_url.startswith("https://";):
             self.repo_url = os.path.expanduser(self.repo_url)
 
+    def _set_http_auth_env(self):
+        """
+        Set git config env vars to force HTTP authentication via extraHeader.
+
+        Git does not send credentials for public repositories since the server
+        does not respond with a 401 challenge. This forces the Authorization
+        header to be sent on every request, allowing authenticated rate limits.
+
+        Uses GIT_CONFIG_* environment variables (git >= 2.31) to inject an
+        ``http.extraHeader`` with a Basic auth token.
+        """

Review Comment:
   This relies on `GIT_CONFIG_*` env vars which require git >= 2.31, but the 
code doesn’t detect/handle older git versions. In environments with older git, 
the new env vars will be ignored and public-repo auth will still not be forced 
(leaving the original rate-limit issue unresolved). Please consider adding a 
runtime fallback (e.g. using an older supported mechanism such as 
`GIT_CONFIG_PARAMETERS` / `-c` injection) or at least emitting a clear 
warning/error when an auth token is configured but the installed git is too old.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to