jgoedeke commented on issue #51877:
URL: https://github.com/apache/airflow/issues/51877#issuecomment-3406032151

   
   In my use case, I need to work with Azure Data Lake Gen2 URIs of the form:
   
   `abfss://<file_system>@<account_name>.dfs.core.windows.net/<path>`
   
   Here, the portion before the @ is <file_system>, not a username or 
credentials, and there is no colon present to indicate a password.
   
   When parsing the netloc, only treat the part before @ as userinfo (and issue 
a warning or remove it) if it actually matches the username:password pattern 
(i.e., contains a colon). For example:
   
   ````python
   parts = parsed.netloc.split("@", 1)
   if len(parts) == 2:
       before_at, after_at = parts
       # Only treat as credentials if there is a colon
       if ":" in before_at:
           warnings.warn(
               "An Asset URI should not contain auth info (e.g. username or 
password). It has been automatically dropped.",
               UserWarning,
               stacklevel=3,
           )
           normalized_netloc = after_at
       else:
           normalized_netloc = parsed.netloc
   else:
       normalized_netloc = parsed.netloc
   ````
   
   This change would allow legitimate ABFSS URIs to be accepted as-is, while 
still protecting against accidental inclusion of credentials for schemes that 
expect them.
   
   For reference, here’s an example of how this logic works for ABFSS:
   
   `abfss://[email protected]/path` → no colon, nothing 
stripped, should work fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to