kubatyszko opened a new issue #10255:
URL: https://github.com/apache/airflow/issues/10255


   **Apache Airflow version**: MASTER
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl 
version`): 1.16
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**: AWS
   - **OS** (e.g. from /etc/os-release): Amazon Linux
   - **Kernel** (e.g. `uname -a`):
   - **Install tools**:
   - **Others**:
   
   **What happened**:
   
   HTTP hook defaults to http connection schema, and I could never get it to 
accurately use the https that I had set in the connection URI.
   This may only happen using secrets backend, since the connections set in 
airflow UI have their fields mapped directly.
   
   **What you expected to happen**: https://host.com should yield connection 
schema to be https
   
   **How to reproduce it**:
   Pull airflow from master, create a sample DAG using http hook, with 
connection being set to https, with the connection string pulled from some 
secrets manager (in my case I used AWS secrets manager)
   
   
   
   **Anything else we need to know**:
   This issue seems to be limited to using secrets backend for connection 
information, since connections configured via airflow UI have their "schema" 
field mapped directly.
   
   More information
   
   ```
           if self.http_conn_id:
               conn = self.get_connection(self.http_conn_id)
               if conn.host and "://" in conn.host:
                   self.base_url = conn.host
               else:
                   # schema defaults to HTTP
   ****                schema = conn.schema if conn.schema else "http"****
   ****                schema = conn.conn_type if conn.conn_type else "http"****
                   host = conn.host if conn.host else ""
                   self.base_url = schema + "://" + host
   
   ```
   code snippet from airflow.models.connection with highlights:
   
   ```
       def _parse_from_uri(self, uri: str):
           uri_parts = urlparse(uri)
           conn_type = uri_parts.scheme
           if conn_type == 'postgresql':
               conn_type = 'postgres'
           elif '-' in conn_type:
               conn_type = conn_type.replace('-', '_')
   ****        self.conn_type = conn_type ****
   ****        self.host = _parse_netloc_to_hostname(uri_parts) ****
   ****        quoted_schema = uri_parts.path[1:]
   ****        self.schema = unquote(quoted_schema) if quoted_schema else 
quoted_schema ****
           self.login = unquote(uri_parts.username) \
   ```
   
   Quick verification of my approach:
   
   ```
   urlparse("https://host.com:443";)
   ParseResult(scheme='https', netloc='host.com:443', path='', params='', 
query='', fragment='')
   >>>urlparse("https://host.com:443/r";).path[1:].
   'r'
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to