[ 
https://issues.apache.org/jira/browse/AIRFLOW-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16775133#comment-16775133
 ] 

Craig S. Connell commented on AIRFLOW-2910:
-------------------------------------------

I ran across this ticket and some stack overflow posts when I was trying to set 
up an https connection.  For what it is worth, I was unable to reproduce the 
error that the original submitter did.  My opinion is that the fact that the 
drop down list only offers HTTP as an option is confusing to people and leads 
them to believe that HTTPS isn't a choice.

The below information is the same I reported in the stack overflow thread.  I'm 
putting the information here so that 
 * It helps other people that run across this Jira issue
 * It can help the team decide what to do with this ticket.

 
I set up the following in Airflow 1.10.2.

In my initial test I was making a request for templates from sendgrid, so the 
connection was set up like this:
{code:java}
Conn Id : sendgrid_templates_test 
Conn Type : 
HTTP Host : https://api.sendgrid.com/ 
Extra : { "authorization": "Bearer [my token]"}{code}
and then in the dag code:
{code:java}
get_templates = SimpleHttpOperator(
        task_id='get_templates',
        method='GET',
        endpoint='/v3/templates',
        http_conn_id = 'sendgrid_templates_test',
        trigger_rule="all_done",
        xcom_push=True
        dag=dag,
    ){code}
and that worked. Also notice that my request happens after a Branch Operator, 
so I needed to set the trigger rule appropriately (to "all_done" to make sure 
it fires even when one of the branches is skipped), which has nothing to do 
with the question, but I just wanted to point it out.

Now to be clear, I did get an Insecure Request warning as I did not have 
certificate verification enabled. But you can see the resulting logs below


{noformat}
[2019-02-21 16:15:01,333] {http_operator.py:89} INFO - Calling HTTP method
[2019-02-21 16:15:01,336] {logging_mixin.py:95} INFO - [2019-02-21 
16:15:01,335] {base_hook.py:83} INFO - Using connection to: id: 
sendgrid_templates_test. Host:  https://api.sendgrid.com/, Port: None, Schema: 
None, Login: None, Password: XXXXXXXX, extra: {'authorization': 'Bearer [my 
token]'}
[2019-02-21 16:15:01,338] {logging_mixin.py:95} INFO - [2019-02-21 
16:15:01,337] {http_hook.py:126} INFO - Sending 'GET' to url:  
https://api.sendgrid.com//v3/templates
[2019-02-21 16:15:01,956] {logging_mixin.py:95} WARNING - 
/home/csconnell/.pyenv/versions/airflow/lib/python3.6/site-packages/urllib3/connectionpool.py:847:
 InsecureRequestWarning: Unverified HTTPS request is being made. Adding 
certificate verification is strongly advised. See: 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
[2019-02-21 16:15:05,242] {logging_mixin.py:95} INFO - [2019-02-21 
16:15:05,241] {jobs.py:2527} INFO - Task exited with return code 0{noformat}

> models.Connection cannot use https
> ----------------------------------
>
>                 Key: AIRFLOW-2910
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2910
>             Project: Apache Airflow
>          Issue Type: Bug
>            Reporter: isaac martin
>            Priority: Major
>
> The SimpleHttpOperator, and anything else relying on 
> airlfow.models.Connection, cannot make use of https due to what appears to be 
> a bug in the way it parses user-provided urls. The bug ends up replacing any 
> https uri with an http uri.
> To reproduce:
>  * Create a new airflow implementation.
>  * Set a connection environment var: 
> AIRFLOW_CONN_ETL_API=[https://yourdomain.com|https://yourdomain.com/]
>  * Instantiate a SimpleHttpOperator which uses the above for its http_conn_id 
> argument.
>  * Notice with horror that your requests are made to http://yourdomain.com
> To fix:
> Proposal 1
> Line 590 of airflow.models.py assigns nothing to Connection.schema. 
> Change:
> self.schema = temp_uri.path[1:]
> to
> self.schema = temp_uri[0]
>  
> Proposal 2:
> Line 40 or airflow.hooks.http_hook.py starts a block which tries to set the 
> base_url. We could add a new elif which checks self.conn_type, as 
> self.conn_type is correctly populated with 'https'.
> For example:
> elif conn.conn_type:
>     self.base_url = conn.conn_type + "://" + conn.host



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to