[ 
https://issues.apache.org/jira/browse/AIRFLOW-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249613#comment-17249613
 ] 

ASF GitHub Bot commented on AIRFLOW-7044:
-----------------------------------------

ashb commented on a change in pull request #12944:
URL: https://github.com/apache/airflow/pull/12944#discussion_r543210741



##########
File path: tests/providers/ssh/hooks/test_ssh.py
##########
@@ -51,8 +51,18 @@ def generate_key_string(pkey: paramiko.PKey, passphrase: 
Optional[str] = None):
     return key_str
 
 
+def generate_host_key(pkey: paramiko.PKey):
+    key_fh = StringIO()
+    pkey.write_private_key(key_fh)
+    key_fh.seek(0)
+    key_obj = paramiko.RSAKey(file_obj=key_fh)
+    return key_obj
+
+
 TEST_PKEY = paramiko.RSAKey.generate(4096)
 TEST_PRIVATE_KEY = generate_key_string(pkey=TEST_PKEY)
+TEST_HOST_PKEY = generate_host_key(pkey=TEST_PKEY)
+TEST_HOST_KEY = TEST_HOST_PKEY.get_base64()

Review comment:
       Why not just have `generate_host_key` return the base64 encoded version?

##########
File path: airflow/providers/ssh/hooks/ssh.py
##########
@@ -149,7 +151,9 @@ def __init__(
                     and str(extra_options["look_for_keys"]).lower() == 'false'
                 ):
                     self.look_for_keys = False
-
+                if "host_key" in extra_options and self.no_host_key_check is 
False:
+                    encoded_host_key = 
decodebytes(extra_options["host_key"].encode('utf-8'))
+                    self.host_key = paramiko.RSAKey(data=encoded_host_key)

Review comment:
       ```suggestion
                       decoded_host_key = 
decodebytes(extra_options["host_key"].encode('utf-8'))
                       self.host_key = paramiko.RSAKey(data=decoded_host_key)
   ```

##########
File path: docs/apache-airflow-providers-ssh/connections/ssh.rst
##########
@@ -47,9 +47,10 @@ Extra (optional)
     * ``private_key_passphrase`` - Content of the private key passphrase used 
to decrypt the private key.
     * ``timeout`` - An optional timeout (in seconds) for the TCP connect. 
Default is ``10``.
     * ``compress`` - ``true`` to ask the remote client/server to compress 
traffic; ``false`` to refuse compression. Default is ``true``.
-    * ``no_host_key_check`` - Set to ``false`` to restrict connecting to hosts 
with no entries in ``~/.ssh/known_hosts`` (Hosts file). This provides maximum 
protection against trojan horse attacks, but can be troublesome when the 
``/etc/ssh/ssh_known_hosts`` file is poorly maintained or connections to new 
hosts are frequently made. This option forces the user to manually add all new 
hosts. Default is ``true``, ssh will automatically add new host keys to the 
user known hosts files.
+    * ``no_host_key_check`` - Set to ``false`` to restrict connecting to hosts 
with either no entries in ``~/.ssh/known_hosts`` (Hosts file) or not present in 
the ``host_key`` extra. This provides maximum protection against trojan horse 
attacks, but can be troublesome when the ``/etc/ssh/ssh_known_hosts`` file is 
poorly maintained or connections to new hosts are frequently made. This option 
forces the user to manually add all new hosts. Default is ``true``, ssh will 
automatically add new host keys to the user known hosts files.
     * ``allow_host_key_change`` - Set to ``true`` if you want to allow 
connecting to hosts that has host key changed or when you get 'REMOTE HOST 
IDENTIFICATION HAS CHANGED' error.  This wont protect against Man-In-The-Middle 
attacks. Other possible solution is to remove the host entry from 
``~/.ssh/known_hosts`` file. Default is ``false``.
     * ``look_for_keys`` - Set to ``false`` if you want to disable searching 
for discoverable private key files in ``~/.ssh/``
+    * ``host_key`` - The base64 encoded ssh-rsa public key of the host. 
Specifying this, along with ``no_host_key_check=False`` allows you to only make 
the connection if the public key of the endpoint matches this value.

Review comment:
       ```suggestion
       * ``host_key`` - The base64 encoded ssh-rsa public key of the host, as 
you would find in the ``known_hosts`` file. Specifying this, along with 
``no_host_key_check=False`` allows you to only make the connection if the 
public key of the endpoint matches this value.
   ```

##########
File path: airflow/providers/sftp/hooks/sftp.py
##########
@@ -109,12 +111,22 @@ def __init__(self, ftp_conn_id: str = 'sftp_default', 
*args, **kwargs) -> None:
                     )
                     self.key_file = extra_options.get('private_key')
 
+                if "host_key" in extra_options and self.no_host_key_check is 
False:
+                    encoded_host_key = 
decodebytes(extra_options["host_key"].encode('utf-8'))
+                    self.host_key = paramiko.RSAKey(data=encoded_host_key)
+

Review comment:
       This isn't needed here -- it's handled by SSHHook already.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


> SSH connection (and hook) should support public host_key usage
> --------------------------------------------------------------
>
>                 Key: AIRFLOW-7044
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-7044
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: hooks
>    Affects Versions: 2.0.0
>            Reporter: Aaron Fowles
>            Assignee: Aaron Fowles
>            Priority: Minor
>              Labels: newbie, security, sftp, ssh
>
> It would be good to be able to enforce a public host key check against a 
> known value when making a SSH or SFTP connection.
> Currently, people are forced into using
> {code:java}
> 'no_host_key_check' = True{code}
> which could allow a Man-in-the-middle attack.
> There are two components as far as I can see:
>  * The connection should support specify the key_type and key (either as 
> fields or in extra)
>  * The hook should write get and write those values (along with the hostname) 
> to the ~/.ssh/known_hosts file if
> {code:java}
> 'no_host_key_check' = False{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to