baolsen opened a new issue #10874:
URL: https://github.com/apache/airflow/issues/10874


   **Apache Airflow version**: 1.10.8
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**: 4 VCPU 8GB RAM VM
   - **OS** (e.g. from /etc/os-release): RHEL 7.7
   - **Kernel** (e.g. `uname -a`): Linux 3.10.0-957.el7.x86_64 #1 SMP Thu Oct 4 
20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
   - **Install tools**:
   - **Others**:
   
   **What happened**:
   
   Sub-classing the SSHOperator and calling its execute repeatedly will create 
a new SSH connection each time, to run the command.
   
   Not sure if this is a bug or an enhancement / feature. I can re-log as a 
feature request if needed.
   
   **What you expected to happen**:
   
   SSH client / connection should be re-used if it was already established.
   
   **How to reproduce it**:
   
   Sub-class the SSHOperator.
   In your sub class execute method, call super().execute() a few times.
   Observe in the logs how an SSH Connection is created each time.
   
   **Anything else we need to know**:
   
   The SSHHook.get_conn() method creates a new Paramiko SSH client each time. 
Despite storing the client on self.client before returning, the hook get_conn() 
method does not actually use the self.client next time. A new connection is 
therefore created.
   I think this is because the SSH Operator uses a context manager to operate 
on the Paramiko client, so the Hook needs to create a new client if a previous 
context manager had closed the last one.
   
   Fixing this would mean changing the SSH Operator execute() to not use the 
ssh_hook.get_conn() as a context manager since this will open and close the 
session each time. Perhaps the conn can be closed with the operator's 
post_execute method rather than in the execute.
   
   ***Example logs***
   
   [2020-09-11 07:04:37,960] {ssh_operator.py:89} INFO - ssh_conn_id is ignored 
when ssh_hook is provided.
   [2020-09-11 07:04:37,960] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:37,960] {ssh_hook.py:166} WARNING - Remote Identification Change is not 
verified. This wont protect against Man-In-The-Middle attacks
   [2020-09-11 07:04:37,961] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:37,961] {ssh_hook.py:170} WARNING - No Host Key Verification. This wont 
protect against Man-In-The-Middle attacks
   [2020-09-11 07:04:37,976] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:37,975] {transport.py:1819} INFO - Connected (version 2.0, client 
OpenSSH_7.4)
   [2020-09-11 07:04:38,161] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:38,161] {transport.py:1819} INFO - Auth banner: b'Authorized uses only. 
All activity may be monitored and reported.\n'
   [2020-09-11 07:04:38,161] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:38,161] {transport.py:1819} INFO - Authentication (publickey) successful!
   [2020-09-11 07:04:38,161] {ssh_operator.py:109} INFO - Running command: 
[REDACTED COMMAND 1]
   ...
   [2020-09-11 07:04:38,383] {ssh_operator.py:89} INFO - ssh_conn_id is ignored 
when ssh_hook is provided.
   [2020-09-11 07:04:38,383] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:38,383] {ssh_hook.py:166} WARNING - Remote Identification Change is not 
verified. This wont protect against Man-In-The-Middle attacks
   [2020-09-11 07:04:38,383] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:38,383] {ssh_hook.py:170} WARNING - No Host Key Verification. This wont 
protect against Man-In-The-Middle attacks
   [2020-09-11 07:04:38,399] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:38,399] {transport.py:1819} INFO - Connected (version 2.0, client 
OpenSSH_7.4)
   [2020-09-11 07:04:38,545] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:38,545] {transport.py:1819} INFO - Auth banner: b'Authorized uses only. 
All activity may be monitored and reported.\n'
   [2020-09-11 07:04:38,546] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:38,546] {transport.py:1819} INFO - Authentication (publickey) successful!
   [2020-09-11 07:04:38,546] {ssh_operator.py:109} INFO - Running command: 
[REDACTED COMMAND 2]
   ....
   [2020-09-11 07:04:38,722] {ssh_operator.py:89} INFO - ssh_conn_id is ignored 
when ssh_hook is provided.
   [2020-09-11 07:04:38,722] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:38,722] {ssh_hook.py:166} WARNING - Remote Identification Change is not 
verified. This wont protect against Man-In-The-Middle attacks
   [2020-09-11 07:04:38,723] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:38,723] {ssh_hook.py:170} WARNING - No Host Key Verification. This wont 
protect against Man-In-The-Middle attacks
   [2020-09-11 07:04:38,734] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:38,734] {transport.py:1819} INFO - Connected (version 2.0, client 
OpenSSH_7.4)
   [2020-09-11 07:04:38,867] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:38,867] {transport.py:1819} INFO - Auth banner: b'Authorized uses only. 
All activity may be monitored and reported.\n'
   [2020-09-11 07:04:38,868] {logging_mixin.py:112} INFO - [2020-09-11 
07:04:38,867] {transport.py:1819} INFO - Authentication (publickey) successful!
   [2020-09-11 07:04:38,868] {ssh_operator.py:109} INFO - Running command: 
[REDACTED COMMAND 3]


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to