Mottimo commented on issue #52289:
URL: https://github.com/apache/airflow/issues/52289#issuecomment-3025358518
Hi @Nataneljpwd, the PR works in the sense that the drop of the connection
is no more observed. Then, for the matter of this issue, the PR fixes it.
Related to this issue there is another observation: the continuous
opening/closing of the connections, done by the latest versions of the SFTP
provider, appears not so efficient. In facts, comparing the following logs
collected by the sensor task previously reported, I see that:
Version 1 - PR #52641 - Three SSH connections for one task
```
[2025-07-01, 21:54:07 CEST] {local_task_job_runner.py:124} ▶ Pre task
execution logs
[2025-07-01, 21:54:07 CEST] {baseoperator.py:424} WARNING -
SFTPSensor.execute cannot be called outside TaskInstance!
[2025-07-01, 21:54:07 CEST] {base.py:84} INFO - Retrieving connection
'sftp-via-proxy'
[2025-07-01, 21:54:07 CEST] {base.py:84} INFO - Retrieving connection
'sftp-via-proxy'
[2025-07-01, 21:54:07 CEST] {sftp.py:100} INFO - Poking path /etc/fstab
[2025-07-01, 21:54:07 CEST] {sftp.py:103} INFO - Soft fail is set as True
[2025-07-01, 21:54:07 CEST] {ssh.py:288} WARNING - Remote Identification
Change is not verified. This won't protect against Man-In-The-Middle attacks
[2025-07-01, 21:54:07 CEST] {ssh.py:298} WARNING - No Host Key Verification.
This won't protect against Man-In-The-Middle attacks
[2025-07-01, 21:54:07 CEST] {transport.py:1944} INFO - Connected (version
2.0, client OpenSSH_8.0)
[2025-07-01, 21:54:07 CEST] {transport.py:1944} INFO - Authentication
(password) successful!
[2025-07-01, 21:54:07 CEST] {sftp.py:169} INFO - [chan 0] Opened sftp
connection (server version 3)
[2025-07-01, 21:54:07 CEST] {logging_mixin.py:190} INFO - SFTP hook -
Automatically closing connections, currently 0 clients remain
[2025-07-01, 21:54:07 CEST] {sftp.py:169} INFO - [chan 0] sftp session
closed.
[2025-07-01, 21:54:07 CEST] {ssh.py:288} WARNING - Remote Identification
Change is not verified. This won't protect against Man-In-The-Middle attacks
[2025-07-01, 21:54:07 CEST] {ssh.py:298} WARNING - No Host Key Verification.
This won't protect against Man-In-The-Middle attacks
[2025-07-01, 21:54:07 CEST] {transport.py:1944} INFO - Connected (version
2.0, client OpenSSH_8.0)
[2025-07-01, 21:54:08 CEST] {transport.py:1944} INFO - Authentication
(password) successful!
[2025-07-01, 21:54:08 CEST] {sftp.py:169} INFO - [chan 0] Opened sftp
connection (server version 3)
[2025-07-01, 21:54:08 CEST] {logging_mixin.py:190} INFO - SFTP hook -
Automatically closing connections, currently 0 clients remain
[2025-07-01, 21:54:08 CEST] {sftp.py:169} INFO - [chan 0] sftp session
closed.
[2025-07-01, 21:54:08 CEST] {sftp.py:133} INFO - Path /etc/fstab is a valid
file, adding it to the candidates list
[2025-07-01, 21:54:08 CEST] {sftp.py:149} INFO - Actual file to check: fstab
[2025-07-01, 21:54:08 CEST] {ssh.py:288} WARNING - Remote Identification
Change is not verified. This won't protect against Man-In-The-Middle attacks
[2025-07-01, 21:54:08 CEST] {ssh.py:298} WARNING - No Host Key Verification.
This won't protect against Man-In-The-Middle attacks
[2025-07-01, 21:54:08 CEST] {transport.py:1944} INFO - Connected (version
2.0, client OpenSSH_8.0)
[2025-07-01, 21:54:08 CEST] {transport.py:1944} INFO - Authentication
(password) successful!
[2025-07-01, 21:54:08 CEST] {sftp.py:169} INFO - [chan 0] Opened sftp
connection (server version 3)
[2025-07-01, 21:54:08 CEST] {logging_mixin.py:190} INFO - SFTP hook -
Automatically closing connections, currently 0 clients remain
[2025-07-01, 21:54:08 CEST] {sftp.py:169} INFO - [chan 0] sftp session
closed.
[2025-07-01, 21:54:08 CEST] {sftp.py:153} INFO - Adding valid file fstab
last modified: 20240301063817 to the result list
[2025-07-01, 21:54:08 CEST] {sftp.py:181} INFO - File search successfully
completed, the result is stored into the XCom returned value
[2025-07-01, 21:54:08 CEST] {base.py:339} INFO - Success criteria met.
Exiting.
```
Version 2 - Disabling the auto closing - One SSH connection for one task
```
[2025-07-01, 21:53:31 CEST] {local_task_job_runner.py:124} ▶ Pre task
execution logs
[2025-07-01, 21:53:31 CEST] {baseoperator.py:424} WARNING -
SFTPSensor.execute cannot be called outside TaskInstance!
[2025-07-01, 21:53:31 CEST] {base.py:84} INFO - Retrieving connection
'sftp-via-proxy'
[2025-07-01, 21:53:31 CEST] {base.py:84} INFO - Retrieving connection
'sftp-via-proxy'
[2025-07-01, 21:53:31 CEST] {sftp.py:100} INFO - Poking path /etc/fstab
[2025-07-01, 21:53:31 CEST] {sftp.py:103} INFO - Soft fail is set as True
[2025-07-01, 21:53:31 CEST] {ssh.py:288} WARNING - Remote Identification
Change is not verified. This won't protect against Man-In-The-Middle attacks
[2025-07-01, 21:53:31 CEST] {ssh.py:298} WARNING - No Host Key Verification.
This won't protect against Man-In-The-Middle attacks
[2025-07-01, 21:53:31 CEST] {transport.py:1944} INFO - Connected (version
2.0, client OpenSSH_8.0)
[2025-07-01, 21:53:31 CEST] {transport.py:1944} INFO - Authentication
(password) successful!
[2025-07-01, 21:53:31 CEST] {sftp.py:169} INFO - [chan 0] Opened sftp
connection (server version 3)
[2025-07-01, 21:53:31 CEST] {logging_mixin.py:190} INFO - SFTP hook - The
auto connection closing is disabled, currently 0 clients remain
[2025-07-01, 21:53:31 CEST] {sftp.py:133} INFO - Path /etc/fstab is a valid
file, adding it to the candidates list
[2025-07-01, 21:53:31 CEST] {sftp.py:149} INFO - Actual file to check: fstab
[2025-07-01, 21:53:31 CEST] {logging_mixin.py:190} INFO - SFTP hook - The
auto connection closing is disabled, currently 0 clients remain
[2025-07-01, 21:53:31 CEST] {sftp.py:153} INFO - Adding valid file fstab
last modified: 20240301063817 to the result list
[2025-07-01, 21:53:31 CEST] {sftp.py:181} INFO - File search successfully
completed, the result is stored into the XCom returned value
[2025-07-01, 21:53:31 CEST] {base.py:339} INFO - Success criteria met.
Exiting.
[2025-07-01, 21:53:31 CEST] {taskinstance.py:353} ▶ Post task execution logs
```
The version 1 executes multiple connections, every time closing the
previous: so, why we are using a cached method if it's removed at every step?
The version 2, that restores the behaviour of the previous versions of the
provider, has a more efficient approach.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]