AngryHelper commented on pull request #19852:
URL: https://github.com/apache/airflow/pull/19852#issuecomment-997398403
> I cannot see my comment - I thought I submitted it but somehow I cannot
see it in GitHub.
> One very important comment - I believe that tha validation and execution
logic should all be moved to the Hook (as new batch_transfer method?). Operator
should be thin layer over hooks capabilities, because you cannot combine
several operators easily to run as single tasks, where Hooks are precisely
desgined to handle the case. So if your logic is in Hook, you could easily add
@task decorated code where in one step you batch download files, and in the
next step the files can be used (and for example sent elsewhere). When this
logic is in Operator, you cannot do it easily.
So I propose to move everything you have in _check_conn and execute() to
hook and call hook for those methods. This way it will be much more usable.
>
```
@task
def my_task:
sftp_hook = SftpHook()
sftp_hook.batch_transfer(GET,....)
another_hook = AnotherHook()
another_hook.do_something_with_the_files(....)
```
@potiuk We have different sftp client dependencies in SFTPHook and
SFTPOperators.
In SFTPHook we have pysftp, but in SFTPOperators we have paramiko.
I don’t want to rewrite the operator, but I understand that we should use
the same hook methods in a good way.
Judging by the pysftp source code, the ability to transport a directory is
implemented there, but not a separate list of files.
```
@task
def my_task:
sftp_hook = SftpHook()
client = sftp_hook.get_conn()
client.get_d(localpath, remotepath)
client.put_d(localpath, remotepath)
```
What you think about it?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]