dstandish edited a comment on issue #20398: URL: https://github.com/apache/airflow/issues/20398#issuecomment-998894798
My take is that generally speaking _hooks_ should not manage state like this; instead, that is the job of the thing using the hook (be it an operator or a function as in `@task`). When it's managed by the caller then you don't have side effects and hidden behavior that you need to remember or document. When there's a good reason fine but when not I'd say it's better not to. In that bigquery **hook** example there's not much benefit in setting that attribute. Instead one can manage it in the operator: ```python running_job_id = hook.run_query(...) ``` In the **operator** there can certainly be legitimate reasons for setting an instance attribute like this because it can be used in `on_kill`. XCom is totally different issue and certainly no complaints there. My concern is just with the hook. And with hooks I wouldn't necessarily say it should _never_ be done. We did it [here](https://github.com/apache/airflow/blob/main/airflow/hooks/subprocess.py#L76) with subprocess hook. But there was at least a half-decent reason. We use that attribute in on_kill. We can't return it from `run_command` because the method waits sychronously for the command to complete. We considered exactly this question [here](https://github.com/apache/airflow/pull/13423#issuecomment-753511787), and thought about returning a runner object that managed the state of the command but figured it was not worth the complexity tradeoff. But maybe we should have done something like that. But in the snowflake case, as well as the bigquery case, the query ids are not known until after they've completed so there's really no reason to store them on the instance. It's just retrospective metadata at that point. And looking ahead, if we were to add a `run_async` method to snowflake hook (which could be useful with deferrable operators) then we should return the query id, not store it. Because then you can reuse the hook repeatedly, for multiple commands, and the hook doesn't have to make any assumptions about what the behavior of the operator willl be. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
