potiuk commented on issue #33178:
URL: https://github.com/apache/airflow/issues/33178#issuecomment-1674262939
Hmm.. But we willl still have at least two sessions at the same time that
might be used in parallel - potentially:
* @provide_session from _run_task
* `dangling_session`
I was thinking - looking at the logging I added:
```
[2023-08-10 21:43:01,362] {session.py:39} INFO - CREATE SESSION
wrapper:_execute_task_with_callbacks:_run_raw_task:
[2023-08-10T21:43:01.362+0000] {session.py:39} INFO - CREATE SESSION
wrapper:_execute_task_with_callbacks:_run_raw_task:
[2023-08-10 21:43:01,364] {session.py:51} INFO - DESTROY_SESSION
wrapper:_execute_task_with_callbacks:_run_raw_task:
[2023-08-10 21:43:01,365] {session.py:39} INFO - CREATE SESSION
wrapper:render_template_fields:render_templates:
[2023-08-10 21:43:01,365] {session.py:51} INFO - DESTROY_SESSION
wrapper:render_template_fields:render_templates:
[2023-08-10 21:43:01,365] {session.py:39} INFO - CREATE SESSION
wrapper:_execute_task_with_callbacks:_run_raw_task:
[2023-08-10T21:43:01.364+0000] {session.py:51} INFO - DESTROY_SESSION
wrapper:_execute_task_with_callbacks:_run_raw_task:
[2023-08-10T21:43:01.365+0000] {session.py:39} INFO - CREATE SESSION
wrapper:render_template_fields:render_templates:
[2023-08-10T21:43:01.365+0000] {session.py:51} INFO - DESTROY_SESSION
wrapper:render_template_fields:render_templates:
[2023-08-10T21:43:01.365+0000] {session.py:39} INFO - CREATE SESSION
wrapper:_execute_task_with_callbacks:_run_raw_task:
[2023-08-10 21:43:01,369] {session.py:51} INFO - DESTROY_SESSION
wrapper:_execute_task_with_callbacks:_run_raw_task:
[2023-08-10 21:43:01,369] {session.py:39} INFO - CREATE SESSION
wrapper:_execute_task_with_callbacks:_run_raw_task:
[2023-08-10T21:43:01.369+0000] {session.py:51} INFO - DESTROY_SESSION
wrapper:_execute_task_with_callbacks:_run_raw_task:
[2023-08-10T21:43:01.369+0000] {session.py:39} INFO - CREATE SESSION
wrapper:_execute_task_with_callbacks:_run_raw_task:
[2023-08-10 21:43:01,372] {session.py:51} INFO - DESTROY_SESSION
wrapper:_execute_task_with_callbacks:_run_raw_task:
```
That seems like an unnecesary resource usage. Opening a new session is
generally expensive (both client and server side) - yes - we have pgbouncer
etc. - but this means that every mapped task needs minimum 2 sessions to run,
where 1 would be enough - rendering templated fields could be easily done using
the `_execute_task_with_callbacks` one.
Since rendeing mapped tasks is ALWAYS done in task maybe a better solution
will be to store the session in the singleton in inside the `_run_raw_task` and
use that stored session in `mapped_operator.py` . That sounds like much more
reasonable usage of the session we already have, and I don't seen an obvious
problem with such an approach..
It would save quite a lot of resources.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]