GitHub user Rishabh1627rawat created a discussion: Using
DatabricksSubmitRunOperator inside @task — is pool applied correctly
Here’s your message rewritten in **clean, simple, and well-structured
language**, ready to post:
---
Hi everyone,
I’m using Airflow 2.x with the `@task` decorator (TaskFlow API), and I’m trying
to better understand how Airflow handles execution and pools when an operator
is called inside a Python task.
Right now, I’m using this pattern:
```python
@task(pool="databricks_superset_med", retries=3)
def run_databricks(run_payload, **context):
op = DatabricksSubmitRunOperator(
task_id="data_transformation",
databricks_conn_id="databricks",
json=run_payload,
wait_for_termination=True,
)
return op.execute(context=context)
```
In this setup:
* The `@task` has a pool assigned.
* Inside that function, I create a `DatabricksSubmitRunOperator`.
* I manually call `op.execute(context=context)`.
This successfully triggers my Databricks notebook, and because
`wait_for_termination=True`, the task waits until the notebook run finishes.
However, I want to better understand what is happening internally.
Specifically:
* Is the pool applied only to the outer Python `@task`?
* From the scheduler’s perspective, is the `DatabricksSubmitRunOperator`
treated as a separate task?
* Or is it completely invisible because it is executed manually inside the
Python task?
* Why does the pool on the inner operator not take effect?
* Would it be better practice to define `DatabricksSubmitRunOperator` directly
in the DAG instead of wrapping it inside a `@task`?
I understand that manually calling `.execute()` may bypass some of Airflow’s
orchestration mechanisms. I’m especially curious how this pattern affects:
* Pool slot acquisition
* Task lifecycle tracking
* Scheduler awareness
* UI visibility
This is not a functional issue — everything runs successfully. I just want to
understand the internal behavior better and ensure I’m following best practices
without unintentionally bypassing Airflow’s concurrency controls.
I would appreciate any clarification on how the scheduler treats this pattern
internally.
GitHub link: https://github.com/apache/airflow/discussions/62403
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]