GitHub user akhundovte created a discussion: PythonVirtualenvOperator: 
`render_template_as_native_obj=True` causes `PicklingError` when using `{{ ti 
}}` / `{{ task_instance }}` (cloudpickle/structlog)

Hi Airflow team,

The PythonVirtualenvOperator docs (“Passing in arguments”) mention that Airflow 
does not support serializing ti / task_instance. I understand this limitation 
and that a workaround is to pass only identifiers and, if needed, fetch task 
instance data via the Airflow API.

However, I found a reproducible case where the behavior changes depending on 
render_template_as_native_obj, and enabling native rendering makes the task 
fail during argument serialization.

## Minimal repro
```python
from pendulum import datetime

from airflow.sdk import dag
from airflow.providers.standard.operators.python import PythonVirtualenvOperator


def venv_callable(
    ti,
    task_instance
):
    print("ti =", ti)
    print("task_instance =", task_instance)


@dag(
    start_date=datetime(2026, 1, 1),
    schedule=None,
    catchup=False,
    render_template_as_native_obj=False,
)
def test_simple():
    PythonVirtualenvOperator(
        task_id="repro",
        python_callable=venv_callable,
        python_version="3.10",
        serializer="cloudpickle",
        op_kwargs={
            "x": 1,
            "ti": "{{ ti }}",
            "task_instance": "{{ task_instance }}",
        },
        requirements=[
            "apache-airflow==3.1.6"
        ],
        system_site_packages=False,
    )


test_simple()
```
### Observation
- With render_template_as_native_obj=False, the task succeeds (values are 
printed).
- If I change only render_template_as_native_obj=True, the task fails while 
serializing arguments for the subprocess.

## Error with render_template_as_native_obj=True
```
[2026-01-29 23:06:37] INFO - Use 'cloudpickle' as serializer.
[2026-01-29 23:06:37] ERROR - Task failed with exception
PicklingError: Only BytesLoggers to sys.stdout and sys.stderr can be pickled.
File 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py",
 line 1004 in run
...
File 
"/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/standard/operators/python.py",
 line 529 in _write_args
File 
"/home/airflow/.local/lib/python3.12/site-packages/cloudpickle/cloudpickle.py", 
line 1537 in dumps
...
File "/home/airflow/.local/lib/python3.12/site-packages/structlog/_output.py", 
line 278 in __getstate__
```
## Questions
1) Can you confirm whether this behavior difference is an expected consequence 
of render_template_as_native_obj=True (Jinja NativeEnvironment), where {{ ti }} 
/ {{ task_instance }} render to native objects that then get pickled for 
PythonVirtualenvOperator?

2) Is there any plan to provide a supported mechanism for isolated environments 
instead of passing the “live” TaskInstance object, for example:
   - a lightweight serializable proxy/handle (e.g., dag_id, run_id, task_id, 
try_number, map_index) and
   - a limited set of operations (or a small client) to retrieve basic 
TaskInstance info from within PythonVirtualenvOperator without serializing the 
full object?

I understand one can call the Airflow API directly, but an official/supported 
approach (or at least clearer validation / a more explicit error message for 
this case) would be very helpful.

## Environment
- Airflow: 3.1.6
- Provider: standard (PythonVirtualenvOperator)
- Python: 3.12 (runner), virtualenv: 3.10
- serializer: cloudpickle

Thanks!




GitHub link: https://github.com/apache/airflow/discussions/61231

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to