alekscp opened a new issue, #25990:
URL: https://github.com/apache/airflow/issues/25990
### Apache Airflow Provider(s)
google
### Versions of Apache Airflow Providers
apache-airflow-providers-google==8.2.0
### Apache Airflow version
2.2.3
### Operating System
MacOS
### Deployment
Composer
### Deployment details
_No response_
### What happened
Using
`airflow.providers.google.cloud.operators.dataproc.DataprocCreateBatchOperator`
and passing jinja templates to its `batch` parameter. The class will not
interpret the jinja template but will instead return its string representation.
Given this code:
```python
batch = DataprocCreateBatchOperator(
task_id="batch_operator_templating_bug_poc",
project_id=PROJECT_ID,
region=REGION,
batch={
"pyspark_batch": {
"main_python_file_uri": MAIN_PYTHON_FILE,
"args": [
"--logical-date",
"{{ ds }}",
"--task-instance",
"{{ task_instance }}",
"--computed-arg",
f"{get_computed_arg()}",
"--arbitrary-arg",
"arbitrary-value",
],
},
"environment_config": {
"execution_config": {"subnetwork_uri": REGION},
},
},
batch_id="batch-operator-templating-bug-poc-{{
macros.datetime.now().strftime('%Y%m%d%H%M%S') }}",
)
```
The following Arguments are generated:
<img width="777" alt="Screen Shot 2022-08-26 at 6 18 52 PM"
src="https://user-images.githubusercontent.com/8513369/186948976-f03070f4-69b4-4fab-a21b-8762eca64bce.png">
Note that the templated value in `batch_id` does get interpreted correctly
giving the following output:
<img width="483" alt="Screen Shot 2022-08-26 at 6 19 38 PM"
src="https://user-images.githubusercontent.com/8513369/186949103-9998498b-89f2-4b7c-9261-6afa7a704a45.png">
### What you think should happen instead
`{{ ds }}` and other templated variables should be interpreted and return
their corresponding values.
### How to reproduce
The problem can be reproduced using the following code:
```python
# dag.py
from datetime import datetime
from airflow import DAG
from airflow.providers.google.cloud.operators.dataproc import (
DataprocCreateBatchOperator,
)
PROJECT_ID = "<your-project-id>"
REGION = "<your-region>"
MAIN_PYTHON_FILE =
"gs://<gcs-bucket-where-you-store-your-jobs>/poc_batch_operator_bug.py"
def get_computed_arg():
return datetime(2022, 1, 1)
with DAG(
"dataproc_create_batch_operator_templating_bug",
default_args={},
schedule_interval=None,
start_date=datetime(2020, 1, 1),
catchup=False,
) as dag:
batch = DataprocCreateBatchOperator(
task_id="batch_operator_templating_bug_poc",
project_id=PROJECT_ID,
region=REGION,
batch={
"pyspark_batch": {
"main_python_file_uri": MAIN_PYTHON_FILE,
"args": [
"--logical-date",
"{{ ds }}",
"--task-instance",
"{{ task_instance }}",
"--computed-arg",
f"{get_computed_arg()}",
"--arbitrary-arg",
"arbitrary-value",
],
},
"environment_config": {
"execution_config": {"subnetwork_uri": REGION},
},
},
batch_id="batch-operator-templating-bug-poc-{{
macros.datetime.now().strftime('%Y%m%d%H%M%S') }}",
)
batch
```
```python
# poc_batch_operator_bug.py
import click
@click.command()
@click.option("--logical-date")
@click.option("--task-instance")
@click.option("--computed-arg")
@click.option("--arbitrary-arg")
def cli(
logical_date,
task_instance,
computed_arg,
arbitrary_arg,
):
print(f"logical_date: {logical_date}")
print(f"task_instance: {task_instance}")
print(f"computed_arg: {computed_arg}")
print(f"arbitrary_arg: {arbitrary_arg}")
if __name__ == "__main__":
cli()
```
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]