jliu0812 opened a new issue, #38005:
URL: https://github.com/apache/airflow/issues/38005
### Apache Airflow Provider(s)
amazon
### Versions of Apache Airflow Providers
apache-airflow-providers-amazon==8.19.0
### Apache Airflow version
2.8.2
### Operating System
Debian GNU/Linux 12 (bookworm)
### Deployment
Docker-Compose
### Deployment details
Used breeze tool to deploy.
### What happened
When using the `EmrServerlessStartJobOperator`, using the airflow expand
functionality is not possible. The DAG will fail to serialize and it shows a
DAG import error in the webserver. This is because
`EmrServerlessStartJobOperator.operator_extra_links` is called and
`EmrServerlessStartJobOperator` is of type `MappedOperator`, but
`MappedOperator` is not a dictionary and not iterable.
### What you think should happen instead
DAG should import successfully without any errors.
### How to reproduce
The following single usage of EmrServerlessStartJobOperator works:
```python
from datetime import datetime
from airflow.models.dag import DAG
from airflow.providers.amazon.aws.operators.emr import (
EmrServerlessStartJobOperator,
)
DAG_ID = "example_emr_serverless"
emr_serverless_app_id = "01234abcd"
role_arn = "arn:test"
with DAG(
dag_id=DAG_ID,
schedule="@once",
start_date=datetime(2021, 1, 1),
tags=["example"],
catchup=False,
):
start_job = EmrServerlessStartJobOperator(
task_id="start_emr_serverless_job",
application_id=emr_serverless_app_id,
execution_role_arn=role_arn,
job_driver={
"sparkSubmit": {
"entryPoint": "test.jar",
"entryPointArguments": ["--arg", "1"],
"sparkSubmitParameters": "--conf sample",
}
},
configuration_overrides={
"monitoringConfiguration": {"s3MonitoringConfiguration":
{"logUri": f"s3://test/logs"}}
},
)
```
Whereas the following usage of expanded EmrServerlessStartJobOperator will
fail to serialize:
```python
from datetime import datetime
from airflow.models.dag import DAG
from airflow.providers.amazon.aws.operators.emr import (
EmrServerlessStartJobOperator,
)
DAG_ID = "example_emr_serverless"
emr_serverless_app_id = "01234abcd"
role_arn = "arn:test"
with DAG(
dag_id=DAG_ID,
schedule="@once",
start_date=datetime(2021, 1, 1),
tags=["example"],
catchup=False,
):
start_job = EmrServerlessStartJobOperator.partial(
task_id="start_emr_serverless_job",
application_id=emr_serverless_app_id,
execution_role_arn=role_arn,
configuration_overrides={
"monitoringConfiguration": {"s3MonitoringConfiguration":
{"logUri": f"s3://test/logs"}}
},
).expand(
job_driver=[{
"sparkSubmit": {
"entryPoint": "test.jar",
"entryPointArguments": ["--arg", "1"],
"sparkSubmitParameters": "--conf sample",
}
},{
"sparkSubmit": {
"entryPoint": "test.jar",
"entryPointArguments": ["--arg", "2"],
"sparkSubmitParameters": "--conf sample",
}
}]
)
```
### Anything else
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]