djuarezg opened a new issue, #38461:
URL: https://github.com/apache/airflow/issues/38461
### Apache Airflow version
Other Airflow 2 version (please specify below)
### If "Other Airflow 2 version" selected, which one?
2.8.1
### What happened?
We extend the SparkSubmitOperator and specify a queue to select which of the
desired runners are going to run our tasks.
```
from airflow.providers.apache.spark.operators.spark_submit import
SparkSubmitOperator
(...)
class CustomSparkSubmitOperator(SparkSubmitOperator):
"""Extended spark submit operator."""
template_fields = list(SparkSubmitOperator.template_fields)
template_fields.append("_java_class")
def spark(
self,
application,
java_class,
task_id,
priority_weight=1,
alert_cc=None,
auto_tags=None,
**task_kwargs,
):
"""Create a task that runs a spark job."""
_LOGGER.debug(f"Creating task: {task_id}")
env = get_env()
default_bp_config_path = (
f"/data/business-parameters/business-parameters-master/data/all.{env}.conf"
)
extra_java_options = nlstrip(
f"""
-DApplication={application}
(...)
"""
)
logstore_options = (
f" params.get('s3a_logstore_class', '{DEFAULT_LOGSTORE}') "
)
jar_path = f"{JAR_FOLDER} params.jar_name "
conf = {
"spark.driver.memory": "{{ params.driver_memory }}",
(...)
}
task = CustomSparkSubmitOperator(
task_id=task_id,
java_class=java_class,
conn_id=f"spark_{env}_client",
priority_weight=priority_weight,
queue="airflow_spark",
application=jar_path,
verbose=True,
spark_binary=SPARK_BINARY_PATH,
conf=conf,
env_vars=get_task_env(None),
**get_task_kwargs(
task_kwargs,
alert_cc=alert_cc,
dag=self.default_dag,
),
)
task.auto_tags = ["spark"]
if auto_tags:
task.auto_tags.extend(auto_tags)
return task
```
This is then used but the value of at least queue is set to the default
value generic to all dags, not `queue="airflow_spark"`
### What you think should happen instead?
queue should be kept and used to determine which node to run on
### How to reproduce
Set up a queue using SparkSubmitOperator on provider 4.5.0 or before vs
4.6.0 or after
### Operating System
cat /etc/os-release NAME="Rocky Linux" VERSION="8.8 (Green Obsidian)"
### Versions of Apache Airflow Providers
```
/data/airflow/env/bin/pip list | grep providers
apache-airflow-providers-amazon 8.19.0
apache-airflow-providers-apache-spark4.5.0
apache-airflow-providers-celery 3.6.1
apache-airflow-providers-common-io 1.3.0
apache-airflow-providers-common-sql 1.11.1
apache-airflow-providers-elasticsearch 5.3.3
apache-airflow-providers-ftp 3.7.0
apache-airflow-providers-http4.10.0
apache-airflow-providers-imap3.5.0
apache-airflow-providers-postgres5.10.2
apache-airflow-providers-redis 3.6.0
apache-airflow-providers-slack 8.6.1
apache-airflow-providers-smtp1.6.1
apache-airflow-providers-sqlite 3.7.1
apache-airflow-providers-ssh 3.10.1
```
### Deployment
Virtualenv installation
### Deployment details
_No response_
### Anything else?
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org