Re: [I] Applying queue with SparkSubmitOperator is ignored from 2.6.0 onwards [airflow]

2024-04-11 Thread via GitHub


Lee-W closed issue #38461: Applying queue with SparkSubmitOperator is ignored 
from 2.6.0 onwards
URL: https://github.com/apache/airflow/issues/38461


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] Applying queue with SparkSubmitOperator is ignored from 2.6.0 onwards [airflow]

2024-04-09 Thread via GitHub


pateash commented on issue #38461:
URL: https://github.com/apache/airflow/issues/38461#issuecomment-2044385967

   Thanks @Lee-W @RNHTTR @djuarezg to flag this out.
   I have raised a PR for the fix, please have a look.
   #38852 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] Applying queue with SparkSubmitOperator is ignored from 2.6.0 onwards [airflow]

2024-04-08 Thread via GitHub


pateash commented on issue #38461:
URL: https://github.com/apache/airflow/issues/38461#issuecomment-2042992467

   Checking this, can any one please assign this to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] Applying queue with SparkSubmitOperator is ignored from 2.6.0 onwards [airflow]

2024-03-25 Thread via GitHub


djuarezg commented on issue #38461:
URL: https://github.com/apache/airflow/issues/38461#issuecomment-2018308367

   Probably caused due to https://github.com/apache/airflow/issues/35911


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] Applying queue with SparkSubmitOperator is ignored from 2.6.0 onwards [airflow]

2024-03-25 Thread via GitHub


djuarezg opened a new issue, #38461:
URL: https://github.com/apache/airflow/issues/38461

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### If "Other Airflow 2 version" selected, which one?
   
   2.8.1
   
   ### What happened?
   
   We extend the SparkSubmitOperator and specify a queue to select which of the 
desired runners are going to run our tasks.
   
   
   ```
   from airflow.providers.apache.spark.operators.spark_submit import 
SparkSubmitOperator
   (...)
   class CustomSparkSubmitOperator(SparkSubmitOperator):
   """Extended spark submit operator."""
   
   template_fields = list(SparkSubmitOperator.template_fields)
   template_fields.append("_java_class")
   
   def spark(
   self,
   application,
   java_class,
   task_id,
   priority_weight=1,
   alert_cc=None,
   auto_tags=None,
   **task_kwargs,
   ):
   """Create a task that runs a spark job."""
   _LOGGER.debug(f"Creating task: {task_id}")
   
   env = get_env()
   default_bp_config_path = (
   
f"/data/business-parameters/business-parameters-master/data/all.{env}.conf"
   )
   extra_java_options = nlstrip(
   f"""
   -DApplication={application}
   (...)
   """
   )
   
   logstore_options = (
   f" params.get('s3a_logstore_class', '{DEFAULT_LOGSTORE}') "
   )
   
   jar_path = f"{JAR_FOLDER} params.jar_name "
   
   conf = {
   "spark.driver.memory": "{{ params.driver_memory }}",
   (...)
   }
   
   task = CustomSparkSubmitOperator(
   task_id=task_id,
   java_class=java_class,
   conn_id=f"spark_{env}_client",
   priority_weight=priority_weight,
   queue="airflow_spark",
   application=jar_path,
   verbose=True,
   spark_binary=SPARK_BINARY_PATH,
   conf=conf,
   env_vars=get_task_env(None),
   **get_task_kwargs(
   task_kwargs,
   alert_cc=alert_cc,
   dag=self.default_dag,
   ),
   )
   task.auto_tags = ["spark"]
   if auto_tags:
   task.auto_tags.extend(auto_tags)
   return task
   ```
   
   This is then used but the value of at least queue is set to the default 
value generic to all dags, not `queue="airflow_spark"`
   
   
   
   ### What you think should happen instead?
   
   queue should be kept and used to determine which node to run on
   
   ### How to reproduce
   
   Set up a queue using SparkSubmitOperator on provider 4.5.0 or before vs 
4.6.0 or after
   
   ### Operating System
   
   cat /etc/os-release  NAME="Rocky Linux" VERSION="8.8 (Green Obsidian)"
   
   ### Versions of Apache Airflow Providers
   
   ```
   /data/airflow/env/bin/pip list | grep providers
   apache-airflow-providers-amazon  8.19.0
   apache-airflow-providers-apache-spark4.5.0
   apache-airflow-providers-celery  3.6.1
   apache-airflow-providers-common-io   1.3.0
   apache-airflow-providers-common-sql  1.11.1
   apache-airflow-providers-elasticsearch   5.3.3
   apache-airflow-providers-ftp 3.7.0
   apache-airflow-providers-http4.10.0
   apache-airflow-providers-imap3.5.0
   apache-airflow-providers-postgres5.10.2
   apache-airflow-providers-redis   3.6.0
   apache-airflow-providers-slack   8.6.1
   apache-airflow-providers-smtp1.6.1
   apache-airflow-providers-sqlite  3.7.1
   apache-airflow-providers-ssh 3.10.1
   
   ```
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [I] Applying queue with SparkSubmitOperator is ignored from 2.6.0 onwards [airflow]

2024-03-25 Thread via GitHub


boring-cyborg[bot] commented on issue #38461:
URL: https://github.com/apache/airflow/issues/38461#issuecomment-2018307705

   Thanks for opening your first issue here! Be sure to follow the issue 
template! If you are willing to raise PR to address this issue please do so, no 
need to wait for approval.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org