allabright opened a new issue, #56453:
URL: https://github.com/apache/airflow/issues/56453

   ### Apache Airflow Provider(s)
   
   apache-spark
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-apache-spark==5.3.2
   
   ### Apache Airflow version
   
   3.1.0
   
   ### Operating System
   
   Mac Tahoe Version 26.0.1 (25A362)
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   Running locally using docker-compose from this page: 
https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html
   
   ### What happened
   
   Issue has already been raised here: 
https://github.com/apache/airflow/issues/46169
   
   The "spark://" part of the URI is being parsed out, leading to an error in 
spark-submit. Changing the host manually in the GUI from "spark-master" to 
"spark://spark-master" causes it to work. However, specifying it as an 
environment variable or via the command line results in the "spark://" part 
being stripped.
   
   ### What you think should happen instead
   
   The "spark://" part of the URI should be kept as part of the spark-submit 
command.
   
   ### How to reproduce
   
   Run Airflow using the docker compose file specified above, and add the 
following command to airflow-init:
   
   /entrypoint airflow connections get spark_default >/dev/null 2>&1 || 
/entrypoint airflow connections add 'spark_default' --conn-uri 
'spark://spark-master:7077'
   
   Then try to run a spark job using this connection. The expected command is:
   
   spark-submit --master spark://spark-master:7077
   
   The actual command is
   
   spark-submit --master spark-master:7077
   
   Leading to the following error:
   
   [2025-10-07 13:49:44] INFO - : org.apache.spark.SparkException: Could not 
parse Master URL: 'spark-master:7077' 
source=airflow.task.hooks.airflow.providers.apache.spark.hooks.spark_submit.SparkSubmitHook
 loc=spark_submit.py:644
   
   Task failed with exception source=task loc=task_runner.py:972
   AirflowException: Cannot execute: spark-submit --master spark-master:7077 
--name arrow-spark --verbose --deploy-mode client 
/opt/airflow/src/spark_test.py. Error code is: 1.
   
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to