Vikram Janarthanan created SPARK-46860: ------------------------------------------
Summary: Credentials with https url not working for --jars, --files, --archives & --py-files options on spark-submit command Key: SPARK-46860 URL: https://issues.apache.org/jira/browse/SPARK-46860 Project: Spark Issue Type: Task Components: k8s Affects Versions: 3.3.4 Environment: Spark 3.3.3 deployed on K8s Reporter: Vikram Janarthanan We are trying to run the spark application by pointing the dependent files as well the main pyspark script from secure webserver We are looking for solution to pass the dependencies as well as pysaprk script from webserver. we have tried deploying the spark application from webserver to k8s cluster without username and password and it worked, but when tried with username/password we are facing "Exception in thread "{*}main" java.io.IOException: Server returned HTTP response code: 401 for URL: https://username:passw...@domain.com/application/pysparkjob.py{*}" *Working options on spark-submit:* spark-submit ...... --repositories https://username:passw...@domain.com/repo1/repo --jars https://domain.com/jars/runtime.jar \ --files https://domain.com/files/query.sql \ --py-files [https://domain.com/pythonlib/pythonlib.zip] \ https://domain.com/app1/pysparkapp.py Note: only repositories option works with username and password *Spark-submit using https url with username/password not working:* spark-submit ...... --jars https://username:passw...@domain.com/jars/runtime.jar \ --files https://username:passw...@domain.com/files/query.sql \ --py-files https://username:passw...@domain.com[/pythonlib/pythonlib.zip|https://domain.com/pythonlib/pythonlib.zip] \ https://username:passw...@domain.com/app1/pysparkapp.py Error : 25/01/23 09:19:57 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Exception in thread "main" java.io.IOException: Server returned HTTP response code: 401 for URL: https://username:passw...@domain.com/repository/spark-artifacts/pysparkdemo/1.0/pysparkdemo-1.0.tgz at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:2000) at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1589) at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:224) at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:809) at org.apache.spark.util.DependencyUtils$.downloadFile(DependencyUtils.scala:264) at org.apache.spark.util.DependencyUtils$.$anonfun$downloadFileList$2(DependencyUtils.scala:233) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.AbstractTraversable.map(Traversable.scala:108) -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org