[
https://issues.apache.org/jira/browse/SPARK-46860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17938064#comment-17938064
]
Krzysztof Ruta edited comment on SPARK-46860 at 3/25/25 6:56 AM:
-----------------------------------------------------------------
The former PR [#50375|https://github.com/apache/spark/pull/50375] failed by a
single (flaky?) test that didn't fail before (I've run the whole workflow
several times before). The current one
[#50377|https://github.com/apache/spark/pull/50377] passes all checks.
The suspicious test was:
</testcase><testcase
classname="org.apache.spark.sql.streaming.FlatMapGroupsWithStateWithInitialStateSuite"
name="flatMapGroupsWithState - initial state and initial batch have same keys
and skipEmittingInitialStateKeys=false - state format version 1" time="0.84">
was (Author: JIRAUSER309126):
The former PR [#50375|https://github.com/apache/spark/pull/50375] failed by a
single (flaky?) test that never failed before (I run the whole workflow several
times before). The current one
[#50377|https://github.com/apache/spark/pull/50377] passes all checks.
The suspicious test was:
</testcase><testcase
classname="org.apache.spark.sql.streaming.FlatMapGroupsWithStateWithInitialStateSuite"
name="flatMapGroupsWithState - initial state and initial batch have same keys
and skipEmittingInitialStateKeys=false - state format version 1" time="0.84">
> Credentials with https url not working for --jars, --files, --archives &
> --py-files options on spark-submit command
> -------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-46860
> URL: https://issues.apache.org/jira/browse/SPARK-46860
> Project: Spark
> Issue Type: Task
> Components: k8s
> Affects Versions: 3.3.3, 3.5.0, 3.3.4
> Environment: Spark 3.3.3 deployed on K8s
> Reporter: Vikram Janarthanan
> Priority: Major
> Labels: pull-request-available
>
> We are trying to run the spark application by pointing the dependent files as
> well the main pyspark script from secure webserver
> We are looking for solution to pass the dependencies as well as pysaprk
> script from webserver.
> we have tried deploying the spark application from webserver to k8s cluster
> without username and password and it worked, but when tried with
> username/password we are facing "Exception in thread "{*}main"
> java.io.IOException: Server returned HTTP response code: 401 for URL:
> https://username:[email protected]/application/pysparkjob.py{*}"
> *Working options on spark-submit:*
> spark-submit ......
> --repositories https://username:[email protected]/repo1/repo
> --jars https://domain.com/jars/runtime.jar \
> --files https://domain.com/files/query.sql \
> --py-files [https://domain.com/pythonlib/pythonlib.zip] \
> https://domain.com/app1/pysparkapp.py
> Note: only repositories option works with username and password
> *Spark-submit using https url with username/password not working:*
> spark-submit ......
> --jars https://username:[email protected]/jars/runtime.jar \
> --files https://username:[email protected]/files/query.sql \
> --py-files
> https://username:[email protected][/pythonlib/pythonlib.zip|https://domain.com/pythonlib/pythonlib.zip]
> \
> https://username:[email protected]/app1/pysparkapp.py
>
> Error :
> 25/01/23 09:19:57 WARN NativeCodeLoader: Unable to load native-hadoop library
> for your platform... using builtin-java classes where applicable
> Exception in thread "main" java.io.IOException: Server returned HTTP response
> code: 401 for URL:
> https://username:[email protected]/repository/spark-artifacts/pysparkdemo/1.0/pysparkdemo-1.0.tgz
> at
> java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:2000)
> at
> java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1589)
> at
> java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:224)
> at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:809)
> at
> org.apache.spark.util.DependencyUtils$.downloadFile(DependencyUtils.scala:264)
> at
> org.apache.spark.util.DependencyUtils$.$anonfun$downloadFileList$2(DependencyUtils.scala:233)
> at
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
> at
> scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
> at
> scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
> at
> scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)
> at scala.collection.TraversableLike.map(TraversableLike.scala:286)
> at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
> at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]