James McShane created SPARK-33227:
-------------------------------------

             Summary: Add Jar with Azure SAS token fails with URL encoded 
characters
                 Key: SPARK-33227
                 URL: https://issues.apache.org/jira/browse/SPARK-33227
             Project: Spark
          Issue Type: Bug
          Components: Spark Submit
    Affects Versions: 2.4.3
            Reporter: James McShane


I am running spark-submit using an Azure SAS token to access the jar file. When 
the sig of the SAS token contains URL encoded characters before the end, I get 
a 403 error trying to download the jar. It appears to be related to the URL 
encoding change that occurs within DependencyUtils: 
[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala#L137.]

Error message:

+ exec /usr/local/bin/tini -s -- /opt/spark/bin/spark-submit --conf 
spark.driver.bindAddress=10.0.0.44 --deploy-mode client --properties-file 
/opt/spark/conf/spark.properties --class MyClass 
'https://storageaccount.blob.core.windows.net/blob/my-jar.jar?sv=2019-12-12&ss=b&srt=sco&sp=r&se=*****&st=*******&spr=https&sig=sigwith%2Band%2Fending%3D'

ava.io.IOException: Server returned HTTP response code: 403 for URL: 
https://storageaccount.blob.core.windows.net/blob/ivm-0.2.40-Spark-2.2.jar?sv=2019-12-12&ss=b&srt=sco&sp=r&se=**********&st=*********&spr=https&sig=sigwith+and/ending=
 at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1900)
 at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
 at 
sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:268)
 at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:713) at 
org.apache.spark.deploy.DependencyUtils$.downloadFile(DependencyUtils.scala:137)
 at 
org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$7.apply(SparkSubmit.scala:367)
 at 
org.apache.spark.deploy.SparkSubmit$$anonfun$prepareSubmitEnvironment$7.apply(SparkSubmit.scala:367)
 at scala.Option.map(Option.scala:146) at 
org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:366)
 at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:143) at 
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924) at 
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933) at 
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


It may not be clear in the example above, but when I submit the sas token url, 
it looks like:

sig=sigwith%2Band%2Fending%3D

The 403 error from the stacktrace gives

sig=sigwith+and/ending=

Is there something I can do to ensure that these characters do not get URL 
decoded in this way?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to