Hao Ren created SPARK-17855:
-------------------------------
Summary: Spark worker throw Exception when uber jar's http url
contains query string
Key: SPARK-17855
URL: https://issues.apache.org/jira/browse/SPARK-17855
Project: Spark
Issue Type: Bug
Components: Spark Core
Reporter: Hao Ren
Priority: Minor
spark-submit support jar url with http protocol
If the url contains any query strings, *worker.DriverRunner.downloadUserJar *
method will throw "Did not see expected jar" exception. This is because this
method checks the existance of a downloaded jar whose name contains query
strings.
This is a problem when your jar is located on some web service which requires
some additional information to retrieve the file. For example, to download a
jar from s3 bucket via http, the url contains signature, datetime, etc as query
string.
{code}
https://s3.amazonaws.com/deploy/spark-job.jar
?X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Credential=<your-access-key-id>/20130721/us-east-1/s3/aws4_request
&X-Amz-Date=20130721T201207Z
&X-Amz-Expires=86400
&X-Amz-SignedHeaders=host
&X-Amz-Signature=<signature-value>
{code}
Woker will look for a jar named
"spark-job.jar?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<your-access-key-id>/20130721/us-east-1/s3/aws4_request&X-Amz-Date=20130721T201207Z&X-Amz-Expires=86400&X-Amz-SignedHeaders=host&X-Amz-Signature=<signature-value>"
instead of
"spark-job.jar"
Hence, all the query string should be removed before checking jar existance.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]