[jira] [Commented] (SPARK-34438) Python Driver is not correctly detected using presigned URLs
[ https://issues.apache.org/jira/browse/SPARK-34438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17284535#comment-17284535 ] Apache Spark commented on SPARK-34438: -- User 'scravy' has created a pull request for this issue: https://github.com/apache/spark/pull/31565 > Python Driver is not correctly detected using presigned URLs > > > Key: SPARK-34438 > URL: https://issues.apache.org/jira/browse/SPARK-34438 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.1.0 >Reporter: Julian Fleischer >Priority: Minor > > In AWS one can generate so-called presigned URLs. spark-submit accepts URLs > for the driver program, e.g. {{http://my-web-server/driver.py}}. Now a > presigned URL has a query fragment > {{http://my-web-server/driver.py?signature}}. > Now the check for whether the given URL is a python driver simply checks > whether it ends in {{.py}} – which the presigned URL does not, as it ends in > {{signature}}. > The relevant check is in {{SparkSubmit.scala}}, Line 1051 (commit tagged > {{v3.0.1}}): > [https://github.com/apache/spark/blob/v3.0.1/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L1051] > > Here is a more realistic example URL: > {{https://bucket-name.s3.us-east-1.amazonaws.com/driver.py?X-Amz-Algorithm=AWS4-HMAC-SHA256=AKIATBNPKWPCNUMWMLUR%2F20210214%2Fus-east-1%2Fs3%2Faws4_request=20210214T062047Z=172800=host=49ef39b6bb7090001af9312692788892551916a6ac0ff6c961ce52efb9acc235}} > A fix could be to parse the the given path as a {{java.net.URI}} and look for > the pathname to end in {{.py}} (as opposed to the whole thing). > To circumvent this issue I am currently appending a fragment to the query > which makes it end in {{.py}}, i.e. > {{http://my-web-server/driver.py?signature#.py}} which does work. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34438) Python Driver is not correctly detected using presigned URLs
[ https://issues.apache.org/jira/browse/SPARK-34438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17284533#comment-17284533 ] Apache Spark commented on SPARK-34438: -- User 'scravy' has created a pull request for this issue: https://github.com/apache/spark/pull/31565 > Python Driver is not correctly detected using presigned URLs > > > Key: SPARK-34438 > URL: https://issues.apache.org/jira/browse/SPARK-34438 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.1.0 >Reporter: Julian Fleischer >Priority: Minor > > In AWS one can generate so-called presigned URLs. spark-submit accepts URLs > for the driver program, e.g. {{http://my-web-server/driver.py}}. Now a > presigned URL has a query fragment > {{http://my-web-server/driver.py?signature}}. > Now the check for whether the given URL is a python driver simply checks > whether it ends in {{.py}} – which the presigned URL does not, as it ends in > {{signature}}. > The relevant check is in {{SparkSubmit.scala}}, Line 1051 (commit tagged > {{v3.0.1}}): > [https://github.com/apache/spark/blob/v3.0.1/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L1051] > > Here is a more realistic example URL: > {{https://bucket-name.s3.us-east-1.amazonaws.com/driver.py?X-Amz-Algorithm=AWS4-HMAC-SHA256=AKIATBNPKWPCNUMWMLUR%2F20210214%2Fus-east-1%2Fs3%2Faws4_request=20210214T062047Z=172800=host=49ef39b6bb7090001af9312692788892551916a6ac0ff6c961ce52efb9acc235}} > A fix could be to parse the the given path as a {{java.net.URI}} and look for > the pathname to end in {{.py}} (as opposed to the whole thing). > To circumvent this issue I am currently appending a fragment to the query > which makes it end in {{.py}}, i.e. > {{http://my-web-server/driver.py?signature#.py}} which does work. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34438) Python Driver is not correctly detected using presigned URLs
[ https://issues.apache.org/jira/browse/SPARK-34438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17284531#comment-17284531 ] Julian Fleischer commented on SPARK-34438: -- I am proposing a patch here: https://github.com/apache/spark/pull/31565 > Python Driver is not correctly detected using presigned URLs > > > Key: SPARK-34438 > URL: https://issues.apache.org/jira/browse/SPARK-34438 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 3.0.0, 3.0.1, 3.0.2, 3.1.0 >Reporter: Julian Fleischer >Priority: Minor > > In AWS one can generate so-called presigned URLs. spark-submit accepts URLs > for the driver program, e.g. {{http://my-web-server/driver.py}}. Now a > presigned URL has a query fragment > {{http://my-web-server/driver.py?signature}}. > Now the check for whether the given URL is a python driver simply checks > whether it ends in {{.py}} – which the presigned URL does not, as it ends in > {{signature}}. > The relevant check is in {{SparkSubmit.scala}}, Line 1051 (commit tagged > {{v3.0.1}}): > [https://github.com/apache/spark/blob/v3.0.1/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L1051] > > Here is a more realistic example URL: > {{https://bucket-name.s3.us-east-1.amazonaws.com/driver.py?X-Amz-Algorithm=AWS4-HMAC-SHA256=AKIATBNPKWPCNUMWMLUR%2F20210214%2Fus-east-1%2Fs3%2Faws4_request=20210214T062047Z=172800=host=49ef39b6bb7090001af9312692788892551916a6ac0ff6c961ce52efb9acc235}} > A fix could be to parse the the given path as a {{java.net.URI}} and look for > the pathname to end in {{.py}} (as opposed to the whole thing). > To circumvent this issue I am currently appending a fragment to the query > which makes it end in {{.py}}, i.e. > {{http://my-web-server/driver.py?signature#.py}} which does work. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org