[
https://issues.apache.org/jira/browse/SPARK-55071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Khakhlyuk updated SPARK-55071:
-----------------------------------
Description:
Currently, spark.addArtifact in pyspark connect does not support absolute
Windows paths.
E.g. this code
{code:java}
spark.addArtifact("C:\\path\\to\\file.py", pyfile=True){code}
will result in the following error
{code:java}
PySparkRuntimeError: [UNSUPPORTED_OPERATION] c scheme is not supported. {code}
This error is caused by urlparse function in
[artifact.py.|https://github.com/apache/spark/blob/ac13473fff64919e8e7756e3a42ce3a68627dd73/python/pyspark/sql/connect/client/artifact.py#L188]
It incorrectly interprets local Windows path, e.g. `C:\\...`, as a URI with
'C' scheme and throws an error because this URI scheme is not known and not
supported.
was:
Currently, spark.addArtifact in pyspark connect does not support absolute
Windows paths.
E.g. this code
{code:java}
spark.addArtifact("C:\\path\\to\\file.py", pyfile=True){code}
will result in the following error
{code:java}
PySparkRuntimeError: [UNSUPPORTED_OPERATION] c scheme is not supported. {code}
This error is caused by urlparse function in
[artifact.py.|https://github.com/apache/spark/blob/ac13473fff64919e8e7756e3a42ce3a68627dd73/python/pyspark/sql/connect/client/artifact.py#L188]
It incorrectly interprets local Windows path, e.g. `C:\\path\\to\\file`, as a
URI with 'C' scheme and throws an error because this URI scheme is not known
and not supported.
> Make spark.addArtifact work with Windows paths
> ----------------------------------------------
>
> Key: SPARK-55071
> URL: https://issues.apache.org/jira/browse/SPARK-55071
> Project: Spark
> Issue Type: Bug
> Components: Connect, PySpark
> Affects Versions: 4.1.1
> Reporter: Alex Khakhlyuk
> Priority: Major
>
> Currently, spark.addArtifact in pyspark connect does not support absolute
> Windows paths.
> E.g. this code
>
> {code:java}
> spark.addArtifact("C:\\path\\to\\file.py", pyfile=True){code}
>
> will result in the following error
>
> {code:java}
> PySparkRuntimeError: [UNSUPPORTED_OPERATION] c scheme is not supported. {code}
>
> This error is caused by urlparse function in
> [artifact.py.|https://github.com/apache/spark/blob/ac13473fff64919e8e7756e3a42ce3a68627dd73/python/pyspark/sql/connect/client/artifact.py#L188]
> It incorrectly interprets local Windows path, e.g. `C:\\...`, as a URI with
> 'C' scheme and throws an error because this URI scheme is not known and not
> supported.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]