[
https://issues.apache.org/jira/browse/SPARK-55071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Khakhlyuk updated SPARK-55071:
-----------------------------------
Description:
Currently, spark.addArtifact in pyspark connect does not support absolute
Windows paths.
E.g. this code
{code:java}
spark.addArtifact("C:\\path\\to\\file.py", pyfile=True){code}
will result in the following error
{code:java}
PySparkRuntimeError: [UNSUPPORTED_OPERATION] c scheme is not supported. {code}
This error is caused by urlparse function in
[artifact.py.|https://github.com/apache/spark/blob/ac13473fff64919e8e7756e3a42ce3a68627dd73/python/pyspark/sql/connect/client/artifact.py#L188]
It incorrectly interprets local Windows path, e.g. `C:\\path\\to\\file`, as a
URI with 'C' scheme and throws an error because this URI scheme is not known
and not supported.
was:
Currently, `spark.addArtifact` in pyspark connect does not support absolute
Windows paths.
E.g. this code
```
spark.addArtifact("C:\\Users\\alex.khakhlyuk\\hey.py", pyfile=True)
```
will result in the following error
```
PySparkRuntimeError: [UNSUPPORTED_OPERATION] c scheme is not supported.
```
This error is caused by `urlparse` function in
[artifact.py.|https://github.com/apache/spark/blob/ac13473fff64919e8e7756e3a42ce3a68627dd73/python/pyspark/sql/connect/client/artifact.py#L188]
It incorrectly interprets local Windows path, e.g. `C:\path\to\file` as a URI
with 'C' scheme and throws an error because this URI scheme is not known and
not supported.
> Make spark.addArtifact work with Windows paths
> ----------------------------------------------
>
> Key: SPARK-55071
> URL: https://issues.apache.org/jira/browse/SPARK-55071
> Project: Spark
> Issue Type: Bug
> Components: Connect, PySpark
> Affects Versions: 4.1.1
> Reporter: Alex Khakhlyuk
> Priority: Major
>
> Currently, spark.addArtifact in pyspark connect does not support absolute
> Windows paths.
> E.g. this code
>
> {code:java}
> spark.addArtifact("C:\\path\\to\\file.py", pyfile=True){code}
>
> will result in the following error
>
> {code:java}
> PySparkRuntimeError: [UNSUPPORTED_OPERATION] c scheme is not supported. {code}
>
> This error is caused by urlparse function in
> [artifact.py.|https://github.com/apache/spark/blob/ac13473fff64919e8e7756e3a42ce3a68627dd73/python/pyspark/sql/connect/client/artifact.py#L188]
> It incorrectly interprets local Windows path, e.g. `C:\\path\\to\\file`, as
> a URI with 'C' scheme and throws an error because this URI scheme is not
> known and not supported.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]