[
https://issues.apache.org/jira/browse/SPARK-27911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Armbrust updated SPARK-27911:
-
Description:
Today, users of pyspark (and Scala) need to manually specify the version of
Scala that their Spark installation is using when adding a Spark package to
their application. This extra configuration is confusing to users who may not
even know which version of Scala they are using (for example, if they installed
Spark using {{pip}}). The confusion here is exacerbated by releases in Spark
that have changed the default from {{2.11}} -> {{2.12}} -> {{2.11}}.
https://spark.apache.org/releases/spark-release-2-4-2.html
https://spark.apache.org/releases/spark-release-2-4-3.html
Since Spark can know which version of Scala it was compiled for, we should give
users the option to automatically choose the correct version. This could be as
simple as a substitution for {{$scalaVersion}} or something when resolving a
package (similar to SBTs support for automatically handling scala dependencies).
Here are some concrete examples of users getting it wrong and getting confused:
https://github.com/delta-io/delta/issues/6
https://github.com/delta-io/delta/issues/63
was:
Today, users of pyspark (and Scala) need to manually specify the version of
Scala that their Spark installation is using when adding a Spark package to
their application. This extra configuration confusing to users who may not even
know which version of Scala they are using (for example, if they installed
Spark using {{pip}}). The confusion here is exacerbated by releases in Spark
that have changed the default from {{2.11}} -> {{2.12}} -> {{2.11}}.
https://spark.apache.org/releases/spark-release-2-4-2.html
https://spark.apache.org/releases/spark-release-2-4-3.html
Since Spark can know which version of Scala it was compiled for, we should give
users the option to automatically choose the correct version. This could be as
simple as a substitution for {{$scalaVersion}} or something when resolving a
package (similar to SBTs support for automatically handling scala dependencies).
Here are some concrete examples of users getting it wrong and getting confused:
https://github.com/delta-io/delta/issues/6
https://github.com/delta-io/delta/issues/63
> PySpark Packages should automatically choose correct scala version
> --
>
> Key: SPARK-27911
> URL: https://issues.apache.org/jira/browse/SPARK-27911
> Project: Spark
> Issue Type: New Feature
> Components: PySpark
>Affects Versions: 2.4.3
>Reporter: Michael Armbrust
>Priority: Major
>
> Today, users of pyspark (and Scala) need to manually specify the version of
> Scala that their Spark installation is using when adding a Spark package to
> their application. This extra configuration is confusing to users who may not
> even know which version of Scala they are using (for example, if they
> installed Spark using {{pip}}). The confusion here is exacerbated by releases
> in Spark that have changed the default from {{2.11}} -> {{2.12}} -> {{2.11}}.
> https://spark.apache.org/releases/spark-release-2-4-2.html
> https://spark.apache.org/releases/spark-release-2-4-3.html
> Since Spark can know which version of Scala it was compiled for, we should
> give users the option to automatically choose the correct version. This
> could be as simple as a substitution for {{$scalaVersion}} or something when
> resolving a package (similar to SBTs support for automatically handling scala
> dependencies).
> Here are some concrete examples of users getting it wrong and getting
> confused:
> https://github.com/delta-io/delta/issues/6
> https://github.com/delta-io/delta/issues/63
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org