[
https://issues.apache.org/jira/browse/SPARK-57548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Fangchen Li updated SPARK-57548:
--------------------------------
Description:
ScalaReflection (sql-api) eagerly initializes `val universe =
scala.reflect.runtime.universe` in its object initializer. On the Scala 3.8+
stdlib this crashes — `FatalError: class Array does not have a member apply` —
an open Scala regression that names Spark: scala/scala3#25896.
Because it's in the static initializer, merely touching ScalaReflection trips
it, and the generated ser/deser code does (via encodeFieldNameToIdentifier /
findConstructor). So consuming Spark's 2.13 jars from a Scala 3 process (the
for3Use2_13 path used today) crashes even for code that only runs a pre-built
encoder without TypeTag and derivation.
So the minimal working fix is two lines:
# val universe → lazy val universe — so class-load doesn't force it.
# encodeFieldNameToIdentifier → scala.reflect.NameTransformer.encode
Reference (2-line patch + a Scala-3 round-trip through Spark's unmodified
codegen): [https://github.com/bearing-research/ProtoCatalyst]
(spark-reflection-patch/, docs/scala3-encoder/REPORT.md §3).
was:
ScalaReflection (sql-api) eagerly initializes `val universe =
scala.reflect.runtime.universe` in its object initializer. On the Scala 3.8+
stdlib this crashes — `FatalError: class Array does not have a member apply` —
an open Scala regression that names Spark: scala/scala3#25896.
Because it's in the static initializer, merely touching ScalaReflection trips
it, and the generated ser/deser code does (via encodeFieldNameToIdentifier /
findConstructor). So consuming Spark's 2.13 jars from a Scala 3 process (the
for3Use2_13 path used today) crashes even for code that only runs a pre-built
encoder without TypeTag and derivation.
Reference (2-line patch + a Scala-3 round-trip through Spark's unmodified
codegen): [https://github.com/bearing-research/ProtoCatalyst]
(spark-reflection-patch/, docs/scala3-encoder/REPORT.md §3).
> Avoid eager scala.reflect.runtime.universe initialization in ScalaReflection
> ----------------------------------------------------------------------------
>
> Key: SPARK-57548
> URL: https://issues.apache.org/jira/browse/SPARK-57548
> Project: Spark
> Issue Type: Sub-task
> Components: Spark Core, SQL
> Affects Versions: 4.3.0
> Reporter: Fangchen Li
> Priority: Major
>
> ScalaReflection (sql-api) eagerly initializes `val universe =
> scala.reflect.runtime.universe` in its object initializer. On the Scala 3.8+
> stdlib this crashes — `FatalError: class Array does not have a member apply`
> — an open Scala regression that names Spark: scala/scala3#25896.
>
> Because it's in the static initializer, merely touching ScalaReflection trips
> it, and the generated ser/deser code does (via encodeFieldNameToIdentifier /
> findConstructor). So consuming Spark's 2.13 jars from a Scala 3 process (the
> for3Use2_13 path used today) crashes even for code that only runs a pre-built
> encoder without TypeTag and derivation.
> So the minimal working fix is two lines:
> # val universe → lazy val universe — so class-load doesn't force it.
> # encodeFieldNameToIdentifier → scala.reflect.NameTransformer.encode
>
> Reference (2-line patch + a Scala-3 round-trip through Spark's unmodified
> codegen): [https://github.com/bearing-research/ProtoCatalyst]
> (spark-reflection-patch/, docs/scala3-encoder/REPORT.md §3).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]