[ 
https://issues.apache.org/jira/browse/SPARK-57548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fangchen Li updated SPARK-57548:
--------------------------------
    Description: 
ScalaReflection (sql-api) eagerly initializes `val universe = 
scala.reflect.runtime.universe` in its object initializer. On the Scala 3.8+ 
stdlib this crashes — `FatalError: class Array does not have a member apply` — 
an open Scala regression that names Spark: scala/scala3#25896.
 
Because it's in the static initializer, merely touching ScalaReflection trips 
it, and the generated ser/deser code does (via encodeFieldNameToIdentifier / 
findConstructor). So consuming Spark's 2.13 jars from a Scala 3 process (the 
for3Use2_13 path used today) crashes even for code that only runs a pre-built 
encoder without TypeTag and derivation.
 
Reference (2-line patch + a Scala-3 round-trip through Spark's unmodified 
codegen): [https://github.com/bearing-research/ProtoCatalyst] 
(spark-reflection-patch/, docs/scala3-encoder/REPORT.md §3).

  was:
ScalaReflection (sql-api) eagerly initializes `val universe = 
scala.reflect.runtime.universe` in its object initializer. On the Scala 3.8+ 
stdlib this crashes — `FatalError: class Array does not have a member apply` — 
an open Scala regression that names Spark: scala/scala3#25896.
 
Because it's in the static initializer, merely touching ScalaReflection trips 
it, and the generated ser/deser code does (via encodeFieldNameToIdentifier / 
findConstructor) — so even a pre-built AgnosticEncoder can't run from Scala 3.
 
Proposed (sql-api, ~3 lines, behavior-preserving on 2.13): - `val universe` -> 
`lazy val universe` - encodeFieldNameToIdentifier via 
scala.reflect.NameTransformer.encode - findConstructor's scala-reflect fallback 
-> Java reflection
 
Independent of, and smaller than, replacing encoderFor itself (separate 
follow-up).
 
Reference (2-line patch + a Scala-3 round-trip through Spark's unmodified 
codegen): https://github.com/bearing-research/ProtoCatalyst 
(spark-reflection-patch/, docs/scala3-encoder/REPORT.md §3).


> Avoid eager scala.reflect.runtime.universe initialization in ScalaReflection
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-57548
>                 URL: https://issues.apache.org/jira/browse/SPARK-57548
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Spark Core, SQL
>    Affects Versions: 4.3.0
>            Reporter: Fangchen Li
>            Priority: Major
>
> ScalaReflection (sql-api) eagerly initializes `val universe = 
> scala.reflect.runtime.universe` in its object initializer. On the Scala 3.8+ 
> stdlib this crashes — `FatalError: class Array does not have a member apply` 
> — an open Scala regression that names Spark: scala/scala3#25896.
>  
> Because it's in the static initializer, merely touching ScalaReflection trips 
> it, and the generated ser/deser code does (via encodeFieldNameToIdentifier / 
> findConstructor). So consuming Spark's 2.13 jars from a Scala 3 process (the 
> for3Use2_13 path used today) crashes even for code that only runs a pre-built 
> encoder without TypeTag and derivation.
>  
> Reference (2-line patch + a Scala-3 round-trip through Spark's unmodified 
> codegen): [https://github.com/bearing-research/ProtoCatalyst] 
> (spark-reflection-patch/, docs/scala3-encoder/REPORT.md §3).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to