Fangchen Li created SPARK-57549:
-----------------------------------
Summary: Scala 3 compile-time AgnosticEncoder derivation (replace
ScalaReflection.encoderFor)
Key: SPARK-57549
URL: https://issues.apache.org/jira/browse/SPARK-57549
Project: Spark
Issue Type: Sub-task
Components: Spark Core, SQL
Affects Versions: 4.3.0
Reporter: Fangchen Li
ScalaReflection.encoderFor[T: TypeTag] derives Dataset[T] encoders via Scala 2
runtime reflection (TypeTag + scala.reflect.runtime.universe), which has no
Scala 3 equivalent. It is the structural blocker for a Scala 3 build of Spark
SQL — the one piece that needs a real reimplementation rather than a mechanical
port.
The seam is already there: AgnosticEncoder is Spark's reflection-free encoder
description, and ExpressionEncoder.apply(enc: AgnosticEncoder[T]) accepts one
with no TypeTag. So only the derivation needs to be replaced; everything
downstream (ExpressionEncoder, ser/deser codegen, Spark Connect) is reused
unchanged.
Proposed: # Add a Scala 3 Mirror/inline derivation, deriveAgnosticEncoder[T],
emitting AgnosticEncoder directly — one self-contained file in sql-api,
depending only on AgnosticEncoder + the Scala stdlib (no new IR).
# On the Scala 3 build, ExpressionEncoder.apply[T]() calls it instead of
encoderFor; the reflective body stays for 2.13.
# Drop the ~16 TypeTag context bounds in encoder-producing signatures
(Encoders, SparkSession.implicits, Dataset/functions/Aggregator). Forced by
Scala 3, regardless, TypeTag does not exist there.
Reference (working prototype): # The drop-in file:
[https://github.com/bearing-research/ProtoCatalyst/blob/main/encoder-spark/src/main/scala/protocatalyst/encoder/spark/AgnosticDerivation.scala|http://example.com]
(one file, no protocatalyst dependencies — rename the package and it is
encoderFor's Scala 3 replacement; compiled under
org.apache.spark.sql.catalyst.encoders on every build).
# Validated against Spark's own reflective encoderFor goldens (structural
parity) and round-tripped through Spark's unmodified ser/deser codegen. Docs:
docs/scala3-encoder/REPORT.md, docs/scala3-encoder/MIGRATION.md.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]