[jira] [Resolved] (SPARK-55439) Types Framework

David Milicevic (Jira) Mon, 09 Feb 2026 00:37:41 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-55439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


David Milicevic resolved SPARK-55439.
-------------------------------------
    Resolution: Duplicate

I didn't realize I had permissions to edit the original work item: 
https://issues.apache.org/jira/browse/SPARK-53504.

I'm closing this one and will update the original one!

> Types Framework
> ---------------
>
>                 Key: SPARK-55439
>                 URL: https://issues.apache.org/jira/browse/SPARK-55439
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 4.2.0
>            Reporter: David Milicevic
>            Priority: Major
>         Attachments: TYPES_FRAMEWORK_DESIGN_V2.md
>
>
> *Summary:*
> Introduce a Types Framework to centralize scattered type-specific pattern 
> matching
> *Description:*
> Adding a new data type to Spark currently requires modifying 50+ files with 
> scattered type-specific logic using diverse patterns (focusing on TIME type - 
> '_: TimeType', 'TimeNanoVector', '.hasTime()', 'LocalTimeEncoder', 
> 'instanceof TimeType', etc.). There is no compiler assistance to ensure 
> completeness, and integration points are easy to miss.
> This effort introduces a *Types Framework* that centralizes type-specific 
> infrastructure operations in Ops interface classes. Instead of modifying 
> dozens of files, a developer implementing a new type creates two Ops classes 
> and registers them in factory objects. The compiler enforces interface 
> completeness.
> *Concrete example - TimeType:*
> TimeType (the proof-of-concept for this framework) has integration points 
> spread across 50+ files for infrastructure concerns like physical type 
> mapping, literals, type converters, encoders, formatters, Arrow SerDe, proto 
> conversion, JDBC, Python, Thrift, and storage formats. With the framework, 
> these are consolidated into two Ops classes (~240 lines total). A developer 
> adding a new type with similar complexity would create two analogous files 
> instead of touching 50+ files.
> *Architecture:*
> The framework defines a hierarchy of Ops traits that each cover a specific 
> category of type operations:
>  
> {code:java}
> TypeOps (sql/catalyst)
> +-- PhyTypeOps      - Physical type representation
> +-- LiteralTypeOps  - Literal creation and defaults
> +-- ExternalTypeOps - External <-> internal type conversion
> +-- ProtoTypeOps    - Spark Connect proto serialization
> +-- ClientTypeOps   - JDBC, Arrow, Python, Thrift integration
> TypeApiOps (sql/api)
> +-- FormatTypeOps   - String formatting
> +-- EncodeTypeOps   - Row encoding (AgnosticEncoder) {code}
>  
> Existing integration points use a *check-and-delegate* pattern guarded by a 
> feature flag to dispatch to the framework while preserving legacy behavior as 
> fallback:
>  
> {code:java}
> def someOperation(dt: DataType) = dt match {
>   case _ if SQLConf.get.typesFrameworkEnabled && SomeOps.supports(dt) =>
>     SomeOps(dt).someMethod()
>   case DateType => ...  // legacy types unchanged
> } {code}
>  
> TimeType serves as the proof-of-concept implementation. Once the framework is 
> validated, additional types can be implemented or migrated incrementally.
> *Scope:*
>  * In scope: Physical type representation, literal creation, type conversion, 
> string formatting, row encoding, proto serialization, client integration 
> (JDBC, Arrow, Python, Thrift), storage formats, testing infrastructure, 
> documentation
>  * Out of scope: Type-specific expressions and arithmetic, SQL parser 
> changes. The framework provides primitives that expressions use, not the 
> expressions themselves.
> See the attached design document for full details.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (SPARK-55439) Types Framework

Reply via email to