Max Gekk created SPARK-57464:
--------------------------------

             Summary: Types Framework - add a string parse hook (string to 
internal value) symmetric to format
                 Key: SPARK-57464
                 URL: https://issues.apache.org/jira/browse/SPARK-57464
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 5.0.0
            Reporter: Max Gekk


Umbrella: SPARK-53504 (Types Framework).

The framework's client-side TypeApiOps trait provides string output via format 
/ formatUTF8 / toSQLValue (internal value -> string), and cast-to-string is 
fully routed through it (e.g. SPARK-57285 for nanosecond timestamps). There is 
no symmetric inverse: no parse / parseUTF8 hook to turn a string into the 
type's internal value. As a result, the string -> value direction (CAST(string 
AS T), default to_timestamp/to_time style parsing) is still implemented per 
type outside the framework (e.g. in Cast.scala), so each new framework type 
re-scatters this logic.

Proposal:
- Add an optional parse hook to TypeApiOps, e.g. def parse(s: UTF8String): 
Option[Any] (or def parseString(s: String)), returning the type's internal 
representation, symmetric to format. Default None so existing types fall 
through to legacy handling.
- Route the cast-from-string path (Cast / ToStringBase counterpart) through the 
hook the same way cast-to-string already flows through format, with codegen and 
interpreted parity.
- Provide the reference implementation for TimeType, and let the nanosecond 
timestamp types (SPARK-56822) reuse it.

Notes / scope boundary:
- The framework hook uses the type's default/fraction formatter. Datasource 
readers that honor user-configurable patterns (JSON/CSV/XML timestampFormat, 
locale, zone) are out of scope here; those keep their own configurable 
formatters.
- This is a framework primitive, consumed by expressions/cast rather than an 
expression itself (consistent with SPARK-53504 scope).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to