[
https://issues.apache.org/jira/browse/SPARK-57588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-57588:
-----------------------------------
Labels: pull-request-available starter (was: starter)
> CAST to TIME(p) reports the wrong target precision in the CAST_INVALID_INPUT
> error
> ----------------------------------------------------------------------------------
>
> Key: SPARK-57588
> URL: https://issues.apache.org/jira/browse/SPARK-57588
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 4.3.0
> Reporter: Max Gekk
> Priority: Minor
> Labels: pull-request-available, starter
>
> Umbrella: SPARK-57550 (Extend support for the TIME data type).
> h2. What
> When a string cannot be parsed while casting to {{TIME(p)}}, the resulting
> {{CAST_INVALID_INPUT}} error always reports the target type as {{"TIME(6)"}},
> regardless
> of the actual target precision. The error message is therefore inaccurate for
> any
> precision other than 6.
> h2. How to reproduce
> {code:sql}
> SET spark.sql.ansi.enabled=true;
> SELECT CAST('not a time' AS TIME(9));
> {code}
> Actual: {{[CAST_INVALID_INPUT] The value 'not a time' of the type "STRING"
> cannot be cast
> to "TIME(6)" ...}}
> Expected: the {{targetType}} should be {{"TIME(9)"}} (the precision the user
> actually
> requested).
> h2. Root cause
> {{SparkDateTimeUtils.stringToTimeAnsi}} hardcodes the default {{TimeType()}}
> (precision 6) when constructing the error, instead of the target type:
> {code:scala}
> def stringToTimeAnsi(s: UTF8String, context: QueryContext = null): Long = {
> stringToTime(s).getOrElse {
> throw ExecutionErrors.invalidInputInCastToDatetimeError(s, TimeType(),
> context)
> }
> }
> {code}
> The {{Cast}} string-to-TIME paths call {{stringToTimeAnsi}} and then truncate
> to
> {{to.precision}}, but the target precision never reaches the error
> ({{Cast.scala}}, {{castToTime}} / {{castToTimeCode}}). This differs from the
> nanosecond
> TIMESTAMP casts, which already pass the real type, e.g.
> {{invalidInputInCastToDatetimeError(s, TimestampLTZNanosType(precision),
> ...)}}.
> h2. Proposed fix
> * Add a precision-aware overload of {{stringToTimeAnsi}} (e.g. carry a
> {{TimeType}} /
> precision argument) that reports {{TimeType(p)}} in the error.
> * Update {{Cast}} {{castToTime}} and {{castToTimeCode}} (interpreted +
> codegen) to pass
> the cast's target {{to}} type.
> * Keep a no-arg/default overload for the non-cast callers (e.g.
> {{DefaultTimeFormatter.parse}}) that legitimately default to {{TIME(6)}}.
> h2. Acceptance criteria
> * {{CAST('x' AS TIME(p))}} (ANSI on) reports {{targetType = "TIME(p)"}} for p
> in [0, 9],
> in both the interpreted and codegen paths.
> * A unit test asserts the target precision is reflected (e.g. {{TIME(9)}}),
> in addition to
> the existing {{TIME(6)}} default-path cases in {{DateTimeUtilsSuite}} /
> {{TimeFormatterSuite}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]