Max Gekk created SPARK-57588:
--------------------------------
Summary: CAST to TIME(p) reports the wrong target precision in the
CAST_INVALID_INPUT error
Key: SPARK-57588
URL: https://issues.apache.org/jira/browse/SPARK-57588
Project: Spark
Issue Type: Sub-task
Components: SQL
Affects Versions: 4.3.0
Reporter: Max Gekk
Umbrella: SPARK-57550 (Extend support for the TIME data type).
h2. What
When a string cannot be parsed while casting to {{TIME(p)}}, the resulting
{{CAST_INVALID_INPUT}} error always reports the target type as {{"TIME(6)"}},
regardless
of the actual target precision. The error message is therefore inaccurate for
any
precision other than 6.
h2. How to reproduce
{code:sql}
SET spark.sql.ansi.enabled=true;
SELECT CAST('not a time' AS TIME(9));
{code}
Actual: {{[CAST_INVALID_INPUT] The value 'not a time' of the type "STRING"
cannot be cast
to "TIME(6)" ...}}
Expected: the {{targetType}} should be {{"TIME(9)"}} (the precision the user
actually
requested).
h2. Root cause
{{SparkDateTimeUtils.stringToTimeAnsi}} hardcodes the default {{TimeType()}}
(precision 6) when constructing the error, instead of the target type:
{code:scala}
def stringToTimeAnsi(s: UTF8String, context: QueryContext = null): Long = {
stringToTime(s).getOrElse {
throw ExecutionErrors.invalidInputInCastToDatetimeError(s, TimeType(),
context)
}
}
{code}
The {{Cast}} string-to-TIME paths call {{stringToTimeAnsi}} and then truncate to
{{to.precision}}, but the target precision never reaches the error
({{Cast.scala}}, {{castToTime}} / {{castToTimeCode}}). This differs from the
nanosecond
TIMESTAMP casts, which already pass the real type, e.g.
{{invalidInputInCastToDatetimeError(s, TimestampLTZNanosType(precision), ...)}}.
h2. Proposed fix
* Add a precision-aware overload of {{stringToTimeAnsi}} (e.g. carry a
{{TimeType}} /
precision argument) that reports {{TimeType(p)}} in the error.
* Update {{Cast}} {{castToTime}} and {{castToTimeCode}} (interpreted + codegen)
to pass
the cast's target {{to}} type.
* Keep a no-arg/default overload for the non-cast callers (e.g.
{{DefaultTimeFormatter.parse}}) that legitimately default to {{TIME(6)}}.
h2. Acceptance criteria
* {{CAST('x' AS TIME(p))}} (ANSI on) reports {{targetType = "TIME(p)"}} for p
in [0, 9],
in both the interpreted and codegen paths.
* A unit test asserts the target precision is reflected (e.g. {{TIME(9)}}), in
addition to
the existing {{TIME(6)}} default-path cases in {{DateTimeUtilsSuite}} /
{{TimeFormatterSuite}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]