Max Gekk created SPARK-57588:
--------------------------------

             Summary: CAST to TIME(p) reports the wrong target precision in the 
CAST_INVALID_INPUT error
                 Key: SPARK-57588
                 URL: https://issues.apache.org/jira/browse/SPARK-57588
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.3.0
            Reporter: Max Gekk


Umbrella: SPARK-57550 (Extend support for the TIME data type).

h2. What

When a string cannot be parsed while casting to {{TIME(p)}}, the resulting
{{CAST_INVALID_INPUT}} error always reports the target type as {{"TIME(6)"}}, 
regardless
of the actual target precision. The error message is therefore inaccurate for 
any
precision other than 6.

h2. How to reproduce

{code:sql}
SET spark.sql.ansi.enabled=true;
SELECT CAST('not a time' AS TIME(9));
{code}

Actual: {{[CAST_INVALID_INPUT] The value 'not a time' of the type "STRING" 
cannot be cast
to "TIME(6)" ...}}

Expected: the {{targetType}} should be {{"TIME(9)"}} (the precision the user 
actually
requested).

h2. Root cause

{{SparkDateTimeUtils.stringToTimeAnsi}} hardcodes the default {{TimeType()}}
(precision 6) when constructing the error, instead of the target type:

{code:scala}
def stringToTimeAnsi(s: UTF8String, context: QueryContext = null): Long = {
  stringToTime(s).getOrElse {
    throw ExecutionErrors.invalidInputInCastToDatetimeError(s, TimeType(), 
context)
  }
}
{code}

The {{Cast}} string-to-TIME paths call {{stringToTimeAnsi}} and then truncate to
{{to.precision}}, but the target precision never reaches the error
({{Cast.scala}}, {{castToTime}} / {{castToTimeCode}}). This differs from the 
nanosecond
TIMESTAMP casts, which already pass the real type, e.g.
{{invalidInputInCastToDatetimeError(s, TimestampLTZNanosType(precision), ...)}}.

h2. Proposed fix

* Add a precision-aware overload of {{stringToTimeAnsi}} (e.g. carry a 
{{TimeType}} /
  precision argument) that reports {{TimeType(p)}} in the error.
* Update {{Cast}} {{castToTime}} and {{castToTimeCode}} (interpreted + codegen) 
to pass
  the cast's target {{to}} type.
* Keep a no-arg/default overload for the non-cast callers (e.g.
  {{DefaultTimeFormatter.parse}}) that legitimately default to {{TIME(6)}}.

h2. Acceptance criteria

* {{CAST('x' AS TIME(p))}} (ANSI on) reports {{targetType = "TIME(p)"}} for p 
in [0, 9],
  in both the interpreted and codegen paths.
* A unit test asserts the target precision is reflected (e.g. {{TIME(9)}}), in 
addition to
  the existing {{TIME(6)}} default-path cases in {{DateTimeUtilsSuite}} /
  {{TimeFormatterSuite}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to