Max Gekk created SPARK-57585:
--------------------------------
Summary: Resolve a common TIME(p) type for mixed-precision
operands in set and conditional operations
Key: SPARK-57585
URL: https://issues.apache.org/jira/browse/SPARK-57585
Project: Spark
Issue Type: Sub-task
Components: SQL
Affects Versions: 4.3.0
Reporter: Max Gekk
h2. What
When two operands have the same datetime family but different fractional-seconds
precision - e.g. {{TIME(6)}} and {{TIME(3)}} - Spark fails to compute a common
type, so
set and conditional operations over mixed {{TIME(p)}} require an explicit
{{CAST}} and
otherwise raise an analysis error. This affects:
* {{UNION}} / {{INTERSECT}} / {{EXCEPT}}
* {{COALESCE}}, {{CASE}} / {{IF}}, {{NULLIF}}
* {{GREATEST}} / {{LEAST}}
* {{IN} subqueries / {{IN}} lists, array/map literals with mixed {{TIME(p)}}
elements
* store assignment (e.g. {{INSERT}} into a {{TIME(p)}} column from a different
precision)
Root cause: {{TypeCoercionHelper.findWiderDateTimeType}} returns {{None}} for
any pair
involving {{TIME}}, so no wider type is derived for two {{TIME(p)}} operands.
h2. Why
ANSI SQL (ISO/IEC 9075-2, the result-type rule for a set of comparable types)
requires
that across datetime operands of differing fractional-seconds precision, the
result type
is the datetime type with the *largest* fractional-seconds precision. Today
Spark diverges
by erroring instead of widening. The {{CAST}} that widening relies on already
exists
({{TIME(p1) -> TIME(p2)}} via {{truncateTimeToPrecision}}), so this is purely a
common-type-resolution gap.
The same gap was explicitly deferred for the nanosecond {{TIMESTAMP}} types in
SPARK-57490
("implicit type coercion / findWiderDateTimeType widening for mixed-precision
nanos
operands (UNION, COALESCE, IF)").
h2. Scope
* Extend {{TypeCoercionHelper.findWiderDateTimeType}} (and the ANSI variant) to
return
{{TIME(max(p1, p2))}} for two {{TIME}} operands.
* Widening (smaller -> larger precision) is a lossless up-cast / valid store
assignment;
narrowing remains explicit-{{CAST}}-only.
* Cover both the default and ANSI type-coercion paths.
* Tests: type-coercion unit tests plus golden-file coverage for {{UNION}},
{{COALESCE}},
{{CASE}}, {{GREATEST}}/{{LEAST}} over mixed {{TIME(p)}}, and store-assignment
widening.
h2. Out of scope
* Common-type resolution between {{TIME}} and other datetime families ({{DATE}},
{{TIMESTAMP}}) - those remain incomparable.
* The analogous nanosecond {{TIMESTAMP}} precision widening (tracked
separately).
h2. Acceptance criteria
* {{SELECT t6 FROM ... UNION SELECT t3 FROM ...}} resolves to {{TIME(6)}}
without an
explicit cast; values from the {{TIME(3)}} side are widened losslessly.
* {{COALESCE(time3, time6)}}, {{CASE}} branches, and {{GREATEST}}/{{LEAST}}
over mixed
{{TIME(p)}} return {{TIME(max(p))}}.
* Inserting a {{TIME(3)}} value into a {{TIME(6)}} column succeeds; the reverse
(narrowing)
still requires an explicit cast.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]