Csaba Ringhofer created IMPALA-11355:
----------------------------------------
Summary: Add STRING overloads for functions that consume time of
day
Key: IMPALA-11355
URL: https://issues.apache.org/jira/browse/IMPALA-11355
Project: IMPALA
Issue Type: Improvement
Components: Backend
Affects Versions: Impala 4.0.0
Reporter: Csaba Ringhofer
IMPALA-9531 dropped support for "dateless timestamps", e.g. cast("12:05:05" as
timestamp) now returns NULL.
This led to breaking functions like minute("12:05:05"), as minute() expects a
timestamp, and Impala adds an implicit cast, so what actually happens is
minute(cast("12:05:05" as timestamp)), which returns NULL. The same expression
works in Hive and other databases like mySQL.
IMO the best solution would be to add overloads for similar functions that take
STRING as argument:
- this would solve the issue above
- would be consistent with Hive where minute() expects a STRING, not a TIMESTAMP
- it would be also faster, as the string->timestamp parsing is done anyway
during casting, and extracting the minute part of a timestamp needs an a large
integer division
At the first glance this should be done for the following functions:
HOUR(TIMESTAMP ts)
MINUTE(TIMESTAMP ts)
SECOND(TIMESTAMP ts)
MILLISECOND(TIMESTAMP ts)
TRUNC(TIMESTAMP / DATE ts, STRING unit) would also make sense, though I am not
sure if it ever worked with dateless timestamp (in Hive this also takes STRING)
For functions like MONTH this is not needed for correctness, as we can convert
"timeless dates" to timestamps or dates without problem. For performance it
would still make sense to work on STRINGs directly.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)