Csaba Ringhofer created IMPALA-11355:
----------------------------------------

             Summary: Add STRING overloads for functions that consume time of 
day
                 Key: IMPALA-11355
                 URL: https://issues.apache.org/jira/browse/IMPALA-11355
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 4.0.0
            Reporter: Csaba Ringhofer


IMPALA-9531 dropped support for "dateless timestamps", e.g. cast("12:05:05" as 
timestamp) now returns NULL.

This led to breaking functions like minute("12:05:05"), as minute() expects a 
timestamp, and Impala adds an implicit cast, so what actually happens is 
minute(cast("12:05:05" as timestamp)), which returns NULL. The same expression 
works in Hive and other databases like mySQL.

IMO the best solution would be to add overloads for similar functions that take 
STRING as argument:
- this would solve the issue above
- would be consistent with Hive where minute() expects a STRING, not a TIMESTAMP
- it would be also faster, as the string->timestamp parsing is done anyway 
during casting, and extracting the minute part of a timestamp needs an a large 
integer division

At the first glance this should be done for the following functions:
HOUR(TIMESTAMP ts)
MINUTE(TIMESTAMP ts)
SECOND(TIMESTAMP ts)
MILLISECOND(TIMESTAMP ts)
 
TRUNC(TIMESTAMP / DATE ts, STRING unit) would also make sense, though I am not 
sure if it ever worked with dateless timestamp (in Hive this also takes STRING)

For functions like MONTH this is not needed for correctness, as we can convert 
"timeless dates" to timestamps or dates without problem. For performance it 
would still make sense to work on STRINGs directly.






--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to