[
https://issues.apache.org/jira/browse/IMPALA-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758554#comment-16758554
]
Greg Rahn commented on IMPALA-4018:
-----------------------------------
Your understanding is correct.
I see the plan as such:
- Since CAST(...FORMAT...) is net new syntax it should support the ISO SQL
masks only. Other engines that implement the ISO SQL standard will then have
compatible SQL. There will be no confusion as "it follows the ISO standard".
- Once the new ISO SQL masks are supported, we can explore a way to allow
users to use the ISO SQL masks with the legacy conversion functions. This would
allow legacy user SQL code to keep working but also enable migrations from
typical db systems to work as well. Running in mixed-mode (some legacy Java,
some new ISO SQL masks) should be possible, and probably best via a session
setting via query option. Details to be worked out. Noting impact on test
combinations here also.
- Another challenge here is that once DATE is supported in Impala, to_date()
needs to return a DATE, not STRING. The new CAST(...FORMAT...) syntax will
allow CAST( ... AS DATE ...) to work as expected from the beginning, so we
should only introduce support for that in the new syntax once DATE is supported.
On the point of different masks for to_timestamp() and friends vs CAST(... as
TIMESTAMP FORMAT ...): I see limiting the new ISO:2016 syntax to only ISO date
masks as less confusing, especially for folks less familiar with Hive/Impala
SQL specifics. It will be new syntax, but it will have the expected behaviors
for those familiar with ISO SQL. It also means that things that used to work
still do thus minimizing breaking or behavior changes.
The point on Hive/Impala compatible views and syntax is a fair one, and I'm
aware of it, but several differences exist today and likely will in the future,
but for things like this I see the right solution is that all engines converge
on ISO/ANSI SQL compatibility. We can certainly let the Hive community (and
others) know of Impala's adoption of this new ISO SQL syntax.
> Add support for SQL:2016 datetime templates/patterns/masks to CAST(... AS ...
> FORMAT <template>)
> ------------------------------------------------------------------------------------------------
>
> Key: IMPALA-4018
> URL: https://issues.apache.org/jira/browse/IMPALA-4018
> Project: IMPALA
> Issue Type: New Feature
> Components: Frontend
> Affects Versions: Impala 2.2.4
> Reporter: Greg Rahn
> Assignee: Gabor Kaszab
> Priority: Critical
> Labels: ansi-sql, compatibility, sql-language
>
> *Summary*
> The format masks/templates for currently are implemented using the [Java
> SimpleDateFormat
> patterns|http://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html],
> and although this is what Hive has implemented, it is not what most standard
> SQL systems implement. For example see
> [Vertica|https://my.vertica.com/docs/7.2.x/HTML/Content/Authoring/SQLReferenceManual/Functions/Formatting/TemplatePatternsForDateTimeFormatting.htm],
>
> [Netezza|http://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_ntz_sql_extns_templ_patterns_date_time_conv.html],
>
> [Oracle|https://docs.oracle.com/database/121/SQLRF/sql_elements004.htm#SQLRF00212],
> and
> [PostgreSQL|https://www.postgresql.org/docs/9.5/static/functions-formatting.html#FUNCTIONS-FORMATTING-DATETIME-TABLE].
>
> *Examples of incompatibilities*
> {noformat}
> -- PostgreSQL/Netezza/Vertica/Oracle
> select to_timestamp('May 15, 2015 12:00:00', 'mon dd, yyyy hh:mi:ss');
> -- Impala
> select to_timestamp('May 15, 2015 12:00:00', 'MMM dd, yyyy HH:mm:ss');
> -- PostgreSQL/Netezza/Vertica/Oracle
> select to_timestamp('2015-02-14 20:19:07','yyyy-mm-dd hh24:mi:ss');
> -- Impala
> select to_timestamp('2015-02-14 20:19:07','yyyy-MM-dd HH:mm:ss');
> -- Vertica/Oracle
> select to_timestamp('2015-02-14 20:19:07.123456','yyyy-mm-dd hh24:mi:ss.ff');
> -- Impala
> select to_timestamp('2015-02-14 20:19:07.123456','yyyy-MM-dd
> HH:mm:ss.SSSSSS');
> {noformat}
> *Considerations*
> Because this is a change in default behavior for to_timestamp(), if possible,
> having a feature flag to revert to the legacy Java SimpleDateFormat patterns
> should be strongly considered. This would allow users to chose the behavior
> they desire and scope it to a session if need be.
> SQL:2016 defines the following datetime templates
> {noformat}
> <datetime template> ::=
> { <datetime template part> }...
> <datetime template part> ::=
> <datetime template field>
> | <datetime template delimiter>
> <datetime template field> ::=
> <datetime template year>
> | <datetime template rounded year>
> | <datetime template month>
> | <datetime template day of month>
> | <datetime template day of year>
> | <datetime template 12-hour>
> | <datetime template 24-hour>
> | <datetime template minute>
> | <datetime template second of minute>
> | <datetime template second of day>
> | <datetime template fraction>
> | <datetime template am/pm>
> | <datetime template time zone hour>
> | <datetime template time zone minute>
> <datetime template delimiter> ::=
> <minus sign>
> | <period>
> | <solidus>
> | <comma>
> | <apostrophe>
> | <semicolon>
> | <colon>
> | <space>
> <datetime template year> ::=
> YYYY | YYY | YY | Y
> <datetime template rounded year> ::=
> RRRR | RR
> <datetime template month> ::=
> MM
> <datetime template day of month> ::=
> DD
> <datetime template day of year> ::=
> DDD
> <datetime template 12-hour> ::=
> HH | HH12
> <datetime template 24-hour> ::=
> HH24
> <datetime template minute> ::=
> MI
> <datetime template second of minute> ::=
> SS
> <datetime template second of day> ::=
> SSSSS
> <datetime template fraction> ::=
> FF1 | FF2 | FF3 | FF4 | FF5 | FF6 | FF7 | FF8 | FF9
> <datetime template am/pm> ::=
> A.M. | P.M.
> <datetime template time zone hour> ::=
> TZH
> <datetime template time zone minute> ::=
> TZM
> {noformat}
> SQL:2016 also introduced the FORMAT clause for CAST which is the standard way
> to do string <> datetime conversions
> {noformat}
> <cast specification> ::=
> CAST <left paren>
> <cast operand> AS <cast target>
> [ FORMAT <cast template> ]
> <right paren>
> <cast operand> ::=
> <value expression>
> | <implicitly typed value specification>
> <cast target> ::=
> <domain name>
> | <data type>
> <cast template> ::=
> <character string literal>
> {noformat}
> For example:
> {noformat}
> CAST(<datetime> AS <char string type> [FORMAT <template>])
> CAST(<char string> AS <datetime type> [FORMAT <template>])
> cast(dt as string format 'DD-MM-YYYY')
> cast('01-05-2017' as date format 'DD-MM-YYYY')
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]