[
https://issues.apache.org/jira/browse/IMPALA-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830360#comment-16830360
]
Matthias commented on IMPALA-4018:
----------------------------------
I have a strong opinion about APIs that do not surprise (see [Principle of
least astonishment|http://wiki.c2.com/?PrincipleOfLeastAstonishment]), i.e.:
{quote}The Principle of Least Astonishment states that the result of performing
some operation should be obvious, consistent, and predictable, based upon the
name of the operation and other clues.
{quote}
The defined "SQL" language is the API for the user and the same logic applies
to it. I do understand, and support the desire to align more with the SQL:2016
standard, but this is not where the code started from (see this comment which
is also aligned with what Hive is doing):
{quote}The original intention was to eventually mimic what the SimpleDateFormat
offers
([https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html])
{quote}
I guess that's the reason why we are talking about feature flags in the first
place.
IMHO, putting one format pattern behind a feature flag but not the other would
be very much against the principle of least astonishment. Therefore, we either
change all of them with or without a feature flag. The latter is probably a bad
idea considering the user base and backwards compatibility. Thus, breaking is
not an option and the therefore the only option I see is a feature flag for all.
[~grahn]: What speaks against having ONE feature flag for all format patterns.
The user can select (per session, per server, per ... <whatever>) whether he
wants to use a Java-style format pattern or an SQL:2016 format pattern. That
format pattern would then be used for ALL functions, be it `CAST()`,
`TO_TIMESTAMP()`, ... . This would keep the API consistent, but leave it up to
the user to choose. From what I tell this would still fulfill your requirement,
but please correct me if I'm wrong.
Having pluggable implementations (selected by feature flag) would allow other
formatters. What about the one from
[ICU|http://userguide.icu-project.org/formatparse/datetime]?
Some small side notes:
* There are already some inconsistencies, like `TRUNC(TIMESTAMP timestamp,
STRING unit)` would accept `MI` instead of `mm`, but I guess that's because
it's a unit and not a format pattern.
> Add support for SQL:2016 datetime templates/patterns/masks to CAST(... AS ...
> FORMAT <template>)
> ------------------------------------------------------------------------------------------------
>
> Key: IMPALA-4018
> URL: https://issues.apache.org/jira/browse/IMPALA-4018
> Project: IMPALA
> Issue Type: New Feature
> Components: Frontend
> Affects Versions: Impala 2.2.4
> Reporter: Greg Rahn
> Assignee: Gabor Kaszab
> Priority: Critical
> Labels: ansi-sql, compatibility, sql-language
>
> *Summary*
> The format masks/templates for currently are implemented using the [Java
> SimpleDateFormat
> patterns|http://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html],
> and although this is what Hive has implemented, it is not what most standard
> SQL systems implement. For example see
> [Vertica|https://my.vertica.com/docs/7.2.x/HTML/Content/Authoring/SQLReferenceManual/Functions/Formatting/TemplatePatternsForDateTimeFormatting.htm],
>
> [Netezza|http://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_ntz_sql_extns_templ_patterns_date_time_conv.html],
>
> [Oracle|https://docs.oracle.com/database/121/SQLRF/sql_elements004.htm#SQLRF00212],
> and
> [PostgreSQL|https://www.postgresql.org/docs/9.5/static/functions-formatting.html#FUNCTIONS-FORMATTING-DATETIME-TABLE].
>
> *Examples of incompatibilities*
> {noformat}
> -- PostgreSQL/Netezza/Vertica/Oracle
> select to_timestamp('May 15, 2015 12:00:00', 'mon dd, yyyy hh:mi:ss');
> -- Impala
> select to_timestamp('May 15, 2015 12:00:00', 'MMM dd, yyyy HH:mm:ss');
> -- PostgreSQL/Netezza/Vertica/Oracle
> select to_timestamp('2015-02-14 20:19:07','yyyy-mm-dd hh24:mi:ss');
> -- Impala
> select to_timestamp('2015-02-14 20:19:07','yyyy-MM-dd HH:mm:ss');
> -- Vertica/Oracle
> select to_timestamp('2015-02-14 20:19:07.123456','yyyy-mm-dd hh24:mi:ss.ff');
> -- Impala
> select to_timestamp('2015-02-14 20:19:07.123456','yyyy-MM-dd
> HH:mm:ss.SSSSSS');
> {noformat}
> *Considerations*
> Because this is a change in default behavior for to_timestamp(), if possible,
> having a feature flag to revert to the legacy Java SimpleDateFormat patterns
> should be strongly considered. This would allow users to chose the behavior
> they desire and scope it to a session if need be.
> SQL:2016 defines the following datetime templates
> {noformat}
> <datetime template> ::=
> { <datetime template part> }...
> <datetime template part> ::=
> <datetime template field>
> | <datetime template delimiter>
> <datetime template field> ::=
> <datetime template year>
> | <datetime template rounded year>
> | <datetime template month>
> | <datetime template day of month>
> | <datetime template day of year>
> | <datetime template 12-hour>
> | <datetime template 24-hour>
> | <datetime template minute>
> | <datetime template second of minute>
> | <datetime template second of day>
> | <datetime template fraction>
> | <datetime template am/pm>
> | <datetime template time zone hour>
> | <datetime template time zone minute>
> <datetime template delimiter> ::=
> <minus sign>
> | <period>
> | <solidus>
> | <comma>
> | <apostrophe>
> | <semicolon>
> | <colon>
> | <space>
> <datetime template year> ::=
> YYYY | YYY | YY | Y
> <datetime template rounded year> ::=
> RRRR | RR
> <datetime template month> ::=
> MM
> <datetime template day of month> ::=
> DD
> <datetime template day of year> ::=
> DDD
> <datetime template 12-hour> ::=
> HH | HH12
> <datetime template 24-hour> ::=
> HH24
> <datetime template minute> ::=
> MI
> <datetime template second of minute> ::=
> SS
> <datetime template second of day> ::=
> SSSSS
> <datetime template fraction> ::=
> FF1 | FF2 | FF3 | FF4 | FF5 | FF6 | FF7 | FF8 | FF9
> <datetime template am/pm> ::=
> A.M. | P.M.
> <datetime template time zone hour> ::=
> TZH
> <datetime template time zone minute> ::=
> TZM
> {noformat}
> SQL:2016 also introduced the FORMAT clause for CAST which is the standard way
> to do string <> datetime conversions
> {noformat}
> <cast specification> ::=
> CAST <left paren>
> <cast operand> AS <cast target>
> [ FORMAT <cast template> ]
> <right paren>
> <cast operand> ::=
> <value expression>
> | <implicitly typed value specification>
> <cast target> ::=
> <domain name>
> | <data type>
> <cast template> ::=
> <character string literal>
> {noformat}
> For example:
> {noformat}
> CAST(<datetime> AS <char string type> [FORMAT <template>])
> CAST(<char string> AS <datetime type> [FORMAT <template>])
> cast(dt as string format 'DD-MM-YYYY')
> cast('01-05-2017' as date format 'DD-MM-YYYY')
> {noformat}
> *Update*
> Here is the proposal for the new datetime patterns and their semantics:
> https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]