While I am not directly answering this question, I want to point out that other
databases have very flexible formats.
Here is an excerpt from Postgres:
https://www.postgresql.org/docs/current/datatype-datetime.html
Example Description
1999-01-08 ISO 8601; January 8 in any mode (recommended format)
January 8, 1999 unambiguous in any datestyle input mode
1/8/1999 January 8 in MDY mode; August 1 in DMY mode
1/18/1999 January 18 in MDY mode; rejected in other modes
01/02/03 January 2, 2003 in MDY mode; February 1, 2003 in DMY mode;
February 3, 2001 in YMD mode
1999-Jan-08 January 8 in any mode
Jan-08-1999 January 8 in any mode
08-Jan-1999 January 8 in any mode
99-Jan-08 January 8 in YMD mode, else error
08-Jan-99 January 8, except error in YMD mode
Jan-08-99 January 8, except error in YMD mode
19990108 ISO 8601; January 8, 1999 in any mode
990108 ISO 8601; January 8, 1999 in any mode
1999.008 year and day of year
J2451187 Julian date
January 8, 99 BC year 99 BC
Another thing that Postgres has is the "DateStyle parameter":
DateStyle (string)
Sets the display format for date and time values, as well as the rules for
interpreting ambiguous date input values. For historical reasons, this variable
contains two independent components: the output format specification (ISO,
Postgres, SQL, or German) and the input/output specification for year/month/day
ordering (DMY, MDY, or YMD). These can be set separately or together. The
keywords Euro and European are synonyms for DMY; the keywords US, NonEuro, and
NonEuropean are synonyms for MDY. See Section 8.5 for more information. The
built-in default is ISO, MDY, but initdb will initialize the configuration file
with a setting that corresponds to the behavior of the chosen lc_time locale.
Unfortunately it's not clear how Calcite could support parameters. In postgres
this is done using:
SET datestyle TO iso
In general I think it's a good policy to be as liberal as possible with inputs
as long as you do not need to compromise on the correctness of the language.
Having several modes (strict, lenient) for literals would be nice.
Mihai
-----Original Message-----
From: Stamatis Zampetakis
Sent: Friday, September 29, 2023 5:05 AM
To: [email protected]
Subject: CALCITE-5678/CALCITE-5957: Datetime literal validation
Hey everyone,
CALCITE-5678, which landed recently in Avatica, enforces strict validation of
datetime literals and is inline with the SQL standard specification. However,
this strict validation also leads to behavior changes (e.g., CALCITE-5957)
since some previously "valid" dates are now considered invalid.
Roughly four people have expressed an opinion on the aforementioned JIRAs but
since this is a change that will likely have wider impact I would prefer to
gather some more inputs from the community.
I created a very brief anonymous survey [1] with a few sample literals to aid
the decision on what Calcite/Avatica should consider valid and invalid from now
onwards.
I will close the survey in 96 hours from now. Please fill in the survey and/or
comment under the respective tickets.
Based on the feedback we can opt to do one of the following:
* Revert CALCITE-5678 (restore the liberal parsing of datetime literals)
* Merge CALCITE-5957 (stricter parsing but leading zeros are optional)
* Reject CALCITE-5957 (only strict SQL standard compliant literals)
Best,
Stamatis
[1] https://forms.gle/23f3yVhJCKZJdqQu8