While I am not directly answering this question, I want to point out that other 
databases have very flexible formats.
Here is an excerpt from Postgres: 
https://www.postgresql.org/docs/current/datatype-datetime.html

Example         Description
1999-01-08      ISO 8601; January 8 in any mode (recommended format)
January 8, 1999         unambiguous in any datestyle input mode
1/8/1999        January 8 in MDY mode; August 1 in DMY mode
1/18/1999       January 18 in MDY mode; rejected in other modes
01/02/03        January 2, 2003 in MDY mode; February 1, 2003 in DMY mode; 
February 3, 2001 in YMD mode
1999-Jan-08     January 8 in any mode
Jan-08-1999     January 8 in any mode
08-Jan-1999     January 8 in any mode
99-Jan-08       January 8 in YMD mode, else error
08-Jan-99       January 8, except error in YMD mode
Jan-08-99       January 8, except error in YMD mode
19990108        ISO 8601; January 8, 1999 in any mode
990108  ISO 8601; January 8, 1999 in any mode
1999.008        year and day of year
J2451187        Julian date
January 8, 99 BC        year 99 BC

Another thing that Postgres has is the "DateStyle parameter":

DateStyle (string)

    Sets the display format for date and time values, as well as the rules for 
interpreting ambiguous date input values. For historical reasons, this variable 
contains two independent components: the output format specification (ISO, 
Postgres, SQL, or German) and the input/output specification for year/month/day 
ordering (DMY, MDY, or YMD). These can be set separately or together. The 
keywords Euro and European are synonyms for DMY; the keywords US, NonEuro, and 
NonEuropean are synonyms for MDY. See Section 8.5 for more information. The 
built-in default is ISO, MDY, but initdb will initialize the configuration file 
with a setting that corresponds to the behavior of the chosen lc_time locale.

Unfortunately it's not clear how Calcite could support parameters. In postgres 
this is done using:

SET datestyle TO iso

In general I think it's a good policy to be as liberal as possible with inputs 
as long as you do not need to compromise on the correctness of the language. 
Having several modes (strict, lenient) for literals would be nice.

Mihai

-----Original Message-----
From: Stamatis Zampetakis 
Sent: Friday, September 29, 2023 5:05 AM
To: [email protected]
Subject: CALCITE-5678/CALCITE-5957: Datetime literal validation

Hey everyone,

CALCITE-5678, which landed recently in Avatica, enforces strict validation of 
datetime literals and is inline with the SQL standard specification. However, 
this strict validation also leads to behavior changes (e.g., CALCITE-5957) 
since some previously "valid" dates are now considered invalid.

Roughly four people have expressed an opinion on the aforementioned JIRAs but 
since this is a change that will likely have wider impact I would prefer to 
gather some more inputs from the community.

I created a very brief anonymous survey [1] with a few sample literals to aid 
the decision on what Calcite/Avatica should consider valid and invalid from now 
onwards.

I will close the survey in 96 hours from now. Please fill in the survey and/or 
comment under the respective tickets.

Based on the feedback we can opt to do one of the following:
* Revert CALCITE-5678 (restore the liberal parsing of datetime literals)
* Merge CALCITE-5957 (stricter parsing but leading zeros are optional)
* Reject CALCITE-5957 (only strict SQL standard compliant literals)

Best,
Stamatis

[1] https://forms.gle/23f3yVhJCKZJdqQu8

Reply via email to