[
https://issues.apache.org/jira/browse/CALCITE-6624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890424#comment-17890424
]
Dmitry Sysolyatin commented on CALCITE-6624:
--------------------------------------------
I did some further investigation into the issue with parsing literals and data
types. Although this task isn't directly related to parsing literals, it might
be useful to consider if Calcite should eventually handle other custom data
type literals in a similar way to how it does for BigQuery.
# MySQL does not support defining {{{}DATETIME{}}}` literals with syntax like
{{SELECT DATETIME '2020-01-01 10:10:10'}} [1]. The test I wrote initially was
based on the task comment CALCITE-6177, but it is incorrect for MySQL syntax.
The correct syntax should be: {{{}SELECT CAST(CAST(TIMESTAMP '1000-01-01
00:00:00' AS DATETIME) AS DATETIME){}}}. MySQL's SQL syntax is quite messy.
They have both {{TIMESTAMP}} and {{DATETIME}} types, but when someone uses
{{{}TIMESTAMP '1000-01-01 00:00:00'{}}}, it results in a {{DATETIME}} type [1].
??The TIMESTAMP syntax produces a DATETIME value in MySQL because DATETIME has
a range that more closely corresponds to the standard SQL TIMESTAMP type, which
has a year range from 0001 to 9999. (The MySQL TIMESTAMP year range is 1970 to
2038.)??
# I tried to find other databases that have custom data types and allow
defining literals like {{<TYPENAME> <TYPEVALUE>}} for custom types. It seems
only BigQuery and PostgreSQL [2] do this. Other databases, use functions
approach such as {{toDateTime64(<value>)}} (Clickhouse),
{{XMLType('xml_string')}} (Oracle) or just plain CAST.
If In BigQuery case, it's just {{{}DATETIME{}}}, then PostgreSQL has many
custom types, like {{{}float8, float4, timestamptz, inet, cidr{}}}, and more.
I’m not sure if it’s a good idea to define a keyword for each data type in
`Parser.jj`. I think the PostgreSQL implementation should define custom type
names as `nonReservedKeywordsToAdd` in the `config.fmpp` file of the Babel
modules (under the `keywords` section) and add parsing for such literals in the
`literalParserMethods` section. Probably the same can be valid for BigQuery
literal as well.
Regarding parsing custom type names, as [~julianhyde] mentioned. Yes, I think
it is better approach. There is a generic way to parse custom type names using
`CompoundIdentifier`:
{code:java}
typeName = CompoundIdentifier() {
typeNameSpec = new SqlUserDefinedTypeNameSpec(typeName,
s.end(this));
}
{code}
However, this doesn’t work for `DATETIME` because it’s marked as a reserved
keyword. There are two options to fix this:
# Add <DATETIME> to nonReservedKeywordsToAdd section of babel config.fmpp
# Add DATETIME data type name parsing inside dataTypeParserMethods section
I prefer the first approach.
[1] [https://dev.mysql.com/doc/refman/8.4/en/date-and-time-literals.html]
[2]
[https://www.postgresql.org/docs/15/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS-GENERIC]
> SqlParser should parse MySQL DATETIME type
> ------------------------------------------
>
> Key: CALCITE-6624
> URL: https://issues.apache.org/jira/browse/CALCITE-6624
> Project: Calcite
> Issue Type: Bug
> Components: babel, core
> Affects Versions: 1.37.0
> Reporter: Dmitry Sysolyatin
> Assignee: Dmitry Sysolyatin
> Priority: Major
> Labels: pull-request-available
>
> MySQL has two different data types: TIMESTAMP and DATETIME. The difference
> between them is the range they support.
> From the documentation [1]
> ??The TIMESTAMP data type is used for values that contain both date and time
> parts. TIMESTAMP has a range of '1970-01-01 00:00:01' UTC to '2038-01-19
> 03:14:07' UTC.??
> ??The DATETIME type is used for values that contain both date and time parts.
> MySQL retrieves and displays DATETIME values in 'YYYY-MM-DD hh:mm:ss' format.
> The supported range is '1000-01-01 00:00:00' to '9999-12-31 23:59:59'.??
> Calcite's TIMESTAMP likely supports both ranges, and for unparse logic, the
> MySQL dialect class always uses DATETIME because the TIMESTAMP range is a
> subset of the DATETIME range.
> The only missing part is parsing the DATETIME datatype. For example
> {code:java}
> SELECT CAST(timestamp_field AS DATETIME) FROM <table>
> {code}
> [1] [https://dev.mysql.com/doc/refman/8.4/en/datetime.html]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)