[
https://issues.apache.org/jira/browse/CALCITE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Will Noble updated CALCITE-6055:
--------------------------------
Description:
*Background*:
In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour,
minute, and second, henceforth referred to as clock-calendar parameters. This
does not define an unambiguous instant in time, since timestamps in different
time zones can have different parameters at the same instant. It is analogous
to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}}
type a different definition: “an absolute point in time, independent of any
time zone or convention such as Daylight Savings Time.”
ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}}
simply attaches a time zone to the existing {{TIMESTAMP}} type, making it
_both_ a set of clock-calendar parameters _and_ an unambiguous instant in time.
The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle.
According to Oracle, it “is normalized to the database time zone, and the time
zone offset is not stored as part of the column data. When users retrieve the
data, Oracle returns it in the users' local session time zone.” This act of
conversion between the local time zone and the database time zone means that
users interacting with this kind of timestamp will observe the same instant in
time even though they see different clock-calendar parameters. However, the
local time zone is not considered “part of the data”. Thus, it has the
semantics of an un-zoned, absolute point in time, analogous to a BigQuery
{{TIMESTAMP}}.
This creates the confusing situation where a BigQuery {{DATETIME}} is actually
an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an ISO
{{TIMESTAMP WITH LOCAL TIME ZONE}}.
*Problem*:
This is very similar to CALCITE-5424, which only applied to literals (e.g.
{{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the
{{DateTimeTypeName}}
construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019]
which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS
TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a
{{DATETIME}} type name, and use {{SqlUserDefinedTypeNameSpec}} for both that
and the {{TIMESTAMP}} type names. During validation and sql-to-rel conversion,
we can look up the type names in the UDT map. If no mapping is found for
{{DATETIME}} (which would occur for any dialect besides BigQuery), an error
would occur during validation, because that type does not exist in that
dialect. If no mapping is found for {{TIMESTAMP}}, we would fall back on the
canonical {{TIMESTAMP}} type, preserving the existing behavior for all other
dialects.
*In summary*:
{{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}} should produce a {{TIMESTAMP WITH
LOCAL TIME ZONE}} for BigQuery, but a regular {{TIMESTAMP}} for all other
dialects.
{{CAST('2023-10-17 12:00:00' AS DATETIME)}} should produce a {{TIMESTAMP}} for
BigQuery, but throw a validation error for all other dialects.
was:
*Background*:
In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour,
minute, and second, henceforth referred to as clock-calendar parameters. This
does not define an unambiguous instant in time, since timestamps in different
time zones can have different parameters at the same instant. It is analogous
to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}}
type a different definition: “an absolute point in time, independent of any
time zone or convention such as Daylight Savings Time.”
ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}}
simply attaches a time zone to the existing {{TIMESTAMP}} type, making it both
a set of clock-calendar parameters _and_ an unambiguous instant in time. The
final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. According
to Oracle, it “is normalized to the database time zone, and the time zone
offset is not stored as part of the column data. When users retrieve the data,
Oracle returns it in the users' local session time zone.” This act of
conversion between the local time zone and the database time zone means that
users interacting with this kind of timestamp will observe the same instant in
time even though they see different clock-calendar parameters. However, the
local time zone is not considered “part of the data”. Thus, it has the
semantics of an un-zoned, absolute point in time, analogous to a BigQuery
{{TIMESTAMP}}.
This creates the confusing situation where a BigQuery {{DATETIME}} is actually
an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an ISO
{{TIMESTAMP WITH LOCAL TIME ZONE}}.
*Problem*:
This is very similar to CALCITE-5424, which only applied to literals (e.g.
{{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the
{{DateTimeTypeName}}
construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019]
which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS
TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a
{{DATETIME}} type name, and use {{SqlUserDefinedTypeNameSpec}} for both that
and the {{TIMESTAMP}} type names. During validation and sql-to-rel conversion,
we can look up the type names in the UDT map. If no mapping is found for
{{DATETIME}} (which would occur for any dialect besides BigQuery), an error
would occur during validation, because that type does not exist in that
dialect. If no mapping is found for {{TIMESTAMP}}, we would fall back on the
canonical {{TIMESTAMP}} type, preserving the existing behavior for all other
dialects.
*In summary*:
{{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}} should produce a {{TIMESTAMP WITH
LOCAL TIME ZONE}} for BigQuery, but a regular {{TIMESTAMP}} for all other
dialects.
{{CAST('2023-10-17 12:00:00' AS DATETIME)}} should produce a {{TIMESTAMP}} for
BigQuery, but throw a validation error for all other dialects.
> Customize handling of type name based on type system
> ----------------------------------------------------
>
> Key: CALCITE-6055
> URL: https://issues.apache.org/jira/browse/CALCITE-6055
> Project: Calcite
> Issue Type: Bug
> Reporter: Will Noble
> Priority: Minor
> Labels: pull-request-available
>
> *Background*:
> In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour,
> minute, and second, henceforth referred to as clock-calendar parameters. This
> does not define an unambiguous instant in time, since timestamps in different
> time zones can have different parameters at the same instant. It is analogous
> to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}}
> type a different definition: “an absolute point in time, independent of any
> time zone or convention such as Daylight Savings Time.”
> ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}}
> simply attaches a time zone to the existing {{TIMESTAMP}} type, making it
> _both_ a set of clock-calendar parameters _and_ an unambiguous instant in
> time. The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle.
> According to Oracle, it “is normalized to the database time zone, and the
> time zone offset is not stored as part of the column data. When users
> retrieve the data, Oracle returns it in the users' local session time zone.”
> This act of conversion between the local time zone and the database time zone
> means that users interacting with this kind of timestamp will observe the
> same instant in time even though they see different clock-calendar
> parameters. However, the local time zone is not considered “part of the
> data”. Thus, it has the semantics of an un-zoned, absolute point in time,
> analogous to a BigQuery {{TIMESTAMP}}.
> This creates the confusing situation where a BigQuery {{DATETIME}} is
> actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an
> ISO {{TIMESTAMP WITH LOCAL TIME ZONE}}.
> *Problem*:
> This is very similar to CALCITE-5424, which only applied to literals (e.g.
> {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the
> {{DateTimeTypeName}}
> construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019]
> which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS
> TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a
> {{DATETIME}} type name, and use {{SqlUserDefinedTypeNameSpec}} for both that
> and the {{TIMESTAMP}} type names. During validation and sql-to-rel
> conversion, we can look up the type names in the UDT map. If no mapping is
> found for {{DATETIME}} (which would occur for any dialect besides BigQuery),
> an error would occur during validation, because that type does not exist in
> that dialect. If no mapping is found for {{TIMESTAMP}}, we would fall back on
> the canonical {{TIMESTAMP}} type, preserving the existing behavior for all
> other dialects.
> *In summary*:
> {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}} should produce a {{TIMESTAMP
> WITH LOCAL TIME ZONE}} for BigQuery, but a regular {{TIMESTAMP}} for all
> other dialects.
> {{CAST('2023-10-17 12:00:00' AS DATETIME)}} should produce a {{TIMESTAMP}}
> for BigQuery, but throw a validation error for all other dialects.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)