Hi Feng,

thanks for proposing this FLIP. It makes a lot of sense to finally support querying tables at a specific point in time or hopefully also ranges soon. Following time-versioned tables.

Here is some feedback from my side:

1. Syntax

Can you elaborate a bit on the Calcite restrictions?

Does Calcite currently support `AS OF` syntax for this but not `FOR SYSTEM_TIME AS OF`?

It would be great to support `AS OF` also for time-versioned joins and have a unified and short syntax.

Once a fix is merged in Calcite for this, we can make this available in Flink earlier by copying the corresponding classes until the next Calcite upgrade is performed.

2. Semantics

How do we interpret the timestamp? In Flink we have 2 timestamp types (TIMESTAMP and TIMESTAMP_LTZ). If users specify AS OF TIMESTAMP '2023-04-27 00:00:00', in which timezone will the timestamp be? We will convert it to TIMESTAMP_LTZ?

We definely need to clarify this because the past has shown that daylight saving times make our lives hard.

Thanks,
Timo

On 25.05.23 10:57, Feng Jin wrote:
Hi, everyone.

I’d like to start a discussion about FLIP-308: Support Time Travel In Batch
Mode [1]


Time travel is a SQL syntax used to query historical versions of data. It
allows users to specify a point in time and retrieve the data and schema of
a table as it appeared at that time. With time travel, users can easily
analyze and compare historical versions of data.


With the widespread use of data lake systems such as Paimon, Iceberg, and
Hudi, time travel can provide more convenience for users' data analysis.


Looking forward to your opinions, any suggestions are welcomed.



1.
https://cwiki.apache.org/confluence/display/FLINK/FLIP-308%3A+Support+Time+Travel+In+Batch+Mode



Best.

Feng


Reply via email to