Hi Feng,
thanks for proposing this FLIP. It makes a lot of sense to finally
support querying tables at a specific point in time or hopefully also
ranges soon. Following time-versioned tables.
Here is some feedback from my side:
1. Syntax
Can you elaborate a bit on the Calcite restrictions?
Does Calcite currently support `AS OF` syntax for this but not `FOR
SYSTEM_TIME AS OF`?
It would be great to support `AS OF` also for time-versioned joins and
have a unified and short syntax.
Once a fix is merged in Calcite for this, we can make this available in
Flink earlier by copying the corresponding classes until the next
Calcite upgrade is performed.
2. Semantics
How do we interpret the timestamp? In Flink we have 2 timestamp types
(TIMESTAMP and TIMESTAMP_LTZ). If users specify AS OF TIMESTAMP
'2023-04-27 00:00:00', in which timezone will the timestamp be? We will
convert it to TIMESTAMP_LTZ?
We definely need to clarify this because the past has shown that
daylight saving times make our lives hard.
Thanks,
Timo
On 25.05.23 10:57, Feng Jin wrote:
Hi, everyone.
I’d like to start a discussion about FLIP-308: Support Time Travel In Batch
Mode [1]
Time travel is a SQL syntax used to query historical versions of data. It
allows users to specify a point in time and retrieve the data and schema of
a table as it appeared at that time. With time travel, users can easily
analyze and compare historical versions of data.
With the widespread use of data lake systems such as Paimon, Iceberg, and
Hudi, time travel can provide more convenience for users' data analysis.
Looking forward to your opinions, any suggestions are welcomed.
1.
https://cwiki.apache.org/confluence/display/FLINK/FLIP-308%3A+Support+Time+Travel+In+Batch+Mode
Best.
Feng