Hi Julian,
The syntax you posted looks very good to me. Besides of that, we hope to extend
the <point in time> to support column reference(i.e. LATERAL + temporal
syntax).
The ‘@ADayAgo’ bind variable is not a good example for our requirement. It is
still a query constant, but what we want is a variable timestamp changing with
every record. The example in “Streams, joins and temporal tables” [1] fits our
requirement much more. It uses the table-value function to express and not
introduce new syntax. But it it is not a standard to manipulate temporal
table. And it can only express querying the table content as of a specified
point in time, but can not express a period time (can it?). That’s why we
prefer the standard temporal syntax.
The SQL standard doesn’t explicitly say the <point in time> must be a constant,
just all the databases only support constant. But I think, it is very useful to
support column reference. For example:
SELECT *
FROM Orders AS o
JOIN LATERAL ProductPrices FOR SYSTEM_TIME AS OF o.orderTime AS p
ON o.productId = p.productId
This is actually what we want.
What do you think?
[1]
https://docs.google.com/document/d/1RvnLEEQK92axdAaZ9XIU5szpkbGqFMBtzYiIY4dHe0Q/edit#
<https://docs.google.com/document/d/1RvnLEEQK92axdAaZ9XIU5szpkbGqFMBtzYiIY4dHe0Q/edit#>
- Jark Wu
> 在 2017年7月29日,上午2:16,Julian Hyde <[email protected]> 写道:
>
> '@' is specific to SQL Server, and a form of bind variable, so I don't
> think we should go with that approach.
>
> The standard explicitly says "QSTPS shall not contain a <column
> reference> or an <SQL parameter reference>." We may choose to ignore
> that, but I think we can do what you want without bind variables:
>
> ... FROM dept FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP - INTERVAL '1' DAY
>
> will suffice. (Recall that CURRENT_TIMESTAMP is a function, but its
> value is constant throughout the execution of the query. So it behaves
> somewhat like a bind variable.)
>
> Can we go with the syntax in the standard? (The Microsoft syntax
> diagram has a bug in it - the '| ...' line should be '[ ... ]', I
> think - and Oracle's syntax throws in TIMESTAMP as a keyword, which
> would create "AS OF TIMESTAMP TIMESTAMP" if the expression is a
> timestamp literal.)
>
> The standard has this:
>
> <table primary> ::=
> <table or query name>
> [ <query system time period specification> ]
> [ <correlation or recognition> ]
> | ...
>
> <correlation or recognition> ::=
> [ AS ] <correlation name>
> [ <parenthesized derived column list> ]
> | ...
>
> <query system time period specification> ::=
> FOR SYSTEM_TIME AS OF <point in time 1>
> | FOR SYSTEM_TIME BETWEEN [ ASYMMETRIC | SYMMETRIC ]
> <point in time 1> AND <point in time 2>
> | FOR SYSTEM_TIME FROM <point in time 1> TO <point in time 2>
>
> <point in time 1> ::=
> <point in time>
>
> <point in time 2> ::=
> <point in time>
>
> <point in time> ::=
> <datetime value expression>
>
> Julian
>
> On Fri, Jul 28, 2017 at 2:49 AM, 伍翀(云邪) <[email protected]> wrote:
>> Hi Julian,
>>
>> Thanks for the valuable input. We have read “Streams, joins and temporal
>> tables” you proposed before I started the discussion. The “temporal table
>> view” proposed in the doc is very interesting and well-defined. But we hope
>> to support this feature based on the standard SQL semantics.
>>
>> But my impression of SQL:2011 (based on what is in Oracle) was that
>> you could execute the WHOLE QUERY as of a particular timestamp, but
>> you couldn't choose for table A to be at timestamp X and table B to be
>> at timestamp Y. Furthermore, it only allows the timestamp to be
>> constant, whereas we require the timestamp to be automatically
>> varying.
>>
>>
>> Regarding to “WHOLE QUERY” or “Table”, I didn't find any declaration in
>> SQL:2011, but I think it effects the Table not the query. In SQL Server’s
>> “FROM” syntax[1], it is a subset of table syntax. In Oracle’s example[2], it
>> follows after table not the whole query. And I also tried in SQL Server, the
>> following SQL works well. The TemporalRates and TemporalRates2 are both
>> temporal table with the same schema. The query chooses different timestamp
>> of table TemporalRates and TemporalRates2:
>>
>> SELECT * FROM TemporalRates2 FOR SYSTEM_TIME AS OF '2017-07-28
>> 07:49:07.5649523' AS T2
>> JOIN TemporalRates FOR SYSTEM_TIME AS OF '2017-07-28 04:26:19.5126849' AS
>> T1
>> ON T1.id = T2.id;
>>
>> So I think the only controversial point is it only allows constant
>> timestamp. Then I find the following example from SQL Server doc [3] :
>>
>> DECLARE @ADayAgo datetime2
>> SET @ADayAgo = DATEADD (day, -1, sysutcdatetime())
>> /*Comparison between two points in time for subset of rows*/
>> SELECT D_1_Ago.[DeptID], D.[DeptID],
>> D_1_Ago.[DeptName], D.[DeptName],
>> D_1_Ago.[SysStartTime], D.[SysStartTime],
>> D_1_Ago.[SysEndTime], D.[SysEndTime]
>> FROM [dbo].[Department] FOR SYSTEM_TIME AS OF @ADayAgo AS D_1_Ago
>> JOIN [Department] AS D ON D_1_Ago.[DeptID] = [D].[DeptID]
>> AND D_1_Ago.[DeptID] BETWEEN 1 and 5 ;
>>
>> The time @ADayAgo is a variable not a constant here, so I’m wondering
>> whether it is acceptable to support LATERAL temporal to make the timestamp
>> automatically varying. Which is very useful in many scenarios.
>>
>> [1] https://docs.microsoft.com/en-us/sql/t-sql/queries/from-transact-sql
>> [2] https://oracle-base.com/articles/12c/temporal-validity-12cr1
>> [3]
>> https://docs.microsoft.com/en-us/sql/relational-databases/tables/querying-data-in-a-system-versioned-temporal-table
>>
>> - Jark Wu
>>
>> 在 2017年7月28日,上午9:03,Julian Hyde <[email protected]> 写道:
>>
>> Jark,
>>
>> One more thing.
>>
>> I am reading Tyler Akidau’s "Robust Streaming SQL in Apache Apex, Beam,
>> Calcite, & Flink”[1] now. It will be a couple of days before I have concrete
>> feedback but I will say now that Tyler has been giving temporal joins a lot
>> of thought, seems to have incorporated my thinking, and generally does
>> excellent work.
>>
>> So, definitely give his document serious consideration, as I am.
>>
>> When we all agree that we have the concepts right, I think it likely that we
>> can embrace the syntax and semantics of temporal support that were
>> introduced in SQL:2011. It’s important that we stay within the SQL standard
>> for areas that it already covers. And by the way, I added PERIOD support to
>> Calcite a while ago because that looked useful and un-contraversial.
>>
>> Julian
>>
>> [1] http://s.apache.org/streaming-sql-spec
>>
>> [2] https://issues.apache.org/jira/browse/CALCITE-715
>>
>>
>> On Jul 27, 2017, at 10:17 AM, Julian Hyde <[email protected]> wrote:
>>
>> This looks very interesting.
>>
>> Are you familiar with the proposal I created in September, "Streams,
>> joins and temporal tables"[1]? I started thinking about
>> stream-to-table joins, where the tables where time-varying, and ended
>> up with temporal database semantics.
>>
>> But my impression of SQL:2011 (based on what is in Oracle) was that
>> you could execute the WHOLE QUERY as of a particular timestamp, but
>> you couldn't choose for table A to be at timestamp X and table B to be
>> at timestamp Y. Furthermore, it only allows the timestamp to be
>> constant, whereas we require the timestamp to be automatically
>> varying.
>>
>> I think you have come to similar conclusions. I would like to hear how
>> your proposal fits with mine.
>>
>>
>> Julian
>>
>> [1]
>> https://lists.apache.org/thread.html/4914256ad347e1d2a72506e2773e59590312d820f04cfcddaffaef1e@%3Cdev.calcite.apache.org%3E
>>
>> On Wed, Jul 26, 2017 at 11:39 PM, 伍翀(云邪) <[email protected]> wrote:
>>
>> Hi all,
>>
>> We are looking for the temporal support in Calcite, or so-called
>> system-versioned temporal table. The temporal database feature was
>> introduced in ANSI SQL 2011 [1] and was supported in SQL Server 2017 [2]. A
>> system-versioned temporal table is designed to keep a full history of data
>> changes and allow easy point in time analysis. It is very useful for
>> scenarios that require tracking history of data changes.
>>
>> Because system-versioned tables are intended primarily for tracking
>> historical data changes, queries on system-versioned tables often tend to be
>> concerned with retrieving the table content as of a given point in time or
>> between any two given points in time. SQL:2011 provides FOR SYSTEM_TIME AS
>> OF syntactic extension for this specific purpose. For example, the following
>> query retrieves the rows of Emp that were current as of Jan. 2, 2011:
>> SELECT ENo,EName,Sys_Start,Sys_End
>> FROM Emp FOR SYSTEM_TIME AS OF TIMESTAMP '2011-01-02 00:00:00'
>>
>> In addition, we need the time expression can be a relational expression
>> whose value is from another table. For example, the following query joins
>> the Orders to the Prices as the price was at the order time:
>> SELECT STREAM *
>> FROM Orders AS o
>> JOIN LATERAL ProductPrices FOR SYSTEM_TIME AS OF o.orderTime AS p
>> ON o.productId = p.productId
>>
>> So I would like to introduce the syntactic extension in Calcite. What do you
>> think about this? Any comments or suggestions are welcome!
>>
>> [1]
>> https://sigmodrecord.org/publications/sigmodRecord/1209/pdfs/07.industry.kulkarni.pdf
>> [2]
>> https://docs.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables
>> [3]
>> https://docs.microsoft.com/en-us/sql/relational-databases/tables/querying-data-in-a-system-versioned-temporal-table
>> [4] http://sqlhints.com/tag/for-system_time-as-of/
>>
>> Bests,
>> Jark Wu
>>
>>