Can you clarify whether you are interested in (a) streaming queries, (b) 
temporal non-streaming queries, or both? If (b), it is covered by the standard, 
and we should follow the standard, period. If (a), syntax and semantics are 
still under discussion. 

If you want both, could we perhaps separate the streaming and non-streaming 
cases? Could you log a jira case for supporting "FOR SYSTEM_TIME AS OF” in 
regular (non-streaming) queries?

> The SQL standard doesn’t explicitly say the <point in time> must be a 
> constant, just all the databases only support constant.

I disagree. The standard says that it cannot contain a column reference:

> If <query system time period specification> QSTPS is specified, then:
> a) The table identified by the <table or query name> shall be a 
> system-versioned table.
> b) If BETWEEN is speci ed and neither SYMMETRIC nor ASYMMETRIC is specified, 
> then ASYMMETRIC is implicit.
> c) QSTPS shall not contain a <column reference> or an <SQL parameter 
> reference>.

We can go beyond the standard, but let’s be clear what it says.

Julian




> On Jul 31, 2017, at 5:15 AM, 伍翀(云邪) <[email protected]> wrote:
> 
> Hi Julian,
> 
> The syntax you posted looks very good to me. Besides of that, we hope to 
> extend the <point in time> to support column reference(i.e. LATERAL + 
> temporal  syntax). 
> 
> The ‘@ADayAgo’ bind variable is not a good example for our requirement. It is 
> still a query constant, but what we want is a variable timestamp changing 
> with every record. The example in “Streams, joins and temporal tables” [1] 
> fits our requirement much more. It uses the table-value function to express 
> and not introduce new syntax. But it  it is not a standard to manipulate 
> temporal table. And it can only express querying the table content as of a 
> specified point in time, but can not express a period time (can it?). That’s 
> why we prefer the standard temporal syntax.
> 
> The SQL standard doesn’t explicitly say the <point in time> must be a 
> constant, just all the databases only support constant. But I think, it is 
> very useful to support column reference. For example: 
> 
>  SELECT  *
>    FROM Orders AS o
>    JOIN LATERAL ProductPrices FOR SYSTEM_TIME AS OF o.orderTime AS p
>      ON o.productId = p.productId
> 
> This is actually what we want. 
> 
> What do you think?
> 
> [1] 
> https://docs.google.com/document/d/1RvnLEEQK92axdAaZ9XIU5szpkbGqFMBtzYiIY4dHe0Q/edit#
>  
> <https://docs.google.com/document/d/1RvnLEEQK92axdAaZ9XIU5szpkbGqFMBtzYiIY4dHe0Q/edit#>
> - Jark Wu 
> 
>> 在 2017年7月29日,上午2:16,Julian Hyde <[email protected]> 写道:
>> 
>> '@' is specific to SQL Server, and a form of bind variable, so I don't
>> think we should go with that approach.
>> 
>> The standard explicitly says "QSTPS shall not contain a <column
>> reference> or an <SQL parameter reference>." We may choose to ignore
>> that, but I think we can do what you want without bind variables:
>> 
>> ... FROM dept FOR SYSTEM_TIME AS OF CURRENT_TIMESTAMP - INTERVAL '1' DAY
>> 
>> will suffice. (Recall that CURRENT_TIMESTAMP is a function, but its
>> value is constant throughout the execution of the query. So it behaves
>> somewhat like a bind variable.)
>> 
>> Can we go with the syntax in the standard? (The Microsoft syntax
>> diagram has a bug in it - the '| ...' line should be '[ ... ]', I
>> think - and Oracle's syntax throws in TIMESTAMP as a keyword, which
>> would create "AS OF TIMESTAMP TIMESTAMP" if the expression is a
>> timestamp literal.)
>> 
>> The standard has this:
>> 
>> <table primary> ::=
>>     <table or query name>
>>     [ <query system time period specification> ]
>>     [ <correlation or recognition> ]
>>   | ...
>> 
>> <correlation or recognition> ::=
>>     [ AS ] <correlation name>
>>     [ <parenthesized derived column list> ]
>>   | ...
>> 
>> <query system time period specification> ::=
>>     FOR SYSTEM_TIME AS OF <point in time 1>
>>   | FOR SYSTEM_TIME BETWEEN [ ASYMMETRIC | SYMMETRIC ]
>>     <point in time 1> AND <point in time 2>
>>   | FOR SYSTEM_TIME FROM <point in time 1> TO <point in time 2>
>> 
>> <point in time 1> ::=
>>   <point in time>
>> 
>> <point in time 2> ::=
>>   <point in time>
>> 
>> <point in time> ::=
>>   <datetime value expression>
>> 
>> Julian
>> 
>> On Fri, Jul 28, 2017 at 2:49 AM, 伍翀(云邪) <[email protected]> wrote:
>>> Hi Julian,
>>> 
>>> Thanks for the valuable input. We have read  “Streams, joins and temporal
>>> tables” you proposed before I started the discussion. The “temporal table
>>> view” proposed in the doc is very interesting and well-defined. But we hope
>>> to support this feature based on the standard SQL semantics.
>>> 
>>> But my impression of SQL:2011 (based on what is in Oracle) was that
>>> you could execute the WHOLE QUERY as of a particular timestamp, but
>>> you couldn't choose for table A to be at timestamp X and table B to be
>>> at timestamp Y. Furthermore, it only allows the timestamp to be
>>> constant, whereas we require the timestamp to be automatically
>>> varying.
>>> 
>>> 
>>> Regarding to “WHOLE QUERY” or “Table”, I didn't find any declaration in
>>> SQL:2011, but I think it effects the Table not the query. In SQL Server’s
>>> “FROM” syntax[1], it is a subset of table syntax. In Oracle’s example[2], it
>>> follows after table not the whole query. And I also tried in SQL Server, the
>>> following SQL works well. The TemporalRates and TemporalRates2 are both
>>> temporal table with the same schema. The query chooses different timestamp
>>> of table TemporalRates and TemporalRates2:
>>> 
>>> SELECT * FROM TemporalRates2 FOR SYSTEM_TIME AS OF '2017-07-28
>>> 07:49:07.5649523' AS T2
>>> JOIN TemporalRates FOR SYSTEM_TIME AS OF '2017-07-28 04:26:19.5126849' AS
>>> T1
>>> ON T1.id = T2.id;
>>> 
>>> So I think the only controversial point is it only allows constant
>>> timestamp. Then I find the following example from SQL Server doc [3] :
>>> 
>>> DECLARE @ADayAgo datetime2
>>> SET @ADayAgo = DATEADD (day, -1, sysutcdatetime())
>>> /*Comparison between two points in time for subset of rows*/
>>> SELECT D_1_Ago.[DeptID], D.[DeptID],
>>> D_1_Ago.[DeptName], D.[DeptName],
>>> D_1_Ago.[SysStartTime], D.[SysStartTime],
>>> D_1_Ago.[SysEndTime], D.[SysEndTime]
>>> FROM [dbo].[Department] FOR SYSTEM_TIME AS OF @ADayAgo AS D_1_Ago
>>> JOIN [Department] AS D ON  D_1_Ago.[DeptID] = [D].[DeptID]
>>> AND D_1_Ago.[DeptID] BETWEEN 1 and 5 ;
>>> 
>>> The time @ADayAgo is a variable not a constant here, so I’m wondering
>>> whether it is acceptable to support LATERAL temporal to make the timestamp
>>> automatically varying. Which is very useful in many scenarios.
>>> 
>>> [1] https://docs.microsoft.com/en-us/sql/t-sql/queries/from-transact-sql
>>> [2] https://oracle-base.com/articles/12c/temporal-validity-12cr1
>>> [3]
>>> https://docs.microsoft.com/en-us/sql/relational-databases/tables/querying-data-in-a-system-versioned-temporal-table
>>> 
>>> - Jark Wu
>>> 
>>> 在 2017年7月28日,上午9:03,Julian Hyde <[email protected]> 写道:
>>> 
>>> Jark,
>>> 
>>> One more thing.
>>> 
>>> I am reading Tyler Akidau’s "Robust Streaming SQL in Apache Apex, Beam,
>>> Calcite, & Flink”[1] now. It will be a couple of days before I have concrete
>>> feedback but I will say now that Tyler has been giving temporal joins a lot
>>> of thought, seems to have incorporated my thinking, and generally does
>>> excellent work.
>>> 
>>> So, definitely give his document serious consideration, as I am.
>>> 
>>> When we all agree that we have the concepts right, I think it likely that we
>>> can embrace the syntax and semantics of temporal support that were
>>> introduced in SQL:2011. It’s important that we stay within the SQL standard
>>> for areas that it already covers. And by the way, I added PERIOD support to
>>> Calcite a while ago because that looked useful and un-contraversial.
>>> 
>>> Julian
>>> 
>>> [1] http://s.apache.org/streaming-sql-spec
>>> 
>>> [2] https://issues.apache.org/jira/browse/CALCITE-715
>>> 
>>> 
>>> On Jul 27, 2017, at 10:17 AM, Julian Hyde <[email protected]> wrote:
>>> 
>>> This looks very interesting.
>>> 
>>> Are you familiar with the proposal I created in September, "Streams,
>>> joins and temporal tables"[1]? I started thinking about
>>> stream-to-table joins, where the tables where time-varying, and ended
>>> up with temporal database semantics.
>>> 
>>> But my impression of SQL:2011 (based on what is in Oracle) was that
>>> you could execute the WHOLE QUERY as of a particular timestamp, but
>>> you couldn't choose for table A to be at timestamp X and table B to be
>>> at timestamp Y. Furthermore, it only allows the timestamp to be
>>> constant, whereas we require the timestamp to be automatically
>>> varying.
>>> 
>>> I think you have come to similar conclusions. I would like to hear how
>>> your proposal fits with mine.
>>> 
>>> 
>>> Julian
>>> 
>>> [1]
>>> https://lists.apache.org/thread.html/4914256ad347e1d2a72506e2773e59590312d820f04cfcddaffaef1e@%3Cdev.calcite.apache.org%3E
>>> 
>>> On Wed, Jul 26, 2017 at 11:39 PM, 伍翀(云邪) <[email protected]> wrote:
>>> 
>>> Hi all,
>>> 
>>> We are looking for the temporal support in Calcite, or so-called
>>> system-versioned temporal table. The temporal database feature was
>>> introduced in ANSI SQL 2011 [1] and was supported in SQL Server 2017 [2]. A
>>> system-versioned temporal table is designed to keep a full history of data
>>> changes and allow easy point in time analysis. It is very useful for
>>> scenarios that require tracking history of data changes.
>>> 
>>> Because system-versioned tables are intended primarily for tracking
>>> historical data changes, queries on system-versioned tables often tend to be
>>> concerned with retrieving the table content as of a given point in time or
>>> between any two given points in time. SQL:2011 provides FOR SYSTEM_TIME AS
>>> OF syntactic extension for this specific purpose. For example, the following
>>> query retrieves the rows of Emp that were current as of Jan. 2, 2011:
>>>  SELECT ENo,EName,Sys_Start,Sys_End
>>>  FROM Emp FOR SYSTEM_TIME AS OF TIMESTAMP '2011-01-02 00:00:00'
>>> 
>>> In addition, we need the time expression can be a relational expression
>>> whose value is from another table. For example, the following query joins
>>> the Orders to the Prices as the price was at the order time:
>>>  SELECT STREAM *
>>>  FROM Orders AS o
>>>  JOIN LATERAL ProductPrices FOR SYSTEM_TIME AS OF o.orderTime AS p
>>>    ON o.productId = p.productId
>>> 
>>> So I would like to introduce the syntactic extension in Calcite. What do you
>>> think about this? Any comments or suggestions are welcome!
>>> 
>>> [1]
>>> https://sigmodrecord.org/publications/sigmodRecord/1209/pdfs/07.industry.kulkarni.pdf
>>> [2]
>>> https://docs.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables
>>> [3]
>>> https://docs.microsoft.com/en-us/sql/relational-databases/tables/querying-data-in-a-system-versioned-temporal-table
>>> [4] http://sqlhints.com/tag/for-system_time-as-of/
>>> 
>>> Bests,
>>> Jark Wu
>>> 
>>> 
> 

Reply via email to