[ 
https://issues.apache.org/jira/browse/PHOENIX-967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839329#comment-16839329
 ] 

Julian commented on PHOENIX-967:
--------------------------------

I have a business use case for which I wish to perform a query to find only the 
latest event status rows over a given timeframe, but for which the traditional 
time series approach taken in Phoenix-914 is not a good fit. I find the 
solution for Phoenix-914 is not at a generic approach to using hbase 
timestamps, but rather focuses on a given problem. I'm looking rather looking 
for a way to use the general hbase features via sql.

The Use Case is as follows -

1) We receive events on basis of an incident number, which is the primary row 
key.
2) We keep updating the row as we receive more events for that incident with 
the latest status information. Thus it is important we maintain a single row 
with the current status which also contains the incident created time, last 
updated time etc, whether the it is still open or closed etc. Versions may also 
be interesting in future to see the history of the incident, but right now we 
are only interested in most recent version.
3) Some of the End User queries need to find any incidents that have been 
updated in the last 7 days, thus are time based, however they also need to be 
able to query a specific incident for example by it's incident number without 
triggering long scans. Note, two years of incidents are in the table, the 
majority of which will not have been updated since being closed, so querying on 
last 7 days is a limited data set of the full data.

In the current phoenix implementation, there seems no way for me to query the 
latest version of rows over a time range in Phoenix without using PHOENIX-914 
feature, but I can't due to the primary key requirements (we don't want time in 
the key). Rather I am looking simply for a way to specify the TIMERANGE, in a 
similar way as I can when performing an HBASE timerange scan directly on Hbase. 
and get only the latest version in my case.

In addition to the above, I am also not seemingly able to set the row / column 
timestamps to the event timestamp in the way I want to also, as seems this is 
only possibly by making a connection and setting a connection level property 
but that would need to be for every event (not handy) or use the PHOENIX-914 
features but that means putting the timestamp into the primary key which I do 
not want. So nothing is fitting here well.

It's a shame, as the Use Case is not that complex, hbase wise but also 
potentially phoenix wise. Ultimately I am unable to use Phoenix for this use 
case and we have to look at other solutions.

Note, secondary indexing we are testing, however this is not delivering the 
performance results we need whilst the time range scan via hbase api is working 
perfectly. This however is not End User friendly and we were hoping to enable 
SQL via Hive on Phoenix on Hbase for this use case for some end user queries. 

 

 

 

> Consider (some of) SQL 2011 temporal support
> --------------------------------------------
>
>                 Key: PHOENIX-967
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-967
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>            Priority: Major
>
> Language extensions for temporal data were added in SQL 2011 (*1), providing 
> among other things the ability to define period definitions as metadata to 
> table, system-versioned tables (CREATE TABLE ... WITH SYSTEM VERSIONING), the 
> ability to query by system version time, and syntax for conditions involving 
> periods: CONTAINS, OVERLAPS, EQUALS, PRECEDES, SUCCEEDS, IMMEDIATELY 
> PRECEDES, and IMMEDIATELY SUCCEEDS.
> While not proposing support for all of SQL 2011 temporal features, it seems 
> that system versioning, the new temporal comparison operators, and the new 
> SELECT syntax for versioned queries map pretty naturally to multiversoned 
> HBase table schema and timerange filtering. And at least one commercial 
> database supports temporal queries using an older nonstandard syntax (*2). 
> E.g. from *1:\\
> - {{SELECT ... FOR SYSTEM_TIME AS OF TIMESTAMP <timestamp>...}}
> - {{SELECT ... FOR SYSTEM_TIME BETWEEN TIMESTAMP <timestamp-start> AND 
> TIMESTAMP <timestamp-end> ...}}
> I didn't pay to download the 2011 standard doc so can't say if TIMESTAMP is 
> required or if we can stand in any type that can be serialized to a long 
> integer. 
> And from *2: \\
> - {{SELECT ... AS OF TIMESTAMP <timestamp> ...}}
> - {{SELECT ... VERSIONS BETWEEN TIMESTAMP <timestamp-start> AND 
> <timestamp-end> ...}}
> *1: 
> http://www.sigmod.org/publications/sigmod-record/0612/publications/1209/pdfs/07.industry.kulkarni.pdf
> *2: 
> http://www.oracle.com/technetwork/database/storage/total-recall-whitepaper-171749.pdf
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to