[ 
https://issues.apache.org/jira/browse/PHOENIX-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16336427#comment-16336427
 ] 

James Taylor edited comment on PHOENIX-4552 at 1/23/18 9:55 PM:
----------------------------------------------------------------

bq. Is it possible to have upsert set the timestamp based on the CurrentSCN? 
Querying with CurrentSCN would then behave exactly as it does now and it 
wouldn't involve any SQL changes.
Setting CurrentSCN is only supported for querying, not for updating (for good 
reason, as "rewriting history" by using a timestamp older than the current one 
is just not a good idea). But you don't really need it because your data will 
use the latest timestamp which would overwrite your previous data. The one 
caveat is that you'd see older column values for columns that you don't set in 
subsequent CSV loads. Would that happen in your case?

If not, this JIRA is really just a read-time optimization (i.e. we could 
prevent older store files that haven't been compacted away from being read). 
This optimization isn't something you'd get from setting the CurrentSCN, as 
that's an upper bound on the timestamp of cells. You need a lower bound (which 
is what this JIRA would give you).


was (Author: jamestaylor):
bq. Is it possible to have upsert set the timestamp based on the CurrentSCN? 
Querying with CurrentSCN would then behave exactly as it does now and it 
wouldn't involve any SQL changes.
Setting CurrentSCN is only supported for querying, not for updating (for good 
reason, as "rewriting history" by using a timestamp older than the current one 
is just not a good idea). But you don't really need it because your data will 
use the latest timestamp which would overwrite your previous data. The one 
caveat is that you'd see older column values for columns that you don't set in 
subsequent CSV loads. Would that happen in your case?

If not, this JIRA is really just a read-time optimization (i.e. we could 
prevent older store files that haven't been compacted away from being read).

> Add support for a ROW_TIMESTAMP that doesn't affect the primary key
> -------------------------------------------------------------------
>
>                 Key: PHOENIX-4552
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4552
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Stephen Petschulat
>            Priority: Minor
>
> By declaring a ROW_TIMESTAMP constraint on a Phoenix table, it does two 
> things 1) expose the hbase native timestamp as this column and 2) prepend 
> your primary key with this timestamp as well.
> It would be useful to have a similar feature that only exposes the hbase 
> native timestamp. This would allow explicit setting of the timestamp when 
> upserting data while allowing multiple hbase versions. It is possible to then 
> query for that specific key and version(s).
> Potential approach:
> {code:sql}
> CREATE TABLE COMMENTS (
>    COMMENT_ID INT NOT NULL,
>    REVISION_NUM BIGINT NOT NULL ROW_TIMESTAMP,    // NEW use of keyword
>    COMMENT_BODY TEXT
>    CONSTRAINT PK PRIMARY KEY(COMMENT_ID))
> UPSERT INTO COMMENTS (123, 1, 'edit 1 comment')
> UPSERT INTO COMMENTS (123, 2, 'edit 2 of comment')
> UPSERT INTO COMMENTS (123, 3, 'edit 3 of comment')
> {code}
>  
> Current behavior of ROW_TIMESTAMP would create a new primary for each upsert, 
> so querying by primary key is no longer straightforward when you don't know 
> the version number at query time. 
> {code:sql}
> SELECT * FROM COMMENTS WHERE COMMENT_ID = 123  // => returns most recent 
> version 'edit 3 of comment'
> SELECT * FROM COMMENTS WHERE COMMENT_ID = 123 AND REVISION_NUM = 1   // => 
> returns explicit version 'edit 1 comment'
> {code}
>  
> It can also be useful to return multiple versions (related: PHOENIX-590)
> {code:sql}
> SELECT * FROM COMMENTS WHERE COMMENT_ID = 123 AND REVISION_NUM < 3   // => 
> returns 2 rows
> {code}
>  
> Or just the highest version less than or equal to a particular version 
> (allowing snapshot queries):
> {code:sql}
> // set CurrentSCN=2 on connection
> SELECT * FROM COMMENTS WHERE COMMENT_ID = 123 // => returns 'edit 2 of 
> comment'
> {code}
> CurrentSCN already allows this type of snapshot query but not against an 
> explicitly set timestamp with multiple versions. The primary key injection 
> prevents this. The above query would behave similar to:
> {code:java}
> scan 'COMMENTS', {TIMERANGE => [0, <maxversionid+1>]}
> {code}
>  This returns the highest versioned value for each key that is less than a 
> specified maximum version number.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to