[ 
https://issues.apache.org/jira/browse/HBASE-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501743#comment-13501743
 ] 

Kannan Muthukkaruppan commented on HBASE-2376:
----------------------------------------------

Lars wrote: <<<Flashback queries only makes sense with TTL>>>. This is not 
true. A simple CF with VERSIONS=1 & no TTL (i.e. TTL of infinity) can also 
benefit from ability to FlashBack query. Flash back is simply an ability to 
query the DB as of a previous point in time. Why should we overload that 
functionality with versions, TTL, etc.?

I think it is useful to think of FlashBack as completely independent of other 
settings like TTL, MAXVERSIONS, MINVERSIONS, etc. The latter should be picked 
at schema design time based on the application requirements. For example, you 
may have many tables in your system with different TTL, VERSIONS requirements. 
Maybe you have different CFs within a table, with differing TTL & VERSION 
requirements. 

But on top of all those, suppose across all my tables I want to be able to 
query the entire DB as of a previous point in time. From a user's point of 
view, the only setting they need to worry about is the "time period" (back in 
time) up to which flash back queries are supported.

For example, you might have one CF, with VERSIONS=1, where you are keeping 
hourly rollup data that you want to retain for 1 month (TTL) and, another CF 
where you keep daily rollup data also with VERSIONS=1 where you want to retain 
data for 3 years. But separately, I want the ability to be able to do flash 
back queries up to say 7 days back. This "7 days" should be a completely 
different setting, and there seems to be no reason to confuse this with TTL & 
Verions.

Now, API wise, we need the ability to say that we are doing a flashback query 
i.e. "Scan @ T" instead of regular "Scan". In Oracle DB too, for instance, 
flash back queries have this special syntax:

SELECT * FROM employee 
  AS OF TIMESTAMP <TS>
  WHERE name = 'JOHN';

Regarding <<< So the snapshot scanner is special in that only through this 
specific scanner you can look further back than the TTL.>>>: I think that is by 
design. Note: Scan @ T (flash back query) is different than doing a Scan with 
setTimeRange(0, T). A delete done a T+1 of a key is immaterial for Scan @ T 
query; whereas for a Scan with setTimeRange(0, T), you will still see the 
effect of the delete done at T+1. 

----

In summary, we should not confuse our users by forcing them to change their 
schema design (i.e. choice of VERSIONS, TTL, etc.) to support flashback 
queries. Flashback support should be configured using a simple extra knob that 
can be set a system, table or CF level. We should NOT overload that knob with 
TTL and Versions.

----


 
                
> Add special SnapshotScanner which presents view of all data at some time in 
> the past
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-2376
>                 URL: https://issues.apache.org/jira/browse/HBASE-2376
>             Project: HBase
>          Issue Type: New Feature
>          Components: Client, regionserver
>    Affects Versions: 0.20.3
>            Reporter: Jonathan Gray
>            Assignee: Pritam Damania
>
> In order to support a particular kind of database "snapshot" feature which 
> doesn't require copying data, we came up with the idea for a special 
> SnapshotScanner that would present a view of your data at some point in the 
> past.  The primary use case for this would be to be able to recover 
> particular data/rows (but not all data, like a global rollback) should they 
> have somehow been messed up (application fault, application bug, user error, 
> etc.).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to