[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-02-07 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16355259#comment-16355259
 ] 

Ankit Singhal commented on HBASE-19826:
---

bq. So what if a compaction happens before you run the tool? The cell will be 
deleted in compaction if you do not have KeepDeleteCells.TRUE, then either you 
use time range scan or not you can never read it anymore...

By default, we actually cap the scan in IndexScrutinyTool to the current 
timestamp just in order to avoid in-flight writes/Deletes and don't expect 
these in-flight writes to get compacted during the same time. But depending on 
the write rate, this surely can happen.  [~jamestaylor] , do you have any idea 
if we handle this case as well somewhere?

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-02-03 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351353#comment-16351353
 ] 

Duo Zhang commented on HBASE-19826:
---

So what if a compaction happens before you run the tool? The cell will be 
deleted in compaction if you do not have KeepDeleteCells.TRUE, then either you 
use time range scan or not you can never read it anymore...

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-02-02 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350493#comment-16350493
 ] 

Ankit Singhal commented on HBASE-19826:
---

{quote}What is a index scrutiny? When do you need to do this?
{quote}
It's a MapReduce tool which does time range scan on the data table and SKIP 
SCAN on the index table to verify that index table is in sync with a data table 
or not.
 
 
[1]https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexScrutinyTool.java

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-02-02 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350441#comment-16350441
 ] 

Duo Zhang commented on HBASE-19826:
---

What is a index scrutiny? When do you need to do this?

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-02-02 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350402#comment-16350402
 ] 

Ankit Singhal commented on HBASE-19826:
---

sure, (Second use-case mentioned in my earlier 
[comment|https://issues.apache.org/jira/browse/HBASE-19826?focusedCommentId=16344850&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16344850]):-

"While doing Index scrutiny on a live table, time range scan wants to see PUTs 
not eclipsed by newer DELETE markers.(raw scan cannot be utilized here as it 
will give all cells even if we have delete markers within the time range)"

To achieve this, we were earlier updating the store scanner by setting 
KeepDeletedCells to true in preStoreScannerOpen hook so that our time range 
queries will see puts which are deleted at the newer timestamp.

Let me know if you need more details. Thanks. 

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-02-02 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350390#comment-16350390
 ] 

Duo Zhang commented on HBASE-19826:
---

More background please? Thanks.

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-02-02 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350362#comment-16350362
 ] 

Ankit Singhal commented on HBASE-19826:
---

We need Scan object to check whether it's a time range query or not.

{code}
 @Override
public void 
preStoreScannerOpen(ObserverContext ctx, Store 
store, ScanOptions options)
throws IOException {
//Set KEEP_DELETED_CELLS for time range non-raw scan 
if (scan.isRaw() || scan.getTimeRange().getMax() == 
HConstants.LATEST_TIMESTAMP) { return; }
options.setKeepDeletedCells(KeepDeletedCells.TRUE);
}
{code}

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-02-02 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350272#comment-16350272
 ] 

Duo Zhang commented on HBASE-19826:
---

And could you please describe the usage in phoenix? We can see how to implement 
with the new CP hooks in 2.0.

Thanks.

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-02-02 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350256#comment-16350256
 ] 

Duo Zhang commented on HBASE-19826:
---

There is no Scan object for compaction and flush so we do not provide it.

And in general, I do not think you can get a stable result if you reset the 
ScanOptions for some scans and not for others.

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-02-02 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350245#comment-16350245
 ] 

Ankit Singhal commented on HBASE-19826:
---

bq. HBase-2.0 is a big release that breaks things, so I think it may also be a 
good chance for Phoenix to drop some legacy support when upgrading to HBase-2.0?
Yes, we will be planning to remove some legacy stuff with Phoenix 5.0.

bq. You can try using attribute to carry some Phoenix only logic..
[~Apache9], In HBase 2.0, we are not getting scan object in 
preStoreScannerOpen() , is it possible to add scan(at least in Immutable form) 
in preStoreScannerOpen() hook so that we can decide based on the attributes 
(like time range scan, raw etc) and set ScanOptions accordingly.

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-01-30 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344966#comment-16344966
 ] 

Duo Zhang commented on HBASE-19826:
---

{quote}
JDBC is always the first choice when using Phoenix but some people are writing 
the table using HBase API and querying through Phoenix till they migrate their 
legacy application.
{quote}

HBase-2.0 is a big release that breaks things, so I think it may also be a good 
chance for Phoenix to drop some legacy support when upgrading to HBase-2.0?

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-01-30 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344960#comment-16344960
 ] 

Ankit Singhal commented on HBASE-19826:
---

{quote}I do not think it is a good idea to give users the ability to access the 
table by themselves and still expect correct result.

 Anyway, I think changing keep delete cells in cp hook is acceptable, but 
please do not introduce new option to Scan since it is a user API. The max 
versions and filter almost kill us since the behavior is really strange...
{quote}
Ok, we can leave adding "seePastDeleteMarkers" option in a scan.
{quote}You can try using attribute to carry some Phoenix only logic...
{quote}
 

Yep, we can just leverage pre-hooks to make it work for our use-cases.

 
{quote}BTW, will JDBC become the first choice when using Phoenix?
{quote}
JDBC is always the first choice when using Phoenix but some people are writing 
the table using HBase API and querying through Phoenix till they migrate their 
legacy application.

 

Thank you so much, [~Apache9] for the time and review.

 

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-01-30 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344927#comment-16344927
 ] 

Duo Zhang commented on HBASE-19826:
---

I do not think it is a good idea to give users the ability to access the table 
by themselves and still expect correct result.

Anyway, I think changing keep delete cells in cp hook is acceptable, but please 
do not introduce new option to Scan since it is a user API. The max versions 
and filter almost kill us since the behavior is really strange...

You can try using attribute to carry some Phoenix only logic...

BTW, will JDBC become the first choice when using Phoenix?

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-01-30 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344850#comment-16344850
 ] 

Ankit Singhal commented on HBASE-19826:
---

{quote}Could phoenix hide the HBase table to user? So that you are free to set 
KeepDeletedCells.TRUE as user do not know it...
{quote}
Not really, we let the user have full access to the HBase table. We just don't 
want to hardcode KeepDeletedCells.TRUE in the descriptor because of performance 
reason. As all user queries would read the deleted data unnecessary(even after 
compaction) and increase the latency of the scan.

Current use-cases of Phoenix requires it to be set dynamically in user scans or 
in pre-hook. 
 # During compaction, we want to keep deleted cells only if there are lagging 
indexes, which will eventually get build by reading the cells of the data table.
 # While doing Index scrutiny on a live table, time range scan wants to see 
PUTs not eclipsed by newer DELETE markers.(raw scan cannot be utilized here as 
it will give all cells even if we have delete markers within the time range)

 

 

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-01-30 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344801#comment-16344801
 ] 

Duo Zhang commented on HBASE-19826:
---

{quote}
Actually, we want to see the put( at t1) arrived before the delete marker( at 
t2, where t2>t1) in a time range query(t1-1 to t2-1) with non-raw scan on a 
table, having keep deleted cells set to false in family descriptor
{quote}

To be honest I really hate these features, the behavior is exactly 
KeepDeletedCells.TRUE, but you want to get it when KeepDeletedCells.FALSE...

Could phoenix hide the HBase table to user? So that you are free to set 
KeepDeletedCells.TRUE as user do not know it...

Of course if only HBASE-19895 I'm OK with it, but I do not want to see lots of 
these strange feature requests again and again...

Thanks.

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-01-30 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344790#comment-16344790
 ] 

Ankit Singhal commented on HBASE-19826:
---

{quote}In 2.0, there is a new behavior which also considers mvcc when deciding 
whether a put can be hide by a deleting marker. A put happens after a delete, 
even if timestamp of the delete is newer, the delete marker can not hide the 
put. Is this what you want?
{quote}
Actually, we want to see the put( at t1) arrived before the delete marker( at 
t2, where t2>t1) in a time range query(t1-1 to t2-1) with non-raw scan on a 
table, having keep deleted cells set to false in family descriptor

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-01-30 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344722#comment-16344722
 ] 

Duo Zhang commented on HBASE-19826:
---

In 2.0, there is a new behavior which also considers mvcc when deciding whether 
a put can be hide by a deleting marker. A put happens after a delete, even if 
timestamp of the delete is newer, the delete marker can not hide the put. Is 
this what you want?

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-01-30 Thread Ankit Singhal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344703#comment-16344703
 ] 

Ankit Singhal commented on HBASE-19826:
---

Thanks [~stack] and [~yuzhih...@gmail.com], Divided the Jira into two subtasks 
for a careful review. Uploaded the patch for adding KeepDeletedCells option in 
ScanOptions. 

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-01-22 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334978#comment-16334978
 ] 

Ted Yu commented on HBASE-19826:


ScanOptions is marked 
@InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.COPROC)
We can add a method which allows setting KeepDeletedCells.

CustomizedScanInfoBuilder would implement the new method so that this can be 
used by the HStore when creating the scanner.

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries

2018-01-22 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334865#comment-16334865
 ] 

stack commented on HBASE-19826:
---

This sounds useful. Any progress here? Pull it back in if progress made.

> Provide a option to see rows behind a delete in a time range queries
> 
>
> Key: HBASE-19826
> URL: https://issues.apache.org/jira/browse/HBASE-19826
> Project: HBase
>  Issue Type: Bug
>Reporter: Ankit Singhal
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: 2.0.0
>
>
> We can provide an option (something like seePastDeleteMarkers) in a scan to 
> let the user see the versions behind the delete marker even if 
> keepDeletedCells is set to false in the descriptor.
> With the previous version, we workaround the same in preStoreScannerOpen 
> hook. For reference PHOENIX-4277
> {code}
>   @Override
>   public KeyValueScanner preStoreScannerOpen(final 
> ObserverContext c,
>   final Store store, final Scan scan, final NavigableSet 
> targetCols,
>   final KeyValueScanner s) throws IOException {
>   
> if (scan.isRaw() || 
> ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || 
> scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || 
> TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) {
>   return s;
> }
>   
> ScanInfo scanInfo = 
> ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo());
> return new StoreScanner(store, scanInfo, scan, targetCols,
> 
> c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel()));
>   }
> {code}
> Another way is to provide a way to set KEEP_DELETED_CELLS to true in 
> ScanOptions of preStoreScannerOpen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)