[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16355259#comment-16355259 ] Ankit Singhal commented on HBASE-19826: --- bq. So what if a compaction happens before you run the tool? The cell will be deleted in compaction if you do not have KeepDeleteCells.TRUE, then either you use time range scan or not you can never read it anymore... By default, we actually cap the scan in IndexScrutinyTool to the current timestamp just in order to avoid in-flight writes/Deletes and don't expect these in-flight writes to get compacted during the same time. But depending on the write rate, this surely can happen. [~jamestaylor] , do you have any idea if we handle this case as well somewhere? > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351353#comment-16351353 ] Duo Zhang commented on HBASE-19826: --- So what if a compaction happens before you run the tool? The cell will be deleted in compaction if you do not have KeepDeleteCells.TRUE, then either you use time range scan or not you can never read it anymore... > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350493#comment-16350493 ] Ankit Singhal commented on HBASE-19826: --- {quote}What is a index scrutiny? When do you need to do this? {quote} It's a MapReduce tool which does time range scan on the data table and SKIP SCAN on the index table to verify that index table is in sync with a data table or not. [1]https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/index/IndexScrutinyTool.java > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350441#comment-16350441 ] Duo Zhang commented on HBASE-19826: --- What is a index scrutiny? When do you need to do this? > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350402#comment-16350402 ] Ankit Singhal commented on HBASE-19826: --- sure, (Second use-case mentioned in my earlier [comment|https://issues.apache.org/jira/browse/HBASE-19826?focusedCommentId=16344850&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16344850]):- "While doing Index scrutiny on a live table, time range scan wants to see PUTs not eclipsed by newer DELETE markers.(raw scan cannot be utilized here as it will give all cells even if we have delete markers within the time range)" To achieve this, we were earlier updating the store scanner by setting KeepDeletedCells to true in preStoreScannerOpen hook so that our time range queries will see puts which are deleted at the newer timestamp. Let me know if you need more details. Thanks. > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350390#comment-16350390 ] Duo Zhang commented on HBASE-19826: --- More background please? Thanks. > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350362#comment-16350362 ] Ankit Singhal commented on HBASE-19826: --- We need Scan object to check whether it's a time range query or not. {code} @Override public void preStoreScannerOpen(ObserverContext ctx, Store store, ScanOptions options) throws IOException { //Set KEEP_DELETED_CELLS for time range non-raw scan if (scan.isRaw() || scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP) { return; } options.setKeepDeletedCells(KeepDeletedCells.TRUE); } {code} > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350272#comment-16350272 ] Duo Zhang commented on HBASE-19826: --- And could you please describe the usage in phoenix? We can see how to implement with the new CP hooks in 2.0. Thanks. > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350256#comment-16350256 ] Duo Zhang commented on HBASE-19826: --- There is no Scan object for compaction and flush so we do not provide it. And in general, I do not think you can get a stable result if you reset the ScanOptions for some scans and not for others. > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350245#comment-16350245 ] Ankit Singhal commented on HBASE-19826: --- bq. HBase-2.0 is a big release that breaks things, so I think it may also be a good chance for Phoenix to drop some legacy support when upgrading to HBase-2.0? Yes, we will be planning to remove some legacy stuff with Phoenix 5.0. bq. You can try using attribute to carry some Phoenix only logic.. [~Apache9], In HBase 2.0, we are not getting scan object in preStoreScannerOpen() , is it possible to add scan(at least in Immutable form) in preStoreScannerOpen() hook so that we can decide based on the attributes (like time range scan, raw etc) and set ScanOptions accordingly. > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344966#comment-16344966 ] Duo Zhang commented on HBASE-19826: --- {quote} JDBC is always the first choice when using Phoenix but some people are writing the table using HBase API and querying through Phoenix till they migrate their legacy application. {quote} HBase-2.0 is a big release that breaks things, so I think it may also be a good chance for Phoenix to drop some legacy support when upgrading to HBase-2.0? > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344960#comment-16344960 ] Ankit Singhal commented on HBASE-19826: --- {quote}I do not think it is a good idea to give users the ability to access the table by themselves and still expect correct result. Anyway, I think changing keep delete cells in cp hook is acceptable, but please do not introduce new option to Scan since it is a user API. The max versions and filter almost kill us since the behavior is really strange... {quote} Ok, we can leave adding "seePastDeleteMarkers" option in a scan. {quote}You can try using attribute to carry some Phoenix only logic... {quote} Yep, we can just leverage pre-hooks to make it work for our use-cases. {quote}BTW, will JDBC become the first choice when using Phoenix? {quote} JDBC is always the first choice when using Phoenix but some people are writing the table using HBase API and querying through Phoenix till they migrate their legacy application. Thank you so much, [~Apache9] for the time and review. > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344927#comment-16344927 ] Duo Zhang commented on HBASE-19826: --- I do not think it is a good idea to give users the ability to access the table by themselves and still expect correct result. Anyway, I think changing keep delete cells in cp hook is acceptable, but please do not introduce new option to Scan since it is a user API. The max versions and filter almost kill us since the behavior is really strange... You can try using attribute to carry some Phoenix only logic... BTW, will JDBC become the first choice when using Phoenix? > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344850#comment-16344850 ] Ankit Singhal commented on HBASE-19826: --- {quote}Could phoenix hide the HBase table to user? So that you are free to set KeepDeletedCells.TRUE as user do not know it... {quote} Not really, we let the user have full access to the HBase table. We just don't want to hardcode KeepDeletedCells.TRUE in the descriptor because of performance reason. As all user queries would read the deleted data unnecessary(even after compaction) and increase the latency of the scan. Current use-cases of Phoenix requires it to be set dynamically in user scans or in pre-hook. # During compaction, we want to keep deleted cells only if there are lagging indexes, which will eventually get build by reading the cells of the data table. # While doing Index scrutiny on a live table, time range scan wants to see PUTs not eclipsed by newer DELETE markers.(raw scan cannot be utilized here as it will give all cells even if we have delete markers within the time range) > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344801#comment-16344801 ] Duo Zhang commented on HBASE-19826: --- {quote} Actually, we want to see the put( at t1) arrived before the delete marker( at t2, where t2>t1) in a time range query(t1-1 to t2-1) with non-raw scan on a table, having keep deleted cells set to false in family descriptor {quote} To be honest I really hate these features, the behavior is exactly KeepDeletedCells.TRUE, but you want to get it when KeepDeletedCells.FALSE... Could phoenix hide the HBase table to user? So that you are free to set KeepDeletedCells.TRUE as user do not know it... Of course if only HBASE-19895 I'm OK with it, but I do not want to see lots of these strange feature requests again and again... Thanks. > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344790#comment-16344790 ] Ankit Singhal commented on HBASE-19826: --- {quote}In 2.0, there is a new behavior which also considers mvcc when deciding whether a put can be hide by a deleting marker. A put happens after a delete, even if timestamp of the delete is newer, the delete marker can not hide the put. Is this what you want? {quote} Actually, we want to see the put( at t1) arrived before the delete marker( at t2, where t2>t1) in a time range query(t1-1 to t2-1) with non-raw scan on a table, having keep deleted cells set to false in family descriptor > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344722#comment-16344722 ] Duo Zhang commented on HBASE-19826: --- In 2.0, there is a new behavior which also considers mvcc when deciding whether a put can be hide by a deleting marker. A put happens after a delete, even if timestamp of the delete is newer, the delete marker can not hide the put. Is this what you want? > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344703#comment-16344703 ] Ankit Singhal commented on HBASE-19826: --- Thanks [~stack] and [~yuzhih...@gmail.com], Divided the Jira into two subtasks for a careful review. Uploaded the patch for adding KeepDeletedCells option in ScanOptions. > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Improvement >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334978#comment-16334978 ] Ted Yu commented on HBASE-19826: ScanOptions is marked @InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.COPROC) We can add a method which allows setting KeepDeletedCells. CustomizedScanInfoBuilder would implement the new method so that this can be used by the HStore when creating the scanner. > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Bug >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19826) Provide a option to see rows behind a delete in a time range queries
[ https://issues.apache.org/jira/browse/HBASE-19826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334865#comment-16334865 ] stack commented on HBASE-19826: --- This sounds useful. Any progress here? Pull it back in if progress made. > Provide a option to see rows behind a delete in a time range queries > > > Key: HBASE-19826 > URL: https://issues.apache.org/jira/browse/HBASE-19826 > Project: HBase > Issue Type: Bug >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Fix For: 2.0.0 > > > We can provide an option (something like seePastDeleteMarkers) in a scan to > let the user see the versions behind the delete marker even if > keepDeletedCells is set to false in the descriptor. > With the previous version, we workaround the same in preStoreScannerOpen > hook. For reference PHOENIX-4277 > {code} > @Override > public KeyValueScanner preStoreScannerOpen(final > ObserverContext c, > final Store store, final Scan scan, final NavigableSet > targetCols, > final KeyValueScanner s) throws IOException { > > if (scan.isRaw() || > ScanInfoUtil.isKeepDeletedCells(store.getScanInfo()) || > scan.getTimeRange().getMax() == HConstants.LATEST_TIMESTAMP || > TransactionUtil.isTransactionalTimestamp(scan.getTimeRange().getMax())) { > return s; > } > > ScanInfo scanInfo = > ScanInfoUtil.cloneScanInfoWithKeepDeletedCells(store.getScanInfo()); > return new StoreScanner(store, scanInfo, scan, targetCols, > > c.getEnvironment().getRegion().getReadpoint(scan.getIsolationLevel())); > } > {code} > Another way is to provide a way to set KEEP_DELETED_CELLS to true in > ScanOptions of preStoreScannerOpen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)