[jira] [Updated] (PHOENIX-7326) Simplify LockManager and make it more efficient
[ https://issues.apache.org/jira/browse/PHOENIX-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7326: --- Fix Version/s: 5.2.1 5.3.0 > Simplify LockManager and make it more efficient > --- > > Key: PHOENIX-7326 > URL: https://issues.apache.org/jira/browse/PHOENIX-7326 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > Fix For: 5.2.1, 5.3.0 > > > Phoenix needs to manage its own row locking for secondary indexes. > LockManager provides this locking. The implementation of row locking was > originally copied for the most part from HRegion.getRowLockInternal > implementation. However, the current implementation is complicated. The > implementation can be simplified and its efficiency can be improved. Also the > correctness of LockManager will be easier to ensure with the simplified > implementation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PHOENIX-7326) Simplify LockManager and make it more efficient
[ https://issues.apache.org/jira/browse/PHOENIX-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir resolved PHOENIX-7326. Resolution: Fixed > Simplify LockManager and make it more efficient > --- > > Key: PHOENIX-7326 > URL: https://issues.apache.org/jira/browse/PHOENIX-7326 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > Phoenix needs to manage its own row locking for secondary indexes. > LockManager provides this locking. The implementation of row locking was > originally copied for the most part from HRegion.getRowLockInternal > implementation. However, the current implementation is complicated. The > implementation can be simplified and its efficiency can be improved. Also the > correctness of LockManager will be easier to ensure with the simplified > implementation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PHOENIX-7314) Enable CompactionScanner for flushes and minor compaction
[ https://issues.apache.org/jira/browse/PHOENIX-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir resolved PHOENIX-7314. Fix Version/s: 5.2.1 5.3.0 Resolution: Fixed > Enable CompactionScanner for flushes and minor compaction > - > > Key: PHOENIX-7314 > URL: https://issues.apache.org/jira/browse/PHOENIX-7314 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.2.0 >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > Fix For: 5.2.1, 5.3.0 > > > Phoenix TTL is currently used for major compaction only. When max lookback is > enabled on a table, PhoenixTTL leads to retaining all cell versions until the > next major compaction. This improvement is for enabling Phoenix TTL, more > specifically CompactionScanner, for flushes and minor compaction to remove > the live or deleted cell versions beyond the max lookback window during > flushes and minor compactions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-7326) Simplify LockManager and make it more efficient
[ https://issues.apache.org/jira/browse/PHOENIX-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-7326: -- Assignee: Kadir Ozdemir > Simplify LockManager and make it more efficient > --- > > Key: PHOENIX-7326 > URL: https://issues.apache.org/jira/browse/PHOENIX-7326 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > Phoenix needs to manage its own row locking for secondary indexes. > LockManager provides this locking. The implementation of row locking was > originally copied for the most part from HRegion.getRowLockInternal > implementation. However, the current implementation is complicated. The > implementation can be simplified and its efficiency can be improved. Also the > correctness of LockManager will be easier to ensure with the simplified > implementation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7326) Simplify LockManager and make it more efficient
Kadir Ozdemir created PHOENIX-7326: -- Summary: Simplify LockManager and make it more efficient Key: PHOENIX-7326 URL: https://issues.apache.org/jira/browse/PHOENIX-7326 Project: Phoenix Issue Type: Improvement Reporter: Kadir Ozdemir Phoenix needs to manage its own row locking for secondary indexes. LockManager provides this locking. The implementation of row locking was originally copied for the most part from HRegion.getRowLockInternal implementation. However, the current implementation is complicated. The implementation can be simplified and its efficiency can be improved. Also the correctness of LockManager will be easier to ensure with the simplified implementation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7314) Enable CompactionScanner for flushes and minor compaction
[ https://issues.apache.org/jira/browse/PHOENIX-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7314: --- Description: Phoenix TTL is currently used for major compaction only. When max lookback is enabled on a table, PhoenixTTL leads to retaining all cell versions until the next major compaction. This improvement is for enabling Phoenix TTL, more specifically CompactionScanner, for flushes and minor compaction to remove the live or deleted cell versions beyond the max lookback window during flushes and minor compactions. (was: Phoenix TTL is currently used for major compaction only. When max lookback is enabled on a table, PhoenixTTL leads to retaining all cell versions until the next major compaction. This improvement is for enabling Phoenix TTL for flushes and minor compaction to remove the live or deleted cell versions beyond the max lookback window during flushes and minor compactions.) > Enable CompactionScanner for flushes and minor compaction > - > > Key: PHOENIX-7314 > URL: https://issues.apache.org/jira/browse/PHOENIX-7314 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.2.0 >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > Phoenix TTL is currently used for major compaction only. When max lookback is > enabled on a table, PhoenixTTL leads to retaining all cell versions until the > next major compaction. This improvement is for enabling Phoenix TTL, more > specifically CompactionScanner, for flushes and minor compaction to remove > the live or deleted cell versions beyond the max lookback window during > flushes and minor compactions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7314) Enable CompactionScanner for flushes and minor compaction
[ https://issues.apache.org/jira/browse/PHOENIX-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7314: --- Summary: Enable CompactionScanner for flushes and minor compaction (was: Phoenix TTL for flushes and minor compaction) > Enable CompactionScanner for flushes and minor compaction > - > > Key: PHOENIX-7314 > URL: https://issues.apache.org/jira/browse/PHOENIX-7314 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.2.0 >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > Phoenix TTL is currently used for major compaction only. When max lookback is > enabled on a table, PhoenixTTL leads to retaining all cell versions until the > next major compaction. This improvement is for enabling Phoenix TTL for > flushes and minor compaction to remove the live or deleted cell versions > beyond the max lookback window during flushes and minor compactions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-7314) Phoenix TTL for flushes and minor compaction
[ https://issues.apache.org/jira/browse/PHOENIX-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-7314: -- Assignee: Kadir Ozdemir > Phoenix TTL for flushes and minor compaction > > > Key: PHOENIX-7314 > URL: https://issues.apache.org/jira/browse/PHOENIX-7314 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.2.0 >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > Phoenix TTL is currently used for major compaction only. When max lookback is > enabled on a table, PhoenixTTL leads to retaining all cell versions until the > next major compaction. This improvement is for enabling Phoenix TTL for > flushes and minor compaction to remove the live or deleted cell versions > beyond the max lookback window during flushes and minor compactions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7314) Phoenix TTL for flushes and minor compaction
[ https://issues.apache.org/jira/browse/PHOENIX-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7314: --- Description: Phoenix TTL is currently used for major compaction only. When max lookback is enabled on a table, PhoenixTTL leads to retaining all cell versions until the next major compaction. This improvement is for enabling Phoenix TTL for flushes and minor compaction to remove the live or deleted cell versions beyond the max lookback window during flushes and minor compactions. (was: Phoenix TTL currently is used for major compaction only. When max lookback is enabled on a table, PhoenixTTL leads to retaining all cell versions until the next major compaction. This improvement is for enabling Phoenix TTL for minor compaction to remove the cell versions beyond max lookback window during minor compactions.) > Phoenix TTL for flushes and minor compaction > > > Key: PHOENIX-7314 > URL: https://issues.apache.org/jira/browse/PHOENIX-7314 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.2.0 >Reporter: Kadir Ozdemir >Priority: Major > > Phoenix TTL is currently used for major compaction only. When max lookback is > enabled on a table, PhoenixTTL leads to retaining all cell versions until the > next major compaction. This improvement is for enabling Phoenix TTL for > flushes and minor compaction to remove the live or deleted cell versions > beyond the max lookback window during flushes and minor compactions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7314) Phoenix TTL for flushes and minor compaction
[ https://issues.apache.org/jira/browse/PHOENIX-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7314: --- Summary: Phoenix TTL for flushes and minor compaction (was: Phoenix TTL for minor compaction) > Phoenix TTL for flushes and minor compaction > > > Key: PHOENIX-7314 > URL: https://issues.apache.org/jira/browse/PHOENIX-7314 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.2.0 >Reporter: Kadir Ozdemir >Priority: Major > > Phoenix TTL currently is used for major compaction only. When max lookback is > enabled on a table, PhoenixTTL leads to retaining all cell versions until the > next major compaction. This improvement is for enabling Phoenix TTL for minor > compaction to remove the cell versions beyond max lookback window during > minor compactions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PHOENIX-7313) All cell versions should not be retained during flushes and minor compaction when maxlookback is disabled
[ https://issues.apache.org/jira/browse/PHOENIX-7313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir resolved PHOENIX-7313. Fix Version/s: 5.2.1 Resolution: Fixed > All cell versions should not be retained during flushes and minor compaction > when maxlookback is disabled > - > > Key: PHOENIX-7313 > URL: https://issues.apache.org/jira/browse/PHOENIX-7313 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.2.0 >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > Fix For: 5.2.1 > > > HBase allows coprocessors to override the column family data retention > properties dynamically within coprocessor hooks for flushes and compactions. > The Phoenix TTL feature overrides the data retention properties such that all > cells including delete markers are preserved and then the decision of what to > be removed is determined in its compaction scanner called CompactionScanner. > However, doing this when max lookback is disabled leads to retaining all row > versions during minor compaction and flushes unnecessarily. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-7313) All cell versions should not be retained during flushes and minor compaction when maxlookback is disabled
[ https://issues.apache.org/jira/browse/PHOENIX-7313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-7313: -- Assignee: Kadir Ozdemir > All cell versions should not be retained during flushes and minor compaction > when maxlookback is disabled > - > > Key: PHOENIX-7313 > URL: https://issues.apache.org/jira/browse/PHOENIX-7313 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.2.0 >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > HBase allows coprocessors to override the column family data retention > properties dynamically within coprocessor hooks for flushes and compactions. > The Phoenix TTL feature overrides the data retention properties such that all > cells including delete markers are preserved and then the decision of what to > be removed is determined in its compaction scanner called CompactionScanner. > However, doing this when max lookback is disabled leads to retaining all row > versions during minor compaction and flushes unnecessarily. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7314) Phoenix TTL for minor compaction
Kadir Ozdemir created PHOENIX-7314: -- Summary: Phoenix TTL for minor compaction Key: PHOENIX-7314 URL: https://issues.apache.org/jira/browse/PHOENIX-7314 Project: Phoenix Issue Type: Improvement Affects Versions: 5.2.0 Reporter: Kadir Ozdemir Phoenix TTL currently is used for major compaction only. When max lookback is enabled on a table, PhoenixTTL leads to retaining all cell versions until the next major compaction. This improvement is for enabling Phoenix TTL for minor compaction to remove the cell versions beyond max lookback window during minor compactions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7313) All cell versions should not be retained during flushes and minor compaction when maxlookback is disabled
[ https://issues.apache.org/jira/browse/PHOENIX-7313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7313: --- Summary: All cell versions should not be retained during flushes and minor compaction when maxlookback is disabled (was: Scan options should not be overridden during flushes and minor compaction when maxlookback is disabled) > All cell versions should not be retained during flushes and minor compaction > when maxlookback is disabled > - > > Key: PHOENIX-7313 > URL: https://issues.apache.org/jira/browse/PHOENIX-7313 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.2.0 >Reporter: Kadir Ozdemir >Priority: Major > > HBase allows coprocessors to override the column family data retention > properties dynamically within coprocessor hooks for flushes and compactions. > The Phoenix TTL feature overrides the data retention properties such that all > cells including delete markers are preserved and then the decision of what to > be removed is determined in its compaction scanner called CompactionScanner. > However, doing this when max lookback is disabled leads to retaining all row > versions during minor compaction and flushes unnecessarily. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7313) Scan options should not be overridden during flushes and minor compaction when maxlookback is disabled
Kadir Ozdemir created PHOENIX-7313: -- Summary: Scan options should not be overridden during flushes and minor compaction when maxlookback is disabled Key: PHOENIX-7313 URL: https://issues.apache.org/jira/browse/PHOENIX-7313 Project: Phoenix Issue Type: Bug Affects Versions: 5.2.0 Reporter: Kadir Ozdemir HBase allows coprocessors to override the column family data retention properties dynamically within coprocessor hooks for flushes and compactions. The Phoenix TTL feature overrides the data retention properties such that all cells including delete markers are preserved and then the decision of what to be removed is determined in its compaction scanner called CompactionScanner. However, doing this when max lookback is disabled leads to retaining all row versions during minor compaction and flushes unnecessarily. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PHOENIX-7245) NPE in Phoenix Coproc leading to Region Server crash
[ https://issues.apache.org/jira/browse/PHOENIX-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir resolved PHOENIX-7245. Resolution: Fixed > NPE in Phoenix Coproc leading to Region Server crash > > > Key: PHOENIX-7245 > URL: https://issues.apache.org/jira/browse/PHOENIX-7245 > Project: Phoenix > Issue Type: Bug > Components: phoenix >Affects Versions: 5.1.1, 5.2.0 >Reporter: Ravi Kishore Valeti >Assignee: Kadir Ozdemir >Priority: Major > Fix For: 5.2.1, 5.3.0, 5.1.4 > > > In our Production, while investigating Region Server crashes, we found that > it is due to Phoenix coproc throwing Null Pointer Exception in > IndexRegionObserver.postBatchMutateIndispensably() method. > Below are the logs > {code:java} > 2024-02-26 13:52:40,716 ERROR > [r.default.FPBQ.Fifo.handler=216,queue=8,port=x] > coprocessor.CoprocessorHost - The coprocessor > org.apache.phoenix.hbase.index.IndexRegionObserver threw > java.lang.NullPointerExceptionjava.lang.NullPointerExceptionat > org.apache.phoenix.hbase.index.IndexRegionObserver.postBatchMutateIndispensably(IndexRegionObserver.java:1301)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1028)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:558)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:631)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutateIndispensably(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.doPostOpCleanupForMiniBatch(HRegion.java:4134)at > > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4573)at > > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4447)at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4369)at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1033)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:951)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:916)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2892)at > > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45961)at > org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)at > org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)at > org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102)at > org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) > 2024-02-26 13:52:40,725 ERROR > [r.default.FPBQ.Fifo.handler=216,queue=8,port=x] > regionserver.HRegionServer - * ABORTING region server > ,x,1708268161243: The coprocessor > org.apache.phoenix.hbase.index.IndexRegionObserver threw > java.lang.NullPointerException *java.lang.NullPointerExceptionat > org.apache.phoenix.hbase.index.IndexRegionObserver.postBatchMutateIndispensably(IndexRegionObserver.java:1301)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1028)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:558)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:631)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutateIndispensably(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.doPostOpCleanupForMiniBatch(HRegion.java:4134)at > > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4573)at > > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4447)at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4369)at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1033)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:951)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:916)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2892)at > >
[jira] [Updated] (PHOENIX-7245) NPE in Phoenix Coproc leading to Region Server crash
[ https://issues.apache.org/jira/browse/PHOENIX-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7245: --- Fix Version/s: 5.1.4 > NPE in Phoenix Coproc leading to Region Server crash > > > Key: PHOENIX-7245 > URL: https://issues.apache.org/jira/browse/PHOENIX-7245 > Project: Phoenix > Issue Type: Bug > Components: phoenix >Affects Versions: 5.1.1, 5.2.0 >Reporter: Ravi Kishore Valeti >Assignee: Kadir Ozdemir >Priority: Major > Fix For: 5.2.1, 5.3.0, 5.1.4 > > > In our Production, while investigating Region Server crashes, we found that > it is due to Phoenix coproc throwing Null Pointer Exception in > IndexRegionObserver.postBatchMutateIndispensably() method. > Below are the logs > {code:java} > 2024-02-26 13:52:40,716 ERROR > [r.default.FPBQ.Fifo.handler=216,queue=8,port=x] > coprocessor.CoprocessorHost - The coprocessor > org.apache.phoenix.hbase.index.IndexRegionObserver threw > java.lang.NullPointerExceptionjava.lang.NullPointerExceptionat > org.apache.phoenix.hbase.index.IndexRegionObserver.postBatchMutateIndispensably(IndexRegionObserver.java:1301)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1028)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:558)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:631)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutateIndispensably(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.doPostOpCleanupForMiniBatch(HRegion.java:4134)at > > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4573)at > > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4447)at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4369)at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1033)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:951)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:916)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2892)at > > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45961)at > org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)at > org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)at > org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102)at > org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) > 2024-02-26 13:52:40,725 ERROR > [r.default.FPBQ.Fifo.handler=216,queue=8,port=x] > regionserver.HRegionServer - * ABORTING region server > ,x,1708268161243: The coprocessor > org.apache.phoenix.hbase.index.IndexRegionObserver threw > java.lang.NullPointerException *java.lang.NullPointerExceptionat > org.apache.phoenix.hbase.index.IndexRegionObserver.postBatchMutateIndispensably(IndexRegionObserver.java:1301)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1028)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:558)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:631)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutateIndispensably(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.doPostOpCleanupForMiniBatch(HRegion.java:4134)at > > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4573)at > > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4447)at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4369)at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1033)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:951)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:916)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2892)at > >
[jira] [Updated] (PHOENIX-7245) NPE in Phoenix Coproc leading to Region Server crash
[ https://issues.apache.org/jira/browse/PHOENIX-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7245: --- Affects Version/s: 5.2.0 > NPE in Phoenix Coproc leading to Region Server crash > > > Key: PHOENIX-7245 > URL: https://issues.apache.org/jira/browse/PHOENIX-7245 > Project: Phoenix > Issue Type: Bug > Components: phoenix >Affects Versions: 5.1.1, 5.2.0 >Reporter: Ravi Kishore Valeti >Assignee: Kadir Ozdemir >Priority: Major > > In our Production, while investigating Region Server crashes, we found that > it is due to Phoenix coproc throwing Null Pointer Exception in > IndexRegionObserver.postBatchMutateIndispensably() method. > Below are the logs > {code:java} > 2024-02-26 13:52:40,716 ERROR > [r.default.FPBQ.Fifo.handler=216,queue=8,port=x] > coprocessor.CoprocessorHost - The coprocessor > org.apache.phoenix.hbase.index.IndexRegionObserver threw > java.lang.NullPointerExceptionjava.lang.NullPointerExceptionat > org.apache.phoenix.hbase.index.IndexRegionObserver.postBatchMutateIndispensably(IndexRegionObserver.java:1301)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1028)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:558)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:631)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutateIndispensably(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.doPostOpCleanupForMiniBatch(HRegion.java:4134)at > > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4573)at > > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4447)at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4369)at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1033)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:951)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:916)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2892)at > > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45961)at > org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)at > org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)at > org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102)at > org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) > 2024-02-26 13:52:40,725 ERROR > [r.default.FPBQ.Fifo.handler=216,queue=8,port=x] > regionserver.HRegionServer - * ABORTING region server > ,x,1708268161243: The coprocessor > org.apache.phoenix.hbase.index.IndexRegionObserver threw > java.lang.NullPointerException *java.lang.NullPointerExceptionat > org.apache.phoenix.hbase.index.IndexRegionObserver.postBatchMutateIndispensably(IndexRegionObserver.java:1301)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1028)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:558)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:631)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutateIndispensably(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.doPostOpCleanupForMiniBatch(HRegion.java:4134)at > > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4573)at > > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4447)at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4369)at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1033)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:951)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:916)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2892)at > > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45961)at >
[jira] [Updated] (PHOENIX-7001) Change Data Capture leveraging Max Lookback and Uncovered Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7001: --- Description: The use cases for a Change Data Capture (CDC) feature are centered around capturing changes to a given table (or updatable view) as these changes happen in near real-time. A CDC application can retrieve changes in real-time or with some delay, or even retrieves the same set of changes multiple times. This means the CDC use case can be generalized as time range queries where the time range is typically short such as last x minutes or hours or expressed as a specific time range in the last n days where n is typically less than 7. A change is an update in a row. That is, a change is either updating one or more columns of a table for a given row or deleting a row. It is desirable to provide these changes in the order of their arrival. One can visualize the delivery of these changes through a stream from a Phoenix table to the application that is initiated by the application similar to the delivery of any other Phoenix query results. The difference is that a regular query result includes at most one result row for each row satisfying the query and the deleted rows are not visible to the query result while the CDC stream/result can include multiple result rows for each row and the result includes deleted rows. Some use cases need to also get the pre and/or post image of the row along with a change on the row. The design proposed here leverages Phoenix Max Lookback and Uncovered Global Indexes. The max lookback feature retains recent changes to a table, that is, the changes that have been done in the last x days typically. This means that the max lookback feature already captures the changes to a given table. Currently, the max lookback age is configurable at the cluster level. We need to extend this capability to be able to configure the max lookback age at the table level so that each table can have a different max lookback age based on its CDC application requirements. To deliver the changes in the order of their arrival, we need a time based index. This index should be uncovered as the changes are already retained in the table by the max lookback feature. The arrival time will be defined as the mutation timestamp generated by the server. An uncovered index would allow us to efficiently and orderly access to the changes. Changes to an index table are also preserved by the max lookback feature. A CDC feature can be composed of the following components: * {*}CDCUncoveredIndexRegionScanner{*}: This is a server side scanner on an uncovered index used for CDC. This can inherit UncoveredIndexRegionScanner. It goes through index table rows using a raw scan to identify data table rows and retrieves these rows using a raw scan. Using the time range, it forms a JSON blob to represent changes to the row including pre and/or post row images. * {*}CDC Query Compiler{*}: This is a client side component. It prepares the scan object based on the given CDC query statement. * {*}CDC DDL Compiler{*}: This is a client side component. It creates the time based uncovered global index based on the given CDC DDL statement and a virtual table of CDC type. CDC will be a new table type. A CDC DDL syntax to create CDC on a (data) table can be as follows: Create CDC on INCLUDE (pre | post) SALT_BUCKETS= The above CDC DDL creates a virtual CDC table and an uncovered index. The CDC table PK columns start with the timestamp and continue with the data table PK columns. The CDC table includes one non-PK column which is a JSON column. The change is expressed in this JSON column in multiple ways based on the CDC DDL or query statement. The change can be expressed as just the mutation for the change, the pre image of the row (the image before the change), the post image, or any combination of these. The CDC table is not a physical table on disk. It is just a virtual table to be used in a CDC query. Phoenix stores just the metadata for this virtual table. A CDC query can be as follow: Select * from where PHOENIX_ROW_TIMESTAMP() >= TO_DATE( …) AND PHOENIX_ROW_TIMESTAMP() < TO_DATE( …) This query would return the rows of the CDC table which is constructed on the server side by CDCUncoveredIndexRegionScanner by joining the uncovered index row versions with the corresponding data table row version (using raw scans). The above select query can be hinted at by using a new CDC hint to return just the actual change, pre, or post image of the row, or a combination of them to overwrite the default JSON column format defined by the CDC DDL statement. The CDC application will run the above query in a loop. When the difference between the current time of the application and the upper limit of the time range of the query becomes less than s milliseconds, say x
[jira] [Assigned] (PHOENIX-7299) ScanningResultIterator should not time out a query after receiving a valid result
[ https://issues.apache.org/jira/browse/PHOENIX-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-7299: -- Assignee: Lokesh Khurana > ScanningResultIterator should not time out a query after receiving a valid > result > - > > Key: PHOENIX-7299 > URL: https://issues.apache.org/jira/browse/PHOENIX-7299 > Project: Phoenix > Issue Type: Bug >Reporter: Kadir Ozdemir >Assignee: Lokesh Khurana >Priority: Major > > Phoenix query time includes setting up scanners and retrieving the very first > result from each of these scanners. The query timeout check in > ScanningResutIterator introduced by PHOENIX-6918 extends the query timeout > check beyond the first result from a given scanner. ScanningResutIterator > should not check if query timeout after the first valid (not dummy) result > from the server. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7299) ScanningResultIterator should not time out a query after receiving a valid result
Kadir Ozdemir created PHOENIX-7299: -- Summary: ScanningResultIterator should not time out a query after receiving a valid result Key: PHOENIX-7299 URL: https://issues.apache.org/jira/browse/PHOENIX-7299 Project: Phoenix Issue Type: Bug Reporter: Kadir Ozdemir Phoenix query time includes setting up scanners and retrieving the very first result from each of these scanners. The query timeout check in ScanningResutIterator introduced by PHOENIX-6918 extends the query timeout check beyond the first result from a given scanner. ScanningResutIterator should not check if query timeout after the first valid (not dummy) result from the server. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-7245) NPE in Phoenix Coproc leading to Region Server crash
[ https://issues.apache.org/jira/browse/PHOENIX-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-7245: -- Assignee: Kadir Ozdemir > NPE in Phoenix Coproc leading to Region Server crash > > > Key: PHOENIX-7245 > URL: https://issues.apache.org/jira/browse/PHOENIX-7245 > Project: Phoenix > Issue Type: Bug > Components: phoenix >Affects Versions: 5.1.1 >Reporter: Ravi Kishore Valeti >Assignee: Kadir Ozdemir >Priority: Major > > In our Production, while investigating Region Server crashes, we found that > it is due to Phoenix coproc throwing Null Pointer Exception in > IndexRegionObserver.postBatchMutateIndispensably() method. > Below are the logs > {code:java} > 2024-02-26 13:52:40,716 ERROR > [r.default.FPBQ.Fifo.handler=216,queue=8,port=x] > coprocessor.CoprocessorHost - The coprocessor > org.apache.phoenix.hbase.index.IndexRegionObserver threw > java.lang.NullPointerExceptionjava.lang.NullPointerExceptionat > org.apache.phoenix.hbase.index.IndexRegionObserver.postBatchMutateIndispensably(IndexRegionObserver.java:1301)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1028)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:558)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:631)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutateIndispensably(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.doPostOpCleanupForMiniBatch(HRegion.java:4134)at > > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4573)at > > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4447)at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4369)at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1033)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:951)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:916)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2892)at > > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45961)at > org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:415)at > org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124)at > org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102)at > org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) > 2024-02-26 13:52:40,725 ERROR > [r.default.FPBQ.Fifo.handler=216,queue=8,port=x] > regionserver.HRegionServer - * ABORTING region server > ,x,1708268161243: The coprocessor > org.apache.phoenix.hbase.index.IndexRegionObserver threw > java.lang.NullPointerException *java.lang.NullPointerExceptionat > org.apache.phoenix.hbase.index.IndexRegionObserver.postBatchMutateIndispensably(IndexRegionObserver.java:1301)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1028)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:558)at > > org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:631)at > > org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutateIndispensably(RegionCoprocessorHost.java:1025)at > > org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.doPostOpCleanupForMiniBatch(HRegion.java:4134)at > > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4573)at > > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4447)at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:4369)at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1033)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:951)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:916)at > > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2892)at > > org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45961)at >
[jira] [Created] (PHOENIX-7228) Document global uncovered and partial Indexes
Kadir Ozdemir created PHOENIX-7228: -- Summary: Document global uncovered and partial Indexes Key: PHOENIX-7228 URL: https://issues.apache.org/jira/browse/PHOENIX-7228 Project: Phoenix Issue Type: Task Reporter: Kadir Ozdemir Fix For: 5.2.0 The new two global secondary index features, uncovered indexes and partial indexes, have been committed to the master branch and will be released as part of the upcoming 5.2.0 release. The [web page for secondary indexes|https://phoenix.apache.org/secondary_indexing.html] should be updated with these new features. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7214) Purging expired rows during minor compaction for immutable tables
[ https://issues.apache.org/jira/browse/PHOENIX-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7214: --- Description: HBase minor compaction does not remove deleted or expired cells since the minor compaction works on a subset of HFiles. However, it is safe to remove expired rows for immutable tables. For immutable tables, rows are inserted but not updated. This means a given row will have only one version.This means we can safely remove expired rows during minor compaction using CompactionScanner in Phoenix. CompactionScanner currently runs only for major compaction. We can introduce an new table attribute called MINOR_COMPACT_TTL. Phoenix can run CompactionScanner for minor compaction too for the tables with MINOR_COMPACT_TTL = TRUE. By doing so, the expired rows will be purged during minor compaction for these tables. This will be useful when TTL is less than 7 days, say 2 days, as major compaction typically runs only once a week. was: HBase minor compaction does not remove deleted or expired cells since the minor compaction works on a subset of HFiles. However, it is safe to remove expired rows for immutable tables without delete markers. For immutable tables, rows are inserted but not updated. This means a given row will have only one version. If these are are not deleted by using delete markers but purged with TTL then, we can safely remove expired rows during minor compaction using CompactionScanner in Phoenix. CompactionScanner currently runs only for major compaction. We can introduce an new table attribute called MINOR_COMPACT_TTL. Phoenix can run CompactionScanner for minor compaction too for the tables with MINOR_COMPACT_TTL = TRUE. By doing so, the expired rows will be purged during minor compaction for these tables. This will be useful when TTL is less than 7 days, say 2 days, as major compaction typically runs only once a week. > Purging expired rows during minor compaction for immutable tables > - > > Key: PHOENIX-7214 > URL: https://issues.apache.org/jira/browse/PHOENIX-7214 > Project: Phoenix > Issue Type: New Feature >Reporter: Kadir Ozdemir >Priority: Major > > HBase minor compaction does not remove deleted or expired cells since the > minor compaction works on a subset of HFiles. However, it is safe to remove > expired rows for immutable tables. For immutable tables, rows are inserted > but not updated. This means a given row will have only one version.This means > we can safely remove expired rows during minor compaction using > CompactionScanner in Phoenix. > CompactionScanner currently runs only for major compaction. We can introduce > an new table attribute called MINOR_COMPACT_TTL. Phoenix can run > CompactionScanner for minor compaction too for the tables with > MINOR_COMPACT_TTL = TRUE. By doing so, the expired rows will be purged during > minor compaction for these tables. This will be useful when TTL is less than > 7 days, say 2 days, as major compaction typically runs only once a week. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7214) Purging expired rows during minor compaction for immutable tables
Kadir Ozdemir created PHOENIX-7214: -- Summary: Purging expired rows during minor compaction for immutable tables Key: PHOENIX-7214 URL: https://issues.apache.org/jira/browse/PHOENIX-7214 Project: Phoenix Issue Type: New Feature Reporter: Kadir Ozdemir HBase minor compaction does not remove deleted or expired cells since the minor compaction works on a subset of HFiles. However, it is safe to remove expired rows for immutable tables without delete markers. For immutable tables, rows are inserted but not updated. This means a given row will have only one version. If these are are not deleted by using delete markers but purged with TTL then, we can safely remove expired rows during minor compaction using CompactionScanner in Phoenix. CompactionScanner currently runs only for major compaction. We can introduce an new table attribute called MINOR_COMPACT_TTL. Phoenix can run CompactionScanner for minor compaction too for the tables with MINOR_COMPACT_TTL = TRUE. By doing so, the expired rows will be purged during minor compaction for these tables. This will be useful when TTL is less than 7 days, say 2 days, as major compaction typically runs only once a week. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PHOENIX-7165) getTable should retrieve PTable from server if not cached
[ https://issues.apache.org/jira/browse/PHOENIX-7165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir resolved PHOENIX-7165. Resolution: Fixed > getTable should retrieve PTable from server if not cached > - > > Key: PHOENIX-7165 > URL: https://issues.apache.org/jira/browse/PHOENIX-7165 > Project: Phoenix > Issue Type: Bug >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > PhoenixConnection#getTable gets a PTable from the PTable cache per JVM. If > the table is not in the cache then, it trows TableNotFoundException. > PhoenixRuntime#getTable calls first PhoenixConnection#getTable and if the > table is not in the cache, it retrieves table from the server. > Since a user table can be evicted from the cache any time, all getTable > methods should retrieve PTable objects from the server if they are not cached > on the client side. > Phoenix internal operations should use PhoenixConnection getTable operations > instead of PhoenixRuntime ones. PhoenixRuntime getTable operations should > simply call the corresponding PhoenixConnection getTable operations in order > to unify the implementation of all getTable operations. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7001) Change Data Capture leveraging Max Lookback and Uncovered Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7001: --- Description: The use cases for a Change Data Capture (CDC) feature are centered around capturing changes to a given table (or updatable view) as these changes happen in near real-time. A CDC application can retrieve changes in real-time or with some delay, or even retrieves the same set of changes multiple times. This means the CDC use case can be generalized as time range queries where the time range is typically short such as last x minutes or hours or expressed as a specific time range in the last n days where n is typically less than 7. A change is an update in a row. That is, a change is either updating one or more columns of a table for a given row or deleting a row. It is desirable to provide these changes in the order of their arrival. One can visualize the delivery of these changes through a stream from a Phoenix table to the application that is initiated by the application similar to the delivery of any other Phoenix query results. The difference is that a regular query result includes at most one result row for each row satisfying the query and the deleted rows are not visible to the query result while the CDC stream/result can include multiple result rows for each row and the result includes deleted rows. Some use cases need to also get the pre and/or post image of the row along with a change on the row. The design proposed here leverages Phoenix Max Lookback and Uncovered (Global or Local) Indexes. The max lookback feature retains recent changes to a table, that is, the changes that have been done in the last x days typically. This means that the max lookback feature already captures the changes to a given table. Currently, the max lookback age is configurable at the cluster level. We need to extend this capability to be able to configure the max lookback age at the table level so that each table can have a different max lookback age based on its CDC application requirements. To deliver the changes in the order of their arrival, we need a time based index. This index should be uncovered as the changes are already retained in the table by the max lookback feature. The arrival time will be defined as the mutation timestamp generated by the server. An uncovered index would allow us to efficiently and orderly access to the changes. Changes to an index table are also preserved by the max lookback feature. A CDC feature can be composed of the following components: * {*}CDCUncoveredIndexRegionScanner{*}: This is a server side scanner on an uncovered index used for CDC. This can inherit UncoveredIndexRegionScanner. It goes through index table rows using a raw scan to identify data table rows and retrieves these rows using a raw scan. Using the time range, it forms a JSON blob to represent changes to the row including pre and/or post row images. * {*}CDC Query Compiler{*}: This is a client side component. It prepares the scan object based on the given CDC query statement. * {*}CDC DDL Compiler{*}: This is a client side component. It creates the time based uncovered (global/local) index based on the given CDC DDL statement and a virtual table of CDC type. CDC will be a new table type. A CDC DDL syntax to create CDC on a (data) table can be as follows: Create CDC on INCLUDE (pre | post | latest | all) INDEX = SALT_BUCKETS= The above CDC DDL creates a virtual CDC table and an uncovered index. The CDC table PK columns start with the timestamp and continue with the data table PK columns. The CDC table includes one non-PK column which is a JSON column. The change is expressed in this JSON column in multiple ways based on the CDC DDL or query statement. The change can be expressed as just the mutation for the change, the latest image of the row, the pre image of the row (the image before the change), the post image, or any combination of these. The CDC table is not a physical table on disk. It is just a virtual table to be used in a CDC query. Phoenix stores just the metadata for this virtual table. A CDC query can be as follow: Select * from where PHOENIX_ROW_TIMESTAMP() >= TO_DATE( …) AND PHOENIX_ROW_TIMESTAMP() < TO_DATE( …) This query would return the rows of the CDC table which is constructed on the server side by CDCUncoveredIndexRegionScanner by joining the uncovered index row versions with the corresponding data table row version (using raw scans). The above select query can be hinted at by using a new CDC hint to return just the actual change, pre, pos, or latest image of the row, or a combination of them to overwrite the default JSON column format defined by the CDC DDL statement. The CDC application will run the above query in a loop. When the difference between the current time of the application and the upper limit
[jira] [Updated] (PHOENIX-7001) Change Data Capture leveraging Max Lookback and Uncovered Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7001: --- Description: The use cases for a Change Data Capture (CDC) feature are centered around capturing changes to a given table (or updatable view) as these changes happen in near real-time. A CDC application can retrieve changes in real-time or with some delay, or even retrieves the same set of changes multiple times. This means the CDC use case can be generalized as time range queries where the time range is typically short such as last x minutes or hours or expressed as a specific time range in the last n days where n is typically less than 7. A change is an update in a row. That is, a change is either updating one or more columns of a table for a given row or deleting a row. It is desirable to provide these changes in the order of their arrival. One can visualize the delivery of these changes through a stream from a Phoenix table to the application that is initiated by the application similar to the delivery of any other Phoenix query results. The difference is that a regular query result includes at most one result row for each row satisfying the query and the deleted rows are not visible to the query result while the CDC stream/result can include multiple result rows for each row and the result includes deleted rows. Some use cases need to also get the pre and/or post image of the row along with a change on the row. The design proposed here leverages Phoenix Max Lookback and Uncovered (Global or Local) Indexes. The max lookback feature retains recent changes to a table, that is, the changes that have been done in the last x days typically. This means that the max lookback feature already captures the changes to a given table. Currently, the max lookback age is configurable at the cluster level. We need to extend this capability to be able to configure the max lookback age at the table level so that each table can have a different max lookback age based on its CDC application requirements. To deliver the changes in the order of their arrival, we need a time based index. This index should be uncovered as the changes are already retained in the table by the max lookback feature. The arrival time will be defined as the mutation timestamp generated by the server. An uncovered index would allow us to efficiently and orderly access to the changes. Changes to an index table are also preserved by the max lookback feature. A CDC feature can be composed of the following components: * {*}CDCUncoveredIndexRegionScanner{*}: This is a server side scanner on an uncovered index used for CDC. This can inherit UncoveredIndexRegionScanner. It goes through index table rows using a raw scan to identify data table rows and retrieves these rows using a raw scan. Using the time range, it forms a JSON blob to represent changes to the row including pre and/or post row images. * {*}CDC Query Compiler{*}: This is a client side component. It prepares the scan object based on the given CDC query statement. * {*}CDC DDL Compiler{*}: This is a client side component. It creates the time based uncovered (global/local) index based on the given CDC DDL statement and a virtual table of CDC type. CDC will be a new table type. A CDC DDL syntax to create CDC on a (data) table can be as follows: Create CDC on INCLUDE (pre | post | latest | all) MAX_LOOKBACK_AGE = INDEX = SALT_BUCKETS= The above CDC DDL creates a virtual CDC table and an uncovered index. The CDC table PK columns start with the timestamp and continue with the data table PK columns. The CDC table includes one non-PK column which is a JSON column. The change is expressed in this JSON column in multiple ways based on the CDC DDL or query statement. The change can be expressed as just the mutation for the change, the latest image of the row, the pre image of the row (the image before the change), the post image, or any combination of these. The CDC table is not a physical table on disk. It is just a virtual table to be used in a CDC query. Phoenix stores just the metadata for this virtual table. A CDC query can be as follow: Select * from where PHOENIX_ROW_TIMESTAMP() >= TO_DATE( …) AND PHOENIX_ROW_TIMESTAMP() < TO_DATE( …) This query would return the rows of the CDC table which is constructed on the server side by CDCUncoveredIndexRegionScanner by joining the uncovered index row versions with the corresponding data table row version (using raw scans). The above select query can be hinted at by using a new CDC hint to return just the actual change, pre, pos, or latest image of the row, or a combination of them to overwrite the default JSON column format defined by the CDC DDL statement. The CDC application will run the above query in a loop. When the difference between the current time of the application
[jira] [Updated] (PHOENIX-7001) Change Data Capture leveraging Max Lookback and Uncovered Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7001: --- Description: The use cases for a Change Data Capture (CDC) feature are centered around capturing changes to a given table (or updatable view) as these changes happen in near real-time. A CDC application can retrieve changes in real-time or with some delay, or even retrieves the same set of changes multiple times. This means the CDC use case can be generalized as time range queries where the time range is typically short such as last x minutes or hours or expressed as a specific time range in the last n days where n is typically less than 7. A change is an update in a row. That is, a change is either updating one or more columns of a table for a given row or deleting a row. It is desirable to provide these changes in the order of their arrival. One can visualize the delivery of these changes through a stream from a Phoenix table to the application that is initiated by the application similar to the delivery of any other Phoenix query results. The difference is that a regular query result includes at most one result row for each row satisfying the query and the deleted rows are not visible to the query result while the CDC stream/result can include multiple result rows for each row and the result includes deleted rows. Some use cases need to also get the pre and/or post image of the row along with a change on the row. The design proposed here leverages Phoenix Max Lookback and Uncovered (Global or Local) Indexes. The max lookback feature retains recent changes to a table, that is, the changes that have been done in the last x days typically. This means that the max lookback feature already captures the changes to a given table. Currently, the max lookback age is configurable at the cluster level. We need to extend this capability to be able to configure the max lookback age at the table level so that each table can have a different max lookback age based on its CDC application requirements. To deliver the changes in the order of their arrival, we need a time based index. This index should be uncovered as the changes are already retained in the table by the max lookback feature. The arrival time will be defined as the mutation timestamp generated by the server. An uncovered index would allow us to efficiently and orderly access to the changes. Changes to an index table are also preserved by the max lookback feature. A CDC feature can be composed of the following components: * {*}CDCUncoveredIndexRegionScanner{*}: This is a server side scanner on an uncovered index used for CDC. This can inherit UncoveredIndexRegionScanner. It goes through index table rows using a raw scan to identify data table rows and retrieves these rows using a raw scan. Using the time range, it forms a JSON blob to represent changes to the row including pre and/or post row images. * {*}CDC Query Compiler{*}: This is a client side component. It prepares the scan object based on the given CDC query statement. * {*}CDC DDL Compiler{*}: This is a client side component. It creates the time based uncovered (global/local) index based on the given CDC DDL statement and a virtual table of CDC type. CDC will be a new table type. A CDC DDL syntax to create CDC on a (data) table can be as follows: Create CDC on INCLUDE (pre | post | latest | all) TTL = INDEX = SALT_BUCKETS= The above CDC DDL creates a virtual CDC table and an uncovered index. The CDC table PK columns start with the timestamp and continue with the data table PK columns. The CDC table includes one non-PK column which is a JSON column. The change is expressed in this JSON column in multiple ways based on the CDC DDL or query statement. The change can be expressed as just the mutation for the change, the latest image of the row, the pre image of the row (the image before the change), the post image, or any combination of these. The CDC table is not a physical table on disk. It is just a virtual table to be used in a CDC query. Phoenix stores just the metadata for this virtual table. A CDC query can be as follow: Select * from where PHOENIX_ROW_TIMESTAMP() >= TO_DATE( …) AND PHOENIX_ROW_TIMESTAMP() < TO_DATE( …) This query would return the rows of the CDC table which is constructed on the server side by CDCUncoveredIndexRegionScanner by joining the uncovered index row versions with the corresponding data table row version (using raw scans). The above select query can be hinted at by using a new CDC hint to return just the actual change, pre, pos, or latest image of the row, or a combination of them to overwrite the default JSON column format defined by the CDC DDL statement. The CDC application will run the above query in a loop. When the difference between the current time of the application and the upper
[jira] [Updated] (PHOENIX-7165) getTable should retrieve PTable from server if not cached
[ https://issues.apache.org/jira/browse/PHOENIX-7165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7165: --- Description: PhoenixConnection#getTable gets a PTable from the PTable cache per JVM. If the table is not in the cache then, it trows TableNotFoundException. PhoenixRuntime#getTable calls first PhoenixConnection#getTable and if the table is not in the cache, it retrieves table from the server. Since a user table can be evicted from the cache any time, all getTable methods should retrieve PTable objects from the server if they are not cached on the client side. Phoenix internal operations should use PhoenixConnection getTable operations instead of PhoenixRuntime ones. PhoenixRuntime getTable operations should simply call the corresponding PhoenixConnection getTable operations in order to unify the implementation of all getTable operations. was: PhoenixConnection#getTable gets a PTable from the PTable cache per JVM. If the table is not in the cache then it trows TableNotFoundException. PhoenixRuntime#getTable calls first PhoenixConnection#getTable and if the table is not in the cache, it retrieves table from the server. Since a user table can be evicted from the cache any time, Phoenix compilers should not use PhoenixConnection#getTable, instead they should use PhoenixRuntime#getTable. > getTable should retrieve PTable from server if not cached > - > > Key: PHOENIX-7165 > URL: https://issues.apache.org/jira/browse/PHOENIX-7165 > Project: Phoenix > Issue Type: Bug >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > PhoenixConnection#getTable gets a PTable from the PTable cache per JVM. If > the table is not in the cache then, it trows TableNotFoundException. > PhoenixRuntime#getTable calls first PhoenixConnection#getTable and if the > table is not in the cache, it retrieves table from the server. > Since a user table can be evicted from the cache any time, all getTable > methods should retrieve PTable objects from the server if they are not cached > on the client side. > Phoenix internal operations should use PhoenixConnection getTable operations > instead of PhoenixRuntime ones. PhoenixRuntime getTable operations should > simply call the corresponding PhoenixConnection getTable operations in order > to unify the implementation of all getTable operations. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7165) getTable should retrieve PTable from server if not cached
[ https://issues.apache.org/jira/browse/PHOENIX-7165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7165: --- Summary: getTable should retrieve PTable from server if not cached (was: Compilers should use PhoenixRuntime#getTable) > getTable should retrieve PTable from server if not cached > - > > Key: PHOENIX-7165 > URL: https://issues.apache.org/jira/browse/PHOENIX-7165 > Project: Phoenix > Issue Type: Bug >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > PhoenixConnection#getTable gets a PTable from the PTable cache per JVM. If > the table is not in the cache then it trows TableNotFoundException. > PhoenixRuntime#getTable calls first PhoenixConnection#getTable and if the > table is not in the cache, it retrieves table from the server. > Since a user table can be evicted from the cache any time, Phoenix compilers > should not use PhoenixConnection#getTable, instead they should use > PhoenixRuntime#getTable. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7170) Conditional TTL
[ https://issues.apache.org/jira/browse/PHOENIX-7170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7170: --- Description: Deleting rows using delete markers require running delete queries to insert them, one for each row to be deleted. Often applications need to run periodic jobs to issue delete queries to insert delete markers. Deleting rows using TTL is more performance optimized compared to adding delete markers in Phoenix since TTL works without inserting delete markers. Phoenix currently supports table and view (level) TTL. It is desirable to have a conditional TTL feature to extend the TTL future to expire a subset of rows of a table or updatable view using a different TTL value. A condition TTL can be set using a CASE statement in CREATE and ALTER statements by adding TTL=. For example, TTL = CASE WHEN ID IS BETWEEN 1 AND 100 THEN <10 days> WHEN ID IS BETWEEN 101 AND 200 <7 days> ELSE <5 days> END The compaction scanner (CompactionScanner) in Phoenix can evaluate the case statement on a row and decide if the row should be removed. Similarly, on the read path TTLRegionScanner can mask the rows using the case statement. The TTL case statement can be stored in SYSCAT in header rows. was: Deleting rows using delete markers require running delete queries to insert them, one for each row to be deleted. Often applications need to run periodic jobs to issue delete queries to insert delete markers. Deleting rows using TTL is more performance optimized compared to adding delete markers in Phoenix since TTL works without inserting delete markers. Phoenix currently supports table and view (level) TTL. It is desirable to have a conditional TTL feature to extend the TTL future to expire a subset of rows of a table or updatable view using a different TTL value than the rest of the rows. A condition TTL can be set using a CASE statement in CREATE and ALTER statements by adding TTL=. For example, TTL = CASE WHEN ID IS BETWEEN 1 AND 100 THEN <10 days> WHEN ID IS BETWEEN 101 AND 200 <7 days> ELSE <5 days> END The compaction scanner (CompactionScanner) in Phoenix can evaluate the case statement on a row and decide if the row should be removed. Similarly, on the read path TTLRegionScanner can mask the rows using the case statement. The TTL case statement can be stored in SYSCAT in header rows. > Conditional TTL > --- > > Key: PHOENIX-7170 > URL: https://issues.apache.org/jira/browse/PHOENIX-7170 > Project: Phoenix > Issue Type: New Feature >Reporter: Kadir Ozdemir >Priority: Major > > Deleting rows using delete markers require running delete queries to insert > them, one for each row to be deleted. Often applications need to run periodic > jobs to issue delete queries to insert delete markers. Deleting rows using > TTL is more performance optimized compared to adding delete markers in > Phoenix since TTL works without inserting delete markers. Phoenix currently > supports table and view (level) TTL. It is desirable to have a conditional > TTL feature to extend the TTL future to expire a subset of rows of a table or > updatable view using a different TTL value. > A condition TTL can be set using a CASE statement in CREATE and ALTER > statements by adding TTL=. For example, > TTL = CASE WHEN ID IS BETWEEN 1 AND 100 THEN <10 days> WHEN ID IS BETWEEN 101 > AND 200 <7 days> ELSE <5 days> END > The compaction scanner (CompactionScanner) in Phoenix can evaluate the case > statement on a row and decide if the row should be removed. Similarly, on the > read path TTLRegionScanner can mask the rows using the case statement. The > TTL case statement can be stored in SYSCAT in header rows. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7170) Conditional TTL
[ https://issues.apache.org/jira/browse/PHOENIX-7170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7170: --- Description: Deleting rows using delete markers require running delete queries to insert them, one for each row to be deleted. Often applications need to run periodic jobs to issue delete queries to insert delete markers. Deleting rows using TTL is more performance optimized compared to adding delete markers in Phoenix since TTL works without inserting delete markers. Phoenix currently supports table and view (level) TTL. It is desirable to have a conditional TTL feature to extend the TTL future to expire a subset of rows of a table or updatable view using a different TTL value than the rest of the rows. A condition TTL can be set using a CASE statement in CREATE and ALTER statements by adding TTL=. For example, TTL = CASE WHEN ID IS BETWEEN 1 AND 100 THEN <10 days> WHEN ID IS BETWEEN 101 AND 200 <7 days> ELSE <5 days> END The compaction scanner (CompactionScanner) in Phoenix can evaluate the case statement on a row and decide if the row should be removed. Similarly, on the read path TTLRegionScanner can mask the rows using the case statement. The TTL case statement can be stored in SYSCAT in header rows. was: Deleting rows using delete markers require running delete queries to insert them, one for each row to be deleted. Often applications need to run periodic jobs to issue delete queries to insert delete markers. Deleting rows using TTL is more performance optimized compared to adding delete markers in Phoenix since TTL works without inserting delete markers. Phoenix currently supports table and view (level) TTL. It is desirable to have a row level TTL feature to extend the TTL future to delete a subset of rows of a table or updatable view. A row-level-TTL can be set using a CASE statement in CREATE and ALTER statements by adding TTL=. For example, TTL = CASE WHEN ID IS BETWEEN 1 AND 100 THEN <10 days> WHEN ID IS BETWEEN 101 AND 200 <7 days> ELSE <5 days> END The compaction scanner (CompactionScanner) in Phoenix can evaluate the case statement on a row and decide if the row should be deleted. Similarly, on the read path TTLRegionScanner can mask the deleted rows using the case statement. The TTL case statement can be stored in SYSCAT in header rows. > Conditional TTL > --- > > Key: PHOENIX-7170 > URL: https://issues.apache.org/jira/browse/PHOENIX-7170 > Project: Phoenix > Issue Type: New Feature >Reporter: Kadir Ozdemir >Priority: Major > > Deleting rows using delete markers require running delete queries to insert > them, one for each row to be deleted. Often applications need to run periodic > jobs to issue delete queries to insert delete markers. Deleting rows using > TTL is more performance optimized compared to adding delete markers in > Phoenix since TTL works without inserting delete markers. Phoenix currently > supports table and view (level) TTL. It is desirable to have a conditional > TTL feature to extend the TTL future to expire a subset of rows of a table or > updatable view using a different TTL value than the rest of the rows. > A condition TTL can be set using a CASE statement in CREATE and ALTER > statements by adding TTL=. For example, > TTL = CASE WHEN ID IS BETWEEN 1 AND 100 THEN <10 days> WHEN ID IS BETWEEN 101 > AND 200 <7 days> ELSE <5 days> END > The compaction scanner (CompactionScanner) in Phoenix can evaluate the case > statement on a row and decide if the row should be removed. Similarly, on the > read path TTLRegionScanner can mask the rows using the case statement. The > TTL case statement can be stored in SYSCAT in header rows. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7170) Conditional TTL
[ https://issues.apache.org/jira/browse/PHOENIX-7170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7170: --- Summary: Conditional TTL (was: Phoenix Row TTL) > Conditional TTL > --- > > Key: PHOENIX-7170 > URL: https://issues.apache.org/jira/browse/PHOENIX-7170 > Project: Phoenix > Issue Type: New Feature >Reporter: Kadir Ozdemir >Priority: Major > > Deleting rows using delete markers require running delete queries to insert > them, one for each row to be deleted. Often applications need to run periodic > jobs to issue delete queries to insert delete markers. Deleting rows using > TTL is more performance optimized compared to adding delete markers in > Phoenix since TTL works without inserting delete markers. Phoenix currently > supports table and view (level) TTL. It is desirable to have a row level TTL > feature to extend the TTL future to delete a subset of rows of a table or > updatable view. > A row-level-TTL can be set using a CASE statement in CREATE and ALTER > statements by adding TTL=. For example, > TTL = CASE WHEN ID IS BETWEEN 1 AND 100 THEN <10 days> WHEN ID IS BETWEEN 101 > AND 200 <7 days> ELSE <5 days> END > The compaction scanner (CompactionScanner) in Phoenix can evaluate the case > statement on a row and decide if the row should be deleted. Similarly, on the > read path TTLRegionScanner can mask the deleted rows using the case > statement. The TTL case statement can be stored in SYSCAT in header rows. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7170) Phoenix Row TTL
[ https://issues.apache.org/jira/browse/PHOENIX-7170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7170: --- Description: Deleting rows using delete markers require running delete queries to insert them, one for each row to be deleted. Often applications need to run periodic jobs to issue delete queries to insert delete markers. Deleting rows using TTL is more performance optimized compared to adding delete markers in Phoenix since TTL works without inserting delete markers. Phoenix currently supports table and view (level) TTL. It is desirable to have a row level TTL feature to extend the TTL future to delete a subset of rows of a table or updatable view. A row-level-TTL can be set using a CASE statement in CREATE and ALTER statements by adding TTL=. For example, TTL = CASE WHEN ID IS BETWEEN 1 AND 100 THEN <10 days> WHEN ID IS BETWEEN 101 AND 200 <7 days> ELSE <5 days> END The compaction scanner (CompactionScanner) in Phoenix can evaluate the case statement on a row and decide if the row should be deleted. Similarly, on the read path TTLRegionScanner can mask the deleted rows using the case statement. The TTL case statement can be stored in SYSCAT in header rows. was: Deleting rows using delete markers require running delete queries to insert them, one for each row to be deleted. Often applications need to run periodic jobs to issue delete queries to insert delete markers. Deleting rows using TTL is more performance optimized compared to adding delete markers in Phoenix since TTL works without inserting delete markers. Phoenix currently supports table and view (level) TTL. It is desirable to have a row level TTL feature to extend the TTL future to delete a subset of rows of a table or updatable view. A row-level-TTL can be set using a CASE statement in CREATE and ALTER statements by adding TTL=. For example, ROW_TTL = . As for partial indexes, the where clause should be evaluable on a single row. The compaction scanner (CompactionScanner) in Phoenix can evaluate a row-TTL-where clause on a row and decide if the row should be deleted. Similarly, on the read path TTLRegionScanner can mask the deleted rows using row-TTL-where clauses. The row-TTL-where clauses can be stored in SYSCAT in header rows. > Phoenix Row TTL > --- > > Key: PHOENIX-7170 > URL: https://issues.apache.org/jira/browse/PHOENIX-7170 > Project: Phoenix > Issue Type: New Feature >Reporter: Kadir Ozdemir >Priority: Major > > Deleting rows using delete markers require running delete queries to insert > them, one for each row to be deleted. Often applications need to run periodic > jobs to issue delete queries to insert delete markers. Deleting rows using > TTL is more performance optimized compared to adding delete markers in > Phoenix since TTL works without inserting delete markers. Phoenix currently > supports table and view (level) TTL. It is desirable to have a row level TTL > feature to extend the TTL future to delete a subset of rows of a table or > updatable view. > A row-level-TTL can be set using a CASE statement in CREATE and ALTER > statements by adding TTL=. For example, > TTL = CASE WHEN ID IS BETWEEN 1 AND 100 THEN <10 days> WHEN ID IS BETWEEN 101 > AND 200 <7 days> ELSE <5 days> END > The compaction scanner (CompactionScanner) in Phoenix can evaluate the case > statement on a row and decide if the row should be deleted. Similarly, on the > read path TTLRegionScanner can mask the deleted rows using the case > statement. The TTL case statement can be stored in SYSCAT in header rows. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7170) Phoenix Row TTL
[ https://issues.apache.org/jira/browse/PHOENIX-7170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7170: --- Description: Deleting rows using delete markers require running delete queries to insert them, one for each row to be deleted. Often applications need to run periodic jobs to issue delete queries to insert delete markers. Deleting rows using TTL is more performance optimized compared to adding delete markers in Phoenix since TTL works without inserting delete markers. Phoenix currently supports table and view (level) TTL. It is desirable to have a row level TTL feature to extend the TTL future to delete a subset of rows of a table or updatable view. A row-level-TTL can be set using a CASE statement in CREATE and ALTER statements by adding TTL=. For example, ROW_TTL = . As for partial indexes, the where clause should be evaluable on a single row. The compaction scanner (CompactionScanner) in Phoenix can evaluate a row-TTL-where clause on a row and decide if the row should be deleted. Similarly, on the read path TTLRegionScanner can mask the deleted rows using row-TTL-where clauses. The row-TTL-where clauses can be stored in SYSCAT in header rows. was: Deleting rows using delete markers require running delete queries to insert them, one for each row to be deleted. Often applications need to run periodic jobs to issue delete queries to insert delete markers. Deleting rows using TTL is more performance optimized compared to adding delete markers in Phoenix since TTL works without inserting delete markers. Phoenix currently supports table and view (level) TTL. It is desirable to have a row level TTL feature to extend the TTL future to delete a subset of rows of a table or updatable view. As in partial indexes, a row level TTL can be set using a where clause. This clause can be set using CREATE and ALTER statements by adding ROW_TTL=(). For example, ROW_TTL = (WHERE CURRENT_TIME() - LAST_UPDATE > 864 AND ID IS BETWEEN 1 AND 100) where LAST_UPDATE and ID are the columns of the table (or updatable view). As for partial indexes, the where clause should be evaluable on a single row. The compaction scanner (CompactionScanner) in Phoenix can evaluate a row-TTL-where clause on a row and decide if the row should be deleted. Similarly, on the read path TTLRegionScanner can mask the deleted rows using row-TTL-where clauses. The row-TTL-where clauses can be stored in SYSCAT in header rows. > Phoenix Row TTL > --- > > Key: PHOENIX-7170 > URL: https://issues.apache.org/jira/browse/PHOENIX-7170 > Project: Phoenix > Issue Type: New Feature >Reporter: Kadir Ozdemir >Priority: Major > > Deleting rows using delete markers require running delete queries to insert > them, one for each row to be deleted. Often applications need to run periodic > jobs to issue delete queries to insert delete markers. Deleting rows using > TTL is more performance optimized compared to adding delete markers in > Phoenix since TTL works without inserting delete markers. Phoenix currently > supports table and view (level) TTL. It is desirable to have a row level TTL > feature to extend the TTL future to delete a subset of rows of a table or > updatable view. > A row-level-TTL can be set using a CASE statement in CREATE and ALTER > statements by adding TTL=. For example, ROW_TTL = . As > for partial indexes, the where clause should be evaluable on a single row. > The compaction scanner (CompactionScanner) in Phoenix can evaluate a > row-TTL-where clause on a row and decide if the row should be deleted. > Similarly, on the read path TTLRegionScanner can mask the deleted rows using > row-TTL-where clauses. The row-TTL-where clauses can be stored in SYSCAT in > header rows. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7170) Phoenix Row TTL
Kadir Ozdemir created PHOENIX-7170: -- Summary: Phoenix Row TTL Key: PHOENIX-7170 URL: https://issues.apache.org/jira/browse/PHOENIX-7170 Project: Phoenix Issue Type: New Feature Reporter: Kadir Ozdemir Deleting rows using delete markers require running delete queries to insert them, one for each row to be deleted. Often applications need to run periodic jobs to issue delete queries to insert delete markers. Deleting rows using TTL is more performance optimized compared to adding delete markers in Phoenix since TTL works without inserting delete markers. Phoenix currently supports table and view (level) TTL. It is desirable to have a row level TTL feature to extend the TTL future to delete a subset of rows of a table or updatable view. As in partial indexes, a row level TTL can be set using a where clause. This clause can be set using CREATE and ALTER statements by adding ROW_TTL=(). For example, ROW_TTL = (WHERE CURRENT_TIME() - LAST_UPDATE > 864 AND ID IS BETWEEN 1 AND 100) where LAST_UPDATE and ID are the columns of the table (or updatable view). As for partial indexes, the where clause should be evaluable on a single row. The compaction scanner (CompactionScanner) in Phoenix can evaluate a row-TTL-where clause on a row and decide if the row should be deleted. Similarly, on the read path TTLRegionScanner can mask the deleted rows using row-TTL-where clauses. The row-TTL-where clauses can be stored in SYSCAT in header rows. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-7165) Compilers should use PhoenixRuntime#getTable
[ https://issues.apache.org/jira/browse/PHOENIX-7165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-7165: -- Assignee: Kadir Ozdemir > Compilers should use PhoenixRuntime#getTable > > > Key: PHOENIX-7165 > URL: https://issues.apache.org/jira/browse/PHOENIX-7165 > Project: Phoenix > Issue Type: Bug >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > PhoenixConnection#getTable gets a PTable from the PTable cache per JVM. If > the table is not in the cache then it trows TableNotFoundException. > PhoenixRuntime#getTable calls first PhoenixConnection#getTable and if the > table is not in the cache, it retrieves table from the server. > Since a user table can be evicted from the cache any time, Phoenix compilers > should not use PhoenixConnection#getTable, instead they should use > PhoenixRuntime#getTable. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7165) Compiler should use PhoenixRuntime#getTable
Kadir Ozdemir created PHOENIX-7165: -- Summary: Compiler should use PhoenixRuntime#getTable Key: PHOENIX-7165 URL: https://issues.apache.org/jira/browse/PHOENIX-7165 Project: Phoenix Issue Type: Bug Reporter: Kadir Ozdemir PhoenixConnection#getTable gets a PTable from the PTable cache per JVM. If the table is not in the cache then it trows TableNotFoundException. PhoenixRuntime#getTable calls first PhoenixConnection#getTable and if the table is not in the cache, it retrieves table from the server. Since a user table can be evicted from the cache any time, Phoenix compilers should not use PhoenixConnection#getTable, instead they should use PhoenixRuntime#getTable. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7165) Compilers should use PhoenixRuntime#getTable
[ https://issues.apache.org/jira/browse/PHOENIX-7165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7165: --- Summary: Compilers should use PhoenixRuntime#getTable (was: Compiler should use PhoenixRuntime#getTable) > Compilers should use PhoenixRuntime#getTable > > > Key: PHOENIX-7165 > URL: https://issues.apache.org/jira/browse/PHOENIX-7165 > Project: Phoenix > Issue Type: Bug >Reporter: Kadir Ozdemir >Priority: Major > > PhoenixConnection#getTable gets a PTable from the PTable cache per JVM. If > the table is not in the cache then it trows TableNotFoundException. > PhoenixRuntime#getTable calls first PhoenixConnection#getTable and if the > table is not in the cache, it retrieves table from the server. > Since a user table can be evicted from the cache any time, Phoenix compilers > should not use PhoenixConnection#getTable, instead they should use > PhoenixRuntime#getTable. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PHOENIX-7032) Partial Global Secondary Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir resolved PHOENIX-7032. Resolution: Resolved > Partial Global Secondary Indexes > > > Key: PHOENIX-7032 > URL: https://issues.apache.org/jira/browse/PHOENIX-7032 > Project: Phoenix > Issue Type: New Feature >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > The secondary indexes supported in Phoenix have been full indexes such that > for every data table row there is an index row. Generating an index row for > every data table row is not always required. For example, some use cases do > not require index rows for the data table rows in which indexed column values > are null. Such indexes are called sparse indexes. Partial indexes generalize > the concept of sparse indexing and allow users to specify the subset of the > data table rows for which index rows will be maintained. This subset is > specified using a WHERE clause added to the CREATE INDEX DDL statement. > Partial secondary indexes were first proposed by Michael Stonebraker > [here|https://dsf.berkeley.edu/papers/ERL-M89-17.pdf]. Since then several SQL > databases (e.g., > [Postgres|https://www.postgresql.org/docs/current/indexes-partial.html] and > [SQLite|https://www.sqlite.org/partialindex.html]) and NoSQL databases > (e.g., [MongoDB|https://www.mongodb.com/docs/manual/core/index-partial/]) > have supported some form of partial indexes. It is challenging to allow > arbitrary WHERE clauses in DDL statements. For example, Postgres does not > allow subqueries in these where clauses and SQLite supports much more > restrictive where clauses. > Supporting arbitrary where clauses creates challenges for query optimizers in > deciding the usability of a partial index for a given query. If the set of > data table rows that satisfy the query is a subset of the data table rows > that the partial index points back, then the query can use the index. Thus, > the query optimizer has to decide if the WHERE clause of the query implies > the WHERE clause of the index. > Michael Stonebraker [here|https://dsf.berkeley.edu/papers/ERL-M89-17.pdf] > suggests that an index WHERE clause is a conjunct of simple terms, i.e: > i-clause-1 and i-clause-2 and ... and i-clause-m where each clause is of the > form . Hence, the qualification can be > evaluated for each tuple in the indicated relation without consulting > additional tuples. > Phoenix partial indexes will initially support a more general set of index > WHERE clauses that can be evaluated on a single row with the following > exceptions > * Subqueries are not allowed. > * Like expressions are allowed with very limited support such that an index > WHERE clause with like expressions can imply/contain a query if the query has > the same like expressions that the index WHERE clause has. > * Comparison between columns are allowed without supporting transitivity, > for example, a > b and b > c does not imply a > c. > Partial indexes will be supported initially for global secondary indexes, > i.e., covered global indexes and uncovered global indexes. The local > secondary indexes will be supported in future. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-7032) Partial Global Secondary Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7032: --- Description: The secondary indexes supported in Phoenix have been full indexes such that for every data table row there is an index row. Generating an index row for every data table row is not always required. For example, some use cases do not require index rows for the data table rows in which indexed column values are null. Such indexes are called sparse indexes. Partial indexes generalize the concept of sparse indexing and allow users to specify the subset of the data table rows for which index rows will be maintained. This subset is specified using a WHERE clause added to the CREATE INDEX DDL statement. Partial secondary indexes were first proposed by Michael Stonebraker [here|https://dsf.berkeley.edu/papers/ERL-M89-17.pdf]. Since then several SQL databases (e.g., [Postgres|https://www.postgresql.org/docs/current/indexes-partial.html] and [SQLite|https://www.sqlite.org/partialindex.html]) and NoSQL databases (e.g., [MongoDB|https://www.mongodb.com/docs/manual/core/index-partial/]) have supported some form of partial indexes. It is challenging to allow arbitrary WHERE clauses in DDL statements. For example, Postgres does not allow subqueries in these where clauses and SQLite supports much more restrictive where clauses. Supporting arbitrary where clauses creates challenges for query optimizers in deciding the usability of a partial index for a given query. If the set of data table rows that satisfy the query is a subset of the data table rows that the partial index points back, then the query can use the index. Thus, the query optimizer has to decide if the WHERE clause of the query implies the WHERE clause of the index. Michael Stonebraker [here|https://dsf.berkeley.edu/papers/ERL-M89-17.pdf] suggests that an index WHERE clause is a conjunct of simple terms, i.e: i-clause-1 and i-clause-2 and ... and i-clause-m where each clause is of the form . Hence, the qualification can be evaluated for each tuple in the indicated relation without consulting additional tuples. Phoenix partial indexes will initially support a more general set of index WHERE clauses that can be evaluated on a single row with the following exceptions * Subqueries are not allowed. * Like expressions are allowed with very limited support such that an index WHERE clause with like expressions can imply/contain a query if the query has the same like expressions that the index WHERE clause has. * Comparison between columns are allowed without supporting transitivity, for example, a > b and b > c does not imply a > c. Partial indexes will be supported initially for global secondary indexes, i.e., covered global indexes and uncovered global indexes. The local secondary indexes will be supported in future. was: The secondary indexes supported in Phoenix have been full indexes such that for every data table row there is an index row. Generating an index row for every data table row is not always required. For example, some use cases do not require index rows for the data table rows in which indexed column values are null. Such indexes are called sparse indexes. Partial indexes generalize the concept of sparse indexing and allow users to specify the subset of the data table rows for which index rows will be maintained. This subset is specified using a WHERE clause added to the CREATE INDEX DDL statement. Partial secondary indexes were first proposed by Michael Stonebraker [here|https://dsf.berkeley.edu/papers/ERL-M89-17.pdf]. Since then several SQL databases (e.g., [Postgres|https://www.postgresql.org/docs/current/indexes-partial.html] and [SQLite|https://www.sqlite.org/partialindex.html]) and NoSQL databases (e.g., [MongoDB|https://www.mongodb.com/docs/manual/core/index-partial/]) have supported some form of partial indexes. It is challenging to allow arbitrary WHERE clauses in DDL statements. For example, Postgres does not allow subqueries in these where clauses and SQLite supports much more restrictive where clauses. Supporting arbitrary where clauses creates challenges for query optimizers in deciding the usability of a partial index for a given query. If the set of data table rows that satisfy the query is a subset of the data table rows that the partial index points back, then the query can use the index. Thus, the query optimizer has to decide if the WHERE clause of the query implies the WHERE clause of the index. Michael Stonebraker [here|https://dsf.berkeley.edu/papers/ERL-M89-17.pdf] suggests that an index WHERE clause is a conjunct of simple terms, i.e: i-clause-1 and i-clause-2 and ... and i-clause-m where each clause is of the form field operator constant. Hence, the qualification can be evaluated for each tuple in the indicated relation without
[jira] [Updated] (PHOENIX-7032) Partial Global Secondary Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-7032: --- Summary: Partial Global Secondary Indexes (was: Partial Secondary Indexes) > Partial Global Secondary Indexes > > > Key: PHOENIX-7032 > URL: https://issues.apache.org/jira/browse/PHOENIX-7032 > Project: Phoenix > Issue Type: New Feature >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > The secondary indexes supported in Phoenix have been full indexes such that > for every data table row there is an index row. Generating an index row for > every data table row is not always required. For example, some use cases do > not require index rows for the data table rows in which indexed column values > are null. Such indexes are called sparse indexes. Partial indexes generalize > the concept of sparse indexing and allow users to specify the subset of the > data table rows for which index rows will be maintained. This subset is > specified using a WHERE clause added to the CREATE INDEX DDL statement. > Partial secondary indexes were first proposed by Michael Stonebraker > [here|https://dsf.berkeley.edu/papers/ERL-M89-17.pdf]. Since then several SQL > databases (e.g., > [Postgres|https://www.postgresql.org/docs/current/indexes-partial.html] and > [SQLite|https://www.sqlite.org/partialindex.html]) and NoSQL databases > (e.g., [MongoDB|https://www.mongodb.com/docs/manual/core/index-partial/]) > have supported some form of partial indexes. It is challenging to allow > arbitrary WHERE clauses in DDL statements. For example, Postgres does not > allow subqueries in these where clauses and SQLite supports much more > restrictive where clauses. > Supporting arbitrary where clauses creates challenges for query optimizers in > deciding the usability of a partial index for a given query. If the set of > data table rows that satisfy the query is a subset of the data table rows > that the partial index points back, then the query can use the index. Thus, > the query optimizer has to decide if the WHERE clause of the query implies > the WHERE clause of the index. > Michael Stonebraker [here|https://dsf.berkeley.edu/papers/ERL-M89-17.pdf] > suggests that an index WHERE clause is a conjunct of simple terms, i.e: > i-clause-1 and i-clause-2 and ... and i-clause-m where each clause is of the > form field operator constant. Hence, the qualification can be evaluated for > each tuple in the indicated relation without consulting additional tuples. > The first implementation of Phoenix partial indexes will support a more > general set of index WHERE clauses where simple terms each of which is in the > form are connected through any > combination of AND and OR operators. Formally, an allowed index WHERE clause > can be represented by any expression tree such that non-leaf nodes are AND, > OR, or NOT operators, and leaf nodes are simple terms each of which is in > the form where a column is a data table > column, an operator is a comparison operator, and a constant is a value from > the domain of the column. > Partial indexes will be supported for all index types, that is, for local > indexes, covered global indexes and uncovered global indexes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-7032) Partial Secondary Indexes
Kadir Ozdemir created PHOENIX-7032: -- Summary: Partial Secondary Indexes Key: PHOENIX-7032 URL: https://issues.apache.org/jira/browse/PHOENIX-7032 Project: Phoenix Issue Type: New Feature Reporter: Kadir Ozdemir Assignee: Kadir Ozdemir The secondary indexes supported in Phoenix have been full indexes such that for every data table row there is an index row. Generating an index row for every data table row is not always required. For example, some use cases do not require index rows for the data table rows in which indexed column values are null. Such indexes are called sparse indexes. Partial indexes generalize the concept of sparse indexing and allow users to specify the subset of the data table rows for which index rows will be maintained. This subset is specified using a WHERE clause added to the CREATE INDEX DDL statement. Partial secondary indexes were first proposed by Michael Stonebraker [here|https://dsf.berkeley.edu/papers/ERL-M89-17.pdf]. Since then several SQL databases (e.g., [Postgres|https://www.postgresql.org/docs/current/indexes-partial.html] and [SQLite|https://www.sqlite.org/partialindex.html]) and NoSQL databases (e.g., [MongoDB|https://www.mongodb.com/docs/manual/core/index-partial/]) have supported some form of partial indexes. It is challenging to allow arbitrary WHERE clauses in DDL statements. For example, Postgres does not allow subqueries in these where clauses and SQLite supports much more restrictive where clauses. Supporting arbitrary where clauses creates challenges for query optimizers in deciding the usability of a partial index for a given query. If the set of data table rows that satisfy the query is a subset of the data table rows that the partial index points back, then the query can use the index. Thus, the query optimizer has to decide if the WHERE clause of the query implies the WHERE clause of the index. Michael Stonebraker [here|https://dsf.berkeley.edu/papers/ERL-M89-17.pdf] suggests that an index WHERE clause is a conjunct of simple terms, i.e: i-clause-1 and i-clause-2 and ... and i-clause-m where each clause is of the form field operator constant. Hence, the qualification can be evaluated for each tuple in the indicated relation without consulting additional tuples. The first implementation of Phoenix partial indexes will support a more general set of index WHERE clauses where simple terms each of which is in the form are connected through any combination of AND and OR operators. Formally, an allowed index WHERE clause can be represented by any expression tree such that non-leaf nodes are AND, OR, or NOT operators, and leaf nodes are simple terms each of which is in the form where a column is a data table column, an operator is a comparison operator, and a constant is a value from the domain of the column. Partial indexes will be supported for all index types, that is, for local indexes, covered global indexes and uncovered global indexes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-7018) Server side index maintainer caching for read, write, and replication
[ https://issues.apache.org/jira/browse/PHOENIX-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-7018: -- Assignee: Viraj Jasani > Server side index maintainer caching for read, write, and replication > - > > Key: PHOENIX-7018 > URL: https://issues.apache.org/jira/browse/PHOENIX-7018 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Viraj Jasani >Priority: Major > > The relationship between a data table and index is somewhat involved. Phoenix > needs to transform a data table row to the corresponding index row, extract a > data table row key (i.e., a primary key) from an index table row key (a > secondary key), and map data table columns to index table included columns. > The metadata for these operations and the operations are encapsulated in the > class called IndexMaintainer. Phoenix creates a separate IndexMaintainer > object for each index table in memory on the client side. IndexMaintainer > objects are then serialized using the protobuf library and sent to servers > along with the mutations on the data tables and scans on the index tables. > The Phoenix server code (more accurately Phoenix coprocessors) then uses > IndexMaintainer objects to update indexes and leverage indexes for queries. > Phoenix coprocessors use IndexMaintainer objects associated with a given > batch of mutations or scan operation only once (i.e., for the batch or scan) > and the Phoenix client sends these objects along with every batch of > mutations and every scan. > The secondary indexes are used to improve the performance of queries on the > secondary index columns. The secondary indexes are required to be consistent > with their data tables. The consistency here means that regardless of whether > a query is served from a data table or index table, the same result is > returned. This consistency promise cannot be kept when the data and their > index table rows are replicated independently, which happens at the HBase > level replication. > HBase replicates WALs (Write Ahead Logs) of regions servers and replays these > WALs at the destination cluster. A given data table row and the corresponding > index table row are likely served by different region servers and the WALs of > these region servers are replicated independently. This means these rows can > arrive at different times which makes data and index tables inconsistent at > the destination cluster. > Replicating global indexes leads to inefficient use of the replication > bandwidth due to the additional overhead of replicating data that can > actually be derived from the data that has been already replicated. When one > considers that an index table is essentially a copy of its data table without > the columns that are not included in the index, and a given data table can > have multiple indexes, it is easy to see that replicating indexes can double > the replication bandwidth requirement easily for a given data table. > A solution for eliminating index table replication is to add just enough > metadata to WAL records for the mutations of the data tables with indexes and > have a replication endpoint and coprocessor endpoint to generate index > mutations from these records, please see PHOENIX-5315. This document extends > this solution to eliminate replicating index tables but also to improve read > and write path for index tables by maintaining a consistent server side > caching for index maintainers. > The idea behind the proposed solution is to cache the index maintainers on > the server side and thus eliminate transferring index maintainers during read > and write as well as replication. The coprocessors that currently require the > index maintainers are IndexRegionObserver for the write path and some other > coprocessors including GlobalIndexChecker for read repair in the read path. > The proposed solution leverages the existing capability of adding > IndexMaintainer objects in the server side cache implemented by > ServerCachingEndpointImpl. The design eliminates global index table > replication and also eliminates the server side cache update with > IndexMaintainer objects for each batch write. > IndexRegionObserver (the coprocessor that generates index mutations from data > table mutations) needs to access IndexMaintainer objects for the indexes on a > table or view. The metadata transferred as a mutation attribute will be used > to identify the table or view for which a mutation is. The metadata will > include the tenant Id, table schema, and table name. The cache key for the > array of index maintainers for this table or view will be formed from this > metadata. When IndexRegionObserver intercepts a mutation on an
[jira] [Created] (PHOENIX-7018) Server side index maintainer caching for read, write, and replication
Kadir Ozdemir created PHOENIX-7018: -- Summary: Server side index maintainer caching for read, write, and replication Key: PHOENIX-7018 URL: https://issues.apache.org/jira/browse/PHOENIX-7018 Project: Phoenix Issue Type: Improvement Reporter: Kadir Ozdemir The relationship between a data table and index is somewhat involved. Phoenix needs to transform a data table row to the corresponding index row, extract a data table row key (i.e., a primary key) from an index table row key (a secondary key), and map data table columns to index table included columns. The metadata for these operations and the operations are encapsulated in the class called IndexMaintainer. Phoenix creates a separate IndexMaintainer object for each index table in memory on the client side. IndexMaintainer objects are then serialized using the protobuf library and sent to servers along with the mutations on the data tables and scans on the index tables. The Phoenix server code (more accurately Phoenix coprocessors) then uses IndexMaintainer objects to update indexes and leverage indexes for queries. Phoenix coprocessors use IndexMaintainer objects associated with a given batch of mutations or scan operation only once (i.e., for the batch or scan) and the Phoenix client sends these objects along with every batch of mutations and every scan. The secondary indexes are used to improve the performance of queries on the secondary index columns. The secondary indexes are required to be consistent with their data tables. The consistency here means that regardless of whether a query is served from a data table or index table, the same result is returned. This consistency promise cannot be kept when the data and their index table rows are replicated independently, which happens at the HBase level replication. HBase replicates WALs (Write Ahead Logs) of regions servers and replays these WALs at the destination cluster. A given data table row and the corresponding index table row are likely served by different region servers and the WALs of these region servers are replicated independently. This means these rows can arrive at different times which makes data and index tables inconsistent at the destination cluster. Replicating global indexes leads to inefficient use of the replication bandwidth due to the additional overhead of replicating data that can actually be derived from the data that has been already replicated. When one considers that an index table is essentially a copy of its data table without the columns that are not included in the index, and a given data table can have multiple indexes, it is easy to see that replicating indexes can double the replication bandwidth requirement easily for a given data table. A solution for eliminating index table replication is to add just enough metadata to WAL records for the mutations of the data tables with indexes and have a replication endpoint and coprocessor endpoint to generate index mutations from these records, please see PHOENIX-5315. This document extends this solution to eliminate replicating index tables but also to improve read and write path for index tables by maintaining a consistent server side caching for index maintainers. The idea behind the proposed solution is to cache the index maintainers on the server side and thus eliminate transferring index maintainers during read and write as well as replication. The coprocessors that currently require the index maintainers are IndexRegionObserver for the write path and some other coprocessors including GlobalIndexChecker for read repair in the read path. The proposed solution leverages the existing capability of adding IndexMaintainer objects in the server side cache implemented by ServerCachingEndpointImpl. The design eliminates global index table replication and also eliminates the server side cache update with IndexMaintainer objects for each batch write. IndexRegionObserver (the coprocessor that generates index mutations from data table mutations) needs to access IndexMaintainer objects for the indexes on a table or view. The metadata transferred as a mutation attribute will be used to identify the table or view for which a mutation is. The metadata will include the tenant Id, table schema, and table name. The cache key for the array of index maintainers for this table or view will be formed from this metadata. When IndexRegionObserver intercepts a mutation on an HBase table (using the preBatchMutate coprocessor hook), IndexRegionObserver will form the cache key for the array of index maintainer and retrieve it from the server cache. This design requires maintaining metadata caches at region servers. These caches need to be consistent, that is, these caches should not have stale metadata. To ensure this, when MetaDataEndpointImpl updates index
[jira] [Created] (PHOENIX-7001) Change Data Capture leveraging Max Lookback and Uncovered Indexes
Kadir Ozdemir created PHOENIX-7001: -- Summary: Change Data Capture leveraging Max Lookback and Uncovered Indexes Key: PHOENIX-7001 URL: https://issues.apache.org/jira/browse/PHOENIX-7001 Project: Phoenix Issue Type: Improvement Reporter: Kadir Ozdemir The use cases for a Change Data Capture (CDC) feature are centered around capturing changes to a given table (or updatable view) as these changes happen in near real-time. A CDC application can retrieve changes in real-time or with some delay, or even retrieves the same set of changes multiple times. This means the CDC use case can be generalized as time range queries where the time range is typically short such as last x minutes or hours or expressed as a specific time range in the last n days where n is typically less than 7. A change is an update in a row. That is, a change is either updating one or more columns of a table for a given row or deleting a row. It is desirable to provide these changes in the order of their arrival. One can visualize the delivery of these changes through a stream from a Phoenix table to the application that is initiated by the application similar to the delivery of any other Phoenix query results. The difference is that a regular query result includes at most one result row for each row satisfying the query and the deleted rows are not visible to the query result while the CDC stream/result can include multiple result rows for each row and the result includes deleted rows. Some use cases need to also get the pre and/or post image of the row along with a change on the row. The design proposed here leverages Phoenix Max Lookback and Uncovered (Global or Local) Indexes. The max lookback feature retains recent changes to a table, that is, the changes that have been done in the last x days typically. This means that the max lookback feature already captures the changes to a given table. Currently, the max lookback age is configurable at the cluster level. We need to extend this capability to be able to configure the max lookback age at the table level so that each table can have a different max lookback age based on its CDC application requirements. To deliver the changes in the order of their arrival, we need a time based index. This index should be uncovered as the changes are already retained in the table by the max lookback feature. The arrival time can be defined as the mutation timestamp generated by the server, or a user-specified timestamp (or any other long integer) column. An uncovered index would allow us to efficiently and orderly access to the changes. Changes to an index table are also preserved by the max lookback feature. A CDC feature can be composed of the following components: * {*}CDCUncoveredIndexRegionScanner{*}: This is a server side scanner on an uncovered index used for CDC. This can inherit UncoveredIndexRegionScanner. It goes through index table rows using a raw scan to identify data table rows and retrieves these rows using a raw scan. Using the time range, it forms a JSON blob to represent changes to the row including pre and/or post row images. * {*}CDC Query Compiler{*}: This is a client side component. It prepares the scan object based on the given CDC query statement. * {*}CDC DDL Compiler{*}: This is a client side component. It creates the time based uncovered (global/local) index based on the given CDC DDL statement and a virtual table of CDC type. CDC will be a new table type. A CDC DDL syntax to create CDC on a (data) table can be as follows: Create CDC on (PHOENIX_ROW_TIMESTAMP() | ) INCLUDE (pre | post | latest | all) TTL = INDEX = SALT_BUCKETS= The above CDC DDL creates a virtual CDC table and an uncovered index. The CDC table PK columns start with the timestamp or user defined column and continue with the data table PK columns. The CDC table includes one non-PK column which is a JSON column. The change is expressed in this JSON column in multiple ways based on the CDC DDL or query statement. The change can be expressed as just the mutation for the change, the latest image of the row, the pre image of the row (the image before the change), the post image, or any combination of these. The CDC table is not a physical table on disk. It is just a virtual table to be used in a CDC query. Phoenix stores just the metadata for this virtual table. A CDC query can be as follow: Select * from where PHOENIX_ROW_TIMESTAMP() >= TO_DATE( …) AND PHOENIX_ROW_TIMESTAMP() < TO_DATE( …) This query would return the rows of the CDC table which is constructed on the server side by CDCUncoveredIndexRegionScanner by joining the uncovered index row versions with the corresponding data table row version (using raw scans). The above select query can be hinted at by using a new CDC hint to return just the actual change, pre,
[jira] [Resolved] (PHOENIX-6832) Uncovered Global Secondary Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir resolved PHOENIX-6832. Fix Version/s: 5.2.0 Resolution: Fixed > Uncovered Global Secondary Indexes > -- > > Key: PHOENIX-6832 > URL: https://issues.apache.org/jira/browse/PHOENIX-6832 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > Fix For: 5.2.0 > > > An index can be called an uncovered index if the index cannot serve a query > alone. The sole purpose of an uncovered index would be identifying the data > table rows to be scanned for the query. This implies that the DDL for an > uncovered index does not have the INCLUDE clause. > Then an index is called a covered index if the index can serve a query alone. > Please note that a covered index does not mean that it can cover all queries. > It just means that it can cover a query. A covered index can still cover some > queries even if the index DDL does not have the INCLUDE clause. This is > because a given query may reference only PK and/or indexed columns, and thus > a covered index without any included columns can serve this query by itself > (i.e., without joining index rows with data table rows). Another use case > for covered indexes without included columns is the count(*) queries. > Currently Phoenix uses indexes for count(*) queries by default. > Since uncovered indexes will be used to identify data table rows affected by > a given query and the column values will be picked up from the data table, we > can provide a solution that is much simpler than the solution for covered > indexes by taking the advantage of the fact that the data table is the source > of truth, and an index table is used to only map secondary keys to the > primary keys to eliminate full table scans. The correctness of such a > solution is ensured if for every data table row, there exists an index row. > Then our solution to update the data tables and their indexes in a consistent > fashion for global secondary indexes would be a two-phase update approach, > where we first insert the index table rows, and only if they are successful, > then we update the data table rows. > This approach does not require reading the existing data table rows which is > currently required for covered indexes. Also, it does not require two-phase > commit writes for updating and maintaining global secondary index table rows. > Eliminating a data table read operation and an RPC call to update the index > row verification status on the corresponding index row would cut down index > write latency overhead by at least 50% for global uncovered indexes when > compared to global covered indexes. This is because global covered indexes > require one data table read and two index write operations for every data > table update whereas global uncovered indexes would require only one index > write. For batch writes, the expected performance and latency improvement > would be much higher than 50% since a batch of random row updates would not > anymore require random seeks on the data table for reading existing data > table rows. > PHOENIX-6458, PHOENIX-6501 and PHOENIX-6663 improve the performance and > efficiency of joining index rows with their data table rows when a covered > index cannot cover a given query. We can further leverage it to support > uncovered indexes. > The uncovered indexes would be a significant performance improvement for > write intensive workloads. Also a common use case where uncovered indexes > will be desired is the upsert select use case on the data table, where a > subset of rows are updated in a batch. In this use case, the select query > performance is greatly improved via a covered index but the upsert part > suffers due to the covered index write overhead especially when the selected > data table rows are not consecutively stored on disk which is the most common > case. > As mentioned before, the DDL for index creation does not include the INCLUDE > clause. We can add the UNCOVERED keyword to indicate the index to be created > is an uncovered index, for example, CREATE UNCOVERED INDEX. > As in the case of covered indexes, we can do read repair for uncovered > indexes too. The difference is that instead of using the verify status for > index rows, we would check if the corresponding data table row exists for a > given index row. Since we would always retrieve the data table rows to join > back with index rows for uncovered indexes, the read repair cost would occur > only for deleting invalid index rows. Also, the existing index reverse > verification and repair feature supported by IndexTool can be used to do bulk > repair operations from
[jira] [Resolved] (PHOENIX-6888) Fixing TTL and Max Lookback Issues for Phoenix Tables
[ https://issues.apache.org/jira/browse/PHOENIX-6888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir resolved PHOENIX-6888. Fix Version/s: 5.3.0 Resolution: Fixed > Fixing TTL and Max Lookback Issues for Phoenix Tables > - > > Key: PHOENIX-6888 > URL: https://issues.apache.org/jira/browse/PHOENIX-6888 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.1.3 >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > Fix For: 5.3.0 > > > In HBase, the unit of data is a cell and data retention rules are executed at > the cell level. These rules are defined at the column family level. Phoenix > leverages the data retention features of HBase and exposes them to its users > to provide its TTL feature at the table level. However, these rules (since > they are defined at the cell level instead of the row level) results in > partial row retention that in turn creates data integrity issues at the > Phoenix level. > Similarly, Phoenix’s max lookback feature leverages HBase deleted data > retention capabilities to preserve deleted cells within a configurable max > lookback. This requires two data retention windows, max lookback and TTL. One > end of these windows is the current time and the end is a moment in the past > (i.e., current time minus the window size). Typically, the max lookback > window is shorter than the TTL window. In the max lookback window, we would > like to preserve the complete history of mutations regardless of how many > cell versions these mutations generated. In the remaining TTL window outside > the max lookback, we would like to apply the data retention rules defined > above. However, HBase provides only one data retention window. Thus, the max > lookback window had to be extended to become TTL window and the max lookback > feature results in unwantedly retaining deleted data for the maximum of max > lookback and TTL periods. > This Jira is to fix both of these issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-6918) ScanningResultIterator should not retry when the query times out
Kadir Ozdemir created PHOENIX-6918: -- Summary: ScanningResultIterator should not retry when the query times out Key: PHOENIX-6918 URL: https://issues.apache.org/jira/browse/PHOENIX-6918 Project: Phoenix Issue Type: Improvement Reporter: Kadir Ozdemir Assignee: Lokesh Khurana ScanningResultIterator drops dummy results and retries Result#next() in a loop as part of the Phoenix server paging feature. ScanningResultIterator does not check if the query has already timed out currently. This means that ScanningResultIterator let the server to work on the scan even though the Phoenix query is already timed out. ScanningResultIterator should check if the query of the scan has been timed out and if so should return an operation timeout exception as BaseResultIterators does. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-6888) Fixing TTL and Max Lookback Issues for Phoenix Tables
[ https://issues.apache.org/jira/browse/PHOENIX-6888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-6888: -- Assignee: Kadir Ozdemir > Fixing TTL and Max Lookback Issues for Phoenix Tables > - > > Key: PHOENIX-6888 > URL: https://issues.apache.org/jira/browse/PHOENIX-6888 > Project: Phoenix > Issue Type: Bug >Affects Versions: 5.1.3 >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > In HBase, the unit of data is a cell and data retention rules are executed at > the cell level. These rules are defined at the column family level. Phoenix > leverages the data retention features of HBase and exposes them to its users > to provide its TTL feature at the table level. However, these rules (since > they are defined at the cell level instead of the row level) results in > partial row retention that in turn creates data integrity issues at the > Phoenix level. > Similarly, Phoenix’s max lookback feature leverages HBase deleted data > retention capabilities to preserve deleted cells within a configurable max > lookback. This requires two data retention windows, max lookback and TTL. One > end of these windows is the current time and the end is a moment in the past > (i.e., current time minus the window size). Typically, the max lookback > window is shorter than the TTL window. In the max lookback window, we would > like to preserve the complete history of mutations regardless of how many > cell versions these mutations generated. In the remaining TTL window outside > the max lookback, we would like to apply the data retention rules defined > above. However, HBase provides only one data retention window. Thus, the max > lookback window had to be extended to become TTL window and the max lookback > feature results in unwantedly retaining deleted data for the maximum of max > lookback and TTL periods. > This Jira is to fix both of these issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-6888) Fixing TTL and Max Lookback Issues for Phoenix Tables
Kadir Ozdemir created PHOENIX-6888: -- Summary: Fixing TTL and Max Lookback Issues for Phoenix Tables Key: PHOENIX-6888 URL: https://issues.apache.org/jira/browse/PHOENIX-6888 Project: Phoenix Issue Type: Bug Affects Versions: 5.1.3 Reporter: Kadir Ozdemir In HBase, the unit of data is a cell and data retention rules are executed at the cell level. These rules are defined at the column family level. Phoenix leverages the data retention features of HBase and exposes them to its users to provide its TTL feature at the table level. However, these rules (since they are defined at the cell level instead of the row level) results in partial row retention that in turn creates data integrity issues at the Phoenix level. Similarly, Phoenix’s max lookback feature leverages HBase deleted data retention capabilities to preserve deleted cells within a configurable max lookback. This requires two data retention windows, max lookback and TTL. One end of these windows is the current time and the end is a moment in the past (i.e., current time minus the window size). Typically, the max lookback window is shorter than the TTL window. In the max lookback window, we would like to preserve the complete history of mutations regardless of how many cell versions these mutations generated. In the remaining TTL window outside the max lookback, we would like to apply the data retention rules defined above. However, HBase provides only one data retention window. Thus, the max lookback window had to be extended to become TTL window and the max lookback feature results in unwantedly retaining deleted data for the maximum of max lookback and TTL periods. This Jira is to fix both of these issues. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PHOENIX-6884) Phoenix to use hbase.rpc.read.timeout and hbase.rpc.write.timeout
[ https://issues.apache.org/jira/browse/PHOENIX-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir resolved PHOENIX-6884. Assignee: Kadir Ozdemir Resolution: Fixed > Phoenix to use hbase.rpc.read.timeout and hbase.rpc.write.timeout > - > > Key: PHOENIX-6884 > URL: https://issues.apache.org/jira/browse/PHOENIX-6884 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > Fix For: 5.2.0, 5.1.4 > > > Phoenix uses the same RPC timeout, hbase.rpc.timeout, for both read and write > operations currently. HBASE-15866 split hbase.rpc.timeout into > hbase.rpc.read.timeout and hbase.rpc.write.timeout. > The paging feature (PHOENIX-6211, PHOENIX-6207 and PHOENIX-5998) slices > server side operations into small chunks (i.e., pages) and allows all queries > make progress without timeouts. This feature makes Phoenix a better > time-sharing system and thus improves availability. > In order to take advantage of the paging feature fully, we need to set the > timeout for scan RPCs to a small value. While it is reasonable to reduce the > RPC timeout for the read path because of the paging feature, it is not safe > to reduce it for the write path drastically. This is because of the batch > writes and synchronous index updates within a batch write. This means we need > to start separately configuring read and write RPC timeouts. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-6884) Phoenix to use hbase.rpc.read.timeout and hbase.rpc.write.timeout
[ https://issues.apache.org/jira/browse/PHOENIX-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-6884: --- Fix Version/s: 5.2.0 5.1.4 > Phoenix to use hbase.rpc.read.timeout and hbase.rpc.write.timeout > - > > Key: PHOENIX-6884 > URL: https://issues.apache.org/jira/browse/PHOENIX-6884 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Priority: Major > Fix For: 5.2.0, 5.1.4 > > > Phoenix uses the same RPC timeout, hbase.rpc.timeout, for both read and write > operations currently. HBASE-15866 split hbase.rpc.timeout into > hbase.rpc.read.timeout and hbase.rpc.write.timeout. > The paging feature (PHOENIX-6211, PHOENIX-6207 and PHOENIX-5998) slices > server side operations into small chunks (i.e., pages) and allows all queries > make progress without timeouts. This feature makes Phoenix a better > time-sharing system and thus improves availability. > In order to take advantage of the paging feature fully, we need to set the > timeout for scan RPCs to a small value. While it is reasonable to reduce the > RPC timeout for the read path because of the paging feature, it is not safe > to reduce it for the write path drastically. This is because of the batch > writes and synchronous index updates within a batch write. This means we need > to start separately configuring read and write RPC timeouts. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-6884) Phoenix to use hbase.rpc.read.timeout and hbase.rpc.write.timeout
Kadir Ozdemir created PHOENIX-6884: -- Summary: Phoenix to use hbase.rpc.read.timeout and hbase.rpc.write.timeout Key: PHOENIX-6884 URL: https://issues.apache.org/jira/browse/PHOENIX-6884 Project: Phoenix Issue Type: Improvement Reporter: Kadir Ozdemir Phoenix uses the same RPC timeout, hbase.rpc.timeout, for both read and write operations currently. HBASE-15866 split hbase.rpc.timeout into hbase.rpc.read.timeout and hbase.rpc.write.timeout. The paging feature (PHOENIX-6211, PHOENIX-6207 and PHOENIX-5998) slices server side operations into small chunks (i.e., pages) and allows all queries make progress without timeouts. This feature makes Phoenix a better time-sharing system and thus improves availability. In order to take advantage of the paging feature fully, we need to set the timeout for scan RPCs to a small value. While it is reasonable to reduce the RPC timeout for the read path because of the paging feature, it is not safe to reduce it for the write path drastically. This is because of the batch writes and synchronous index updates within a batch write. This means we need to start separately configuring read and write RPC timeouts. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-6883) Phoenix metadata caching redesign
Kadir Ozdemir created PHOENIX-6883: -- Summary: Phoenix metadata caching redesign Key: PHOENIX-6883 URL: https://issues.apache.org/jira/browse/PHOENIX-6883 Project: Phoenix Issue Type: Improvement Reporter: Kadir Ozdemir PHOENIX-6761 improves the client side metadata caching by eliminating the separate cache for each connection. This improvement results in memory and compute savings since it eliminates copying CQSI level cache every time a Phoenix connection is created, and also replaces the inefficient the CQSI level cache implementation with Guava Cache from Google. Despite this improvement, the overall metadata caching architecture begs for redesign. This is because every operation in Phoenix need to make multiple RPCs to metadata servers for the SYSTEM.CATALOG table (please see PHOENIX-6860) to ensure the latest metadata changes are visible to clients. These constant RPCs makes the region servers serving SYSTEM.CATALOG hot spot and thus leads to poor performance and availability issues. The UPDATE_CACHE_FREQUENCY configuration parameter specifies how frequently the client cache is updated. However, setting this parameter to a non-zero value results in stale caching. Stale caching can cause data integrity issues. For example, if an index table creation is not visible to the client, Phoenix would skip updating the index table in the write path. That's why is this parameter is typically set to zero. However, this defeats the purpose of client side metadata caching. The redesign of the metadata caching architecture is to directly address this issue by making sure that the client metadata caching is always used (that is, UPDATE_CACHE_FREQUENCY is set to NEVER) but still ensures the data integrity. This is achieved by three main changes. The first change is to introduce server side metadata caching in all region servers. Currently, the server side metadata caching is used on the region servers serving SYSTEM.CATALOG. This metadata caching should be strongly consistent such that the metadata updates should include invalidating the corresponding entries on the server side caches. This would ensure the server cache would not become stale. The second change is that the Phoenix client passes the LAST_DDL_TIMESTAMP table attribute along with scan and mutation operations to the server regions (more accurately to the Phoenix coprocessors). Then the Phoenix coprocessors would check the timestamp on a given operation against with the timestamp in its server side cache to validate that the client did not use stale metadata when it prepared the operation. If the client did use stale metadata then the coprocessor would return an exception (this exception can be called StaleClientMetadataCacheException) to the client. The third change is that upon receiving StaleClientMetadataCacheException the Phoenix client makes an RPC call to the metadata server to update the client cache, reconstruct the operation with the updated cached, and retry the operation. This redesign would require updating client and server metadata caches only when metadata is stale instead of updating the client metadata cache for each (scan or mutation) operation. This would eliminate hot spotting on the metadata servers and thus poor performance and availability issues caused by this hot spotting. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-6854) Salted global indexes do not work for queries with uncovered columns
[ https://issues.apache.org/jira/browse/PHOENIX-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-6854: --- Description: With an index hint, global indexes can be used for queries with uncovered columns. However, when the data table is salted, Phoenix does not project the columns correctly for queries with uncovered columns, and thus the result set returns wrong columns. For example, the following select statement returns 'b' instead of 'bcde'. {code:java} create table T1 (id varchar not null primary key, val1 varchar, val2 varchar, val3 varchar) SALT_BUCKETS=4; upsert into T1 values ('b', 'bc', 'bcd', 'bcde'); create index I1 on T1 (val1) include (val2); select /*+ INDEX(T1 I1)*/ val3 from T1 WHERE val1 = 'bc'; {code} was: With an index hint, global indexes can be used for queries with uncovered columns. However, when the data table is salted, Phoenix does not project the columns correctly for queries with uncovered columns, and thus the result set returns wrong columns. For example, the following select statement returns 'b' instead of 'bcde'. create table T1 (id varchar not null primary key, val1 varchar, val2 varchar, val3 varchar) SALT_BUCKETS=4; upsert into T1 values ('b', 'bc', 'bcd', 'bcde'); CREATE INDEX I1 on T1 (val1) include (val2); SELECT /{*}+ INDEX(T1 I1){*}/ val3 from T1 WHERE val1 = 'bc'; > Salted global indexes do not work for queries with uncovered columns > > > Key: PHOENIX-6854 > URL: https://issues.apache.org/jira/browse/PHOENIX-6854 > Project: Phoenix > Issue Type: Bug >Reporter: Kadir Ozdemir >Priority: Major > Fix For: 5.2.0, 5.1.3 > > > With an index hint, global indexes can be used for queries with uncovered > columns. However, when the data table is salted, Phoenix does not project the > columns correctly for queries with uncovered columns, and thus the result set > returns wrong columns. For example, the following select statement returns > 'b' instead of 'bcde'. > > {code:java} > create table T1 (id varchar not null primary key, val1 varchar, val2 varchar, > val3 varchar) SALT_BUCKETS=4; > upsert into T1 values ('b', 'bc', 'bcd', 'bcde'); > create index I1 on T1 (val1) include (val2); > select /*+ INDEX(T1 I1)*/ val3 from T1 WHERE val1 = 'bc'; > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-6854) Salted global indexes do not work for queries with uncovered columns
[ https://issues.apache.org/jira/browse/PHOENIX-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-6854: --- Description: With an index hint, global indexes can be used for queries with uncovered columns. However, when the data table is salted, Phoenix does not project the columns correctly for queries with uncovered columns, and thus the result set returns wrong columns. For example, the following select statement returns 'b' instead of 'bcde'. create table T1 (id varchar not null primary key, val1 varchar, val2 varchar, val3 varchar) SALT_BUCKETS=4; upsert into T1 values ('b', 'bc', 'bcd', 'bcde'); CREATE INDEX I1 on T1 (val1) include (val2); SELECT /{*}+ INDEX(T1 I1){*}/ val3 from T1 WHERE val1 = 'bc'; was: With an index hint, global indexes can be used for queries with uncovered columns. However, when the data table is salted, Phoenix does not project the columns correctly for for queries with uncovered columns, and thus the result set returns wrong columns. For example, the following select statement returns 'b' instead of 'bcde'. create table T1 (id varchar not null primary key, val1 varchar, val2 varchar, val3 varchar) SALT_BUCKETS=4; upsert into T1 values ('b', 'bc', 'bcd', 'bcde'); CREATE INDEX I1 on T1 (val1) include (val2); SELECT /*+ INDEX(T1 I1)*/ val3 from T1 WHERE val1 = 'bc'; > Salted global indexes do not work for queries with uncovered columns > > > Key: PHOENIX-6854 > URL: https://issues.apache.org/jira/browse/PHOENIX-6854 > Project: Phoenix > Issue Type: Bug >Reporter: Kadir Ozdemir >Priority: Major > Fix For: 5.2.0, 5.1.3 > > > With an index hint, global indexes can be used for queries with uncovered > columns. However, when the data table is salted, Phoenix does not project the > columns correctly for queries with uncovered columns, and thus the result set > returns wrong columns. For example, the following select statement returns > 'b' instead of 'bcde'. > create table T1 (id varchar not null primary key, val1 varchar, val2 varchar, > val3 varchar) SALT_BUCKETS=4; > upsert into T1 values ('b', 'bc', 'bcd', 'bcde'); > CREATE INDEX I1 on T1 (val1) include (val2); > SELECT /{*}+ INDEX(T1 I1){*}/ val3 from T1 WHERE val1 = 'bc'; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-6854) Salted global indexes do not work for queries with uncovered columns
[ https://issues.apache.org/jira/browse/PHOENIX-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-6854: --- Summary: Salted global indexes do not work for queries with uncovered columns (was: Salted global indexes does not work for queries with uncovered columns) > Salted global indexes do not work for queries with uncovered columns > > > Key: PHOENIX-6854 > URL: https://issues.apache.org/jira/browse/PHOENIX-6854 > Project: Phoenix > Issue Type: Bug >Reporter: Kadir Ozdemir >Priority: Major > Fix For: 5.2.0, 5.1.3 > > > With an index hint, global indexes can be used for queries with uncovered > columns. However, when the data table is salted, Phoenix does not project the > columns correctly for for queries with uncovered columns, and thus the result > set returns wrong columns. For example, the following select statement > returns 'b' instead of 'bcde'. > create table T1 (id varchar not null primary key, val1 varchar, val2 varchar, > val3 varchar) SALT_BUCKETS=4; > upsert into T1 values ('b', 'bc', 'bcd', 'bcde'); > CREATE INDEX I1 on T1 (val1) include (val2); > SELECT /*+ INDEX(T1 I1)*/ val3 from T1 WHERE val1 = 'bc'; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-6854) Salted global indexes does not work for queries with uncovered columns
Kadir Ozdemir created PHOENIX-6854: -- Summary: Salted global indexes does not work for queries with uncovered columns Key: PHOENIX-6854 URL: https://issues.apache.org/jira/browse/PHOENIX-6854 Project: Phoenix Issue Type: Bug Reporter: Kadir Ozdemir Fix For: 5.2.0, 5.1.3 With an index hint, global indexes can be used for queries with uncovered columns. However, when the data table is salted, Phoenix does not project the columns correctly for for queries with uncovered columns, and thus the result set returns wrong columns. For example, the following select statement returns 'b' instead of 'bcde'. create table T1 (id varchar not null primary key, val1 varchar, val2 varchar, val3 varchar) SALT_BUCKETS=4; upsert into T1 values ('b', 'bc', 'bcd', 'bcde'); CREATE INDEX I1 on T1 (val1) include (val2); SELECT /*+ INDEX(T1 I1)*/ val3 from T1 WHERE val1 = 'bc'; -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-6141) Ensure consistency between SYSTEM.CATALOG and SYSTEM.CHILD_LINK
[ https://issues.apache.org/jira/browse/PHOENIX-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-6141: --- Fix Version/s: 5.2.0 5.1.4 (was: 4.17.0) > Ensure consistency between SYSTEM.CATALOG and SYSTEM.CHILD_LINK > --- > > Key: PHOENIX-6141 > URL: https://issues.apache.org/jira/browse/PHOENIX-6141 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.0.0, 4.15.0 >Reporter: Chinmay Kulkarni >Assignee: Palash Chauhan >Priority: Blocker > Fix For: 5.2.0, 5.1.4 > > > Before 4.15, "CREATE/DROP VIEW" was an atomic operation since we were issuing > batch mutations on just the 1 SYSTEM.CATALOG region. In 4.15 we introduced > SYSTEM.CHILD_LINK to store the parent->child links and so a CREATE VIEW is no > longer atomic since it consists of 2 separate RPCs (1 to SYSTEM.CHILD_LINK > to add the linking row and another to SYSTEM.CATALOG to write metadata for > the new view). > If the second RPC i.e. the RPC to write metadata to SYSTEM.CATALOG fails > after the 1st RPC has already gone through, there will be an inconsistency > between both metadata tables. We will see orphan parent->child linking rows > in SYSTEM.CHILD_LINK in this case. This can cause the following issues: > # ALTER TABLE calls on the base table will fail > # DROP TABLE without CASCADE will fail > # The upgrade path has calls like UpgradeUtil.upgradeTable() which will fail > # Any metadata consistency checks can be thrown off > # Unnecessary extra storage of orphan links > The first 3 issues happen because we wrongly deduce that a base table has > child views due to the orphan linking rows. > This Jira aims at trying to come up with a way to make mutations among > SYSTEM.CATALOG and SYSTEM.CHILD_LINK an atomic transaction. We can use a > 2-phase commit approach like in global indexing or also potentially explore > using a transaction manager. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-6141) Ensure consistency between SYSTEM.CATALOG and SYSTEM.CHILD_LINK
[ https://issues.apache.org/jira/browse/PHOENIX-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-6141: -- Assignee: Palash Chauhan > Ensure consistency between SYSTEM.CATALOG and SYSTEM.CHILD_LINK > --- > > Key: PHOENIX-6141 > URL: https://issues.apache.org/jira/browse/PHOENIX-6141 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.0.0, 4.15.0 >Reporter: Chinmay Kulkarni >Assignee: Palash Chauhan >Priority: Blocker > Fix For: 4.17.0 > > > Before 4.15, "CREATE/DROP VIEW" was an atomic operation since we were issuing > batch mutations on just the 1 SYSTEM.CATALOG region. In 4.15 we introduced > SYSTEM.CHILD_LINK to store the parent->child links and so a CREATE VIEW is no > longer atomic since it consists of 2 separate RPCs (1 to SYSTEM.CHILD_LINK > to add the linking row and another to SYSTEM.CATALOG to write metadata for > the new view). > If the second RPC i.e. the RPC to write metadata to SYSTEM.CATALOG fails > after the 1st RPC has already gone through, there will be an inconsistency > between both metadata tables. We will see orphan parent->child linking rows > in SYSTEM.CHILD_LINK in this case. This can cause the following issues: > # ALTER TABLE calls on the base table will fail > # DROP TABLE without CASCADE will fail > # The upgrade path has calls like UpgradeUtil.upgradeTable() which will fail > # Any metadata consistency checks can be thrown off > # Unnecessary extra storage of orphan links > The first 3 issues happen because we wrongly deduce that a base table has > child views due to the orphan linking rows. > This Jira aims at trying to come up with a way to make mutations among > SYSTEM.CATALOG and SYSTEM.CHILD_LINK an atomic transaction. We can use a > 2-phase commit approach like in global indexing or also potentially explore > using a transaction manager. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (PHOENIX-6761) Phoenix Client Side Metadata Caching Improvement
[ https://issues.apache.org/jira/browse/PHOENIX-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir resolved PHOENIX-6761. Resolution: Fixed > Phoenix Client Side Metadata Caching Improvement > > > Key: PHOENIX-6761 > URL: https://issues.apache.org/jira/browse/PHOENIX-6761 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Palash Chauhan >Priority: Major > Attachments: PHOENIX-6761.master.initial.patch > > > CQSI maintains a client-side metadata cache, i.e., schemas, tables, and > functions, that evicts the last recently used table entries when the cache > size grows beyond the configured size. > Each time a Phoenix connection is created, the client-side metadata cache > maintained by the CQSI object creating this connection is cloned for the > connection. Thus, we have two levels of caches, one at the Phoenix connection > level and the other at the CQSI level. > When a Phoenix client needs to update the client side cache, it updates both > caches (on the connection object and on the CQSI object). The Phoenix client > attempts to retrieve a table from the connection level cache. If this table > is not there then the Phoenix client does not check the CQSI level cache, > instead it retrieves the object from the server and finally updates both the > connection and CQSI level cache. > PMetaDataCache provides caching for tables, schemas and functions but it > maintains separate caches internally, one cache for each type of metadata. > The cache for the tables is actually a cache of PTableRef objects. PTableRef > holds a reference to the table object as well as the estimated size of the > table object, the create time, last access time, and resolved time. The > create time is set to the last access time value provided when the PTableRef > object is inserted into the cache. The resolved time is also provided when > the PTableRef object is inserted into the cache. Both the created time and > resolved time are final fields (i.e., they are not updated). PTableRef > provide a setter method to update the last access time. PMetaDataCache > updates the last access time whenever the table is retrieved from the cache. > The LRU eviction policy is implemented using the last access time. The > eviction policy is not implemented for schemas and functions. The > configuration parameter for the frequency of updating cache is > phoenix.default.update.cache.frequency. This can be defined at the cluster or > table level. When it is set to zero, it means cache would not be used. > Obviously the eviction of the cache is to limit the memory consumed by the > cache. The expected behavior is that when a table is removed from the cache, > the table (PTableImpl) object is also garbage collected. However, this does > not really happen because multiple caches make references to the same object > and each cache maintains its own table refs and thus access times. This means > that the access time for the same table may differ from one cache to another; > and when one cache can evict an object, another cache will hold on the same > object. > Although individual caches implements the LRU eviction policy, the overall > memory eviction policy for the actual table objects is more like age based > cache. If a table is frequently accessed from the connection level caches, > the last access time maintained by the corresponding table ref objects for > this table will be updated. However, these updates on the access times will > not be visible to the CQSI level cache. The table refs in the CQSI level > cache have the same create time and access time. > Since whenever an object is inserted into the local cache of a connection > object, it is also inserted the cache on the CSQI object, the CQSI level > cache will grow faster than the caches on the connection objects. When the > cache reaches its maximum size, the newly inserted tables will result in > evicting one of the existing tables in the cache. Since the access time of > these tables are not updated on the CQSI level cache, it is likely that the > table that has stayed in the cache for the longest period of time will be > evicted (regardless of whether the same table is frequently accessed via the > connection level caches). This obviously defeats the purpose of an LRU cache. > Another problem with the current cache is related to the choice of its > internal data structures and its eviction implementation. The table refs in > the cache are maintained in a hash map which maps a table key (which is pair > of a tenant id and table name) to a table ref. When the size of a cache (the > total byte size of the table objects referred by the cache) reaches its > configured limit, how much
[jira] [Reopened] (PHOENIX-6776) Abort scans of closed connections at ScanningResultIterator
[ https://issues.apache.org/jira/browse/PHOENIX-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reopened PHOENIX-6776: > Abort scans of closed connections at ScanningResultIterator > --- > > Key: PHOENIX-6776 > URL: https://issues.apache.org/jira/browse/PHOENIX-6776 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Lokesh Khurana >Priority: Major > Fix For: 5.2.0, 5.1.3 > > > The server side paging feature introduced by PHOENIX-6211 breaks a scan into > timed scan operations on the server side and returns an intermediate result > for each operation. This intermediate result could be a valid result or a > dummy result. The HBase scans are wrapped by ScanningResultIterator in > Phoenix. If the next call on a scan returns a dummy or empty result, > ScanningResultIterator ignores this result and call the next method on the > scan again. However, if the Phoenix connection is closed, we should abort the > scan instead of continuing scanning. This will result in timely abort of > scans and release of resources (especially when phoenix.server.page.size.ms > is set to a small value, e.g., 5 sec). > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (PHOENIX-6832) Uncovered Global Secondary Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-6832: --- Description: An index can be called an uncovered index if the index cannot serve a query alone. The sole purpose of an uncovered index would be identifying the data table rows to be scanned for the query. This implies that the DDL for an uncovered index does not have the INCLUDE clause. Then an index is called a covered index if the index can serve a query alone. Please note that a covered index does not mean that it can cover all queries. It just means that it can cover a query. A covered index can still cover some queries even if the index DDL does not have the INCLUDE clause. This is because a given query may reference only PK and/or indexed columns, and thus a covered index without any included columns can serve this query by itself (i.e., without joining index rows with data table rows). Another use case for covered indexes without included columns is the count(*) queries. Currently Phoenix uses indexes for count(*) queries by default. Since uncovered indexes will be used to identify data table rows affected by a given query and the column values will be picked up from the data table, we can provide a solution that is much simpler than the solution for covered indexes by taking the advantage of the fact that the data table is the source of truth, and an index table is used to only map secondary keys to the primary keys to eliminate full table scans. The correctness of such a solution is ensured if for every data table row, there exists an index row. Then our solution to update the data tables and their indexes in a consistent fashion for global secondary indexes would be a two-phase update approach, where we first insert the index table rows, and only if they are successful, then we update the data table rows. This approach does not require reading the existing data table rows which is currently required for covered indexes. Also, it does not require two-phase commit writes for updating and maintaining global secondary index table rows. Eliminating a data table read operation and an RPC call to update the index row verification status on the corresponding index row would cut down index write latency overhead by at least 50% for global uncovered indexes when compared to global covered indexes. This is because global covered indexes require one data table read and two index write operations for every data table update whereas global uncovered indexes would require only one index write. For batch writes, the expected performance and latency improvement would be much higher than 50% since a batch of random row updates would not anymore require random seeks on the data table for reading existing data table rows. PHOENIX-6458, PHOENIX-6501 and PHOENIX-6663 improve the performance and efficiency of joining index rows with their data table rows when a covered index cannot cover a given query. We can further leverage it to support uncovered indexes. The uncovered indexes would be a significant performance improvement for write intensive workloads. Also a common use case where uncovered indexes will be desired is the upsert select use case on the data table, where a subset of rows are updated in a batch. In this use case, the select query performance is greatly improved via a covered index but the upsert part suffers due to the covered index write overhead especially when the selected data table rows are not consecutively stored on disk which is the most common case. As mentioned before, the DDL for index creation does not include the INCLUDE clause. We can add the UNCOVERED keyword to indicate the index to be created is an uncovered index, for example, CREATE UNCOVERED INDEX. As in the case of covered indexes, we can do read repair for uncovered indexes too. The difference is that instead of using the verify status for index rows, we would check if the corresponding data table row exists for a given index row. Since we would always retrieve the data table rows to join back with index rows for uncovered indexes, the read repair cost would occur only for deleting invalid index rows. Also, the existing index reverse verification and repair feature supported by IndexTool can be used to do bulk repair operations from time to time. was: An index can be called a covered index if the index cannot serve a query alone. The sole purpose of an uncovered index would be identifying the data table rows to be scanned for the query. This implies that the DDL for an uncovered index does not have the INCLUDE clause. Then an index is called a covered index if the index can serve a query alone. Please note that a covered index does not mean that it can cover all queries. It just means that it can cover a query. A covered index can still cover some
[jira] [Assigned] (PHOENIX-6832) Uncovered Global Secondary Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-6832: -- Assignee: Kadir Ozdemir > Uncovered Global Secondary Indexes > -- > > Key: PHOENIX-6832 > URL: https://issues.apache.org/jira/browse/PHOENIX-6832 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > An index can be called a covered index if the index cannot serve a query > alone. The sole purpose of an uncovered index would be identifying the data > table rows to be scanned for the query. This implies that the DDL for an > uncovered index does not have the INCLUDE clause. > Then an index is called a covered index if the index can serve a query alone. > Please note that a covered index does not mean that it can cover all queries. > It just means that it can cover a query. A covered index can still cover some > queries even if the index DDL does not have the INCLUDE clause. This is > because a given query may reference only PK and/or indexed columns, and thus > a covered index without any included columns can serve this query by itself > (i.e., without joining index rows with data table rows). Another use case > for covered indexes without included columns is the count(*) queries. > Currently Phoenix uses indexes for count(*) queries by default. > Since uncovered indexes will be used to identify data table rows affected by > a given query and the column values will be picked up from the data table, we > can provide a solution that is much simpler than the solution for covered > indexes by taking the advantage of the fact that the data table is the source > of truth, and an index table is used to only map secondary keys to the > primary keys to eliminate full table scans. The correctness of such a > solution is ensured if for every data table row, there exists an index row. > Then our solution to update the data tables and their indexes in a consistent > fashion for global secondary indexes would be a two-phase update approach, > where we first insert the index table rows, and only if they are successful, > then we update the data table rows. > This approach does not require reading the existing data table rows which is > currently required for covered indexes. Also, it does not require two-phase > commit writes for updating and maintaining global secondary index table rows. > Eliminating a data table read operation and an RPC call to update the index > row verification status on the corresponding index row would cut down index > write latency overhead by at least 50% for global uncovered indexes when > compared to global covered indexes. This is because global covered indexes > require one data table read and two index write operations for every data > table update whereas global uncovered indexes would require only one index > write. For batch writes, the expected performance and latency improvement > would be much higher than 50% since a batch of random row updates would not > anymore require random seeks on the data table for reading existing data > table rows. > PHOENIX-6458, PHOENIX-6501 and PHOENIX-6663 improve the performance and > efficiency of joining index rows with their data table rows when a covered > index cannot cover a given query. We can further leverage it to support > uncovered indexes. > The uncovered indexes would be a significant performance improvement for > write intensive workloads. Also a common use case where uncovered indexes > will be desired is the upsert select use case on the data table, where a > subset of rows are updated in a batch. In this use case, the select query > performance is greatly improved via a covered index but the upsert part > suffers due to the covered index write overhead especially when the selected > data table rows are not consecutively stored on disk which is the most common > case. > As mentioned before, the DDL for index creation does not include the INCLUDE > clause. We can add the UNCOVERED keyword to indicate the index to be created > is an uncovered index, for example, CREATE UNCOVERED INDEX. > As in the case of covered indexes, we can do read repair for uncovered > indexes too. The difference is that instead of using the verify status for > index rows, we would check if the corresponding data table row exists for a > given index row. Since we would always retrieve the data table rows to join > back with index rows for uncovered indexes, the read repair cost would occur > only for deleting invalid index rows. Also, the existing index reverse > verification and repair feature supported by IndexTool can be used to do bulk > repair operations from time to time. -- This message was sent by Atlassian
[jira] [Updated] (PHOENIX-6832) Uncovered Global Secondary Indexes
[ https://issues.apache.org/jira/browse/PHOENIX-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-6832: --- Description: An index can be called a covered index if the index cannot serve a query alone. The sole purpose of an uncovered index would be identifying the data table rows to be scanned for the query. This implies that the DDL for an uncovered index does not have the INCLUDE clause. Then an index is called a covered index if the index can serve a query alone. Please note that a covered index does not mean that it can cover all queries. It just means that it can cover a query. A covered index can still cover some queries even if the index DDL does not have the INCLUDE clause. This is because a given query may reference only PK and/or indexed columns, and thus a covered index without any included columns can serve this query by itself (i.e., without joining index rows with data table rows). Another use case for covered indexes without included columns is the count(*) queries. Currently Phoenix uses indexes for count(*) queries by default. Since uncovered indexes will be used to identify data table rows affected by a given query and the column values will be picked up from the data table, we can provide a solution that is much simpler than the solution for covered indexes by taking the advantage of the fact that the data table is the source of truth, and an index table is used to only map secondary keys to the primary keys to eliminate full table scans. The correctness of such a solution is ensured if for every data table row, there exists an index row. Then our solution to update the data tables and their indexes in a consistent fashion for global secondary indexes would be a two-phase update approach, where we first insert the index table rows, and only if they are successful, then we update the data table rows. This approach does not require reading the existing data table rows which is currently required for covered indexes. Also, it does not require two-phase commit writes for updating and maintaining global secondary index table rows. Eliminating a data table read operation and an RPC call to update the index row verification status on the corresponding index row would cut down index write latency overhead by at least 50% for global uncovered indexes when compared to global covered indexes. This is because global covered indexes require one data table read and two index write operations for every data table update whereas global uncovered indexes would require only one index write. For batch writes, the expected performance and latency improvement would be much higher than 50% since a batch of random row updates would not anymore require random seeks on the data table for reading existing data table rows. PHOENIX-6458, PHOENIX-6501 and PHOENIX-6663 improve the performance and efficiency of joining index rows with their data table rows when a covered index cannot cover a given query. We can further leverage it to support uncovered indexes. The uncovered indexes would be a significant performance improvement for write intensive workloads. Also a common use case where uncovered indexes will be desired is the upsert select use case on the data table, where a subset of rows are updated in a batch. In this use case, the select query performance is greatly improved via a covered index but the upsert part suffers due to the covered index write overhead especially when the selected data table rows are not consecutively stored on disk which is the most common case. As mentioned before, the DDL for index creation does not include the INCLUDE clause. We can add the UNCOVERED keyword to indicate the index to be created is an uncovered index, for example, CREATE UNCOVERED INDEX. As in the case of covered indexes, we can do read repair for uncovered indexes too. The difference is that instead of using the verify status for index rows, we would check if the corresponding data table row exists for a given index row. Since we would always retrieve the data table rows to join back with index rows for uncovered indexes, the read repair cost would occur only for deleting invalid index rows. Also, the existing index reverse verification and repair feature supported by IndexTool can be used to do bulk repair operations from time to time. was: An index can be called a covered index if the index cannot serve a query alone. The sole purpose of an uncovered index would be identifying the data table rows to be scanned for the query. This implies that the DDL for an uncovered index does not have the INCLUDE clause. Then an index is called a covered index if the index can serve a query alone. Please note that a covered index does not mean that it can cover all queries. It just means that it can cover a query. A covered index can still cover some queries
[jira] [Created] (PHOENIX-6832) Uncovered Global Secondary Indexes
Kadir Ozdemir created PHOENIX-6832: -- Summary: Uncovered Global Secondary Indexes Key: PHOENIX-6832 URL: https://issues.apache.org/jira/browse/PHOENIX-6832 Project: Phoenix Issue Type: Improvement Reporter: Kadir Ozdemir An index can be called a covered index if the index cannot serve a query alone. The sole purpose of an uncovered index would be identifying the data table rows to be scanned for the query. This implies that the DDL for an uncovered index does not have the INCLUDE clause. Then an index is called a covered index if the index can serve a query alone. Please note that a covered index does not mean that it can cover all queries. It just means that it can cover a query. A covered index can still cover some queries even if the index DDL does not have the INCLUDE clause. This is because if a given query may reference to only PK and/or indexed columns, and thus a covered index without any included columns can serve this query by itself (i.e., without joining index rows with data table rows). Another use case for covered indexes without included columns is the count(*) queries. Currently Phoenix uses indexes for count(*) queries by default. Since uncovered indexes will be used to identify data table rows affected by a given query and the column values will be picked up from the data table, we can provide a solution that is much simpler than the solution for covered indexes by taking the advantage of the fact that the data table is the source of truth, and an index table is used to only map secondary keys to the primary keys to eliminate full table scans. The correctness of such a solution is ensured if for every data table row, there exists an index row. Then our solution to update the data tables and their indexes in a consistent fashion for global secondary indexes would be a two-phase update approach, where we first insert the index table rows, and only if they are successful, then we update the data table rows. This approach does not require reading the existing data table rows which is currently required for covered indexes. Also, it does not require two-phase commit writes for updating and maintaining global secondary index table rows. Eliminating a data table read operation and an RPC call to update the index row verification status on the corresponding index row would cut down index write latency overhead by at least 50% for global uncovered indexes when compared to global covered indexes. This is because global covered indexes require one data table read and two index write operation for every data table update whereas global uncovered indexes would require only one index write. For batch writes, the expected performance and latency improvement would be much higher than 50% since a batch of random row updates would not anymore require random seeks on the data table for reading existing data table rows. PHOENIX-6458, PHOENIX-6501 and PHOENIX-6663 improve the performance and efficiency of joining index rows with their data table rows when a covered index cannot cover a given query. We can further leverage it to support uncovered indexes. The uncovered indexes would be a significant performance improvement for write intensive workloads. Also a common use case where uncovered indexes will be desired is the upsert select use case on the data table, where a subset of rows are updated in a batch. In this use case, the select query performance is greatly improved via a covered index but the upsert part suffers due to the covered index write overhead especially when the selected data table rows are not consecutively stored on disk which is the most common case. As mentioned before, the DDL for index creation does not include the INCLUDE clause. We can add UNCOVERED key word to indicate the index to be created is an uncovered index, for example, CREATE UNCOVERED INDEX. As in the case of covered indexes, we can do read repair for uncovered indexes too. The difference is that instead of using the verify status for index rows, we would check if the corresponding data table row exists for a given index row. Since we would always retrieve the data table rows to join back with index rows for uncovered indexes, the read repair cost would occur only for deleting invalid index rows. Also, the existing index reverse verification and repair feature supported by IndexTool can be used to do bulk repair operation time to time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-6644) Column name based Result Set getter issue with view indexes
[ https://issues.apache.org/jira/browse/PHOENIX-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-6644: -- Assignee: Saurabh Rai > Column name based Result Set getter issue with view indexes > --- > > Key: PHOENIX-6644 > URL: https://issues.apache.org/jira/browse/PHOENIX-6644 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.1.2 >Reporter: Kadir Ozdemir >Assignee: Saurabh Rai >Priority: Major > > If a column used to define the view is also a projected column in a select > clause, a view index is chosen by the Phoenix query optimizer for the select > statement, and the value of the projected column is retrieved using the > column name based ResultSet getter, Phoenix returns ColumnNotFoundException. > For example, the last assertEquals fails with > {quote}org.apache.phoenix.schema.ColumnNotFoundException: ERROR 504 (42703): > Undefined column. columnName=V1 > {quote} > in the following integration test: > {code:java} > @Test > public void test() throws Exception { > try (Connection conn = DriverManager.getConnection(getUrl())) { > conn.createStatement().execute("create table T "+ > " (id INTEGER not null primary key, v1 varchar, v2 varchar, > v3 varchar)"); > conn.createStatement().execute("CREATE VIEW V AS SELECT * FROM T > WHERE v1 = 'a'"); > conn.createStatement().execute("CREATE INDEX I ON V (v2) INCLUDE > (v3)"); > conn.createStatement().execute("upsert into V values (1, 'a', 'ab', > 'abc')"); > conn.commit(); > ResultSet rs = conn.createStatement().executeQuery("SELECT v1, v3 > from V WHERE v2 = 'ab'"); > assertTrue(rs.next()); > assertEquals("a", rs.getString(1)); > assertEquals("a", rs.getString("v1")); > } > } {code} > Without the view index, the above test passes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-6828) Test failure in master branch : LogicalTableNameIT.testUpdatePhysicalIndexTableName_runScrutiny
[ https://issues.apache.org/jira/browse/PHOENIX-6828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-6828: -- Assignee: Aman Poonia > Test failure in master branch : > LogicalTableNameIT.testUpdatePhysicalIndexTableName_runScrutiny > --- > > Key: PHOENIX-6828 > URL: https://issues.apache.org/jira/browse/PHOENIX-6828 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.2.0 >Reporter: Rushabh Shah >Assignee: Aman Poonia >Priority: Major > > The following tests are failing in master branch > [ERROR] Failures: > [ERROR] LogicalTableNameIT.testUpdatePhysicalIndexTableName_runScrutiny:229 > expected:<2> but was:<1> > [ERROR] LogicalTableNameIT.testUpdatePhysicalIndexTableName_runScrutiny:229 > expected:<2> but was:<1> > [ERROR] LogicalTableNameIT.testUpdatePhysicalIndexTableName_runScrutiny:229 > expected:<2> but was:<1> > [ERROR] LogicalTableNameIT.testUpdatePhysicalIndexTableName_runScrutiny:229 > expected:<2> but was:<1> > [ERROR] > LogicalTableNameIT.testUpdatePhysicalTableNameWithIndex_runScrutiny:169 > expected:<2> but was:<1> > [ERROR] > LogicalTableNameIT.testUpdatePhysicalTableNameWithIndex_runScrutiny:169 > expected:<2> but was:<1> > [ERROR] > LogicalTableNameIT.testUpdatePhysicalTableNameWithIndex_runScrutiny:165 > expected:<3> but was:<1> > [ERROR] > LogicalTableNameIT.testUpdatePhysicalTableNameWithIndex_runScrutiny:165 > expected:<3> but was:<1> > [ERROR] > LogicalTableNameIT.testUpdatePhysicalTableNameWithViews_runScrutiny:353 > expected:<2> but was:<0> > [ERROR] > LogicalTableNameIT.testUpdatePhysicalTableNameWithViews_runScrutiny:353 > expected:<2> but was:<0> > Failed in 2 different PR builds and confirmed locally on master that it is > failing. > 1. > https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1518/4/artifact/yetus-general-check/output/patch-unit-phoenix-core.txt > 2. > https://ci-hadoop.apache.org/job/Phoenix/job/Phoenix-PreCommit-GitHub-PR/job/PR-1522/1/testReport/ -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-6821) Batching with auto-commit connections
[ https://issues.apache.org/jira/browse/PHOENIX-6821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-6821: -- Assignee: Hari Krishna Dara > Batching with auto-commit connections > - > > Key: PHOENIX-6821 > URL: https://issues.apache.org/jira/browse/PHOENIX-6821 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Hari Krishna Dara >Priority: Major > > Phoenix commits the commands of a batch individually when executeBatch() is > called if auto commit is enabled on the connection. For example, if a batch > of 100 upsert statements is created using addBatch() within an auto-commit > mode connection then when executeBatch() is called, Phoenix creates 100 HBase > batches each with a single mutation, i.e., one for each upsert. This defeats > the purpose of batching. The correct behavior is to commit the entire batch > of upsert statements using the minimum number of HBase batches. This means if > the entire batch of upsert statements fits in a single HBase batch, then one > HBase batch should be used. > Please note for connections without auto-commit, Phoenix behaves correctly, > that is, the entire batch of upsert commands is committed using the minimum > number of HBase batches. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-6821) Batching with auto-commit connections
Kadir Ozdemir created PHOENIX-6821: -- Summary: Batching with auto-commit connections Key: PHOENIX-6821 URL: https://issues.apache.org/jira/browse/PHOENIX-6821 Project: Phoenix Issue Type: Improvement Reporter: Kadir Ozdemir Phoenix commits the commands of a batch individually when executeBatch() is called if auto commit is enabled on the connection. For example, if a batch of 100 upsert statements is created using addBatch() within an auto-commit mode connection then when executeBatch() is called, Phoenix creates 100 HBase batches each with a single mutation, i.e., one for each upsert. This defeats the purpose of batching. The correct behavior is to commit the entire batch of upsert statements using the minimum number of HBase batches. This means if the entire batch of upsert statements fits in a single HBase batch, then one HBase batch should be used. Please note for connections without auto-commit, Phoenix behaves correctly, that is, the entire batch of upsert commands is committed using the minimum number of HBase batches. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-6791) WHERE optimizer redesign
Kadir Ozdemir created PHOENIX-6791: -- Summary: WHERE optimizer redesign Key: PHOENIX-6791 URL: https://issues.apache.org/jira/browse/PHOENIX-6791 Project: Phoenix Issue Type: Improvement Reporter: Kadir Ozdemir The WHERE optimizer in Phoenix derives the information about which row key ranges to be scanned from the primary key (PK) column expressions in a where clause. These key ranges are then used to determine the table regions to scan and generate a SkipScanFilter for each of these scans if applicable. The WHERE expression may include non-PK column (sub) expressions. After identifying the key ranges, the WHERE optimizer removes the nodes for PK columns from the expression tree if these nodes are fully used to determine the key ranges. Since the values in the WHERE expression are expressed by byte arrays, the key ranges are also expressed using byte arrays. KeyRange represents a range for a row key or any sub part of a row key key. A key range is composed of two pairs, one for each end of the range, lower and upper. The pair is formed from a byte array and a boolean value. The boolean value indicates if the end of the range specified by the byte array is inclusive or not. If the byte array is empty, it means that the corresponding end of the range is unbounded. KeySlot represents a key part and the list of key ranges for this key part where a key part can be any sub part of a PK, including leading, trailing, or middle part of the key. The number of columns in a key part is called span. For the terminal nodes (i..e, constant values) in the expression tree, KeySlot objects are created with a single key range. When KeySlot objects are rolled up in the expression tree, they can have multiple ranges. For example, a KeySlot object representing an IN expression will have a separate range for each member of the IN expression. Similarly the KeySlot object for an OR expression can have multiple ranges similarly. Please note an IN operator can be replaced by an equivalent OR expression. When the WHERE optimizer visits the nodes of the expression tree, it generates a KeySlots object. KeySlots is essentially a list of KeySlot objects (please note the difference between KeySlots vs KeySlot). There are two types of KeySlots: SingleKeySlot and MultiKeySlot. SingleKeySlot represents a single key slot whereas MultiKeySlot is a list of key slots the results of AND expression on SingleKeySlot or MultiKeySlot objects. The key slots are rolled into a MultiKeySlot object when processing an AND expression. The AND operation on two key slots starting their spans with the same PK columns is equivalent to taking intersection of their ranges. The OR operation implementation is limited and rather simple compared to the AND operation. The OR operation attempts to coalesce key slots if all of the key slots have the same starting PK column. If not, it generates a null KeySlots. When an expression node is used fully in generating a key slot, this expression node is removed from the expression tree. A row key for a given table can be composed of several PK columns. Without any restrictions imposed by predefined rules, intersection of key slots can lead to a large number of key slots, i.e., key ranges. For example, consider a row key composed of three integer columns, PK1, PK2, and PK3, and the expression (PK1, PK2) > (100, 25) AND PK3 = 5. The result would be a very large number of key slots and each key slot represents a point in the three dimensional space, including (100, 26, 5), (100, 27, 5), …, (100, 2147483647, 5), (101, 1, 5), (101, 2, 5), … . A simple expression (like the one given above) with a relatively small number of PK columns and a simple data type, e.g., integer, is sufficient to show that finding key ranges for an arbitrary expression is an intractable problem. Attempting to optimize the queries by enumerating the key ranges can lead to excessive memory allocation and long computation times and the optimization can defeat its purpose. The current implementation attempts to enumerate all possible key ranges in general. Because of this, the WHERE optimizer has caused out of memory issues, and query timeouts due to high CPU usage. The very recent bug fixes attempts to catch these cases and prevent them. However, these fixes do not attempt to cover all cases and are formulated based on known cases. In addition to inefficient resource utilization, there are known types of expressions, the current implementation still returns wrong results for them. For example, please see PHOENIX-6669 where if degenerate queries are caused by some conditions on non-leading PK columns, then Phoenix cannot catch this and can return wrong results. An example to show inconsistencies in the implementation is as follows. An RVC expression can be converted to
[jira] [Assigned] (PHOENIX-6761) Phoenix Client Side Metadata Caching Improvement
[ https://issues.apache.org/jira/browse/PHOENIX-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-6761: -- Assignee: Palash Chauhan (was: Kadir Ozdemir) > Phoenix Client Side Metadata Caching Improvement > > > Key: PHOENIX-6761 > URL: https://issues.apache.org/jira/browse/PHOENIX-6761 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Palash Chauhan >Priority: Major > Attachments: PHOENIX-6761.master.initial.patch > > > CQSI maintains a client-side metadata cache, i.e., schemas, tables, and > functions, that evicts the last recently used table entries when the cache > size grows beyond the configured size. > Each time a Phoenix connection is created, the client-side metadata cache > maintained by the CQSI object creating this connection is cloned for the > connection. Thus, we have two levels of caches, one at the Phoenix connection > level and the other at the CQSI level. > When a Phoenix client needs to update the client side cache, it updates both > caches (on the connection object and on the CQSI object). The Phoenix client > attempts to retrieve a table from the connection level cache. If this table > is not there then the Phoenix client does not check the CQSI level cache, > instead it retrieves the object from the server and finally updates both the > connection and CQSI level cache. > PMetaDataCache provides caching for tables, schemas and functions but it > maintains separate caches internally, one cache for each type of metadata. > The cache for the tables is actually a cache of PTableRef objects. PTableRef > holds a reference to the table object as well as the estimated size of the > table object, the create time, last access time, and resolved time. The > create time is set to the last access time value provided when the PTableRef > object is inserted into the cache. The resolved time is also provided when > the PTableRef object is inserted into the cache. Both the created time and > resolved time are final fields (i.e., they are not updated). PTableRef > provide a setter method to update the last access time. PMetaDataCache > updates the last access time whenever the table is retrieved from the cache. > The LRU eviction policy is implemented using the last access time. The > eviction policy is not implemented for schemas and functions. The > configuration parameter for the frequency of updating cache is > phoenix.default.update.cache.frequency. This can be defined at the cluster or > table level. When it is set to zero, it means cache would not be used. > Obviously the eviction of the cache is to limit the memory consumed by the > cache. The expected behavior is that when a table is removed from the cache, > the table (PTableImpl) object is also garbage collected. However, this does > not really happen because multiple caches make references to the same object > and each cache maintains its own table refs and thus access times. This means > that the access time for the same table may differ from one cache to another; > and when one cache can evict an object, another cache will hold on the same > object. > Although individual caches implements the LRU eviction policy, the overall > memory eviction policy for the actual table objects is more like age based > cache. If a table is frequently accessed from the connection level caches, > the last access time maintained by the corresponding table ref objects for > this table will be updated. However, these updates on the access times will > not be visible to the CQSI level cache. The table refs in the CQSI level > cache have the same create time and access time. > Since whenever an object is inserted into the local cache of a connection > object, it is also inserted the cache on the CSQI object, the CQSI level > cache will grow faster than the caches on the connection objects. When the > cache reaches its maximum size, the newly inserted tables will result in > evicting one of the existing tables in the cache. Since the access time of > these tables are not updated on the CQSI level cache, it is likely that the > table that has stayed in the cache for the longest period of time will be > evicted (regardless of whether the same table is frequently accessed via the > connection level caches). This obviously defeats the purpose of an LRU cache. > Another problem with the current cache is related to the choice of its > internal data structures and its eviction implementation. The table refs in > the cache are maintained in a hash map which maps a table key (which is pair > of a tenant id and table name) to a table ref. When the size of a cache (the > total byte size of the table objects referred by the cache) reaches its
[jira] [Updated] (PHOENIX-6761) Phoenix Client Side Metadata Caching Improvement
[ https://issues.apache.org/jira/browse/PHOENIX-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-6761: --- Attachment: PHOENIX-6761.master.initial.patch > Phoenix Client Side Metadata Caching Improvement > > > Key: PHOENIX-6761 > URL: https://issues.apache.org/jira/browse/PHOENIX-6761 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > Attachments: PHOENIX-6761.master.initial.patch > > > CQSI maintains a client-side metadata cache, i.e., schemas, tables, and > functions, that evicts the last recently used table entries when the cache > size grows beyond the configured size. > Each time a Phoenix connection is created, the client-side metadata cache > maintained by the CQSI object creating this connection is cloned for the > connection. Thus, we have two levels of caches, one at the Phoenix connection > level and the other at the CQSI level. > When a Phoenix client needs to update the client side cache, it updates both > caches (on the connection object and on the CQSI object). The Phoenix client > attempts to retrieve a table from the connection level cache. If this table > is not there then the Phoenix client does not check the CQSI level cache, > instead it retrieves the object from the server and finally updates both the > connection and CQSI level cache. > PMetaDataCache provides caching for tables, schemas and functions but it > maintains separate caches internally, one cache for each type of metadata. > The cache for the tables is actually a cache of PTableRef objects. PTableRef > holds a reference to the table object as well as the estimated size of the > table object, the create time, last access time, and resolved time. The > create time is set to the last access time value provided when the PTableRef > object is inserted into the cache. The resolved time is also provided when > the PTableRef object is inserted into the cache. Both the created time and > resolved time are final fields (i.e., they are not updated). PTableRef > provide a setter method to update the last access time. PMetaDataCache > updates the last access time whenever the table is retrieved from the cache. > The LRU eviction policy is implemented using the last access time. The > eviction policy is not implemented for schemas and functions. The > configuration parameter for the frequency of updating cache is > phoenix.default.update.cache.frequency. This can be defined at the cluster or > table level. When it is set to zero, it means cache would not be used. > Obviously the eviction of the cache is to limit the memory consumed by the > cache. The expected behavior is that when a table is removed from the cache, > the table (PTableImpl) object is also garbage collected. However, this does > not really happen because multiple caches make references to the same object > and each cache maintains its own table refs and thus access times. This means > that the access time for the same table may differ from one cache to another; > and when one cache can evict an object, another cache will hold on the same > object. > Although individual caches implements the LRU eviction policy, the overall > memory eviction policy for the actual table objects is more like age based > cache. If a table is frequently accessed from the connection level caches, > the last access time maintained by the corresponding table ref objects for > this table will be updated. However, these updates on the access times will > not be visible to the CQSI level cache. The table refs in the CQSI level > cache have the same create time and access time. > Since whenever an object is inserted into the local cache of a connection > object, it is also inserted the cache on the CSQI object, the CQSI level > cache will grow faster than the caches on the connection objects. When the > cache reaches its maximum size, the newly inserted tables will result in > evicting one of the existing tables in the cache. Since the access time of > these tables are not updated on the CQSI level cache, it is likely that the > table that has stayed in the cache for the longest period of time will be > evicted (regardless of whether the same table is frequently accessed via the > connection level caches). This obviously defeats the purpose of an LRU cache. > Another problem with the current cache is related to the choice of its > internal data structures and its eviction implementation. The table refs in > the cache are maintained in a hash map which maps a table key (which is pair > of a tenant id and table name) to a table ref. When the size of a cache (the > total byte size of the table objects referred by the cache) reaches its >
[jira] [Updated] (PHOENIX-6776) Abort scans of closed connections at ScanningResultIterator
[ https://issues.apache.org/jira/browse/PHOENIX-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-6776: --- Description: The server side paging feature introduced by PHOENIX-6211 breaks a scan into timed scan operations on the server side and returns an intermediate result for each operation. This intermediate result could be a valid result or a dummy result. The HBase scans are wrapped by ScanningResultIterator in Phoenix. If the next call on a scan returns a dummy or empty result, ScanningResultIterator ignores this result and call the next method on the scan again. However, if the Phoenix connection is closed, we should abort the scan instead of continuing scanning. This will result in timely abort of scans and release of resources (especially when phoenix.server.page.size.ms is set to a small value, e.g., 5 sec). was: The server side paging feature introduced by Phoenix-6211 breaks a scan into timed scan operations on the server side and returns an intermediate result for each operation. This intermediate result could be a valid result or a dummy result. The HBase scans are wrapped by ScanningResultIterator in Phoenix. If the next call on a scan returns a dummy or empty result, ScanningResultIterator ignores this result and call the next method on the scan again. However, if the Phoenix connection is closed, we should abort the scan instead of continuing scanning. This will result in timely abort of scans and release of resources (especially when phoenix.server.page.size.ms is set to a small value, e.g., 5 sec). > Abort scans of closed connections at ScanningResultIterator > --- > > Key: PHOENIX-6776 > URL: https://issues.apache.org/jira/browse/PHOENIX-6776 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Lokesh Khurana >Priority: Major > > The server side paging feature introduced by PHOENIX-6211 breaks a scan into > timed scan operations on the server side and returns an intermediate result > for each operation. This intermediate result could be a valid result or a > dummy result. The HBase scans are wrapped by ScanningResultIterator in > Phoenix. If the next call on a scan returns a dummy or empty result, > ScanningResultIterator ignores this result and call the next method on the > scan again. However, if the Phoenix connection is closed, we should abort the > scan instead of continuing scanning. This will result in timely abort of > scans and release of resources (especially when phoenix.server.page.size.ms > is set to a small value, e.g., 5 sec). > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-6776) Abort scans of closed connections at ScanningResultIterator
[ https://issues.apache.org/jira/browse/PHOENIX-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-6776: -- Assignee: Lokesh Khurana > Abort scans of closed connections at ScanningResultIterator > --- > > Key: PHOENIX-6776 > URL: https://issues.apache.org/jira/browse/PHOENIX-6776 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Lokesh Khurana >Priority: Major > > The server side paging feature introduced by Phoenix-6211 breaks a scan into > timed scan operations on the server side and returns an intermediate result > for each operation. This intermediate result could be a valid result or a > dummy result. The HBase scans are wrapped by ScanningResultIterator in > Phoenix. If the next call on a scan returns a dummy or empty result, > ScanningResultIterator ignores this result and call the next method on the > scan again. However, if the Phoenix connection is closed, we should abort the > scan instead of continuing scanning. This will result in timely abort of > scans and release of resources (especially when phoenix.server.page.size.ms > is set to a small value, e.g., 5 sec). > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (PHOENIX-6776) Abort scans of closed connections at ScanningResultIterator
Kadir Ozdemir created PHOENIX-6776: -- Summary: Abort scans of closed connections at ScanningResultIterator Key: PHOENIX-6776 URL: https://issues.apache.org/jira/browse/PHOENIX-6776 Project: Phoenix Issue Type: Improvement Reporter: Kadir Ozdemir The server side paging feature introduced by Phoenix-6211 breaks a scan into timed scan operations on the server side and returns an intermediate result for each operation. This intermediate result could be a valid result or a dummy result. The HBase scans are wrapped by ScanningResultIterator in Phoenix. If the next call on a scan returns a dummy or empty result, ScanningResultIterator ignores this result and call the next method on the scan again. However, if the Phoenix connection is closed, we should abort the scan instead of continuing scanning. This will result in timely abort of scans and release of resources (especially when phoenix.server.page.size.ms is set to a small value, e.g., 5 sec). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (PHOENIX-6761) Phoenix Client Side Metadata Caching Improvement
[ https://issues.apache.org/jira/browse/PHOENIX-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-6761: -- Assignee: Kadir Ozdemir > Phoenix Client Side Metadata Caching Improvement > > > Key: PHOENIX-6761 > URL: https://issues.apache.org/jira/browse/PHOENIX-6761 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir Ozdemir >Assignee: Kadir Ozdemir >Priority: Major > > CQSI maintains a client-side metadata cache, i.e., schemas, tables, and > functions, that evicts the last recently used table entries when the cache > size grows beyond the configured size. > Each time a Phoenix connection is created, the client-side metadata cache > maintained by the CQSI object creating this connection is cloned for the > connection. Thus, we have two levels of caches, one at the Phoenix connection > level and the other at the CQSI level. > When a Phoenix client needs to update the client side cache, it updates both > caches (on the connection object and on the CQSI object). The Phoenix client > attempts to retrieve a table from the connection level cache. If this table > is not there then the Phoenix client does not check the CQSI level cache, > instead it retrieves the object from the server and finally updates both the > connection and CQSI level cache. > PMetaDataCache provides caching for tables, schemas and functions but it > maintains separate caches internally, one cache for each type of metadata. > The cache for the tables is actually a cache of PTableRef objects. PTableRef > holds a reference to the table object as well as the estimated size of the > table object, the create time, last access time, and resolved time. The > create time is set to the last access time value provided when the PTableRef > object is inserted into the cache. The resolved time is also provided when > the PTableRef object is inserted into the cache. Both the created time and > resolved time are final fields (i.e., they are not updated). PTableRef > provide a setter method to update the last access time. PMetaDataCache > updates the last access time whenever the table is retrieved from the cache. > The LRU eviction policy is implemented using the last access time. The > eviction policy is not implemented for schemas and functions. The > configuration parameter for the frequency of updating cache is > phoenix.default.update.cache.frequency. This can be defined at the cluster or > table level. When it is set to zero, it means cache would not be used. > Obviously the eviction of the cache is to limit the memory consumed by the > cache. The expected behavior is that when a table is removed from the cache, > the table (PTableImpl) object is also garbage collected. However, this does > not really happen because multiple caches make references to the same object > and each cache maintains its own table refs and thus access times. This means > that the access time for the same table may differ from one cache to another; > and when one cache can evict an object, another cache will hold on the same > object. > Although individual caches implements the LRU eviction policy, the overall > memory eviction policy for the actual table objects is more like age based > cache. If a table is frequently accessed from the connection level caches, > the last access time maintained by the corresponding table ref objects for > this table will be updated. However, these updates on the access times will > not be visible to the CQSI level cache. The table refs in the CQSI level > cache have the same create time and access time. > Since whenever an object is inserted into the local cache of a connection > object, it is also inserted the cache on the CSQI object, the CQSI level > cache will grow faster than the caches on the connection objects. When the > cache reaches its maximum size, the newly inserted tables will result in > evicting one of the existing tables in the cache. Since the access time of > these tables are not updated on the CQSI level cache, it is likely that the > table that has stayed in the cache for the longest period of time will be > evicted (regardless of whether the same table is frequently accessed via the > connection level caches). This obviously defeats the purpose of an LRU cache. > Another problem with the current cache is related to the choice of its > internal data structures and its eviction implementation. The table refs in > the cache are maintained in a hash map which maps a table key (which is pair > of a tenant id and table name) to a table ref. When the size of a cache (the > total byte size of the table objects referred by the cache) reaches its > configured limit, how much overage adding a new table would cause is >
[jira] [Created] (PHOENIX-6761) Phoenix Client Side Metadata Caching Improvement
Kadir Ozdemir created PHOENIX-6761: -- Summary: Phoenix Client Side Metadata Caching Improvement Key: PHOENIX-6761 URL: https://issues.apache.org/jira/browse/PHOENIX-6761 Project: Phoenix Issue Type: Improvement Reporter: Kadir Ozdemir CQSI maintains a client-side metadata cache, i.e., schemas, tables, and functions, that evicts the last recently used table entries when the cache size grows beyond the configured size. Each time a Phoenix connection is created, the client-side metadata cache maintained by the CQSI object creating this connection is cloned for the connection. Thus, we have two levels of caches, one at the Phoenix connection level and the other at the CQSI level. When a Phoenix client needs to update the client side cache, it updates both caches (on the connection object and on the CQSI object). The Phoenix client attempts to retrieve a table from the connection level cache. If this table is not there then the Phoenix client does not check the CQSI level cache, instead it retrieves the object from the server and finally updates both the connection and CQSI level cache. PMetaDataCache provides caching for tables, schemas and functions but it maintains separate caches internally, one cache for each type of metadata. The cache for the tables is actually a cache of PTableRef objects. PTableRef holds a reference to the table object as well as the estimated size of the table object, the create time, last access time, and resolved time. The create time is set to the last access time value provided when the PTableRef object is inserted into the cache. The resolved time is also provided when the PTableRef object is inserted into the cache. Both the created time and resolved time are final fields (i.e., they are not updated). PTableRef provide a setter method to update the last access time. PMetaDataCache updates the last access time whenever the table is retrieved from the cache. The LRU eviction policy is implemented using the last access time. The eviction policy is not implemented for schemas and functions. The configuration parameter for the frequency of updating cache is phoenix.default.update.cache.frequency. This can be defined at the cluster or table level. When it is set to zero, it means cache would not be used. Obviously the eviction of the cache is to limit the memory consumed by the cache. The expected behavior is that when a table is removed from the cache, the table (PTableImpl) object is also garbage collected. However, this does not really happen because multiple caches make references to the same object and each cache maintains its own table refs and thus access times. This means that the access time for the same table may differ from one cache to another; and when one cache can evict an object, another cache will hold on the same object. Although individual caches implements the LRU eviction policy, the overall memory eviction policy for the actual table objects is more like age based cache. If a table is frequently accessed from the connection level caches, the last access time maintained by the corresponding table ref objects for this table will be updated. However, these updates on the access times will not be visible to the CQSI level cache. The table refs in the CQSI level cache have the same create time and access time. Since whenever an object is inserted into the local cache of a connection object, it is also inserted the cache on the CSQI object, the CQSI level cache will grow faster than the caches on the connection objects. When the cache reaches its maximum size, the newly inserted tables will result in evicting one of the existing tables in the cache. Since the access time of these tables are not updated on the CQSI level cache, it is likely that the table that has stayed in the cache for the longest period of time will be evicted (regardless of whether the same table is frequently accessed via the connection level caches). This obviously defeats the purpose of an LRU cache. Another problem with the current cache is related to the choice of its internal data structures and its eviction implementation. The table refs in the cache are maintained in a hash map which maps a table key (which is pair of a tenant id and table name) to a table ref. When the size of a cache (the total byte size of the table objects referred by the cache) reaches its configured limit, how much overage adding a new table would cause is computed. Then all the table refs in this cache are cloned into a priority queue as well as a new cache. This queue uses the access time to determine the order of its elements (i.e., table refs). The table refs that should not be evicted are removed from the queue, which leaves the table refs to be evicted in the queue. Finally, the table refs left in the queue are removed from the
[jira] [Updated] (PHOENIX-6448) ConnectionQueryServicesImpl init failure may cause Full GC.
[ https://issues.apache.org/jira/browse/PHOENIX-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-6448: --- Attachment: PHOENIX-6448.master.001.patch > ConnectionQueryServicesImpl init failure may cause Full GC. > --- > > Key: PHOENIX-6448 > URL: https://issues.apache.org/jira/browse/PHOENIX-6448 > Project: Phoenix > Issue Type: Bug >Reporter: Chen Feng >Assignee: Kadir Ozdemir >Priority: Major > Attachments: PHOENIX-6448.master.001.patch > > > in ConnectionQueryServicesImpl.init() > In some cases(e.g. the user has not permissions to create SYSTEM.CATALOG), > there's only LOGGER.WARN and return null directly. > {code:java} > // Some comments here > { > ... > if (inspectIfAnyExceptionInChain(e, Collections. Exception>> singletonList(AccessDeniedException.class))) { > // Pass > LOGGER.warn("Could not check for Phoenix SYSTEM tables," + > " assuming they exist and are properly configured"); > > checkClientServerCompatibility(SchemaUtil.getPhysicalName(SYSTEM_CATALOG_NAME_BYTES, > getProps()).getName()); > success = true; > } > ... > return null; > } > ... > scheduleRenewLeaseTasks(); > {code} > Therefore, the following scheduleRenewLeaseTasks will be skipped and no > exception is thrown. > > 1. scheduleRenewLeaseTasks not called > 2. no renew task started > 3. queries will call PhoenixConection.addIteratorForLeaseRenewal() as usual > 4. the scannerQueue is unlimited therefore it will always adding new items. > 5. Full GC. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (PHOENIX-6448) ConnectionQueryServicesImpl init failure may cause Full GC.
[ https://issues.apache.org/jira/browse/PHOENIX-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-6448: -- Assignee: Kadir Ozdemir > ConnectionQueryServicesImpl init failure may cause Full GC. > --- > > Key: PHOENIX-6448 > URL: https://issues.apache.org/jira/browse/PHOENIX-6448 > Project: Phoenix > Issue Type: Bug >Reporter: Chen Feng >Assignee: Kadir Ozdemir >Priority: Major > > in ConnectionQueryServicesImpl.init() > In some cases(e.g. the user has not permissions to create SYSTEM.CATALOG), > there's only LOGGER.WARN and return null directly. > {code:java} > // Some comments here > { > ... > if (inspectIfAnyExceptionInChain(e, Collections. Exception>> singletonList(AccessDeniedException.class))) { > // Pass > LOGGER.warn("Could not check for Phoenix SYSTEM tables," + > " assuming they exist and are properly configured"); > > checkClientServerCompatibility(SchemaUtil.getPhysicalName(SYSTEM_CATALOG_NAME_BYTES, > getProps()).getName()); > success = true; > } > ... > return null; > } > ... > scheduleRenewLeaseTasks(); > {code} > Therefore, the following scheduleRenewLeaseTasks will be skipped and no > exception is thrown. > > 1. scheduleRenewLeaseTasks not called > 2. no renew task started > 3. queries will call PhoenixConection.addIteratorForLeaseRenewal() as usual > 4. the scannerQueue is unlimited therefore it will always adding new items. > 5. Full GC. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (PHOENIX-6677) Parallelism within a batch of mutations
[ https://issues.apache.org/jira/browse/PHOENIX-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-6677: --- Fix Version/s: (was: 4.17.0) (was: 5.2.0) > Parallelism within a batch of mutations > > > Key: PHOENIX-6677 > URL: https://issues.apache.org/jira/browse/PHOENIX-6677 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir OZDEMIR >Priority: Major > > Currently, Phoenix client simply passes the batches of row mutations from the > application to HBase client without any parallelism or intelligent grouping > (except grouping mutations for the same row). > Assume that the application creates batches 1 row mutations for a given > table. Phoenix client divides these rows based on their arrival order into > HBase batches of n (e.g., 100) rows based on the configured batch size, i.e., > the number of rows and bytes. Then, Phoenix calls HBase batch API, one batch > at a time (i.e., serially). HBase client further divides a given batch of > rows into smaller batches based on their regions. This means that a large > batch created by the application is divided into many tiny batches and > executed mostly serially. For slated tables, this will result in even smaller > batches. > We can improve the current implementation greatly if we group the rows of the > batch prepared by the application into sub batches based on table region > boundaries and then execute these batches in parallel. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (PHOENIX-6677) Parallelism within a batch of mutations
[ https://issues.apache.org/jira/browse/PHOENIX-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir resolved PHOENIX-6677. Resolution: Not A Problem > Parallelism within a batch of mutations > > > Key: PHOENIX-6677 > URL: https://issues.apache.org/jira/browse/PHOENIX-6677 > Project: Phoenix > Issue Type: Improvement >Reporter: Kadir OZDEMIR >Priority: Major > Fix For: 4.17.0, 5.2.0 > > > Currently, Phoenix client simply passes the batches of row mutations from the > application to HBase client without any parallelism or intelligent grouping > (except grouping mutations for the same row). > Assume that the application creates batches 1 row mutations for a given > table. Phoenix client divides these rows based on their arrival order into > HBase batches of n (e.g., 100) rows based on the configured batch size, i.e., > the number of rows and bytes. Then, Phoenix calls HBase batch API, one batch > at a time (i.e., serially). HBase client further divides a given batch of > rows into smaller batches based on their regions. This means that a large > batch created by the application is divided into many tiny batches and > executed mostly serially. For slated tables, this will result in even smaller > batches. > We can improve the current implementation greatly if we group the rows of the > batch prepared by the application into sub batches based on table region > boundaries and then execute these batches in parallel. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (PHOENIX-6702) ConcurrentMutationsExtendedIT and PartialIndexRebuilderIT fail on Hbase 2.4.11+
[ https://issues.apache.org/jira/browse/PHOENIX-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir reassigned PHOENIX-6702: -- Assignee: Kadir Ozdemir (was: Kadir OZDEMIR) > ConcurrentMutationsExtendedIT and PartialIndexRebuilderIT fail on Hbase > 2.4.11+ > --- > > Key: PHOENIX-6702 > URL: https://issues.apache.org/jira/browse/PHOENIX-6702 > Project: Phoenix > Issue Type: Bug > Components: core >Affects Versions: 5.2.0, 5.1.3 >Reporter: Istvan Toth >Assignee: Kadir Ozdemir >Priority: Blocker > Fix For: 5.2.0 > > Attachments: bisect.sh > > > On my local machine > ConcurrentMutationsExtendedIT.testConcurrentUpserts failed 6 out 10 times > while PartialIndexRebuilderIT.testConcurrentUpsertsWithRebuild failed 10 out > of 10 times with HBase 2.4.11 (the default build) > The same tests succeeded 3 out of 3 times with HBase 2.3.7. > Either HBase 2.4 has a bug, or our compatibility modules need to be fixed. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (PHOENIX-6677) Parallelism within a batch of mutations
Kadir OZDEMIR created PHOENIX-6677: -- Summary: Parallelism within a batch of mutations Key: PHOENIX-6677 URL: https://issues.apache.org/jira/browse/PHOENIX-6677 Project: Phoenix Issue Type: Improvement Reporter: Kadir OZDEMIR Fix For: 4.17.0, 5.2.0 Currently, Phoenix client simply passes the batches of row mutations from the application to HBase client without any parallelism or intelligent grouping (except grouping mutations for the same row). Assume that the application creates batches 1 row mutations for a given table. Phoenix client divides these rows based on their arrival order into HBase batches of n (e.g., 100) rows based on the configured batch size, i.e., the number of rows and bytes. Then, Phoenix calls HBase batch API, one batch at a time (i.e., serially). HBase client further divides a given batch of rows into smaller batches based on their regions. This means that a large batch created by the application is divided into many tiny batches and executed mostly serially. For slated tables, this will result in even smaller batches. We can improve the current implementation greatly if we group the rows of the batch prepared by the application into sub batches based on table region boundaries and then execute these batches in parallel. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (PHOENIX-6663) Use batching when joining data table rows with uncovered local index rows
[ https://issues.apache.org/jira/browse/PHOENIX-6663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir OZDEMIR reassigned PHOENIX-6663: -- Assignee: Kadir OZDEMIR > Use batching when joining data table rows with uncovered local index rows > - > > Key: PHOENIX-6663 > URL: https://issues.apache.org/jira/browse/PHOENIX-6663 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.16.1, 5.1.2 >Reporter: Kadir OZDEMIR >Assignee: Kadir OZDEMIR >Priority: Major > > The current solution uses HBase get operations to join data table rows with > uncovered local index rows on the server side. Issuing a separate get > operation for every data table row can be expensive. Instead, we can buffer > lots of data row keys in memory and use a scan with skip scan filter. This > will reduce the cost of join and also improve the performance. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (PHOENIX-6501) Use batching when joining data table rows with uncovered global index rows
[ https://issues.apache.org/jira/browse/PHOENIX-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir OZDEMIR updated PHOENIX-6501: --- Fix Version/s: 5.2.0 4.16.2 5.1.3 > Use batching when joining data table rows with uncovered global index rows > -- > > Key: PHOENIX-6501 > URL: https://issues.apache.org/jira/browse/PHOENIX-6501 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.1.2 >Reporter: Kadir Ozdemir >Assignee: Kadir OZDEMIR >Priority: Major > Fix For: 5.2.0, 4.16.2, 5.1.3 > > Attachments: PHOENIX-6501.master.001.patch > > > PHOENIX-6458 extends the existing uncovered local index support for global > indexes. The current solution uses HBase get operations to join data table > rows with uncovered index rows on the server side. Doing a separate RPC call > for every data table row can be expensive. Instead, we can buffer lots of > data row keys in memory, use a skip scan filter and even multiple threads to > issue a separate scan for each data table region in parallel. This will > reduce the cost of join and also improve the performance. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (PHOENIX-6501) Use batching when joining data table rows with uncovered global index rows
[ https://issues.apache.org/jira/browse/PHOENIX-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir OZDEMIR reassigned PHOENIX-6501: -- Assignee: Kadir OZDEMIR (was: Lars Hofhansl) Resolution: Fixed > Use batching when joining data table rows with uncovered global index rows > -- > > Key: PHOENIX-6501 > URL: https://issues.apache.org/jira/browse/PHOENIX-6501 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.1.2 >Reporter: Kadir Ozdemir >Assignee: Kadir OZDEMIR >Priority: Major > Attachments: PHOENIX-6501.master.001.patch > > > PHOENIX-6458 extends the existing uncovered local index support for global > indexes. The current solution uses HBase get operations to join data table > rows with uncovered index rows on the server side. Doing a separate RPC call > for every data table row can be expensive. Instead, we can buffer lots of > data row keys in memory, use a skip scan filter and even multiple threads to > issue a separate scan for each data table region in parallel. This will > reduce the cost of join and also improve the performance. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (PHOENIX-6663) Use batching when joining data table rows with uncovered local index rows
Kadir OZDEMIR created PHOENIX-6663: -- Summary: Use batching when joining data table rows with uncovered local index rows Key: PHOENIX-6663 URL: https://issues.apache.org/jira/browse/PHOENIX-6663 Project: Phoenix Issue Type: Improvement Affects Versions: 5.1.2, 4.16.1 Reporter: Kadir OZDEMIR The current solution uses HBase get operations to join data table rows with uncovered local index rows on the server side. Issuing a separate get operation for every data table row can be expensive. Instead, we can buffer lots of data row keys in memory and use a scan with skip scan filter. This will reduce the cost of join and also improve the performance. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (PHOENIX-6501) Use batching when joining data table rows with uncovered global index rows
[ https://issues.apache.org/jira/browse/PHOENIX-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir OZDEMIR updated PHOENIX-6501: --- Summary: Use batching when joining data table rows with uncovered global index rows (was: Use batching when joining data table rows with uncovered index rows) > Use batching when joining data table rows with uncovered global index rows > -- > > Key: PHOENIX-6501 > URL: https://issues.apache.org/jira/browse/PHOENIX-6501 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.1.2 >Reporter: Kadir Ozdemir >Assignee: Kadir OZDEMIR >Priority: Major > Attachments: PHOENIX-6501.master.001.patch > > > PHOENIX-6458 extends the existing uncovered local index support for global > indexes. The current solution uses HBase get operations to join data table > rows with uncovered index rows on the server side. Doing a separate RPC call > for every data table row can be expensive. Instead, we can buffer lots of > data row keys in memory, use a skip scan filter and even multiple threads to > issue a separate scan for each data table region in parallel. This will > reduce the cost of join and also improve the performance. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (PHOENIX-6501) Use batching when joining data table rows with uncovered index rows
[ https://issues.apache.org/jira/browse/PHOENIX-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir OZDEMIR updated PHOENIX-6501: --- Attachment: PHOENIX-6501.master.001.patch > Use batching when joining data table rows with uncovered index rows > --- > > Key: PHOENIX-6501 > URL: https://issues.apache.org/jira/browse/PHOENIX-6501 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.1.2 >Reporter: Kadir Ozdemir >Assignee: Kadir OZDEMIR >Priority: Major > Attachments: PHOENIX-6501.master.001.patch > > > PHOENIX-6458 extends the existing uncovered local index support for global > indexes. The current solution uses HBase get operations to join data table > rows with uncovered index rows on the server side. Doing a separate RPC call > for every data table row can be expensive. Instead, we can buffer lots of > data row keys in memory, use a skip scan filter and even multiple threads to > issue a separate scan for each data table region in parallel. This will > reduce the cost of join and also improve the performance. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (PHOENIX-6458) Using global indexes for queries with uncovered columns
[ https://issues.apache.org/jira/browse/PHOENIX-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir OZDEMIR updated PHOENIX-6458: --- Attachment: PHOENIX-6458.master.addendum.patch > Using global indexes for queries with uncovered columns > --- > > Key: PHOENIX-6458 > URL: https://issues.apache.org/jira/browse/PHOENIX-6458 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.1.0 >Reporter: Kadir Ozdemir >Assignee: Kadir OZDEMIR >Priority: Major > Fix For: 4.17.0, 5.2.0, 5.1.3 > > Attachments: PHOENIX-6458.master.001.patch, > PHOENIX-6458.master.002.patch, PHOENIX-6458.master.addendum.patch > > > The Phoenix query optimizer does not use a global index for a query with the > columns that are not covered by the global index if the query does not have > the corresponding index hint for this index. With the index hint, the > optimizer rewrites the query where the index is used within a subquery. With > this subquery, the row keys of the index rows that satisfy the subquery are > retrieved by the Phoenix client and then pushed into the Phoenix server > caches of the data table regions. Finally, on the server side, data table > rows are scanned and joined with the index rows using HashJoin. Based on the > selectivity of the original query, this join operation may still result in > scanning a large amount of data table rows. > Eliminating these data table scans would be a significant improvement. To do > that, instead of rewriting the query, the Phoenix optimizer simply treats the > global index as a covered index for the given query. With this, the Phoenix > query optimizer chooses the index table for the query especially when the > index row key prefix length is greater than the data row key prefix length > for the query. On the server side, the index table is scanned using index row > key ranges implied by the query and the index row keys are then mapped to the > data table row keys (please note an index row key includes all the data row > key columns). Finally, the corresponding data table rows are scanned using > server-to-server RPCs. PHOENIX-6458 (this Jira) retrieves the data table > rows one by one using the HBase get operation. PHOENIX-6501 replaces this get > operation with the scan operation to reduce the number of server-to-server > RPC calls. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (PHOENIX-6656) Reindent NonAggregateRegionScannerFactory
Kadir OZDEMIR created PHOENIX-6656: -- Summary: Reindent NonAggregateRegionScannerFactory Key: PHOENIX-6656 URL: https://issues.apache.org/jira/browse/PHOENIX-6656 Project: Phoenix Issue Type: Bug Reporter: Kadir OZDEMIR Assignee: Kadir OZDEMIR The indentation in the NonAggregateRegionScannerFactory.java file is badly broken and results in failures in code style checks whenever we make changes on this file. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (PHOENIX-6501) Use batching when joining data table rows with uncovered index rows
[ https://issues.apache.org/jira/browse/PHOENIX-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir OZDEMIR reassigned PHOENIX-6501: -- Assignee: Kadir OZDEMIR > Use batching when joining data table rows with uncovered index rows > --- > > Key: PHOENIX-6501 > URL: https://issues.apache.org/jira/browse/PHOENIX-6501 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.1.2 >Reporter: Kadir Ozdemir >Assignee: Kadir OZDEMIR >Priority: Major > > PHOENIX-6458 extends the existing uncovered local index support for global > indexes. The current solution uses HBase get operations to join data table > rows with uncovered index rows on the server side. Doing a separate RPC call > for every data table row can be expensive. Instead, we can buffer lots of > data row keys in memory, use a skip scan filter and even multiple threads to > issue a separate scan for each data table region in parallel. This will > reduce the cost of join and also improve the performance. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (PHOENIX-6458) Using global indexes for queries with uncovered columns
[ https://issues.apache.org/jira/browse/PHOENIX-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir OZDEMIR updated PHOENIX-6458: --- Fix Version/s: 4.17.0 5.2.0 5.1.3 > Using global indexes for queries with uncovered columns > --- > > Key: PHOENIX-6458 > URL: https://issues.apache.org/jira/browse/PHOENIX-6458 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.1.0 >Reporter: Kadir Ozdemir >Assignee: Kadir OZDEMIR >Priority: Major > Fix For: 4.17.0, 5.2.0, 5.1.3 > > Attachments: PHOENIX-6458.master.001.patch, > PHOENIX-6458.master.002.patch > > > The Phoenix query optimizer does not use a global index for a query with the > columns that are not covered by the global index if the query does not have > the corresponding index hint for this index. With the index hint, the > optimizer rewrites the query where the index is used within a subquery. With > this subquery, the row keys of the index rows that satisfy the subquery are > retrieved by the Phoenix client and then pushed into the Phoenix server > caches of the data table regions. Finally, on the server side, data table > rows are scanned and joined with the index rows using HashJoin. Based on the > selectivity of the original query, this join operation may still result in > scanning a large amount of data table rows. > Eliminating these data table scans would be a significant improvement. To do > that, instead of rewriting the query, the Phoenix optimizer simply treats the > global index as a covered index for the given query. With this, the Phoenix > query optimizer chooses the index table for the query especially when the > index row key prefix length is greater than the data row key prefix length > for the query. On the server side, the index table is scanned using index row > key ranges implied by the query and the index row keys are then mapped to the > data table row keys (please note an index row key includes all the data row > key columns). Finally, the corresponding data table rows are scanned using > server-to-server RPCs. PHOENIX-6458 (this Jira) retrieves the data table > rows one by one using the HBase get operation. PHOENIX-6501 replaces this get > operation with the scan operation to reduce the number of server-to-server > RPC calls. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (PHOENIX-6458) Using global indexes for queries with uncovered columns
[ https://issues.apache.org/jira/browse/PHOENIX-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir OZDEMIR reassigned PHOENIX-6458: -- Assignee: Kadir OZDEMIR (was: Lars Hofhansl) > Using global indexes for queries with uncovered columns > --- > > Key: PHOENIX-6458 > URL: https://issues.apache.org/jira/browse/PHOENIX-6458 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.1.0 >Reporter: Kadir Ozdemir >Assignee: Kadir OZDEMIR >Priority: Major > Attachments: PHOENIX-6458.master.001.patch, > PHOENIX-6458.master.002.patch > > > The Phoenix query optimizer does not use a global index for a query with the > columns that are not covered by the global index if the query does not have > the corresponding index hint for this index. With the index hint, the > optimizer rewrites the query where the index is used within a subquery. With > this subquery, the row keys of the index rows that satisfy the subquery are > retrieved by the Phoenix client and then pushed into the Phoenix server > caches of the data table regions. Finally, on the server side, data table > rows are scanned and joined with the index rows using HashJoin. Based on the > selectivity of the original query, this join operation may still result in > scanning a large amount of data table rows. > Eliminating these data table scans would be a significant improvement. To do > that, instead of rewriting the query, the Phoenix optimizer simply treats the > global index as a covered index for the given query. With this, the Phoenix > query optimizer chooses the index table for the query especially when the > index row key prefix length is greater than the data row key prefix length > for the query. On the server side, the index table is scanned using index row > key ranges implied by the query and the index row keys are then mapped to the > data table row keys (please note an index row key includes all the data row > key columns). Finally, the corresponding data table rows are scanned using > server-to-server RPCs. PHOENIX-6458 (this Jira) retrieves the data table > rows one by one using the HBase get operation. PHOENIX-6501 replaces this get > operation with the scan operation to reduce the number of server-to-server > RPC calls. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (PHOENIX-6458) Using global indexes for queries with uncovered columns
[ https://issues.apache.org/jira/browse/PHOENIX-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir OZDEMIR reassigned PHOENIX-6458: -- Assignee: Kadir OZDEMIR > Using global indexes for queries with uncovered columns > --- > > Key: PHOENIX-6458 > URL: https://issues.apache.org/jira/browse/PHOENIX-6458 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.1.0 >Reporter: Kadir Ozdemir >Assignee: Kadir OZDEMIR >Priority: Major > Attachments: PHOENIX-6458.master.001.patch, > PHOENIX-6458.master.002.patch > > > The Phoenix query optimizer does not use a global index for a query with the > columns that are not covered by the global index if the query does not have > the corresponding index hint for this index. With the index hint, the > optimizer rewrites the query where the index is used within a subquery. With > this subquery, the row keys of the index rows that satisfy the subquery are > retrieved by the Phoenix client and then pushed into the Phoenix server > caches of the data table regions. Finally, on the server side, data table > rows are scanned and joined with the index rows using HashJoin. Based on the > selectivity of the original query, this join operation may still result in > scanning a large amount of data table rows. > Eliminating these data table scans would be a significant improvement. To do > that, instead of rewriting the query, the Phoenix optimizer simply treats the > global index as a covered index for the given query. With this, the Phoenix > query optimizer chooses the index table for the query especially when the > index row key prefix length is greater than the data row key prefix length > for the query. On the server side, the index table is scanned using index row > key ranges implied by the query and the index row keys are then mapped to the > data table row keys (please note an index row key includes all the data row > key columns). Finally, the corresponding data table rows are scanned using > server-to-server RPCs. PHOENIX-6458 (this Jira) retrieves the data table > rows one by one using the HBase get operation. PHOENIX-6501 replaces this get > operation with the scan operation to reduce the number of server-to-server > RPC calls. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (PHOENIX-6458) Using global indexes for queries with uncovered columns
[ https://issues.apache.org/jira/browse/PHOENIX-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-6458: --- Attachment: PHOENIX-6458.master.002.patch > Using global indexes for queries with uncovered columns > --- > > Key: PHOENIX-6458 > URL: https://issues.apache.org/jira/browse/PHOENIX-6458 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.1.0 >Reporter: Kadir Ozdemir >Priority: Major > Attachments: PHOENIX-6458.master.001.patch, > PHOENIX-6458.master.002.patch > > > The Phoenix query optimizer does not use a global index for a query with the > columns that are not covered by the global index if the query does not have > the corresponding index hint for this index. With the index hint, the > optimizer rewrites the query where the index is used within a subquery. With > this subquery, the row keys of the index rows that satisfy the subquery are > retrieved by the Phoenix client and then pushed into the Phoenix server > caches of the data table regions. Finally, on the server side, data table > rows are scanned and joined with the index rows using HashJoin. Based on the > selectivity of the original query, this join operation may still result in > scanning a large amount of data table rows. > Eliminating these data table scans would be a significant improvement. To do > that, instead of rewriting the query, the Phoenix optimizer simply treats the > global index as a covered index for the given query. With this, the Phoenix > query optimizer chooses the index table for the query especially when the > index row key prefix length is greater than the data row key prefix length > for the query. On the server side, the index table is scanned using index row > key ranges implied by the query and the index row keys are then mapped to the > data table row keys (please note an index row key includes all the data row > key columns). Finally, the corresponding data table rows are scanned using > server-to-server RPCs. PHOENIX-6458 (this Jira) retrieves the data table > rows one by one using the HBase get operation. PHOENIX-6501 replaces this get > operation with the scan operation to reduce the number of server-to-server > RPC calls. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (PHOENIX-6458) Using global indexes for queries with uncovered columns
[ https://issues.apache.org/jira/browse/PHOENIX-6458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kadir Ozdemir updated PHOENIX-6458: --- Description: The Phoenix query optimizer does not use a global index for a query with the columns that are not covered by the global index if the query does not have the corresponding index hint for this index. With the index hint, the optimizer rewrites the query where the index is used within a subquery. With this subquery, the row keys of the index rows that satisfy the subquery are retrieved by the Phoenix client and then pushed into the Phoenix server caches of the data table regions. Finally, on the server side, data table rows are scanned and joined with the index rows using HashJoin. Based on the selectivity of the original query, this join operation may still result in scanning a large amount of data table rows. Eliminating these data table scans would be a significant improvement. To do that, instead of rewriting the query, the Phoenix optimizer simply treats the global index as a covered index for the given query. With this, the Phoenix query optimizer chooses the index table for the query especially when the index row key prefix length is greater than the data row key prefix length for the query. On the server side, the index table is scanned using index row key ranges implied by the query and the index row keys are then mapped to the data table row keys (please note an index row key includes all the data row key columns). Finally, the corresponding data table rows are scanned using server-to-server RPCs. PHOENIX-6458 (this Jira) retrieves the data table rows one by one using the HBase get operation. PHOENIX-6501 replaces this get operation with the scan operation to reduce the number of server-to-server RPC calls. was:Phoenix client does not use a global index for the queries with the columns that are not covered by the global index. However, there are many cases where using the global index to map secondary keys to primary keys and then retrieving the corresponding rows from the data table results in faster queries. It is expected that such performance improvement will happen when the index row key prefix length is greater than the data row key prefix length for a given query. > Using global indexes for queries with uncovered columns > --- > > Key: PHOENIX-6458 > URL: https://issues.apache.org/jira/browse/PHOENIX-6458 > Project: Phoenix > Issue Type: Improvement >Affects Versions: 5.1.0 >Reporter: Kadir Ozdemir >Priority: Major > Attachments: PHOENIX-6458.master.001.patch > > > The Phoenix query optimizer does not use a global index for a query with the > columns that are not covered by the global index if the query does not have > the corresponding index hint for this index. With the index hint, the > optimizer rewrites the query where the index is used within a subquery. With > this subquery, the row keys of the index rows that satisfy the subquery are > retrieved by the Phoenix client and then pushed into the Phoenix server > caches of the data table regions. Finally, on the server side, data table > rows are scanned and joined with the index rows using HashJoin. Based on the > selectivity of the original query, this join operation may still result in > scanning a large amount of data table rows. > Eliminating these data table scans would be a significant improvement. To do > that, instead of rewriting the query, the Phoenix optimizer simply treats the > global index as a covered index for the given query. With this, the Phoenix > query optimizer chooses the index table for the query especially when the > index row key prefix length is greater than the data row key prefix length > for the query. On the server side, the index table is scanned using index row > key ranges implied by the query and the index row keys are then mapped to the > data table row keys (please note an index row key includes all the data row > key columns). Finally, the corresponding data table rows are scanned using > server-to-server RPCs. PHOENIX-6458 (this Jira) retrieves the data table > rows one by one using the HBase get operation. PHOENIX-6501 replaces this get > operation with the scan operation to reduce the number of server-to-server > RPC calls. -- This message was sent by Atlassian Jira (v8.20.1#820001)