[ https://issues.apache.org/jira/browse/PHOENIX-2667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141480#comment-15141480 ]
Enis Soztutar commented on PHOENIX-2667: ---------------------------------------- bq. Does it make a difference that index writes are done at a higher priority than data writes? I think priority will not help here, since we are blocked on the region lock, RPC is not involved. bq. I don't understand why we don't have this problem with HRegion#checkAndMutateRow() since that takes a lock twice by calling startRegionOperation() twice. Region's RWLock is a reentrant lock which means that the same thread can acquire the lock multiple times. Since checkAndMutate() and other ops are all handled by the handler thread, this does not cause an issue. However, for the index updates, the handler thread forks of the work to the IndexBuilder thread pool, and it is another thread that wants to acquire the region's read lock at that time. Thus locks reentrant semantics do not let the re-acquisition of the lock from the other thread. bq. Anyone else who read region state on update form a CP should have this same problem, yes? I think because of re-entrant locks, if you do not do the work in a different thread than the handler thread, CP code should be fine. bq. Do you have a proposed solution? I did not do a deep analysis. We can break up the reads for the index work, and do the "read from region" bits from the handler thread, and do the rest from the thread pool. wdyt? > Race condition between IndexBuilder and Split for region lock > ------------------------------------------------------------- > > Key: PHOENIX-2667 > URL: https://issues.apache.org/jira/browse/PHOENIX-2667 > Project: Phoenix > Issue Type: Bug > Reporter: Enis Soztutar > > In a production cluster, we have seen a condition where the split did not > finish for 30+ minutes. Also due to this, no request was being serviced in > this time frame affectively making the region offline. > The jstack provides 3 types of threads waiting on the regions read or write > locks. > First, the handlers are all blocked on trying to acquire the read lock on the > region in multi(), most of the handlers are like this: > {code} > Thread 2328: (state = BLOCKED) > - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information > may be imprecise) > - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long) > @bci=20, line=226 (Compiled frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(int, > long) @bci=122, line=1033 (Compiled frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(int, > long) @bci=25, line=1326 (Compiled frame) > - java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(long, > java.util.concurrent.TimeUnit) @bci=10, line=873 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.lock(java.util.concurrent.locks.Lock, > int) @bci=27, line=7754 (Interpreted frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.lock(java.util.concurrent.locks.Lock) > @bci=3, line=7741 (Interpreted frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(org.apache.hadoop.hbase.regionserver.Region$Operation) > @bci=211, line=7650 (Interpreted frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.regionserver.HRegion$BatchOperationInProgress) > @bci=21, line=2803 (Interpreted frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(org.apache.hadoop.hbase.client.Mutation[], > long, long) @bci=12, line=2760 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$RegionActionResult$Builder, > org.apache.hadoop.hbase.regionserver.Region, org.apache. > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(org.apache.hadoop.hbase.regionserver.Region, > org.apache.hadoop.hbase.quotas.OperationQuota, > org.apache.hadoop.hbase.protobuf > - > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(com.google.protobuf.RpcController, > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$MultiRequest) > @bci=407, line=2032 (Compiled frame) > - > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.RpcController, com.google.protobuf.Messa > - > org.apache.hadoop.hbase.ipc.RpcServer.call(com.google.protobuf.BlockingService, > com.google.protobuf.Descriptors$MethodDescriptor, > com.google.protobuf.Message, org.apache.hadoop.hbase.CellScanner, long, > - org.apache.hadoop.hbase.ipc.CallRunner.run() @bci=345, line=101 (Compiled > frame) > - > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(java.util.concurrent.BlockingQueue) > @bci=54, line=130 (Compiled frame) > - org.apache.hadoop.hbase.ipc.RpcExecutor$1.run() @bci=20, line=107 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} > second, the IndexBuilder threads from Phoenix index are also blocked waiting > on the region read locks: > {code} > Thread 17566: (state = BLOCKED) > - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information > may be imprecise) > - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long) > @bci=20, line=226 (Compiled frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(int, > long) @bci=122, line=1033 (Compiled frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(int, > long) @bci=25, line=1326 (Compiled frame) > - java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(long, > java.util.concurrent.TimeUnit) @bci=10, line=873 (Compiled frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.lock(java.util.concurrent.locks.Lock, > int) @bci=27, line=7754 (Interpreted frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.lock(java.util.concurrent.locks.Lock) > @bci=3, line=7741 (Interpreted frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(org.apache.hadoop.hbase.regionserver.Region$Operation) > @bci=211, line=7650 (Interpreted frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.getScanner(org.apache.hadoop.hbase.client.Scan, > java.util.List) @bci=4, line=2484 (Interpreted frame) > - > org.apache.hadoop.hbase.regionserver.HRegion.getScanner(org.apache.hadoop.hbase.client.Scan) > @bci=3, line=2479 (Compiled frame) > - > org.apache.phoenix.hbase.index.covered.data.LocalTable.getCurrentRowState(org.apache.hadoop.hbase.client.Mutation, > java.util.Collection) @bci=43, line=63 (Compiled frame) > - > org.apache.phoenix.hbase.index.covered.LocalTableState.ensureLocalStateInitialized(java.util.Collection) > @bci=29, line=158 (Compiled frame) > - > org.apache.phoenix.hbase.index.covered.LocalTableState.getIndexedColumnsTableState(java.util.Collection) > @bci=2, line=126 (Compiled frame) > - > org.apache.phoenix.index.PhoenixIndexCodec.getIndexUpdates(org.apache.phoenix.hbase.index.covered.TableState, > boolean) @bci=213, line=162 (Compiled frame) > - > org.apache.phoenix.index.PhoenixIndexCodec.getIndexDeletes(org.apache.phoenix.hbase.index.covered.TableState) > @bci=3, line=120 (Compiled frame) > - > org.apache.phoenix.hbase.index.covered.CoveredColumnsIndexBuilder.addDeleteUpdatesToMap(org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager, > org.apache.phoenix.hbase.index.covered.LocalTab > - > org.apache.phoenix.hbase.index.covered.CoveredColumnsIndexBuilder.addCleanupForCurrentBatch(org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager, > long, org.apache.phoenix.hbase.index.covere > - > org.apache.phoenix.hbase.index.covered.CoveredColumnsIndexBuilder.addMutationsForBatch(org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager, > org.apache.phoenix.hbase.index.covered.Batch, or > - > org.apache.phoenix.hbase.index.covered.CoveredColumnsIndexBuilder.batchMutationAndAddUpdates(org.apache.phoenix.hbase.index.covered.update.IndexUpdateManager, > org.apache.hadoop.hbase.client.Mutation) @ > - > org.apache.phoenix.hbase.index.covered.CoveredColumnsIndexBuilder.getIndexUpdate(org.apache.hadoop.hbase.client.Mutation) > @bci=11, line=99 (Compiled frame) > - org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call() @bci=11, > line=133 (Compiled frame) > - org.apache.phoenix.hbase.index.builder.IndexBuildManager$1.call() @bci=1, > line=129 (Compiled frame) > - java.util.concurrent.FutureTask.run() @bci=42, line=262 (Compiled frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1145 (Compiled frame) > - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} > Finally, the splitting of the region is being executed, which is trying to > close the region. This operation acquires the regions write lock, but it is > also blocked: > {code} > Thread 18417: (state = BLOCKED) > - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information > may be imprecise) > - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, > line=186 (Compiled frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() > @bci=1, line=834 (Compiled frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node, > int) @bci=67, line=867 (Compiled frame) > - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) > @bci=17, line=1197 (Compiled frame) > - java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock() @bci=5, > line=945 (Interpreted frame) > - org.apache.hadoop.hbase.regionserver.HRegion.doClose(boolean, > org.apache.hadoop.hbase.monitoring.MonitoredTask) @bci=278, line=1401 > (Interpreted frame) > - org.apache.hadoop.hbase.regionserver.HRegion.close(boolean) @bci=62, > line=1349 (Interpreted frame) > - > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(org.apache.hadoop.hbase.Server, > org.apache.hadoop.hbase.regionserver.RegionServerServices, boolean) > @bci=238, line=394 (Interpr > - > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(org.apache.hadoop.hbase.Server, > org.apache.hadoop.hbase.regionserver.RegionServerServices, > org.apache.hadoop.hbase.security.Use > - > org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(org.apache.hadoop.hbase.Server, > org.apache.hadoop.hbase.regionserver.RegionServerServices, > org.apache.hadoop.hbase.security.User) @bci= > - > org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(org.apache.hadoop.hbase.security.User) > @bci=357, line=82 (Interpreted frame) > - org.apache.hadoop.hbase.regionserver.SplitRequest.run() @bci=74, line=154 > (Interpreted frame) > - > java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) > @bci=95, line=1145 (Compiled frame) > - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 > (Interpreted frame) > - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame) > {code} > This looks like a race between the split thread and the global index updates > racing for the region lock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)