[
https://issues.apache.org/jira/browse/PHOENIX-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400125#comment-15400125
]
Enis Soztutar commented on PHOENIX-3111:
----------------------------------------
PHOENIX-2667 is about a regular index update being computed as a part of
processing multi() request. This is a scan reading and writing to the same
region, so different issues.
bq. blocking a split from occurring (which is one way a write lock can be
acquired)
Blocking splits while we are building the local index updates makes sense to me
regardless of this issue. Local indexes by definition are build
once-per-region. If the region wants to split in between, then we have to
restart and redo all the work. This can result in multiple splits happening and
the local index build to take a very long time. Instead, in this approach, we
disable the splits until we are done index building case. Once it is over, if
the region is really larger than the split point, it will be split once by
regular means and local indexes will get re-written once. We were actually
talking about pre-splitting some regions before the local index build process
starts, so that we don't have to re-write the data once local indexes are
built. Each local index can increase the data size in the region by up to 100%.
bq. Or are you brainstorming to figure out if HBase can help with this issue
Some of these issues are caused by the fact that we are piggy-backing on
operations that are not intended for this purpose. Like for example,
doMiniBatchMutation() is supposed to be a write-only operation, but we are
doing reads/scans from there. Local index rebuild is using scans which are
read-only, but we are doing batchMutate() there. For example, the hbase-client
can deal with RegionTooBusyException's being thrown from the write RPCs, but
bubbles up the exception if the scan requests throw it. If we can come up with
a plan for longer term way to negotiate these kind of semantics between hbase
and phoenix, that would be great. I don't have an immediate proposal
unfortunately. I have to think about it more.
> Possible Deadlock/delay while building index, upsert select, delete rows at
> server
> ----------------------------------------------------------------------------------
>
> Key: PHOENIX-3111
> URL: https://issues.apache.org/jira/browse/PHOENIX-3111
> Project: Phoenix
> Issue Type: Bug
> Reporter: Sergio Peleato
> Assignee: Rajeshbabu Chintaguntla
> Priority: Critical
> Fix For: 4.8.0
>
> Attachments: PHOENIX-3111.patch
>
>
> There is a possible deadlock while building local index or running upsert
> select, delete at server. The situation might happen in this case.
> In the above queries we scan mutations from table and write back to same
> table in that case there is a chance of memstore might reach the threshold of
> blocking memstore size then RegionTooBusyException might be thrown back to
> client and queries might retry scanning.
> Let's suppose if we take a local index build index case we first scan from
> the data table and prepare index mutations and write back to same table.
> So there is chance of memstore full as well in that case we try to flush the
> region. But if the split happen in between then split might be waiting for
> write lock on the region to close and flush wait for readlock because the
> write lock in the queue until the local index build completed. Local index
> build won't complete because we are not allowed to write until there is
> flush. This might not be complete deadlock situation but the queries might
> take lot of time to complete in this cases.
> {noformat}
> "regionserver//192.168.0.53:16201-splits-1469165876186" #269 prio=5
> os_prio=31 tid=0x00007f7fb2050800 nid=0x1c033 waiting on condition
> [0x0000000139b68000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000006ede72550> (a
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
> at
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1422)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1370)
> - locked <0x00000006ede69d00> (a java.lang.Object)
> at
> org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:394)
> at
> org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:278)
> at
> org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:561)
> at
> org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82)
> at
> org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:154)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Locked ownable synchronizers:
> - <0x00000006ee132098> (a
> java.util.concurrent.ThreadPoolExecutor$Worker)
> {noformat}
> {noformat}
> "MemStoreFlusher.0" #170 prio=5 os_prio=31 tid=0x00007f7fb6842000 nid=0x19303
> waiting on condition [0x00000001388e9000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00000006ede72550> (a
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
> at
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1986)
> at
> org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1950)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:501)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
> at
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> As a fix we need to block region splits if building index, upsert select,
> delete rows running at server.
> Thanks [~sergey.soldatov] for the help in understanding the bug and analyzing
> it. [~speleato] for finding it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)