[ 
https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362989#comment-16362989
 ] 

Umesh Agashe commented on HBASE-19988:
--------------------------------------

It was logging following exception... several times!
{code:java}
2018-02-10 04:24:25,503 WARN [PutThread] regionserver.HRegion(5636): Thread 
interrupted waiting for lock on row: row0
2018-02-10 04:24:25,503 WARN [PutThread] 
regionserver.HRegion$BatchOperation(3173): Failed getting lock, row=row0
java.io.InterruptedIOException
at 
org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5637)
at 
org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.lockRowsAndBuildMiniBatch(HRegion.java:3168)
at 
org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:3837)
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3810)
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3741)
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3732)
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3746)
at org.apache.hadoop.hbase.regionserver.HRegion.doBatchMutate(HRegion.java:4074)
at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2925)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion$PutThread.run(TestHRegion.java:3891)
Caused by: java.lang.InterruptedException
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
at 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871)
at 
org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5621)
... 9 more{code}
 

There is a loop in the write batch path:
{code:java}
while (!batchOp.isDone()) {
  doMiniBatchMutate(batchOp);
}{code}
 

This loop essentially, tries to acquire locks on as many rows in a batch as 
possible and creates a mini-batch of those rows to write. Next time, locks are 
acquired from last row (row for which previous iteration failed to acquire a 
lock) on till the entire batch is written.

The operation was aborted/ stopped only on Timeout exception. All other 
exceptions were logged and ignored to resume creating and writing mini-batches 
for an input batch.

In this particular case, getRowLockInternal() used to fail with exception 
InterruptedIOException caused by surefire (possibly due to test timeout). This 
exception was ignored to proceed with write operation containing locked rows so 
far. This was causing continuous calls to doMinibatchMutate() in a loop, 
filling up the logs.

> HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while 
> waiting for a row lock
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-19988
>                 URL: https://issues.apache.org/jira/browse/HBASE-19988
>             Project: HBase
>          Issue Type: Improvement
>          Components: amv2
>    Affects Versions: 2.0.0-beta-1
>            Reporter: Umesh Agashe
>            Assignee: Umesh Agashe
>            Priority: Minor
>             Fix For: 2.0.0-beta-2
>
>         Attachments: hbase-19988.master.001.patch
>
>
> See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to