[
https://issues.apache.org/jira/browse/HBASE-13971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14629081#comment-14629081
]
Hadoop QA commented on HBASE-13971:
-----------------------------------
{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12745550/13971-v1.txt
against master branch at commit 5315f0f11ffa0f750e5615617424baa9271611af.
ATTACHMENT ID: 12745550
{color:green}+1 @author{color}. The patch does not contain any @author
tags.
{color:red}-1 tests included{color}. The patch doesn't appear to include
any new or modified tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.
{color:green}+1 hadoop versions{color}. The patch compiles with all
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0)
{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.
{color:green}+1 protoc{color}. The applied patch does not increase the
total number of protoc compiler warnings.
{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.
{color:green}+1 checkstyle{color}. The applied patch does not increase the
total number of checkstyle errors
{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.
{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.
{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100
{color:green}+1 site{color}. The mvn post-site goal succeeds with this patch.
{color:red}-1 core tests{color}. The patch failed these unit tests:
Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/14791//testReport/
Release Findbugs (version 2.0.3) warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/14791//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors:
https://builds.apache.org/job/PreCommit-HBASE-Build/14791//artifact/patchprocess/checkstyle-aggregate.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/14791//console
This message is automatically generated.
> Flushes stuck since 6 hours on a regionserver.
> ----------------------------------------------
>
> Key: HBASE-13971
> URL: https://issues.apache.org/jira/browse/HBASE-13971
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 1.3.0
> Environment: Caused while running IntegrationTestLoadAndVerify for 20
> M rows on cluster with 32 region servers each with max heap size of 24GBs.
> Reporter: Abhilash
> Assignee: Ted Yu
> Priority: Critical
> Attachments: 13971-v1.txt, 13971-v1.txt, jstack.1, jstack.2,
> jstack.3, jstack.4, jstack.5, rsDebugDump.txt, screenshot-1.png
>
>
> One region server stuck while flushing(possible deadlock). Its trying to
> flush two regions since last 6 hours (see the screenshot).
> Caused while running IntegrationTestLoadAndVerify for 20 M rows with 600
> mapper jobs and 100 back references. ~37 Million writes on each regionserver
> till now but no writes happening on any regionserver from past 6 hours and
> their memstore size is zero(I dont know if this is related). But this
> particular regionserver has memstore size of 9GBs from past 6 hours.
> Relevant snaps from debug dump:
> Tasks:
> ===========================================================
> Task: Flushing
> IntegrationTestLoadAndVerify,R\x9B\x1B\xBF\xAE\x08\xD1\xA2,1435179555993.8e2d075f94ce7699f416ec4ced9873cd.
> Status: RUNNING:Preparing to flush by snapshotting stores in
> 8e2d075f94ce7699f416ec4ced9873cd
> Running for 22034s
> Task: Flushing
> IntegrationTestLoadAndVerify,\x93\xA385\x81Z\x11\xE6,1435179555993.9f8d0e01a40405b835bf6e5a22a86390.
> Status: RUNNING:Preparing to flush by snapshotting stores in
> 9f8d0e01a40405b835bf6e5a22a86390
> Running for 22033s
> Executors:
> ===========================================================
> ...
> Thread 139 (MemStoreFlusher.1):
> State: WAITING
> Blocked count: 139711
> Waited count: 239212
> Waiting on java.util.concurrent.CountDownLatch$Sync@b9c094a
> Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305)
>
> org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422)
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168)
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047)
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011)
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902)
> org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828)
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75)
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> java.lang.Thread.run(Thread.java:745)
> Thread 137 (MemStoreFlusher.0):
> State: WAITING
> Blocked count: 138931
> Waited count: 237448
> Waiting on java.util.concurrent.CountDownLatch$Sync@53f41f76
> Stack:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> org.apache.hadoop.hbase.wal.WALKey.getSequenceId(WALKey.java:305)
>
> org.apache.hadoop.hbase.regionserver.HRegion.getNextSequenceId(HRegion.java:2422)
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2168)
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2047)
>
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2011)
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1902)
> org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1828)
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510)
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$900(MemStoreFlusher.java:75)
>
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
> java.lang.Thread.run(Thread.java:745)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)