[jira] [Commented] (HBASE-10871) Indefinite OPEN/CLOSE wait on busy RegionServers

Hadoop QA (JIRA) Thu, 12 Jun 2014 21:57:18 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-10871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030260#comment-14030260
 ]


Hadoop QA commented on HBASE-10871:
-----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12650212/HBASE-10871.v1.patch
  against trunk revision .
  ATTACHMENT ID: 12650212

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
                        Please justify why no new tests are needed for this 
patch.
                        Also please list what manual steps were performed to 
verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

     {color:red}-1 core tests{color}.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.TestAcidGuarantees

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9761//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9761//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/9761//console

This message is automatically generated.

> Indefinite OPEN/CLOSE wait on busy RegionServers
> ------------------------------------------------
>
>                 Key: HBASE-10871
>                 URL: https://issues.apache.org/jira/browse/HBASE-10871
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer, master, Region Assignment
>    Affects Versions: 0.94.6
>            Reporter: Harsh J
>            Assignee: Esteban Gutierrez
>         Attachments: HBASE-10871-0.94.v1.patch, HBASE-10871.v0.patch, 
> HBASE-10871.v1.patch
>
>
> We observed a case where, when a specific RS got bombarded by a large amount 
> of regular requests, spiking and filling up its RPC queue, the balancer's 
> invoked unassigns and assigns for regions that dealt with this server entered 
> into an indefinite retry loop.
> The regions specifically began waiting in PENDING_CLOSE/PENDING_OPEN states 
> indefinitely cause of the HBase Client RPC from the ServerManager at the 
> master was running into SocketTimeouts. This caused a region unavailability 
> in the server for the affected regions. The timeout monitor retry default of 
> 30m in 0.94's AM compounded the waiting gap further a bit more (this is now 
> 10m in 0.95+'s new AM, and has further retries before we get there, which is 
> good).
> Wonder if there's a way to improve this situation generally. PENDING_OPENs 
> may be easy to handle - we can switch them out and move them elsewhere. 
> PENDING_CLOSEs may be a bit more tricky, but there must perhaps at least be a 
> way to "give up" permanently on a movement plan, and letting things be for a 
> while hoping for the RS to recover itself on its own (such that clients also 
> have a chance of getting things to work in the meantime)?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10871) Indefinite OPEN/CLOSE wait on busy RegionServers

Reply via email to