[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763212#comment-13763212
 ] 

Hudson commented on HBASE-9451:
---

SUCCESS: Integrated in HBase-TRUNK #4485 (See 
[https://builds.apache.org/job/HBase-TRUNK/4485/])
HBASE-9451  Meta remains unassigned when the meta server crashes with the 
ClusterStatusListener set (nkeywal: rev 1521513)
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.96.0

 Attachments: 9451.v1.patch


 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763306#comment-13763306
 ] 

Hudson commented on HBASE-9451:
---

SUCCESS: Integrated in hbase-0.96 #29 (See 
[https://builds.apache.org/job/hbase-0.96/29/])
HBASE-9451  Meta remains unassigned when the meta server crashes with the 
ClusterStatusListener set (nkeywal: rev 1521526)
* 
/hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.96.0

 Attachments: 9451.v1.patch


 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763591#comment-13763591
 ] 

Hudson commented on HBASE-9451:
---

SUCCESS: Integrated in hbase-0.96-hadoop2 #16 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/16/])
HBASE-9451  Meta remains unassigned when the meta server crashes with the 
ClusterStatusListener set (nkeywal: rev 1521526)
* 
/hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.96.0

 Attachments: 9451.v1.patch


 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763841#comment-13763841
 ] 

Hudson commented on HBASE-9451:
---

SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #721 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/721/])
HBASE-9451  Meta remains unassigned when the meta server crashes with the 
ClusterStatusListener set (nkeywal: rev 1521513)
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.96.0

 Attachments: 9451.v1.patch


 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-06 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760115#comment-13760115
 ] 

Nicolas Liochon commented on HBASE-9451:


bq. I am trying to understand why it's hardcoded to 'false' for former case.
It's because if we don't have the status, then we don't know, so we consider 
the server is up.

 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das

 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760163#comment-13760163
 ] 

Hadoop QA commented on HBASE-9451:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12601807/9451.v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.master.TestDistributedLogSplitting

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//console

This message is automatically generated.

 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Nicolas Liochon
 Attachments: 9451.v1.patch


 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA 

[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-06 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760384#comment-13760384
 ] 

Jimmy Xiang commented on HBASE-9451:


+1

 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Nicolas Liochon
 Attachments: 9451.v1.patch


 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760373#comment-13760373
 ] 

Hadoop QA commented on HBASE-9451:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12601807/9451.v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7069//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7069//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7069//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7069//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7069//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7069//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7069//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7069//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7069//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/7069//console

This message is automatically generated.

 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Nicolas Liochon
 Attachments: 9451.v1.patch


 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira