subject:"\[jira\] \[Commented\] $HBASE\-9451$ Meta remains unassigned when the meta server crashes with the ClusterStatusListener set"

[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763212#comment-13763212
 ] 

Hudson commented on HBASE-9451:
---

SUCCESS: Integrated in HBase-TRUNK #4485 (See 
[https://builds.apache.org/job/HBase-TRUNK/4485/])
HBASE-9451  Meta remains unassigned when the meta server crashes with the 
ClusterStatusListener set (nkeywal: rev 1521513)
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.96.0

 Attachments: 9451.v1.patch


 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763306#comment-13763306
 ] 

Hudson commented on HBASE-9451:
---

SUCCESS: Integrated in hbase-0.96 #29 (See 
[https://builds.apache.org/job/hbase-0.96/29/])
HBASE-9451  Meta remains unassigned when the meta server crashes with the 
ClusterStatusListener set (nkeywal: rev 1521526)
* 
/hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.96.0

 Attachments: 9451.v1.patch


 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763591#comment-13763591
 ] 

Hudson commented on HBASE-9451:
---

SUCCESS: Integrated in hbase-0.96-hadoop2 #16 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/16/])
HBASE-9451  Meta remains unassigned when the meta server crashes with the 
ClusterStatusListener set (nkeywal: rev 1521526)
* 
/hbase/branches/0.96/hbase-client/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.96.0

 Attachments: 9451.v1.patch


 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-10 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763841#comment-13763841
 ] 

Hudson commented on HBASE-9451:
---

SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #721 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/721/])
HBASE-9451  Meta remains unassigned when the meta server crashes with the 
ClusterStatusListener set (nkeywal: rev 1521513)
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java


 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Nicolas Liochon
 Fix For: 0.98.0, 0.96.0

 Attachments: 9451.v1.patch


 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-06 Thread Nicolas Liochon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760115#comment-13760115
 ] 

Nicolas Liochon commented on HBASE-9451:


bq. I am trying to understand why it's hardcoded to 'false' for former case.
It's because if we don't have the status, then we don't know, so we consider 
the server is up.

 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das

 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-06 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760163#comment-13760163
]

Hadoop QA commented on HBASE-9451:
--

{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12601807/9451.v1.patch
against trunk revision .

{color:green}+1 @author{color}. The patch does not contain any @author
tags.

{color:red}-1 tests included{color}. The patch doesn't appear to include
any new or modified tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

{color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop
1.0 profile.

{color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop
2.0 profile.

{color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1
warning messages.

{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.

{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.

{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100

{color:green}+1 site{color}. The mvn site goal succeeds with this patch.

{color:red}-1 core tests{color}. The patch failed these unit tests:

org.apache.hadoop.hbase.master.TestDistributedLogSplitting

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/7067//console

This message is automatically generated.

Meta remains unassigned when the meta server crashes with the
ClusterStatusListener set
---

Key: HBASE-9451
URL: https://issues.apache.org/jira/browse/HBASE-9451
Project: HBase
Issue Type: Bug
Reporter: Devaraj Das
Assignee: Nicolas Liochon
Attachments: 9451.v1.patch

While running tests described in HBASE-9338, ran into this problem. The
hbase.status.listener.class was set to
org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
1. I had the meta server coming down
2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when
the clusterStatusListener field is null. If clusterStatusListener is not null
(in my test), then it could return true in certain cases (and in this case,
indeed it should return true since the server is down). I am trying to
understand why it's hardcoded to 'false' for former case.
3. When isDeadServer returns true, the method
HConnectionManager.getAdmin(ServerName, boolean) throws
RegionServerStoppedException.
4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives
up and the master aborts.
The methods in the above call chain don't handle
RegionServerStoppedException. Maybe something to look at...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA

[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-06 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760384#comment-13760384
 ] 

Jimmy Xiang commented on HBASE-9451:


+1

 Meta remains unassigned when the meta server crashes with the 
 ClusterStatusListener set
 ---

 Key: HBASE-9451
 URL: https://issues.apache.org/jira/browse/HBASE-9451
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Nicolas Liochon
 Attachments: 9451.v1.patch


 While running tests described in HBASE-9338, ran into this problem. The 
 hbase.status.listener.class was set to 
 org.apache.hadoop.hbase.client.ClusterStatusListener$MultiCastListener.
 1. I had the meta server coming down
 2. The metaSSH got triggered. The call chain:
2.1 verifyAndAssignMetaWithRetries
2.2 verifyMetaRegionLocation
2.3 waitForMetaServerConnection
2.4 getMetaServerConnection
2.5 getCachedConnection
2.6 HConnectionManager.getAdmin(serverName, false)
2.7 isDeadServer(serverName) - This is hardcoded to return 'false' when 
 the clusterStatusListener field is null. If clusterStatusListener is not null 
 (in my test), then it could return true in certain cases (and in this case, 
 indeed it should return true since the server is down). I am trying to 
 understand why it's hardcoded to 'false' for former case.
 3. When isDeadServer returns true, the method 
 HConnectionManager.getAdmin(ServerName, boolean) throws 
 RegionServerStoppedException.
 4. Finally, after the retries are over verifyAndAssignMetaWithRetries gives 
 up and the master aborts.
 The methods in the above call chain don't handle 
 RegionServerStoppedException. Maybe something to look at... 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9451) Meta remains unassigned when the meta server crashes with the ClusterStatusListener set

2013-09-06 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760373#comment-13760373
]