[
https://issues.apache.org/jira/browse/HBASE-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13569798#comment-13569798
]
Himanshu Vashishtha commented on HBASE-7607:
--------------------------------------------
Interestingly, with this patch, the regionserver which is aborted is processed
normally. And, the test passes its normal phase. Its in the cluster shutdown
process, sometimes master is not able to process the other regionserver dying
process, but the cluster is considered as shutdown by JVMClusterUtil.
{code}
2013-01-30 19:40:14,048 INFO [RegionServer:0;localhost,49074,1359600001555]
regionserver.HRegionServer(851): stopping server localhost,49074,1359600001555;
zookeeper connection closed.
2013-01-30 19:40:14,048 INFO [RegionServer:0;localhost,49074,1359600001555]
regionserver.HRegionServer(854): RegionServer:0;localhost,49074,1359600001555
exiting
2013-01-30 19:40:14,048 INFO [localhost,35387,1359600001393.timerUpdater]
hbase.Chore(80): localhost,35387,1359600001393.timerUpdater exiting
2013-01-30 19:40:14,048 INFO [Shutdown of
org.apache.hadoop.hbase.fs.HFileSystem@32d35f5f]
hbase.MiniHBaseCluster$SingleFileSystemShutdownThread(182): Hook closing
fs=org.apache.hadoop.hbase.fs.HFileSystem@32d35f5f
2013-01-30 19:40:14,049 INFO [main] util.JVMClusterUtil(262): Shutdown of 1
master(s) and 2 regionserver(s) complete
{code}
{code}
2013-01-30 19:40:14,168 INFO
[RegionServer:0;localhost,49074,1359600001555.leaseChecker]
regionserver.Leases(132):
RegionServer:0;localhost,49074,1359600001555.leaseChecker closed leases
2013-01-30 19:40:14,227 INFO [Master:0;localhost,35387,1359600001393]
master.ServerManager(357): Waiting on regionserver(s) to go down
localhost,49074,1359600001555
{code}
But, master thread still looping in its ServerManager#letRegionServersShutdown
method to process the dead regionserver, which it doesn't get. I am looking
into the reason why this happens only with this patch (frequently is around
1/5).
> Fix TestRegionServerCoprocessorExceptionWithAbort flakiness in 0.94
> -------------------------------------------------------------------
>
> Key: HBASE-7607
> URL: https://issues.apache.org/jira/browse/HBASE-7607
> Project: HBase
> Issue Type: Bug
> Components: Client, test
> Affects Versions: 0.94.4
> Reporter: Himanshu Vashishtha
> Assignee: Himanshu Vashishtha
> Fix For: 0.94.6
>
> Attachments: HBASE-7607-v2.patch
>
>
> TestRegionServerCoprocessorExceptionWithAbort fails sometimes both on trunk
> and 0.94.X. The codebase is different in both.
> In 0.94.x, client retries to look at the root region, while the cluster is
> down and /hbase znode is no longer present.
> "Check the value configured in 'zookeeper.znode.parent'. There could be a
> mismatch with the one configured in the master."
> I will file a separate jira for the trunk as the code is different there.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira