[
https://issues.apache.org/jira/browse/HBASE-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jim Kellerman updated HBASE-421:
--------------------------------
Attachment: patch.txt
Increase memory for tests, 256M may be too little
> TestRegionServerExit broken
> ---------------------------
>
> Key: HBASE-421
> URL: https://issues.apache.org/jira/browse/HBASE-421
> Project: Hadoop HBase
> Issue Type: Bug
> Components: test
> Affects Versions: 0.2.0
> Reporter: Jim Kellerman
> Assignee: Jim Kellerman
> Priority: Critical
> Attachments: patch.txt, patch.txt
>
>
> TestRegionServerExit has a couple of problems:
> 1. Region server tries to start http server on a port already in use:
> [junit] 2008-02-07 07:01:06,529 FATAL [RegionServer:2]
> hbase.HRegionServer(867): Failed init
> [junit] java.io.IOException: Problem starting http server
> [junit] at
> org.apache.hadoop.hbase.util.InfoServer.start(InfoServer.java:227)
> [junit] at
> org.apache.hadoop.hbase.HRegionServer.startServiceThreads(HRegionServer.java:928)
> [junit] at
> org.apache.hadoop.hbase.HRegionServer.init(HRegionServer.java:863)
> [junit] at
> org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:633)
> [junit] at java.lang.Thread.run(Thread.java:595)
> [junit] Caused by:
> org.mortbay.util.MultiException[java.net.BindException: Address already in
> use]
> [junit] at org.mortbay.http.HttpServer.doStart(HttpServer.java:731)
> [junit] at org.mortbay.util.Container.start(Container.java:72)
> [junit] at
> org.apache.hadoop.hbase.util.InfoServer.start(InfoServer.java:205)
> [junit] ... 4 more
> [junit] 2008-02-07 07:01:06,530 FATAL [RegionServer:2]
> hbase.HRegionServer(772): Unhandled exception. Aborting...
> The region server that died apparently was serving the root region.
> The test case apparently has a long timeout for finding the root region
> because you see a lot of
> [junit] 2008-02-07 07:04:14,813 DEBUG [Thread-540]
> hbase.HConnectionManager$TableServers(708): Wake. Retry finding root region.
> [junit] 2008-02-07 07:04:14,814 DEBUG [Thread-540]
> hbase.HConnectionManager$TableServers(704): Sleeping. Waiting for root region.
> [junit] 2008-02-07 07:04:24,823 DEBUG [Thread-540]
> hbase.HConnectionManager$TableServers(708): Wake. Retry finding root region.
> [junit] 2008-02-07 07:04:24,827 DEBUG [Thread-540]
> hbase.HConnectionManager$TableServers(704): Sleeping. Waiting for root region.
> [junit] 2008-02-07 07:04:34,833 DEBUG [Thread-540]
> hbase.HConnectionManager$TableServers(708): Wake. Retry finding root region.
> [junit] 2008-02-07 07:04:34,836 DEBUG [Thread-540]
> hbase.HConnectionManager$TableServers(704): Sleeping. Waiting for root region.
> [junit] 2008-02-07 07:04:44,842 DEBUG [Thread-540]
> hbase.HConnectionManager$TableServers(708): Wake. Retry finding root region.
> until finally the client gives up:
> [junit] 2008-02-07 07:04:44,843 FATAL [Thread-540]
> hbase.TestRegionServerExit$1(161): could not re-open meta table because
> [junit] org.apache.hadoop.hbase.NoServerForRegionException: Timed out
> trying to locate root region
> [junit] at
> org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:718)
> [junit] at
> org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:329)
> [junit] at
> org.apache.hadoop.hbase.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:311)
> [junit] at
> org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:476)
> [junit] at
> org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:339)
> [junit] at
> org.apache.hadoop.hbase.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:311)
> [junit] at
> org.apache.hadoop.hbase.HTable.getRegionLocation(HTable.java:114)
> [junit] at
> org.apache.hadoop.hbase.HTable$ClientScanner.nextScanner(HTable.java:889)
> [junit] at
> org.apache.hadoop.hbase.HTable$ClientScanner.<init>(HTable.java:817)
> [junit] at org.apache.hadoop.hbase.HTable.obtainScanner(HTable.java:522)
> [junit] at org.apache.hadoop.hbase.HTable.obtainScanner(HTable.java:411)
> [junit] at
> org.apache.hadoop.hbase.TestRegionServerExit$1.run(TestRegionServerExit.java:156)
> [junit] at java.lang.Thread.run(Thread.java:595)
> [junit] Exception in thread "Thread-540"
> junit.framework.AssertionFailedError
> [junit] at junit.framework.Assert.fail(Assert.java:47)
> [junit] at junit.framework.Assert.fail(Assert.java:53)
> [junit] at
> org.apache.hadoop.hbase.TestRegionServerExit$1.run(TestRegionServerExit.java:162)
> [junit] at java.lang.Thread.run(Thread.java:595)
> Which is not the way the test is supposed to run at all.
> It appears that when we start multiple region servers in a MiniHBaseCluster,
> they all try to start their http server on the same port. In the past I
> believe that the http server start failure was not fatal, so the test ran.
> We should either have some kind of setting for MiniHBaseCluster that tells
> the master and region servers not to start their http servers, or some way of
> telling multiple servers not to start on the same port, or making http
> startup failure non-fatal.
> Tests like these are good as they (eventually) point out a regression to us.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.