The issue says that it was applied to the branch for 0.90.2. Thats a misstatement. The patch was not applied. Will apply to the branch now. St.Ack
On Thu, May 12, 2011 at 10:59 AM, Stack <[email protected]> wrote: > Vidhya: > > So its failing to send close to an explicit server -- see the IP in > the below -- and the other server is closing down the request > prematurely so we get the EOFE. Can you see anything in the logs on > that machine? > > Regards EOFE crashing Master, you might want to pick up a TRUNK > change. See > http://hbase.apache.org/xref/org/apache/hadoop/hbase/master/AssignmentManager.html#1261 > (This is how TRUNK looks). Notice that its more generic than what you > currently have -- or add a catch for the EOFE. > > The patch is actually kinda small and targetted explicitly to fix the > likes of what you are seeing: > > + HBASE-3617 NoRouteToHostException during balancing will cause Master > abort > + (Ted Yu via Stack) > > Let me know if it works for you. If so, I'll backport it to the branch. > > St.Ack > > > > On Wed, May 11, 2011 at 2:32 PM, Vidhyashankar Venkataraman > <[email protected]> wrote: >> The master of my Hbase instance (0.90.x) crashes each time it is restarted, >> with the exceptions shown below. Can you let me know what this is usually >> due to? (I also saw these exceptions in a JIRA but they were about uncaught >> EOF exception). Only the master dies while the region servers wait for a >> master to wake back up. >> >> Thank you >> Vidhya >> >> The master log: >> >> 2011-05-11 21:19:04,259 FATAL org.apache.hadoop.hbase.master.HMaster: Remote >> unexpected exception >> java.io.IOException: Call to /67.195.47.230:44420 failed on local exception: >> java.io.EOFException at >> org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:788) >> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757) >> at >> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) >> at $Proxy7.closeRegion(Unknown Source) >> at >> org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:589) >> at >> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1092) >> at >> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1039) >> at >> org.apache.hadoop.hbase.master.AssignmentManager.balance(AssignmentManager.java:1808) >> at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:691) >> at org.apache.hadoop.hbase.master.HMaster$1.chore(HMaster.java:582) >> at org.apache.hadoop.hbase.Chore.run(Chore.java:66) >> Caused by: java.io.EOFException >> at java.io.DataInputStream.readInt(DataInputStream.java:375) >> at >> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:521) >> at >> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459)2011-05-11 >> 21:19:04,260 INFO org.apache.hadoop.hbase.master.HMaster: Aborting >> 2011-05-11 21:19:04,260 INFO org.apache.hadoop.hbase.master.HMaster: balance >> hri=WCC.davesch2,r:at#start#www!/Gateway2000!http,1302916227366.b7d206f663282e2a37adb24ba7e4c0de., >> src=b3110318.yst.yahoo.net,44420,1305073517470, >> dest=b3110175.yst.yahoo.net,44420,1305073507459 >> 2011-05-11 21:19:04,260 DEBUG >> org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of >> region WCC.davesch2,r:at#start#www!/Gateway2000!http >> ,1302916227366.b7d206f663282e2a37adb24ba7e4c0de. (offlining) >> 2011-05-11 21:19:04,260 FATAL org.apache.hadoop.hbase.master.HMaster: Remote >> unexpected exception >> java.io.IOException: Call to /67.195.47.230:44420 failed on local exception: >> java.io.EOFException >> at >> org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:788) >> at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757) >> at >> org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257) >> at $Proxy7.closeRegion(Unknown Source) at >> org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:589) >> at >> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1092) >> at >> org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1039) >> at >> org.apache.hadoop.hbase.master.AssignmentManager.balance(AssignmentManager.java:1808) >> at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:691) >> at org.apache.hadoop.hbase.master.HMaster$1.chore(HMaster.java:582) >> at org.apache.hadoop.hbase.Chore.run(Chore.java:66) >> Caused by: java.io.EOFException >> at java.io.DataInputStream.readInt(DataInputStream.java:375) >> at >> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:521) >> at >> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459) >> 2011-05-11 21:19:04,260 DEBUG org.apache.hadoop.hbase.master.HMaster: >> Stopping service threads >> 2011-05-11 21:19:04,260 INFO org.apache.hadoop.hbase.master.HMaster: Aborting >> >
