Thanks for your answer,
That's right, server does'nt really crash but the client (a map/reduce
task attemp) fail on this.
On the server, there are other errors which appears after that :
12/03/26 15:18:01 INFO mapred.JobClient: Task Id :
attempt_201203261505_0002_m_000051_0, Status : FAILED
org.apache.hadoop.hbase.regionserver.LeaseException:
org.apache.hadoop.hbase.regionserver.LeaseException: lease
'-7139822043219672768' does not exist
at
org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:230)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1879)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.hbase.ipc.HBaseRPC
$Server.call(HBaseRPC.java:570)
at org.apache.hadoop.hbase.ipc.HBaseServer
$Handler.run(HBaseServer.java:1039)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at
java.lang.reflect.Constructor.newInstance(Constructor.java:525)
I think each error after that was originaletly caused by the connexion
closed. This error appears before the IPC timeout (which has been set
*10). On the client, I don't see anything weird until the closed
connexion. Any idea on what can cause that?
Simon Gilliot
Le lundi 26 mars 2012 à 08:03 -0700, Stack a écrit :
> On Mon, Mar 26, 2012 at 7:01 AM, Simon Gilliot
> <[email protected]> wrote:
> > Hello,
> >
> > we have a small architecture of 4 servers with 1
> > Namenode/Jobtracker/HbaseMaster, 2 Datanode/Tasktracker, 1 server
> > Failover Namenode/Jobtracker.
> > We often Hbase crashes with this error:
> >
>
> > 2012-03-26 15:15:11,043 WARN org.apache.hadoop.ipc.HBaseServer:
> > IPC Server handler 7 on 58237 caught:
> > java.nio.channels.ClosedChannelException
>
> That doesn't look like a crash. It looks like client was gone -- shut
> down the socket -- when we went to respond. Do you see a client
> timeout previous on client-side? If server is crashing, maybe later
> logs show why.
>
> St.Ack