I am noticing following errors also:
2011-03-11 17:52:00,376 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
10.103.7.3:50010, storageID=DS-824332190-10.103.7.3-50010-1290043658438,
infoPort=50075, ipcPort=50020):DataXceiveServer: Exiting due
to:java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:597)
at
org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:132)
at java.lang.Thread.run(Thread.java:619)
and this:
nf_conntrack: table full, dropping packet.
nf_conntrack: table full, dropping packet.
nf_conntrack: table full, dropping packet.
nf_conntrack: table full, dropping packet.
nf_conntrack: table full, dropping packet.
nf_conntrack: table full, dropping packet.
net_ratelimit: 10 callbacks suppressed
nf_conntrack: table full, dropping packet.
possible SYN flooding on port 9090. Sending cookies.
This seems like a network stack issue?
So, does datanode need higher heap than 1GB? Or possible we ran out of RAM
for other reasons?
-Jack
On Thu, Mar 10, 2011 at 1:29 PM, Ryan Rawson <[email protected]> wrote:
> Looks like a datanode went down. InterruptedException is how java
> uses to interrupt IO in threads, its similar to the EINTR errno. That
> means the actual source of the abort is higher up...
>
> So back to how InterruptedException works... at some point a thread in
> the JVM decides that the VM should abort. So it calls
> thread.interrupt() on all the threads it knows/cares about to
> interrupt their IO. That is what you are seeing in the logs. The root
> cause lies above I think.
>
> Look for the first "Exception" string or any FATAL or ERROR strings in
> the datanode logfiles.
>
> -ryan
>
> On Thu, Mar 10, 2011 at 1:03 PM, Jack Levin <[email protected]> wrote:
> > http://pastebin.com/ZmsyvcVc Here is the regionserver log, they all
> have
> > similar stuff,
> >
> > On Thu, Mar 10, 2011 at 11:34 AM, Stack <[email protected]> wrote:
> >
> >> Whats in the regionserver logs? Please put up regionserver and
> >> datanode excerpts.
> >> Thanks Jack,
> >> St.Ack
> >>
> >> On Thu, Mar 10, 2011 at 10:31 AM, Jack Levin <[email protected]> wrote:
> >> > All was well, until this happen:
> >> >
> >> > http://pastebin.com/iM1niwrS
> >> >
> >> > and all regionservers went down, is this xciever issue?
> >> >
> >> > <property>
> >> > <name>dfs.datanode.max.xcievers</name>
> >> > <value>12047</value>
> >> > </property>
> >> >
> >> > this is what I have, should I set it higher?
> >> >
> >> > -Jack
> >> >
> >>
> >
>