Sean,

A few things in your messages is not clear to me. Currently this is what I make out of it :

1) with 1k limit, you do see the problem.
2) with 16 limit - (?) not clear if you see the problem
3) with 8k you don't see the problem
     3a) with or without the patch, I don't know.

But if you do use the patch and things do improve, please let us know.

Raghu.

Sean Knapp wrote:
Raghu,
Thanks for the quick response. I've been beating up on the cluster for a
while now and so far so good. I'm still at 8k... what should I expect to
find with 16k versus 1k? The 8k didn't appear to be affecting things to
begin with.

Regards,
Sean

On Thu, Feb 12, 2009 at 2:07 PM, Raghu Angadi <rang...@yahoo-inc.com> wrote:

You are most likely hit by
https://issues.apache.org/jira/browse/HADOOP-4346 . I hope it gets back
ported. There is a 0.18 patch posted there.

btw, does 16k help in your case?

Ideally 1k should be enough (with small number of clients). Please try the
above patch with 1k limit.

Raghu.


Sean Knapp wrote:

Hi all,
I'm continually running into the "Too many open files" error on 18.3:

DataXceiveServer: java.io.IOException: Too many open files
       at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
       at

sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:145)

        at
sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:96)

        at
org.apache.hadoop.dfs.DataNode$DataXceiveServer.run(DataNode.java:997)

        at java.lang.Thread.run(Thread.java:619)

I'm writing thousands of files in the course of a few minutes, but nothing
that seems too unreasonable, especially given the numbers below. I begin
getting a surge of these warnings right as I hit 1024 files open by the
DataNode:

had...@u10:~$ ps ux | awk '/dfs\.DataNode/ { print $2 }' | xargs -i ls

/proc/{}/fd | wc -l

 1023

This is a bit unexpected, however, since I've configured my open file
limit
to be 16k:

had...@u10:~$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 268288
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 16384
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 268288
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited


Note, I've also set dfs.datanode.max.xcievers to 8192 in hadoop-site.xml.

Thanks in advance,
Sean




Reply via email to