Re: "Too many open files" in 0.18.3

Sean Knapp Fri, 13 Feb 2009 11:34:56 -0800

Raghu,
Apologies for the confusion. I was seeing the problem with any setting
for dfs.datanode.max.xcievers... 1k, 2k and 8k. Likewise, I was also seeing
the problem with different open file settings, all the way up to 32k.


Since I installed the patch, HDFS has been performing much better. The
current settings that work for me are 16k max open files with
dfs.datanode.max.xcievers=8k, though under heavy balancer load I do start to
hit the 16k max.

Regards,
Sean

2009/2/13 Raghu Angadi <rang...@yahoo-inc.com>

> Sean,
>
> A few things in your messages is not clear to me. Currently this is what I
> make out of it :
>
> 1) with 1k limit, you do see the problem.
> 2) with 16 limit - (?) not clear if you see the problem
> 3) with 8k you don't see the problem
>     3a) with or without the patch, I don't know.
>
> But if you do use the patch and things do improve, please let us know.
>
> Raghu.
>
>
> Sean Knapp wrote:
>
>> Raghu,
>> Thanks for the quick response. I've been beating up on the cluster for a
>> while now and so far so good. I'm still at 8k... what should I expect to
>> find with 16k versus 1k? The 8k didn't appear to be affecting things to
>> begin with.
>>
>> Regards,
>> Sean
>>
>> On Thu, Feb 12, 2009 at 2:07 PM, Raghu Angadi <rang...@yahoo-inc.com>
>> wrote:
>>
>>  You are most likely hit by
>>> https://issues.apache.org/jira/browse/HADOOP-4346 . I hope it gets back
>>> ported. There is a 0.18 patch posted there.
>>>
>>> btw, does 16k help in your case?
>>>
>>> Ideally 1k should be enough (with small number of clients). Please try
>>> the
>>> above patch with 1k limit.
>>>
>>> Raghu.
>>>
>>>
>>> Sean Knapp wrote:
>>>
>>>  Hi all,
>>>> I'm continually running into the "Too many open files" error on 18.3:
>>>>
>>>> DataXceiveServer: java.io.IOException: Too many open files
>>>>       at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>>>>       at
>>>>
>>>> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:145)
>>>>>
>>>>>        at
>>>>> sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:96)
>>>>>
>>>>>        at
>>>>> org.apache.hadoop.dfs.DataNode$DataXceiveServer.run(DataNode.java:997)
>>>>>
>>>>>        at java.lang.Thread.run(Thread.java:619)
>>>>>
>>>>
>>>> I'm writing thousands of files in the course of a few minutes, but
>>>> nothing
>>>> that seems too unreasonable, especially given the numbers below. I begin
>>>> getting a surge of these warnings right as I hit 1024 files open by the
>>>> DataNode:
>>>>
>>>> had...@u10:~$ ps ux | awk '/dfs\.DataNode/ { print $2 }' | xargs -i ls
>>>>
>>>>  /proc/{}/fd | wc -l
>>>>>
>>>>>  1023
>>>>>
>>>>
>>>> This is a bit unexpected, however, since I've configured my open file
>>>> limit
>>>> to be 16k:
>>>>
>>>> had...@u10:~$ ulimit -a
>>>> core file size          (blocks, -c) 0
>>>> data seg size           (kbytes, -d) unlimited
>>>> scheduling priority             (-e) 0
>>>> file size               (blocks, -f) unlimited
>>>> pending signals                 (-i) 268288
>>>> max locked memory       (kbytes, -l) 32
>>>> max memory size         (kbytes, -m) unlimited
>>>> open files                      (-n) 16384
>>>> pipe size            (512 bytes, -p) 8
>>>> POSIX message queues     (bytes, -q) 819200
>>>> real-time priority              (-r) 0
>>>> stack size              (kbytes, -s) 8192
>>>> cpu time               (seconds, -t) unlimited
>>>> max user processes              (-u) 268288
>>>> virtual memory          (kbytes, -v) unlimited
>>>> file locks                      (-x) unlimited
>>>>
>>>>
>>>> Note, I've also set dfs.datanode.max.xcievers to 8192 in
>>>> hadoop-site.xml.
>>>>
>>>> Thanks in advance,
>>>> Sean
>>>>
>>>>
>>>>
>>
>

Re: "Too many open files" in 0.18.3

Reply via email to