U need to set ulimit -n <bigger value> on datanode and restart datanodes.

Sent from my iPhone

On Jan 26, 2012, at 6:06 AM, Idris Ali <[email protected]> wrote:

> Hi Mark,
> 
> On a lighter note what is the count of xceivers? dfs.datanode.max.xceivers
> property in hdfs-site.xml?
> 
> Thanks,
> -idris
> 
> On Thu, Jan 26, 2012 at 5:28 PM, Michel Segel 
> <[email protected]>wrote:
> 
>> Sorry going from memory...
>> As user Hadoop or mapred or hdfs what do you see when you do a ulimit -a?
>> That should give you the number of open files allowed by a single user...
>> 
>> 
>> Sent from a remote device. Please excuse any typos...
>> 
>> Mike Segel
>> 
>> On Jan 26, 2012, at 5:13 AM, Mark question <[email protected]> wrote:
>> 
>>> Hi guys,
>>> 
>>>  I get this error from a job trying to process 3Million records.
>>> 
>>> java.io.IOException: Bad connect ack with firstBadLink
>> 192.168.1.20:50010
>>>   at
>>> 
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2903)
>>>   at
>>> 
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2826)
>>>   at
>>> 
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
>>>   at
>>> 
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
>>> 
>>> When I checked the logfile of the datanode-20, I see :
>>> 
>>> 2012-01-26 03:00:11,827 ERROR
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
>>> 192.168.1.20:50010,
>> storageID=DS-97608578-192.168.1.20-50010-1327575205369,
>>> infoPort=50075, ipcPort=50020):DataXceiver
>>> java.io.IOException: Connection reset by peer
>>>   at sun.nio.ch.FileDispatcher.read0(Native Method)
>>>   at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
>>>   at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202)
>>>   at sun.nio.ch.IOUtil.read(IOUtil.java:175)
>>>   at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
>>>   at
>>> 
>> org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
>>>   at
>>> 
>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>>>   at
>>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>>>   at
>>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>>>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
>>>   at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
>>>   at java.io.DataInputStream.read(DataInputStream.java:132)
>>>   at
>>> 
>> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:262)
>>>   at
>>> 
>> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:309)
>>>   at
>>> 
>> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:373)
>>>   at
>>> 
>> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:525)
>>>   at
>>> 
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357)
>>>   at
>>> 
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
>>>   at java.lang.Thread.run(Thread.java:662)
>>> 
>>> 
>>> Which is because I'm running 10 maps per taskTracker on a 20 node
>> cluster,
>>> each map opens about 300 files so that should give 6000 opened files at
>> the
>>> same time ... why is this a problem? the maximum # of files per process
>> on
>>> one machine is:
>>> 
>>> cat /proc/sys/fs/file-max   ---> 2403545
>>> 
>>> 
>>> Any suggestions?
>>> 
>>> Thanks,
>>> Mark
>> 

Reply via email to