Hi again,
I've tried :
     <property>
        <name>dfs.datanode.max.xcievers</name>
        <value>1048576</value>
      </property>
but I'm still getting the same error ... how high can I go??

Thanks,
Mark



On Thu, Jan 26, 2012 at 9:29 AM, Mark question <[email protected]> wrote:

> Thanks for the reply.... I have nothing about dfs.datanode.max.xceivers on
> my hdfs-site.xml so hopefully this would solve the problem and about the
> ulimit -n , I'm running on an NFS cluster, so usually I just start Hadoop
> with a single bin/start-all.sh ... Do you think I can add it by
> bin/Datanode -ulimit n ?
>
> Mark
>
>
> On Thu, Jan 26, 2012 at 7:33 AM, Mapred Learn <[email protected]>wrote:
>
>> U need to set ulimit -n <bigger value> on datanode and restart datanodes.
>>
>> Sent from my iPhone
>>
>> On Jan 26, 2012, at 6:06 AM, Idris Ali <[email protected]> wrote:
>>
>> > Hi Mark,
>> >
>> > On a lighter note what is the count of xceivers?
>> dfs.datanode.max.xceivers
>> > property in hdfs-site.xml?
>> >
>> > Thanks,
>> > -idris
>> >
>> > On Thu, Jan 26, 2012 at 5:28 PM, Michel Segel <
>> [email protected]>wrote:
>> >
>> >> Sorry going from memory...
>> >> As user Hadoop or mapred or hdfs what do you see when you do a ulimit
>> -a?
>> >> That should give you the number of open files allowed by a single
>> user...
>> >>
>> >>
>> >> Sent from a remote device. Please excuse any typos...
>> >>
>> >> Mike Segel
>> >>
>> >> On Jan 26, 2012, at 5:13 AM, Mark question <[email protected]>
>> wrote:
>> >>
>> >>> Hi guys,
>> >>>
>> >>>  I get this error from a job trying to process 3Million records.
>> >>>
>> >>> java.io.IOException: Bad connect ack with firstBadLink
>> >> 192.168.1.20:50010
>> >>>   at
>> >>>
>> >>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2903)
>> >>>   at
>> >>>
>> >>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2826)
>> >>>   at
>> >>>
>> >>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
>> >>>   at
>> >>>
>> >>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
>> >>>
>> >>> When I checked the logfile of the datanode-20, I see :
>> >>>
>> >>> 2012-01-26 03:00:11,827 ERROR
>> >>> org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(
>> >>> 192.168.1.20:50010,
>> >> storageID=DS-97608578-192.168.1.20-50010-1327575205369,
>> >>> infoPort=50075, ipcPort=50020):DataXceiver
>> >>> java.io.IOException: Connection reset by peer
>> >>>   at sun.nio.ch.FileDispatcher.read0(Native Method)
>> >>>   at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
>> >>>   at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202)
>> >>>   at sun.nio.ch.IOUtil.read(IOUtil.java:175)
>> >>>   at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
>> >>>   at
>> >>>
>> >>
>> org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
>> >>>   at
>> >>>
>> >>
>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>> >>>   at
>> >>>
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>> >>>   at
>> >>>
>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>> >>>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
>> >>>   at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
>> >>>   at java.io.DataInputStream.read(DataInputStream.java:132)
>> >>>   at
>> >>>
>> >>
>> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:262)
>> >>>   at
>> >>>
>> >>
>> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:309)
>> >>>   at
>> >>>
>> >>
>> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:373)
>> >>>   at
>> >>>
>> >>
>> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:525)
>> >>>   at
>> >>>
>> >>
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357)
>> >>>   at
>> >>>
>> >>
>> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
>> >>>   at java.lang.Thread.run(Thread.java:662)
>> >>>
>> >>>
>> >>> Which is because I'm running 10 maps per taskTracker on a 20 node
>> >> cluster,
>> >>> each map opens about 300 files so that should give 6000 opened files
>> at
>> >> the
>> >>> same time ... why is this a problem? the maximum # of files per
>> process
>> >> on
>> >>> one machine is:
>> >>>
>> >>> cat /proc/sys/fs/file-max   ---> 2403545
>> >>>
>> >>>
>> >>> Any suggestions?
>> >>>
>> >>> Thanks,
>> >>> Mark
>> >>
>>
>
>

Reply via email to