Have you dug into the dead DN logs as well? James
Sent from my mobile. Please excuse the typos. On 2011-05-11, at 7:39 AM, Evert Lammerts <[email protected]> wrote: > Hi James, > > Hadoop version is 0.20.2 (find that and more on our setup also in my first > mail, under heading "The cluster"). > > Below I) an example stacktrace of losing a datanode is and II) an example of > a "Could not obtain block" IOException. > > Cheers, > Evert > > 11/05/11 15:06:43 INFO hdfs.DFSClient: Failed to connect to > /192.168.28.214:50050, add to deadNodes and continue > java.net.SocketTimeoutException: 60000 millis timeout while waiting for > channel to be ready for read. ch : > java.nio.channels.SocketChannel[connected local=/192.168.28.209:50726 > remote=/192.168.28.214:50050] > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) > at java.io.BufferedInputStream.read(BufferedInputStream.java:237) > at java.io.DataInputStream.readShort(DataInputStream.java:295) > at > org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1478) > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1811) > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1948) > at java.io.DataInputStream.readFully(DataInputStream.java:178) > at > org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63) > at > org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1945) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1845) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1891) > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95) > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:1) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130) > at > nl.liacs.infrawatch.hadoop.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:85) > at nl.liacs.infrawatch.hadoop.kmeans.Job.run(Job.java:171) > at nl.liacs.infrawatch.hadoop.kmeans.Job.main(Job.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > > > 11/05/10 09:43:47 INFO mapred.JobClient: map 82% reduce 17% 11/05/10 > 09:44:39 INFO mapred.JobClient: Task Id : > attempt_201104121440_0122_m_000225_0, Status : FAILED > java.io.IOException: Could not obtain block: > blk_4397122445076815438_4097927 > file=/user/joaquin/data/20081201/20081201.039 > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1993) > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1800) > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1948) > at java.io.DataInputStream.read(DataInputStream.java:83) > at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134) > at > org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:97) > at > nl.liacs.infrawatch.hadoop.kmeans.KeyValueLineRecordReader.nextKeyValue(KeyValueLineRecordReader.java:94) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:455) > at > org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) > at org.apache.hadoop.mapred.Child$4.run(Child.java:268) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) > at org.apache.hadoop.mapred.Child.main(Child.java:262) > >> -----Original Message----- >> From: James Seigel [mailto:[email protected]] >> Sent: woensdag 11 mei 2011 14:54 >> To: [email protected] >> Subject: Re: Stability issue - dead DN's >> >> Evert, >> >> What's the stack trace and what version of hadoop do you have installed >> Sir! >> >> James. >> On 2011-05-11, at 3:23 AM, Evert Lammerts wrote: >> >>> Hi list, >>> >>> I notice that whenever our Hadoop installation is put under a heavy >> load we lose one or two (on a total of five) datanodes. This results in >> IOExceptions, and affects the overall performance of the job being run. >> Can anybody give me advise or best practices on a different >> configuration to increase the stability? Below I've included the specs >> of the cluster, the hadoop related config and an example of when which >> things go wrong. Any help is very much appreciated, and if I can >> provide any other info please let me know. >>> >>> Cheers, >>> Evert >>> >>> == What goes wrong, and when == >>> >>> See attached a screenshot of Ganglia when the cluster is under load >> of a single job. This job: >>> * reads ~1TB from HDFS >>> * writes ~200GB to HDFS >>> * runs 288 Mappers and 35 Reducers >>> >>> When the job runs it takes all available Map and Reduce slots. The >> system starts swapping and there is a short time interval during which >> most cores are in WAIT. After that the job really starts running. At >> around half way, one or two datanodes become unreachable and are marked >> as dead nodes. The amount of under-replicated blocks becomes huge. Then >> some "java.io.IOException: Could not obtain block" are thrown in >> Mappers. The job does manage to finish successfully after around 3.5 >> hours, but my fear is that when we make the input much larger - which >> we want - the system becomes too unstable to finish the job. >>> >>> Maybe worth mentioning - never know what might help diagnostics. We >> notice that memory usage becomes less when we switch our keys from Text >> to LongWritable. Also, the Mappers are done in a fraction of the time. >> However, this for some reason results in much more network traffic and >> makes Reducers extremely slow. We're working on figuring out what >> causes this. >>> >>> >>> == The cluster == >>> >>> We have a cluster that consists of 6 Sun Thumpers running Hadoop >> 0.20.2 on CentOS 5.5. One of them acts as NN and JT, the other 5 run >> DN's and TT's. Each node has: >>> * 16GB RAM >>> * 32GB swapspace >>> * 4 cores >>> * 11 LVM's of 4 x 500GB disks (2TB in total) for HDFS >>> * non-HDFS stuff on separate disks >>> * a 2x1GE bonded network interface for interconnects >>> * a 2x1GE bonded network interface for external access >>> >>> I realize that this is not a well balanced system, but it's what we >> had available for a prototype environment. We're working on putting >> together a specification for a much larger production environment. >>> >>> >>> == Hadoop config == >>> >>> Here some properties that I think might be relevant: >>> >>> __CORE-SITE.XML__ >>> >>> fs.inmemory.size.mb: 200 >>> mapreduce.task.io.sort.factor: 100 >>> mapreduce.task.io.sort.mb: 200 >>> # 1024*1024*4 MB, blocksize of the LVM's >>> io.file.buffer.size: 4194304 >>> >>> __HDFS-SITE.XML__ >>> >>> # 1024*1024*4*32 MB, 32 times the blocksize of the LVM's >>> dfs.block.size: 134217728 >>> # Only 5 DN's, but this shouldn't hurt >>> dfs.namenode.handler.count: 40 >>> # This got rid of the occasional "Could not obtain block"'s >>> dfs.datanode.max.xcievers: 4096 >>> >>> __MAPRED-SITE.XML__ >>> >>> mapred.tasktracker.map.tasks.maximum: 4 >>> mapred.tasktracker.reduce.tasks.maximum: 4 >>> mapred.child.java.opts: -Xmx2560m >>> mapreduce.reduce.shuffle.parallelcopies: 20 >>> mapreduce.map.java.opts: -Xmx512m >>> mapreduce.reduce.java.opts: -Xmx512m >>> # Compression codecs are configured and seem to work fine >>> mapred.compress.map.output: true >>> mapred.map.output.compression.codec: >> com.hadoop.compression.lzo.LzoCodec >>> >
