Checked now. It is 0.94.6.1
----- Original Message ----- From: lars hofhansl <[email protected]> To: "[email protected]" <[email protected]> Cc: Sent: Sunday, July 14, 2013 6:55 AM Subject: Re: HBase issues since upgrade from 0.92.4 to 0.94.6 Didn't check, but I sincerely hope that CDH 4.3.0 ships with HBase 0.94.6.1 (and not 0.94.6). ________________________________ From: David Koch <[email protected]> To: [email protected] Sent: Friday, July 12, 2013 3:09 AM Subject: HBase issues since upgrade from 0.92.4 to 0.94.6 Hello, NOTE: I posted the same message in the the Cloudera group. Since upgrading from CDH 4.0.1 (HBase 0.92.4) to 4.3.0 (HBase 0.94.6) we systematically experience problems with region servers crashing silently under workloads which used to pass without problems. More specifically, we run about 30 Mapper jobs in parallel which read from HDFS and insert in HBase. region server log NOTE: no trace of crash, but server is down and shows up as such in Cloudera Manager. 2013-07-12 10:22:12,050 WARN org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: File hdfs://XXXXXXX:8020/hbase/.logs/XXXXXXX,60020,1373616547696-splitting/XXXXXXX%2C60020%2C1373616547696.1373617004286 might be still open, length is 0 2013-07-12 10:22:12,051 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Recovering file hdfs://XXXXXXX:8020/hbase/.logs/XXXXXXX,60020,1373616547696-splitting/XXXXXXX t%2C60020%2C1373616547696.1373617004286 2013-07-12 10:22:13,064 INFO org.apache.hadoop.hbase.util.FSHDFSUtils: Finished lease recover attempt for hdfs://XXXXXXX:8020/hbase/.logs/XXXXXXX,60020,1373616547696-splitting/XXXXXXX%2C60020%2C1373616547696.1373617004286 2013-07-12 10:22:14,819 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.deflate] 2013-07-12 10:22:14,824 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.deflate] ... 2013-07-12 10:22:14,850 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.deflate] 2013-07-12 10:22:15,530 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.deflate] < -- last log entry, region server is down here -- > datanode log, same machine 2013-07-12 10:22:04,811 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: XXXXXXX:50010:DataXceiver error processing WRITE_BLOCK operation src: /YYY.YY.YYY.YY:36024 dest: /XXX.XX.XXX.XX:50010 java.io.IOException: Premature EOF from inputStream at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:414) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:635) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:564) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:103) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:67) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221) at java.lang.Thread.run(Thread.java:724) < -- many repetitions of this -- > What could have caused this difference in stability? We did not change any configuration settings with respect to the previous CDH 4.0.1 setup. In particular, we left ulimit and dfs.datanode.max.xcievers at 32k. If need be, I can provide more complete log/configuration information. Thank you, /David
