What Hadoop version are you using ? Btw, the sentence about previous flushes was incomplete.
Cheers On Mar 14, 2014, at 12:12 AM, Salabhanjika S <[email protected]> wrote: > Devs, > > We are using hbase version 0.90.6 (please don't complain of old > version. we are in process of upgrading) in our production and we are > noticing a strange problem arbitrarily for every few weeks. Region > server goes extremely slow. > We have to restart Region Server once this happens. There is no unique > pattern of this problem. This happens on different region servers, > different tables/regions and different times. > > Here are observations & findings from our analysis. > - We are using LZO compression (0.4.10). > > - [RS Dashboard] Flush is running for more than 6 hours. It is in > "creating writer" status for long time. Other previous flushes (600MB > to 1.5GB) takes > > - [Thread dumps] No deadlocks. Flusher thread stack. Even compactor > thread is in same state Configuration.loadResource > "regionserver60020.cacheFlusher" daemon prio=10 tid=0x00007efd016c4800 > nid=0x35e9 runnable [0x00007efcad9c5000] > java.lang.Thread.State: RUNNABLE > at > sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:70) > at > sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:161) > - locked <0x00007f02ccc2ef78> (a > sun.net.www.protocol.file.FileURLConnection) > at > com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:653) > ... [cutting down some stack to keep mail compact. all this stack > is in com.sun.org.apache.xerces...] > at > com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284) > at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1308) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1259) > at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1200) > - locked <0x00007f014f1543b8> (a org.apache.hadoop.conf.Configuration) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:501) > at > com.hadoop.compression.lzo.LzoCodec.getCompressionStrategy(LzoCodec.java:205) > at com.hadoop.compression.lzo.LzoCompressor.reinit(LzoCompressor.java:204) > at > org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:105) > at > org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:112) > at > org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(Compression.java:236) > at > org.apache.hadoop.hbase.io.hfile.HFile$Writer.getCompressingStream(HFile.java:397) > at org.apache.hadoop.hbase.io.hfile.HFile$Writer.newBlock(HFile.java:383) > at > org.apache.hadoop.hbase.io.hfile.HFile$Writer.checkBlockBoundary(HFile.java:354) > at org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:536) > at org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:501) > at > org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:836) > at > org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:530) > - locked <0x00007efe1b6e7af8> (a java.lang.Object) > at org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:496) > at org.apache.hadoop.hbase.regionserver.Store.access$100(Store.java:83) > at > org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store.java:1576) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1046) > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:967) > at > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:915) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:394) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:368) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:242) > > Any leads on this please? > > -S
