Thanks Rodinov & Enis for responding. I agree with you that we need to upgrade.
As I mentioned in my first mail, we are in process of upgrade. >> >>> We are using hbase version 0.90.6 (please don't complain of old >> >>> version. we are in process of upgrading) - Suboptimal (as per me) code snippets I posted in followup mail holds good for trunk as well. - I strongly feel this issue has something to do with HBase version. I verified the code paths of the stack I posted. I don't see any significant changes in current version in this code (Flusher - getCompressor). On Tue, Mar 18, 2014 at 2:30 AM, Enis Söztutar <[email protected]> wrote: > Hi > > Agreed with Vladimir. I doubt anybody will spend the time to debug the > issue. It would be easier if you can upgrade your HBase cluster. Also you > will have to upgrade your Hadoop cluster as well. You should go with > 0.96.x/0.98.x and either Hadoop-2.2 or Hadoop2.3. Check out the Hbase book > for the upgrade process. > > Enis > > > On Mon, Mar 17, 2014 at 11:19 AM, Vladimir Rodionov <[email protected] >> wrote: > >> I think, 0.90.6 has reached EOL a couple years ago. The best you can do >> right now is >> start planning upgrading to the latest stable 0.94 or 0.96. >> >> Best regards, >> Vladimir Rodionov >> Principal Platform Engineer >> Carrier IQ, www.carrieriq.com >> e-mail: [email protected] >> >> ________________________________________ >> From: Salabhanjika S [[email protected]] >> Sent: Monday, March 17, 2014 2:55 AM >> To: [email protected] >> Subject: Re: Region server slowdown >> >> @Devs, please respond if you can provide me some hints on this problem. >> >> Did some more analysis. While going through the code in stack track I >> noticed something sub-optimal. >> This may not be a root cause of our slowdown but I felt it may be some >> thing worthy to optimize/fix. >> >> HBase is making a call to Compressor *WITHOUT* config object. This is >> resulting in configuration reload for every call. >> Should this be calling with existing config object as a parameter so >> that configuration reload (discovery & xml parsing) will not happen so >> frequently? >> >> >> http://svn.apache.org/viewvc/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/compress/Compression.java?view=markup >> {code} >> 309 public Compressor getCompressor() { >> 310 CompressionCodec codec = getCodec(conf); >> 311 if (codec != null) { >> 312 Compressor compressor = CodecPool.getCompressor(codec); >> 313 if (compressor != null) { >> {code} >> >> >> http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/CodecPool.java?view=markup >> {code} >> 162 public static Compressor getCompressor(CompressionCodec codec) { >> 163 return getCompressor(codec, null); >> 164 } >> {code} >> >> On Fri, Mar 14, 2014 at 1:47 PM, Salabhanjika S <[email protected]> >> wrote: >> > Thanks for quick response Ted. >> > >> > - Hadoop version is 0.20.2 >> > - Other previous flushes (600MB to 1.5GB) takes around 60 to 300 seconds >> > >> > On Fri, Mar 14, 2014 at 1:21 PM, Ted Yu <[email protected]> wrote: >> >> What Hadoop version are you using ? >> >> >> >> Btw, the sentence about previous flushes was incomplete. >> >> >> >> Cheers >> >> >> >> On Mar 14, 2014, at 12:12 AM, Salabhanjika S <[email protected]> >> wrote: >> >> >> >>> Devs, >> >>> >> >>> We are using hbase version 0.90.6 (please don't complain of old >> >>> version. we are in process of upgrading) in our production and we are >> >>> noticing a strange problem arbitrarily for every few weeks. Region >> >>> server goes extremely slow. >> >>> We have to restart Region Server once this happens. There is no unique >> >>> pattern of this problem. This happens on different region servers, >> >>> different tables/regions and different times. >> >>> >> >>> Here are observations & findings from our analysis. >> >>> - We are using LZO compression (0.4.10). >> >>> >> >>> - [RS Dashboard] Flush is running for more than 6 hours. It is in >> >>> "creating writer" status for long time. Other previous flushes (600MB >> >>> to 1.5GB) takes >> >>> >> >>> - [Thread dumps] No deadlocks. Flusher thread stack. Even compactor >> >>> thread is in same state Configuration.loadResource >> >>> "regionserver60020.cacheFlusher" daemon prio=10 tid=0x00007efd016c4800 >> >>> nid=0x35e9 runnable [0x00007efcad9c5000] >> >>> java.lang.Thread.State: RUNNABLE >> >>> at >> sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:70) >> >>> at >> sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:161) >> >>> - locked <0x00007f02ccc2ef78> (a >> >>> sun.net.www.protocol.file.FileURLConnection) >> >>> at >> com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:653) >> >>> ... [cutting down some stack to keep mail compact. all this stack >> >>> is in com.sun.org.apache.xerces...] >> >>> at >> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284) >> >>> at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180) >> >>> at >> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1308) >> >>> at >> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1259) >> >>> at >> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1200) >> >>> - locked <0x00007f014f1543b8> (a >> org.apache.hadoop.conf.Configuration) >> >>> at org.apache.hadoop.conf.Configuration.get(Configuration.java:501) >> >>> at >> com.hadoop.compression.lzo.LzoCodec.getCompressionStrategy(LzoCodec.java:205) >> >>> at >> com.hadoop.compression.lzo.LzoCompressor.reinit(LzoCompressor.java:204) >> >>> at >> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:105) >> >>> at >> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:112) >> >>> at >> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(Compression.java:236) >> >>> at >> org.apache.hadoop.hbase.io.hfile.HFile$Writer.getCompressingStream(HFile.java:397) >> >>> at >> org.apache.hadoop.hbase.io.hfile.HFile$Writer.newBlock(HFile.java:383) >> >>> at >> org.apache.hadoop.hbase.io.hfile.HFile$Writer.checkBlockBoundary(HFile.java:354) >> >>> at >> org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:536) >> >>> at >> org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:501) >> >>> at >> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:836) >> >>> at >> org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:530) >> >>> - locked <0x00007efe1b6e7af8> (a java.lang.Object) >> >>> at >> org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:496) >> >>> at >> org.apache.hadoop.hbase.regionserver.Store.access$100(Store.java:83) >> >>> at >> org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store.java:1576) >> >>> at >> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1046) >> >>> at >> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:967) >> >>> at >> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:915) >> >>> at >> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:394) >> >>> at >> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:368) >> >>> at >> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:242) >> >>> >> >>> Any leads on this please? >> >>> >> >>> -S >> >> Confidentiality Notice: The information contained in this message, >> including any attachments hereto, may be confidential and is intended to be >> read only by the individual or entity to whom this message is addressed. If >> the reader of this message is not the intended recipient or an agent or >> designee of the intended recipient, please note that any review, use, >> disclosure or distribution of this message or its attachments, in any form, >> is strictly prohibited. If you have received this message in error, please >> immediately notify the sender and/or [email protected] and >> delete or destroy any copy of this message and its attachments. >>
