My bad. > - I strongly feel this issue has something to do with HBase version. I > verified the code paths of the stack I posted.
Read this as "I DON'T feel this issue has something to do with HBase version." On Tue, Mar 18, 2014 at 10:12 AM, Salabhanjika S <[email protected]> wrote: > Thanks Rodinov & Enis for responding. I agree with you that we need to > upgrade. > > As I mentioned in my first mail, we are in process of upgrade. >>> >>> We are using hbase version 0.90.6 (please don't complain of old >>> >>> version. we are in process of upgrading) > > - Suboptimal (as per me) code snippets I posted in followup mail holds > good for trunk as well. > > - I strongly feel this issue has something to do with HBase version. I > verified the code paths of the stack I posted. > I don't see any significant changes in current version in this code > (Flusher - getCompressor). > > > On Tue, Mar 18, 2014 at 2:30 AM, Enis Söztutar <[email protected]> wrote: >> Hi >> >> Agreed with Vladimir. I doubt anybody will spend the time to debug the >> issue. It would be easier if you can upgrade your HBase cluster. Also you >> will have to upgrade your Hadoop cluster as well. You should go with >> 0.96.x/0.98.x and either Hadoop-2.2 or Hadoop2.3. Check out the Hbase book >> for the upgrade process. >> >> Enis >> >> >> On Mon, Mar 17, 2014 at 11:19 AM, Vladimir Rodionov <[email protected] >>> wrote: >> >>> I think, 0.90.6 has reached EOL a couple years ago. The best you can do >>> right now is >>> start planning upgrading to the latest stable 0.94 or 0.96. >>> >>> Best regards, >>> Vladimir Rodionov >>> Principal Platform Engineer >>> Carrier IQ, www.carrieriq.com >>> e-mail: [email protected] >>> >>> ________________________________________ >>> From: Salabhanjika S [[email protected]] >>> Sent: Monday, March 17, 2014 2:55 AM >>> To: [email protected] >>> Subject: Re: Region server slowdown >>> >>> @Devs, please respond if you can provide me some hints on this problem. >>> >>> Did some more analysis. While going through the code in stack track I >>> noticed something sub-optimal. >>> This may not be a root cause of our slowdown but I felt it may be some >>> thing worthy to optimize/fix. >>> >>> HBase is making a call to Compressor *WITHOUT* config object. This is >>> resulting in configuration reload for every call. >>> Should this be calling with existing config object as a parameter so >>> that configuration reload (discovery & xml parsing) will not happen so >>> frequently? >>> >>> >>> http://svn.apache.org/viewvc/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/compress/Compression.java?view=markup >>> {code} >>> 309 public Compressor getCompressor() { >>> 310 CompressionCodec codec = getCodec(conf); >>> 311 if (codec != null) { >>> 312 Compressor compressor = CodecPool.getCompressor(codec); >>> 313 if (compressor != null) { >>> {code} >>> >>> >>> http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/CodecPool.java?view=markup >>> {code} >>> 162 public static Compressor getCompressor(CompressionCodec codec) { >>> 163 return getCompressor(codec, null); >>> 164 } >>> {code} >>> >>> On Fri, Mar 14, 2014 at 1:47 PM, Salabhanjika S <[email protected]> >>> wrote: >>> > Thanks for quick response Ted. >>> > >>> > - Hadoop version is 0.20.2 >>> > - Other previous flushes (600MB to 1.5GB) takes around 60 to 300 seconds >>> > >>> > On Fri, Mar 14, 2014 at 1:21 PM, Ted Yu <[email protected]> wrote: >>> >> What Hadoop version are you using ? >>> >> >>> >> Btw, the sentence about previous flushes was incomplete. >>> >> >>> >> Cheers >>> >> >>> >> On Mar 14, 2014, at 12:12 AM, Salabhanjika S <[email protected]> >>> wrote: >>> >> >>> >>> Devs, >>> >>> >>> >>> We are using hbase version 0.90.6 (please don't complain of old >>> >>> version. we are in process of upgrading) in our production and we are >>> >>> noticing a strange problem arbitrarily for every few weeks. Region >>> >>> server goes extremely slow. >>> >>> We have to restart Region Server once this happens. There is no unique >>> >>> pattern of this problem. This happens on different region servers, >>> >>> different tables/regions and different times. >>> >>> >>> >>> Here are observations & findings from our analysis. >>> >>> - We are using LZO compression (0.4.10). >>> >>> >>> >>> - [RS Dashboard] Flush is running for more than 6 hours. It is in >>> >>> "creating writer" status for long time. Other previous flushes (600MB >>> >>> to 1.5GB) takes >>> >>> >>> >>> - [Thread dumps] No deadlocks. Flusher thread stack. Even compactor >>> >>> thread is in same state Configuration.loadResource >>> >>> "regionserver60020.cacheFlusher" daemon prio=10 tid=0x00007efd016c4800 >>> >>> nid=0x35e9 runnable [0x00007efcad9c5000] >>> >>> java.lang.Thread.State: RUNNABLE >>> >>> at >>> sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:70) >>> >>> at >>> sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:161) >>> >>> - locked <0x00007f02ccc2ef78> (a >>> >>> sun.net.www.protocol.file.FileURLConnection) >>> >>> at >>> com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:653) >>> >>> ... [cutting down some stack to keep mail compact. all this stack >>> >>> is in com.sun.org.apache.xerces...] >>> >>> at >>> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284) >>> >>> at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180) >>> >>> at >>> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1308) >>> >>> at >>> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1259) >>> >>> at >>> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1200) >>> >>> - locked <0x00007f014f1543b8> (a >>> org.apache.hadoop.conf.Configuration) >>> >>> at org.apache.hadoop.conf.Configuration.get(Configuration.java:501) >>> >>> at >>> com.hadoop.compression.lzo.LzoCodec.getCompressionStrategy(LzoCodec.java:205) >>> >>> at >>> com.hadoop.compression.lzo.LzoCompressor.reinit(LzoCompressor.java:204) >>> >>> at >>> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:105) >>> >>> at >>> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:112) >>> >>> at >>> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.getCompressor(Compression.java:236) >>> >>> at >>> org.apache.hadoop.hbase.io.hfile.HFile$Writer.getCompressingStream(HFile.java:397) >>> >>> at >>> org.apache.hadoop.hbase.io.hfile.HFile$Writer.newBlock(HFile.java:383) >>> >>> at >>> org.apache.hadoop.hbase.io.hfile.HFile$Writer.checkBlockBoundary(HFile.java:354) >>> >>> at >>> org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:536) >>> >>> at >>> org.apache.hadoop.hbase.io.hfile.HFile$Writer.append(HFile.java:501) >>> >>> at >>> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:836) >>> >>> at >>> org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:530) >>> >>> - locked <0x00007efe1b6e7af8> (a java.lang.Object) >>> >>> at >>> org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:496) >>> >>> at >>> org.apache.hadoop.hbase.regionserver.Store.access$100(Store.java:83) >>> >>> at >>> org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store.java:1576) >>> >>> at >>> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1046) >>> >>> at >>> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:967) >>> >>> at >>> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:915) >>> >>> at >>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:394) >>> >>> at >>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:368) >>> >>> at >>> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:242) >>> >>> >>> >>> Any leads on this please? >>> >>> >>> >>> -S >>> >>> Confidentiality Notice: The information contained in this message, >>> including any attachments hereto, may be confidential and is intended to be >>> read only by the individual or entity to whom this message is addressed. If >>> the reader of this message is not the intended recipient or an agent or >>> designee of the intended recipient, please note that any review, use, >>> disclosure or distribution of this message or its attachments, in any form, >>> is strictly prohibited. If you have received this message in error, please >>> immediately notify the sender and/or [email protected] and >>> delete or destroy any copy of this message and its attachments. >>>
