Hi Tom, Thanks for this information and the offer. I think we have enought to start to look at this issue. I'm still trying to reproduce that locally. In the meantime, I sent a patch to fix the NullPointException your faced before.
I will post back here if I'm able to reproduce. Have you tried Sergey's workarround? JM 2013/9/24 Tom Brown <[email protected]> > Yes, it is empty. > > 13/09/24 13:03:03 INFO hfile.CacheConfig: Allocating LruBlockCache with > maximum size 2.9g > 13/09/24 13:03:03 ERROR metrics.SchemaMetrics: Inconsistent configuration. > Previous configuration for using table name in metrics: true, new > configuration: false > 13/09/24 13:03:03 WARN metrics.SchemaConfigured: Could not determine table > and column family of the HFile path /fca0882dc7624342a8f4fce4b89420ff. > Expecting at least 5 path components. > 13/09/24 13:03:03 WARN snappy.LoadSnappy: Snappy native library is > available > 13/09/24 13:03:03 INFO util.NativeCodeLoader: Loaded the native-hadoop > library > 13/09/24 13:03:03 INFO snappy.LoadSnappy: Snappy native library loaded > 13/09/24 13:03:03 INFO compress.CodecPool: Got brand-new decompressor > Stats: > no data available for statistics > Scanned kv count -> 0 > > If you want to examine the actual file, I would be happy to email it to you > directly. > > --Tom > > > On Tue, Sep 24, 2013 at 12:42 PM, Jean-Marc Spaggiari < > [email protected]> wrote: > > > Can you try with less parameters and see if you are able to get something > > from it? This exception is caused by the "printMeta", so if you remove -m > > it should be ok. However, printMeta was what I was looking for ;) > > > > getFirstKey for this file seems to return null. So it might simply be an > > empty file, not necessary a corrupted one. > > > > > > 2013/9/24 Tom Brown <[email protected]> > > > > > /usr/lib/hbase/bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -m -s > -v > > -f > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fca0882dc7624342a8f4fce4b89420ff > > > 13/09/24 12:33:40 INFO util.ChecksumType: Checksum using > > > org.apache.hadoop.util.PureJavaCrc32 > > > Scanning -> > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fca0882dc7624342a8f4fce4b89420ff > > > 13/09/24 12:33:41 INFO hfile.CacheConfig: Allocating LruBlockCache with > > > maximum size 2.9g > > > 13/09/24 12:33:41 ERROR metrics.SchemaMetrics: Inconsistent > > configuration. > > > Previous configuration for using table name in metrics: true, new > > > configuration: false > > > 13/09/24 12:33:41 WARN snappy.LoadSnappy: Snappy native library is > > > available > > > 13/09/24 12:33:41 INFO util.NativeCodeLoader: Loaded the native-hadoop > > > library > > > 13/09/24 12:33:41 INFO snappy.LoadSnappy: Snappy native library loaded > > > 13/09/24 12:33:41 INFO compress.CodecPool: Got brand-new decompressor > > > Block index size as per heapsize: 336 > > > Exception in thread "main" java.lang.NullPointerException > > > at > > org.apache.hadoop.hbase.KeyValue.keyToString(KeyValue.java:716) > > > at > > > > > > > > > org.apache.hadoop.hbase.io.hfile.AbstractHFileReader.toStringFirstKey(AbstractHFileReader.java:138) > > > at > > > > > > > > > org.apache.hadoop.hbase.io.hfile.AbstractHFileReader.toString(AbstractHFileReader.java:149) > > > at > > > > > > > > > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.printMeta(HFilePrettyPrinter.java:318) > > > at > > > > > > > > > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:234) > > > at > > > > > > > > > org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:189) > > > at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:756) > > > > > > > > > Does this mean the problem might have been caused by a corrupted > file(s)? > > > > > > --Tom > > > > > > > > > On Tue, Sep 24, 2013 at 12:21 PM, Jean-Marc Spaggiari < > > > [email protected]> wrote: > > > > > > > One more Tom, > > > > > > > > When you will have been able capture de HFile locally, please run run > > the > > > > HFile class on it to see the number of keys (is it empty?) and the > > other > > > > specific information. > > > > > > > > bin/hbase org.apache.hadoop.hbase.io.hfile.HFile -m -s -v -f > HFILENAME > > > > > > > > Thanks, > > > > > > > > JM > > > > > > > > > > > > 2013/9/24 Jean-Marc Spaggiari <[email protected]> > > > > > > > > > We get -1 because of this: > > > > > > > > > > byte [] timerangeBytes = metadataMap.get(TIMERANGE_KEY); > > > > > if (timerangeBytes != null) { > > > > > this.reader.timeRangeTracker = new TimeRangeTracker(); > > > > > Writables.copyWritable(timerangeBytes, > > > > > this.reader.timeRangeTracker); > > > > > } > > > > > this.reader.timeRangeTracker will return -1 for the > maximumTimestamp > > > > > value. So now, we need to figure if it's normal or not to have > > > > > TIMERANGE_KEY not null here. > > > > > > > > > > I have created the same table locally on 0,94.10 with the same > > > attributes > > > > > and I'm not facing this issue. > > > > > > > > > > We need to look at the related HFile, but files are rolled VERY > > > quickly, > > > > > so might be difficult to get one. > > > > > > > > > > Maybe something like > > > > > hadoop fs -get hdfs:// > > > > > > > > > > > > > > > hdpmgr001.pse.movenetworks.com:8020/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/* > > > > > . > > > > > > > > > > might help to get the file? Then we can start to look at it and see > > > what > > > > > exactly trigger this behaviour? > > > > > > > > > > JM > > > > > > > > > > > > > > > 2013/9/24 Sergey Shelukhin <[email protected]> > > > > > > > > > >> Yeah, I think c3580bdb62d64e42a9eeac50f1c582d2 store file is a > good > > > > >> example. > > > > >> Can you grep for c3580bdb62d64e42a9eeac50f1c582d2 and post the log > > > just > > > > to > > > > >> be sure? Thanks. > > > > >> It looks like an interaction of deleting expired files and > > > > >> // Create the writer even if no kv(Empty store file is > > also > > > > ok), > > > > >> // because we need record the max seq id for the store > > file, > > > > see > > > > >> // HBASE-6059 > > > > >> in compactor. > > > > >> The newly created file is immediately collected the same way and > > > > replaced > > > > >> by another file, which seems like not an intended behavior, even > > > though > > > > >> both pieces of code are technically correct (the empty file is > > > expired, > > > > >> and > > > > >> the new file is generally needed). > > > > >> > > > > >> I filed HBASE-9648 > > > > >> > > > > >> > > > > >> On Tue, Sep 24, 2013 at 10:55 AM, Sergey Shelukhin > > > > >> <[email protected]>wrote: > > > > >> > > > > >> > To mitigate, you can change hbase.store.delete.expired.storefile > > to > > > > >> false > > > > >> > on one region server, or for entire table, and restart this RS. > > > > >> > This will trigger a different compaction, hopefully. > > > > >> > We'd need to find what the bug is. My gut feeling (which is > known > > to > > > > be > > > > >> > wrong often) is that it has to do with it selecting one file, > > > probably > > > > >> > invalid check somewhere, or interaction with the code that > ensures > > > > that > > > > >> at > > > > >> > least one file needs to be written to preserve metadata, it > might > > be > > > > >> just > > > > >> > cycling thru such files. > > > > >> > > > > > >> > > > > > >> > On Tue, Sep 24, 2013 at 10:20 AM, Jean-Marc Spaggiari < > > > > >> > [email protected]> wrote: > > > > >> > > > > > >> >> So. Looking at the code, this, for me, sound like a bug. > > > > >> >> > > > > >> >> I will try to reproduce it locally. Seems to be related to the > > > > >> combination > > > > >> >> of TTL + BLOOM. > > > > >> >> > > > > >> >> Creating a table for that right now, will keep you posted very > > > > shortly. > > > > >> >> > > > > >> >> JM > > > > >> >> > > > > >> >> > > > > >> >> 2013/9/24 Tom Brown <[email protected]> > > > > >> >> > > > > >> >> > -rw------- 1 hadoop supergroup 2194 2013-09-21 14:32 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/014ead47a9484d67b55205be16802ff1 > > > > >> >> > -rw------- 1 hadoop supergroup 31321 2013-09-24 05:49 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1305d625bd4a4be39a98ae4d91a66140 > > > > >> >> > -rw------- 1 hadoop supergroup 1350 2013-09-24 10:31 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1352e0828f974f08b1f3d7a9dff04abd > > > > >> >> > -rw------- 1 hadoop supergroup 4194 2013-09-21 10:38 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/17a546064bd840619816809ae0fc4c49 > > > > >> >> > -rw------- 1 hadoop supergroup 1061 2013-09-20 22:55 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1cb3df115da244288bd076968ab4ccf6 > > > > >> >> > -rw------- 1 hadoop supergroup 1375 2013-08-24 10:17 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/1e41a96c49fc4e5ab59392d26935978d > > > > >> >> > -rw------- 1 hadoop supergroup 96296 2013-08-26 15:48 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/22d72fd897e34424b5420a96483a571e > > > > >> >> > -rw------- 1 hadoop supergroup 1356 2013-08-26 15:23 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/25fee1ffadbe42549bd0b7b13d782b72 > > > > >> >> > -rw------- 1 hadoop supergroup 6229 2013-09-21 11:14 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/26289c777ec14dc5b7021b4d6b1050c5 > > > > >> >> > -rw------- 1 hadoop supergroup 1223 2013-09-21 02:42 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/2757d7ba9c8448d6a3d5d46bd4d59758 > > > > >> >> > -rw------- 1 hadoop supergroup 5302248 2013-08-24 02:22 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/2ec40943787246ea983608dd6591db24 > > > > >> >> > -rw------- 1 hadoop supergroup 1596 2013-08-24 03:37 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/3157fd1cabe4483aaa4d9a21f75e4d88 > > > > >> >> > -rw------- 1 hadoop supergroup 1338 2013-09-22 04:25 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/36b0f80a4a7b492f97358b64d879a2df > > > > >> >> > -rw------- 1 hadoop supergroup 3264 2013-09-21 12:05 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/39e249fcb532400daed73aed6689ceeb > > > > >> >> > -rw------- 1 hadoop supergroup 4549 2013-09-21 08:56 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/3bc9e2a566ad460a9b0ed336b2fb5ed9 > > > > >> >> > -rw------- 1 hadoop supergroup 1630 2013-09-22 03:22 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/48026d08aae748f08aad59e4eea903be > > > > >> >> > -rw------- 1 hadoop supergroup 105395 2013-09-20 21:12 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/53198825f085401cbbd4322faa0e3aae > > > > >> >> > -rw------- 1 hadoop supergroup 3859 2013-09-21 09:09 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/71c2f9b2a8ff4c049fcc5a9a22af5cfe > > > > >> >> > -rw------- 1 hadoop supergroup 311688 2013-09-20 21:12 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/97ff16d6da974c30835c6e0acc7c737a > > > > >> >> > -rw------- 1 hadoop supergroup 1897 2013-08-24 08:43 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/a172d7577641434d82abcce88a433213 > > > > >> >> > -rw------- 1 hadoop supergroup 3380 2013-09-21 13:04 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/be678e5c60534c65a012a798fbc7e284 > > > > >> >> > -rw------- 1 hadoop supergroup 43710 2013-09-22 02:15 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/e2508a23acf1491f9d38b9a8594e41e8 > > > > >> >> > -rw------- 1 hadoop supergroup 5409 2013-09-21 10:10 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/f432846182714b93a1c3df0f5835c09b > > > > >> >> > -rw------- 1 hadoop supergroup 491 2013-09-24 11:18 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/f7d8669cf7a047b98c1d3b13c16cfaec > > > > >> >> > -rw------- 1 hadoop supergroup 491 2013-09-24 11:18 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fa1b8f6cc9584eb28365dcd8f10d3f0a > > > > >> >> > -rw------- 1 hadoop supergroup 491 2013-09-13 11:28 > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > /hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/fca0882dc7624342a8f4fce4b89420ff > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > > >> >> > On Tue, Sep 24, 2013 at 11:14 AM, Jean-Marc Spaggiari < > > > > >> >> > [email protected]> wrote: > > > > >> >> > > > > > >> >> > > TTL seems to be fine. > > > > >> >> > > > > > > >> >> > > -1 is the default value for > > TimeRangeTracker.maximumTimestamp. > > > > >> >> > > > > > > >> >> > > Can you run: > > > > >> >> > > hadoop fs -lsr hdfs:// > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > hdpmgr001.pse.movenetworks.com:8020/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/ > > > > >> >> > > > > > > >> >> > > Thanks, > > > > >> >> > > > > > > >> >> > > JM > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > 2013/9/24 Tom Brown <[email protected]> > > > > >> >> > > > > > > >> >> > > > 1. Hadoop version is 1.1.2. > > > > >> >> > > > 2. All servers are synched with NTP. > > > > >> >> > > > 3. Table definition is: 'compound0', { > > > > >> >> > > > NAME => 'd', > > > > >> >> > > > DATA_BLOCK_ENCODING => 'NONE', > > > > >> >> > > > BLOOMFILTER => 'ROW', > > > > >> >> > > > REPLICATION_SCOPE => '0', > > > > >> >> > > > VERSIONS => '1', > > > > >> >> > > > COMPRESSION => 'SNAPPY', > > > > >> >> > > > MIN_VERSIONS => '0', > > > > >> >> > > > TTL => '8640000', > > > > >> >> > > > KEEP_DELETED_CELLS => 'false', > > > > >> >> > > > BLOCKSIZE => '65536', > > > > >> >> > > > IN_MEMORY => 'false', > > > > >> >> > > > ENCODE_ON_DISK => 'true', > > > > >> >> > > > BLOCKCACHE => 'true' > > > > >> >> > > > } > > > > >> >> > > > > > > > >> >> > > > The TTL is supposed to be 100 days. > > > > >> >> > > > > > > > >> >> > > > --Tom > > > > >> >> > > > > > > > >> >> > > > > > > > >> >> > > > On Tue, Sep 24, 2013 at 10:53 AM, Jean-Marc Spaggiari < > > > > >> >> > > > [email protected]> wrote: > > > > >> >> > > > > > > > >> >> > > > > Another important information why might be the root > cause > > > of > > > > >> this > > > > >> >> > > > issue... > > > > >> >> > > > > > > > > >> >> > > > > Do you have any TTL defines for this table? > > > > >> >> > > > > > > > > >> >> > > > > JM > > > > >> >> > > > > > > > > >> >> > > > > > > > > >> >> > > > > 2013/9/24 Jean-Marc Spaggiari <[email protected] > > > > > > >> >> > > > > > > > > >> >> > > > > > Strange. > > > > >> >> > > > > > > > > > >> >> > > > > > Few questions then. > > > > >> >> > > > > > 1) What is your hadoop version? > > > > >> >> > > > > > 2) Is the clock on all your severs synched with NTP? > > > > >> >> > > > > > 3) What is you table definition? Bloom filters, etc.? > > > > >> >> > > > > > > > > > >> >> > > > > > This is the reason why it keep compacting: > > > > >> >> > > > > > > > > > >> >> > > > > > 2013-09-24 10:04:00,548 INFO > > > > >> >> > > > > > > > > >> org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: > > > > >> >> > > > Deleting > > > > >> >> > > > > the expired store file by compaction: hdfs:// > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> > > > > > > > > > > hdpmgr001.pse.movenetworks.com:8020/hbase/compound3/5ab5fdfcf2aff2633e1d6d5089c96aa2/d/7426f128469242ec8ee09f3965fd5a1awhosemaxTimeStampis-1whilethemaxexpiredtimestampis > 1371398640548 > > > > >> >> > > > > > > > > > >> >> > > > > > maxTimeStamp = -1 > > > > >> >> > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > Each time there is a comparison between maxTimeStamp > > for > > > > this > > > > >> >> store > > > > >> >> > > > file > > > > >> >> > > > > > and the configured maxExpiredTimeStamp and since > > > > maxTimeStamp > > > > >> >> > returns > > > > >> >> > > > -1, > > > > >> >> > > > > > it's always elected for a compaction. Now, we need to > > > find > > > > >> >> why... > > > > >> >> > > > > > > > > > >> >> > > > > > JM > > > > >> >> > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > 2013/9/24 Tom Brown <[email protected]> > > > > >> >> > > > > > > > > > >> >> > > > > >> My cluster is fully distributed (2 regionserver > > nodes). > > > > >> >> > > > > >> > > > > >> >> > > > > >> Here is a snippet of log entries that may explain > why > > it > > > > >> >> started: > > > > >> >> > > > > >> http://pastebin.com/wQECif8k. I had to go back 2 > days > > > to > > > > >> find > > > > >> >> > when > > > > >> >> > > it > > > > >> >> > > > > >> started for this region. > > > > >> >> > > > > >> > > > > >> >> > > > > >> This is not the only region experiencing this issue > > (but > > > > >> this > > > > >> >> is > > > > >> >> > the > > > > >> >> > > > > >> smallest one it's happened to). > > > > >> >> > > > > >> > > > > >> >> > > > > >> --Tom > > > > >> >> > > > > >> > > > > >> >> > > > > >> > > > > >> >> > > > > >> On Tue, Sep 24, 2013 at 10:13 AM, Jean-Marc > Spaggiari > > < > > > > >> >> > > > > >> [email protected]> wrote: > > > > >> >> > > > > >> > > > > >> >> > > > > >> > Can you past logs a bit before that? To see if > > > anything > > > > >> >> > triggered > > > > >> >> > > > the > > > > >> >> > > > > >> > compaction? > > > > >> >> > > > > >> > Before the 1M compactions entries. > > > > >> >> > > > > >> > > > > > >> >> > > > > >> > Also, what is your setup? Are you running in > > > Standalone? > > > > >> >> > > > Pseudo-Dist? > > > > >> >> > > > > >> > Fully-Dist? > > > > >> >> > > > > >> > > > > > >> >> > > > > >> > Thanks, > > > > >> >> > > > > >> > > > > > >> >> > > > > >> > JM > > > > >> >> > > > > >> > > > > > >> >> > > > > >> > > > > > >> >> > > > > >> > 2013/9/24 Tom Brown <[email protected]> > > > > >> >> > > > > >> > > > > > >> >> > > > > >> > > There is one column family, d. Each row has > about > > 10 > > > > >> >> columns, > > > > >> >> > > and > > > > >> >> > > > > each > > > > >> >> > > > > >> > > row's total data size is less than 2K. > > > > >> >> > > > > >> > > > > > > >> >> > > > > >> > > Here is a small snippet of logs from the region > > > > server: > > > > >> >> > > > > >> > > http://pastebin.com/S2jE4ZAx > > > > >> >> > > > > >> > > > > > > >> >> > > > > >> > > --Tom > > > > >> >> > > > > >> > > > > > > >> >> > > > > >> > > > > > > >> >> > > > > >> > > On Tue, Sep 24, 2013 at 9:59 AM, Bharath > > > Vissapragada > > > > < > > > > >> >> > > > > >> > > [email protected] > > > > >> >> > > > > >> > > > wrote: > > > > >> >> > > > > >> > > > > > > >> >> > > > > >> > > > It would help if you can show your RS log (via > > > > >> >> pastebin?) . > > > > >> >> > > Are > > > > >> >> > > > > >> there > > > > >> >> > > > > >> > > > frequent flushes for this region too? > > > > >> >> > > > > >> > > > > > > > >> >> > > > > >> > > > > > > > >> >> > > > > >> > > > On Tue, Sep 24, 2013 at 9:20 PM, Tom Brown < > > > > >> >> > > > [email protected]> > > > > >> >> > > > > >> > wrote: > > > > >> >> > > > > >> > > > > > > > >> >> > > > > >> > > > > I have a region that is very small, only > 5MB. > > > > >> Despite > > > > >> >> it's > > > > >> >> > > > size, > > > > >> >> > > > > >> it > > > > >> >> > > > > >> > has > > > > >> >> > > > > >> > > > 24 > > > > >> >> > > > > >> > > > > store files. The logs show that it's > > compacting > > > > >> (over > > > > >> >> and > > > > >> >> > > over > > > > >> >> > > > > >> > again). > > > > >> >> > > > > >> > > > > > > > > >> >> > > > > >> > > > > The odd thing is that even though there are > 24 > > > > store > > > > >> >> > files, > > > > >> >> > > it > > > > >> >> > > > > >> only > > > > >> >> > > > > >> > > does > > > > >> >> > > > > >> > > > > one at a time. Even more strange is that my > > logs > > > > are > > > > >> >> > filling > > > > >> >> > > > up > > > > >> >> > > > > >> with > > > > >> >> > > > > >> > > > > compacting this one region. In the last 10 > > > hours, > > > > >> there > > > > >> >> > have > > > > >> >> > > > > been > > > > >> >> > > > > >> > > > 1,876,200 > > > > >> >> > > > > >> > > > > log entries corresponding to compacting this > > > > region > > > > >> >> alone. > > > > >> >> > > > > >> > > > > > > > > >> >> > > > > >> > > > > My cluster is 0.94.10, and using almost all > > > > default > > > > >> >> > > settings. > > > > >> >> > > > > >> Only a > > > > >> >> > > > > >> > > few > > > > >> >> > > > > >> > > > > are not default: > > > > >> >> > > > > >> > > > > hbase.hregion.max.filesize = 4294967296 > > > > >> >> > > > > >> > > > > hbase.hstore.compaction.min = 6 > > > > >> >> > > > > >> > > > > > > > > >> >> > > > > >> > > > > I am at a total loss as to why this behavior > > is > > > > >> >> occurring. > > > > >> >> > > Any > > > > >> >> > > > > >> help > > > > >> >> > > > > >> > is > > > > >> >> > > > > >> > > > > appreciated. > > > > >> >> > > > > >> > > > > > > > > >> >> > > > > >> > > > > --Tom > > > > >> >> > > > > >> > > > > > > > > >> >> > > > > >> > > > > > > > >> >> > > > > >> > > > > > > > >> >> > > > > >> > > > > > > > >> >> > > > > >> > > > -- > > > > >> >> > > > > >> > > > Bharath Vissapragada > > > > >> >> > > > > >> > > > <http://www.cloudera.com> > > > > >> >> > > > > >> > > > > > > > >> >> > > > > >> > > > > > > >> >> > > > > >> > > > > > >> >> > > > > >> > > > > >> >> > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> > > > > > >> > > > > > >> > > > > >> -- > > > > >> CONFIDENTIALITY NOTICE > > > > >> NOTICE: This message is intended for the use of the individual or > > > entity > > > > >> to > > > > >> which it is addressed and may contain information that is > > > confidential, > > > > >> privileged and exempt from disclosure under applicable law. If the > > > > reader > > > > >> of this message is not the intended recipient, you are hereby > > notified > > > > >> that > > > > >> any printing, copying, dissemination, distribution, disclosure or > > > > >> forwarding of this communication is strictly prohibited. If you > have > > > > >> received this communication in error, please contact the sender > > > > >> immediately > > > > >> and delete it from your system. Thank You. > > > > >> > > > > > > > > > > > > > > > > > > > >
