So, seems like in 0.20.6, we're not doing compression right. St.Ack
On Fri, Jan 28, 2011 at 11:23 AM, Nanheng Wu <[email protected]> wrote: > Ah, sorry I should've read the usage. I ran it just now and the meta > data dump threw the same error "Not in GZIP format" > > On Fri, Jan 28, 2011 at 10:51 AM, Stack <[email protected]> wrote: >> hfile metadata, the -m option? >> St.Ack >> >> On Fri, Jan 28, 2011 at 10:41 AM, Nanheng Wu <[email protected]> wrote: >>> Sorry, by dumping the metadata did you mean running the same HFile >>> tool on ".region" file in each region? >>> >>> On Fri, Jan 28, 2011 at 10:25 AM, Stack <[email protected]> wrote: >>>> If you dump the metadata, does it claim GZIP compressor? If so, yeah, >>>> seems to be mismatch between what data is and what metadata is. >>>> St.Ack >>>> >>>> On Fri, Jan 28, 2011 at 9:58 AM, Nanheng Wu <[email protected]> wrote: >>>>> Awesome. I ran it on one of the hfiles and got this: >>>>> 11/01/28 09:57:15 INFO compress.CodecPool: Got brand-new decompressor >>>>> java.io.IOException: Not in GZIP format >>>>> at >>>>> java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:137) >>>>> at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:58) >>>>> at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:68) >>>>> at >>>>> org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.<init>(GzipCodec.java:92) >>>>> at >>>>> org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.<init>(GzipCodec.java:101) >>>>> at >>>>> org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:169) >>>>> at >>>>> org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:179) >>>>> at >>>>> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.createDecompressionStream(Compression.java:168) >>>>> at >>>>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1013) >>>>> at >>>>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:966) >>>>> at >>>>> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.seekTo(HFile.java:1291) >>>>> at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:1740) >>>>> >>>>> So the problem could be that HFile writer is not writing properly >>>>> gzipped outputs? >>>>> >>>>> >>>>> On Fri, Jan 28, 2011 at 9:41 AM, Stack <[email protected]> wrote: >>>>>> The section in 0.90 book on hfile tool should apply to 0.20.6: >>>>>> http://hbase.apache.org/ch08s02.html#hfile_tool It might help you w/ >>>>>> your explorations. >>>>>> >>>>>> St.Ack >>>>>> >>>>>> On Fri, Jan 28, 2011 at 9:38 AM, Nanheng Wu <[email protected]> wrote: >>>>>>> Hi Stack, >>>>>>> >>>>>>> Get doesn't work either. It was a fresh table created by >>>>>>> loadtable.rb. Finally, the uncompressed version had the same number of >>>>>>> regions (8 total). I totally understand you guys shouldn't be patching >>>>>>> the older version, upgrading for me is an option but will be pretty >>>>>>> painful. I wonder if I can figure something out by comparing the two >>>>>>> version's Hfile. Thanks again! >>>>>>> >>>>>>> On Fri, Jan 28, 2011 at 9:14 AM, Stack <[email protected]> wrote: >>>>>>>> On Thu, Jan 27, 2011 at 9:35 PM, Nanheng Wu <[email protected]> >>>>>>>> wrote: >>>>>>>>> In the compressed case, there are 8 regions and the region start/end >>>>>>>>> keys do line up. Which actually is confusing to me, how can hbase read >>>>>>>>> the files if they are compressed? does each hfile have some metadata >>>>>>>>> in it that has compression info? >>>>>>>> >>>>>>>> You got it. >>>>>>>> >>>>>>>>> Anyway, the regions are the same >>>>>>>>> (numbers and boundaries are same) in both compressed and uncompressed >>>>>>>>> version. So what else should I look into to fix this? Thanks again! >>>>>>>> >>>>>>>> You can't scan. Can you Get from the table at all? Try getting start >>>>>>>> key from a few of the regions you see in .META. >>>>>>>> >>>>>>>> Did this table preexist or was this a fresh creation? >>>>>>>> >>>>>>>> When you created this table uncompressed, how many regions was it? >>>>>>>> >>>>>>>> How about just running uncompressed while you are on 0.20.6? We'd >>>>>>>> rather be fixing bugs in the new stuff, not the version that we are >>>>>>>> leaving behind? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> St.Ack >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
