Re: Use loadtable.rb with compressed data?

Stack Fri, 28 Jan 2011 10:26:09 -0800

If you dump the metadata, does it claim GZIP compressor?  If so, yeah,
seems to be mismatch between what data is and what metadata is.
St.Ack


On Fri, Jan 28, 2011 at 9:58 AM, Nanheng Wu <[email protected]> wrote:
> Awesome. I ran it on one of the hfiles and got this:
> 11/01/28 09:57:15 INFO compress.CodecPool: Got brand-new decompressor
> java.io.IOException: Not in GZIP format
>        at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:137)
>        at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:58)
>        at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:68)
>        at 
> org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.<init>(GzipCodec.java:92)
>        at 
> org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.<init>(GzipCodec.java:101)
>        at 
> org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:169)
>        at 
> org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:179)
>        at 
> org.apache.hadoop.hbase.io.hfile.Compression$Algorithm.createDecompressionStream(Compression.java:168)
>        at 
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1013)
>        at 
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:966)
>        at 
> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.seekTo(HFile.java:1291)
>        at org.apache.hadoop.hbase.io.hfile.HFile.main(HFile.java:1740)
>
> So the problem could be that HFile writer is not writing properly
> gzipped outputs?
>
>
> On Fri, Jan 28, 2011 at 9:41 AM, Stack <[email protected]> wrote:
>> The section in 0.90 book on hfile tool should apply to 0.20.6:
>> http://hbase.apache.org/ch08s02.html#hfile_tool  It might help you w/
>> your explorations.
>>
>> St.Ack
>>
>> On Fri, Jan 28, 2011 at 9:38 AM, Nanheng Wu <[email protected]> wrote:
>>> Hi Stack,
>>>
>>>  Get doesn't work either. It was a fresh table created by
>>> loadtable.rb. Finally, the uncompressed version had the same number of
>>> regions (8 total). I totally understand you guys shouldn't be patching
>>> the older version, upgrading for me is an option but will be pretty
>>> painful. I wonder if I can figure something out by comparing the two
>>> version's Hfile. Thanks again!
>>>
>>> On Fri, Jan 28, 2011 at 9:14 AM, Stack <[email protected]> wrote:
>>>> On Thu, Jan 27, 2011 at 9:35 PM, Nanheng Wu <[email protected]> wrote:
>>>>> In the compressed case, there are 8 regions and the region start/end
>>>>> keys do line up. Which actually is confusing to me, how can hbase read
>>>>> the files if they are compressed? does each hfile have some metadata
>>>>> in it that has compression info?
>>>>
>>>> You got it.
>>>>
>>>>> Anyway, the regions are the same
>>>>> (numbers and boundaries are same) in both compressed and uncompressed
>>>>> version. So what else should I look into to fix this? Thanks again!
>>>>
>>>> You can't scan. Can you Get from the table at all?  Try getting start
>>>> key from a few of the regions you see in .META.
>>>>
>>>> Did this table preexist or was this a fresh creation?
>>>>
>>>> When you created this table uncompressed, how many regions was it?
>>>>
>>>> How about just running uncompressed while you are on 0.20.6?  We'd
>>>> rather be fixing bugs in the new stuff, not the version that we are
>>>> leaving behind?
>>>>
>>>> Thanks,
>>>> St.Ack
>>>>
>>>
>>
>

Re: Use loadtable.rb with compressed data?

Reply via email to