I have added the file to the issue, hopefully it will make it through. 

Regarding the context, I have encountered this problem firstly in a
MapReduce job trying to bulk import the kvs into hbase 0.90.3, I started
tracking down the problem and found out, that the errors disappear when I
use the version 0.20.6 - originally I was getting the exception mentioning
that a keys are in wrong order while doing the reduce part of the bulk
import. So I digged deeper and now I can reproduce the problem, without the
bulky mapreduce framework just by executing the java class containing the
code I previously pasted - any tim (because I suspected that I am doing
something wrong in the preparation phase of the kvs bulk import which turned
out not to be true - with the workaround mentioned in earlier post the bulk
import works fine).

Stan


stack-5 wrote:
> 
> This could be a pretty serums issue Stan.  It happens in context only?  
> Can u stick the sequence file up in an issue so one us could take a look?
> 
> Stack
> 
> 
> 
> On Sep 5, 2011, at 1:18, Stan Barton <[email protected]> wrote:
> 
>> 
>> 
>> 
>> stack-3 wrote:
>>> 
>>> On Thu, Aug 4, 2011 at 6:54 AM, Stan Barton <[email protected]> wrote:
>>>> I have spent a lot of time in order to track down the bug and found out
>>>> that
>>>> when I write the SequenceFile of KeyValues with HBase 0.90.3 I cannot
>>>> read
>>>> the content back using the same HBase version jar, however I am able to
>>>> read
>>>> it without any problems with HBase 0.20.* versions. It is easily
>>>> reproducible with this unit test.
>>>> 
>>> 
>>> Stan:
>>> 
>>> You are writing kvs with 0.90 and they are readable with 0.20 but not
>>> w/ the jar that wrote them?
>>> 
>>> Where is the unit test you refer to?  Attachments usually don't make
>>> it across so you might have to pastebin it.
>>> 
>>> St.Ack
>>> 
>>> 
>> 
>> Exactly, I create the kvs with any of the > v0.90 jar and am not able to
>> read it back. By digging deeper, I have found a work-around that solves
>> the
>> problem:
>> 
>> KeyValue kv2 = new KeyValue(kvOrig.getBuffer());
>> 
>> which means that the buffer is read properly by all jars, but somehow in
>> the
>> new versions it is parsed wrong. I have compared the values of the leght
>> and
>> offset values that are read in by class KV in the particular hbase
>> versions:
>> 
>> I took a simple sequence file stored in HDFS containing Long and kvs. I
>> have
>> then output the lengths and offsets of row, key, value, family and
>> qualifier
>> respectively (+ plus some other kv related info - the whole procedure can
>> be
>> found here http://pastebin.com/kxC5GrtM ):
>> 
>> version 0.20.6:
>> 1-url/content:content/1264692453000/Put/vlen=2-0-39
>> r:10-3
>> k:8-29
>> v:37-2
>> f:14-7
>> q:21-7
>> 39:\x00\x00\x00\x1D\x00\x00\x00\x02\x00\x03url\x07contentcontent\x00\x00\x01&u\x8B^\x88\x04\x00\x00
>> 2-url/meta:statusCode/1264692453000/Put/vlen=3-0-40
>> r:10-3
>> k:8-29
>> v:37-3
>> f:14-4
>> q:18-10
>> 40:\x00\x00\x00\x1D\x00\x00\x00\x03\x00\x03url\x04metastatusCode\x00\x00\x01&u\x8B^\x88\x04200
>> 3-url/meta:length/1264692453000/Put/vlen=8-0-41
>> r:10-3
>> k:8-25
>> v:33-8
>> f:14-4
>> q:18-6
>> 
>> 
>> 
>> version 0.90.3:
>> 
>> 1-url/content:content/1264692453000/Put/vlen=2-0-39
>> r:10-3
>> k:8-29
>> v:37-2
>> f:14-7
>> q:21-7
>> 39:\x00\x00\x00\x1D\x00\x00\x00\x02\x00\x03url\x07contentcontent\x00\x00\x01&u\x8B^\x88\x04\x00\x00
>> 2-url/meta:statusCode/1264692453000/Put/vlen=3-0-40
>> r:10-3
>> k:8-29
>> v:37-3
>> f:14-4
>> q:18-10
>> 40:\x00\x00\x00\x1D\x00\x00\x00\x03\x00\x03url\x04metastatusCode\x00\x00\x01&u\x8B^\x88\x04200
>> 3-url/meta:length\x00\x00\x01&/8469967462476021760/Minimum/vlen=8-0-41
>> r:10-3
>> k:8-29
>> v:37-8
>> f:14-4
>> q:18-10
>> 
>> 
>> you can see the discrepancy in the third kv read in, namely in the length
>> of
>> the key as is parsed by v0.20.6 (25) and the v.90 (29). This garbles the
>> read in stream. However I have not found why is this happening.
>> 
>> Stan
>> -- 
>> View this message in context:
>> http://old.nabble.com/Possible-bug-in-reading-KeyValues-from-sequence-files-in-HBase-0.90-tp32194680p32399356.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>> 
> 
> 
> 
http://old.nabble.com/file/p32409228/myTestFile.seq myTestFile.seq 
-- 
View this message in context: 
http://old.nabble.com/Possible-bug-in-reading-KeyValues-from-sequence-files-in-HBase-0.90-tp32194680p32409228.html
Sent from the HBase User mailing list archive at Nabble.com.

Reply via email to