I have added the file to the issue, hopefully it will make it through. Regarding the context, I have encountered this problem firstly in a MapReduce job trying to bulk import the kvs into hbase 0.90.3, I started tracking down the problem and found out, that the errors disappear when I use the version 0.20.6 - originally I was getting the exception mentioning that a keys are in wrong order while doing the reduce part of the bulk import. So I digged deeper and now I can reproduce the problem, without the bulky mapreduce framework just by executing the java class containing the code I previously pasted - any tim (because I suspected that I am doing something wrong in the preparation phase of the kvs bulk import which turned out not to be true - with the workaround mentioned in earlier post the bulk import works fine).
Stan stack-5 wrote: > > This could be a pretty serums issue Stan. It happens in context only? > Can u stick the sequence file up in an issue so one us could take a look? > > Stack > > > > On Sep 5, 2011, at 1:18, Stan Barton <[email protected]> wrote: > >> >> >> >> stack-3 wrote: >>> >>> On Thu, Aug 4, 2011 at 6:54 AM, Stan Barton <[email protected]> wrote: >>>> I have spent a lot of time in order to track down the bug and found out >>>> that >>>> when I write the SequenceFile of KeyValues with HBase 0.90.3 I cannot >>>> read >>>> the content back using the same HBase version jar, however I am able to >>>> read >>>> it without any problems with HBase 0.20.* versions. It is easily >>>> reproducible with this unit test. >>>> >>> >>> Stan: >>> >>> You are writing kvs with 0.90 and they are readable with 0.20 but not >>> w/ the jar that wrote them? >>> >>> Where is the unit test you refer to? Attachments usually don't make >>> it across so you might have to pastebin it. >>> >>> St.Ack >>> >>> >> >> Exactly, I create the kvs with any of the > v0.90 jar and am not able to >> read it back. By digging deeper, I have found a work-around that solves >> the >> problem: >> >> KeyValue kv2 = new KeyValue(kvOrig.getBuffer()); >> >> which means that the buffer is read properly by all jars, but somehow in >> the >> new versions it is parsed wrong. I have compared the values of the leght >> and >> offset values that are read in by class KV in the particular hbase >> versions: >> >> I took a simple sequence file stored in HDFS containing Long and kvs. I >> have >> then output the lengths and offsets of row, key, value, family and >> qualifier >> respectively (+ plus some other kv related info - the whole procedure can >> be >> found here http://pastebin.com/kxC5GrtM ): >> >> version 0.20.6: >> 1-url/content:content/1264692453000/Put/vlen=2-0-39 >> r:10-3 >> k:8-29 >> v:37-2 >> f:14-7 >> q:21-7 >> 39:\x00\x00\x00\x1D\x00\x00\x00\x02\x00\x03url\x07contentcontent\x00\x00\x01&u\x8B^\x88\x04\x00\x00 >> 2-url/meta:statusCode/1264692453000/Put/vlen=3-0-40 >> r:10-3 >> k:8-29 >> v:37-3 >> f:14-4 >> q:18-10 >> 40:\x00\x00\x00\x1D\x00\x00\x00\x03\x00\x03url\x04metastatusCode\x00\x00\x01&u\x8B^\x88\x04200 >> 3-url/meta:length/1264692453000/Put/vlen=8-0-41 >> r:10-3 >> k:8-25 >> v:33-8 >> f:14-4 >> q:18-6 >> >> >> >> version 0.90.3: >> >> 1-url/content:content/1264692453000/Put/vlen=2-0-39 >> r:10-3 >> k:8-29 >> v:37-2 >> f:14-7 >> q:21-7 >> 39:\x00\x00\x00\x1D\x00\x00\x00\x02\x00\x03url\x07contentcontent\x00\x00\x01&u\x8B^\x88\x04\x00\x00 >> 2-url/meta:statusCode/1264692453000/Put/vlen=3-0-40 >> r:10-3 >> k:8-29 >> v:37-3 >> f:14-4 >> q:18-10 >> 40:\x00\x00\x00\x1D\x00\x00\x00\x03\x00\x03url\x04metastatusCode\x00\x00\x01&u\x8B^\x88\x04200 >> 3-url/meta:length\x00\x00\x01&/8469967462476021760/Minimum/vlen=8-0-41 >> r:10-3 >> k:8-29 >> v:37-8 >> f:14-4 >> q:18-10 >> >> >> you can see the discrepancy in the third kv read in, namely in the length >> of >> the key as is parsed by v0.20.6 (25) and the v.90 (29). This garbles the >> read in stream. However I have not found why is this happening. >> >> Stan >> -- >> View this message in context: >> http://old.nabble.com/Possible-bug-in-reading-KeyValues-from-sequence-files-in-HBase-0.90-tp32194680p32399356.html >> Sent from the HBase User mailing list archive at Nabble.com. >> > > > http://old.nabble.com/file/p32409228/myTestFile.seq myTestFile.seq -- View this message in context: http://old.nabble.com/Possible-bug-in-reading-KeyValues-from-sequence-files-in-HBase-0.90-tp32194680p32409228.html Sent from the HBase User mailing list archive at Nabble.com.
