This could be a pretty serums issue Stan. It happens in context only? Can u stick the sequence file up in an issue so one us could take a look?
Stack On Sep 5, 2011, at 1:18, Stan Barton <[email protected]> wrote: > > > > stack-3 wrote: >> >> On Thu, Aug 4, 2011 at 6:54 AM, Stan Barton <[email protected]> wrote: >>> I have spent a lot of time in order to track down the bug and found out >>> that >>> when I write the SequenceFile of KeyValues with HBase 0.90.3 I cannot >>> read >>> the content back using the same HBase version jar, however I am able to >>> read >>> it without any problems with HBase 0.20.* versions. It is easily >>> reproducible with this unit test. >>> >> >> Stan: >> >> You are writing kvs with 0.90 and they are readable with 0.20 but not >> w/ the jar that wrote them? >> >> Where is the unit test you refer to? Attachments usually don't make >> it across so you might have to pastebin it. >> >> St.Ack >> >> > > Exactly, I create the kvs with any of the > v0.90 jar and am not able to > read it back. By digging deeper, I have found a work-around that solves the > problem: > > KeyValue kv2 = new KeyValue(kvOrig.getBuffer()); > > which means that the buffer is read properly by all jars, but somehow in the > new versions it is parsed wrong. I have compared the values of the leght and > offset values that are read in by class KV in the particular hbase versions: > > I took a simple sequence file stored in HDFS containing Long and kvs. I have > then output the lengths and offsets of row, key, value, family and qualifier > respectively (+ plus some other kv related info - the whole procedure can be > found here http://pastebin.com/kxC5GrtM ): > > version 0.20.6: > 1-url/content:content/1264692453000/Put/vlen=2-0-39 > r:10-3 > k:8-29 > v:37-2 > f:14-7 > q:21-7 > 39:\x00\x00\x00\x1D\x00\x00\x00\x02\x00\x03url\x07contentcontent\x00\x00\x01&u\x8B^\x88\x04\x00\x00 > 2-url/meta:statusCode/1264692453000/Put/vlen=3-0-40 > r:10-3 > k:8-29 > v:37-3 > f:14-4 > q:18-10 > 40:\x00\x00\x00\x1D\x00\x00\x00\x03\x00\x03url\x04metastatusCode\x00\x00\x01&u\x8B^\x88\x04200 > 3-url/meta:length/1264692453000/Put/vlen=8-0-41 > r:10-3 > k:8-25 > v:33-8 > f:14-4 > q:18-6 > > > > version 0.90.3: > > 1-url/content:content/1264692453000/Put/vlen=2-0-39 > r:10-3 > k:8-29 > v:37-2 > f:14-7 > q:21-7 > 39:\x00\x00\x00\x1D\x00\x00\x00\x02\x00\x03url\x07contentcontent\x00\x00\x01&u\x8B^\x88\x04\x00\x00 > 2-url/meta:statusCode/1264692453000/Put/vlen=3-0-40 > r:10-3 > k:8-29 > v:37-3 > f:14-4 > q:18-10 > 40:\x00\x00\x00\x1D\x00\x00\x00\x03\x00\x03url\x04metastatusCode\x00\x00\x01&u\x8B^\x88\x04200 > 3-url/meta:length\x00\x00\x01&/8469967462476021760/Minimum/vlen=8-0-41 > r:10-3 > k:8-29 > v:37-8 > f:14-4 > q:18-10 > > > you can see the discrepancy in the third kv read in, namely in the length of > the key as is parsed by v0.20.6 (25) and the v.90 (29). This garbles the > read in stream. However I have not found why is this happening. > > Stan > -- > View this message in context: > http://old.nabble.com/Possible-bug-in-reading-KeyValues-from-sequence-files-in-HBase-0.90-tp32194680p32399356.html > Sent from the HBase User mailing list archive at Nabble.com. >
