Hi Lars, What I am trying to do is to do a internal scan inside a coprocessor and then stream the kv buffer as an byte array to a separate process for processing. I hit a snag on how to reconstruct the kv in the separate process from the byte array since I do not know what are the correct offsets should I use. The kv underlying buffer from a internal scan is very different from the client scan, as you already stated.
Is there somewhere in the HBase codes that do a similar thing? Thanks. Kim On Wed, Sep 25, 2013 at 9:30 PM, lars hofhansl <[email protected]> wrote: > Are you on the client or the server? > In the server the KeyValue objects are created in > HFileReaderV2.ScannerV2.getKeyValue(). There you will see that a KeyValue > object is really just a "pointer" into a larger byte[] loaded from an HFile. > > On the client the KeyValue is typically deserialized from an RPC; in that > case the backing array only holds one KeyValue (and the buffer size and the > KeyValue length should match). > > > Does that make sense? I know this can be a bit confusing. > > > -- Lars > > > > ----- Original Message ----- > From: Kim Chew <[email protected]> > To: [email protected]; lars hofhansl <[email protected]> > Cc: > Sent: Wednesday, September 25, 2013 5:40 PM > Subject: Re: KeyValue.getLength() question > > On Wed, Sep 25, 2013 at 7:52 AM, lars hofhansl <[email protected]> wrote: > > > myKV.getLength() is alway <= myKV.getBuffer().length. > > > > The buffer here is typically an HFile block. > > > Lars, I don't quite understand this, could you please elaborate a bit > more? Also if the KV's buffer size is bigger than the one returned by > "readLength()", what would be those extra bytes in the buffer? > > It seems to me that the Scanner and InternalScanner packs different > numbers of extra bytes to the buffer, I tired to pinpoint the scanner codes > to where the KV objects is created but without too much luck. Could you > show me where it is done? > > Thanks a lot. > > Kim > > > > We use that buffer and pass it up the chain without making any further > > copy of the KV. > > > > > > > -- Lars > > > > > > > > ----- Original Message ----- > > From: Kim Chew <[email protected]> > > To: [email protected] > > Cc: > > Sent: Wednesday, September 25, 2013 12:06 AM > > Subject: KeyValue.getLength() question > > > > Hello, > > > > I have a "strange" situation that I can't wrap my head around it. Say, > for > > example, I have an KeyValue instance, shouldn't > > > > myKV.getLength() == myKV.getBuffer().length ? > > > > Given that, "getLength()" returns "Length of bytes this KeyValue occupies > > in getBuffer()< > > > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/KeyValue.html#getBuffer%28%29 > > > > > ." > > > > > > In my case the value returned by "myKV.getBuffer().length" is greater > than > > "myKV.getLength()". What possibly went wrong? > > > > TIA > > > > Kim. > > > > > >
