[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13882067#comment-13882067
 ] 

stack commented on HBASE-7320:
--

I already gave +1 on commit.

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.99.0

 Attachments: 7320-simple.txt


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13882066#comment-13882066
 ] 

stack commented on HBASE-7320:
--

bq. All calls to getBuffer() left are of two categories: access a KVs complete 
key, access the KV as a whole, Now we need to fix those.

How you think we fix these Lars?  Would the best approach now be to try and 
implement a new Cell type altogether?  That would shake out any reliance on KV?

bq. The next part is to get rid of the timestamp array API completely.

Can we change the Cell Interface ex post facto?  Cell was published in 0.96.0?  
Or could we deprecate the array accesses on timestamp and behind the scenes 
return an array if asked but not use it in core codebase?

bq. After that, I think, is to writeTo/readFrom files. The writing side might 
be simple as long we keep the format, in that case the reading side might not 
have to change at all (it is still OK to back a Cell by a single byte[])

What is to be done here?  Adding Cell Interfaces and moving over to use those?

Encoders?

bq, Big parts are: ScanQueryMatcher, which dissects the KV itself, and all the 
comparators in KeyValue itself.

We would move to use Cell-base comparators rather than KV#comparators?

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.99.0

 Attachments: 7320-simple.txt


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-25 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13882068#comment-13882068
 ] 

stack commented on HBASE-7320:
--

bq. As for this patch, should I put it into a subtask?

Yes.

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.99.0

 Attachments: 7320-simple.txt


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13882073#comment-13882073
 ] 

Lars Hofhansl commented on HBASE-7320:
--

bq. Can we change the Cell Interface ex post facto?
Cell has only {{long getTimestamp()}}, so that's clean. I meant removing calls 
to getTimestampOffset, etc.

bq. Would the best approach now be to try and implement a new Cell type 
altogether?
Maybe, but then we'd have to replace KeyValue everywhere, and the KeyValue type 
leaks to the client, so not sure we can do that.
In the end it'd come to the same, either we make a new type, or we fix KeyValue 
to be the new type. Either way seems fine.

bq. How you think we fix these Lars?
Need to look at each case individually to see how to fix them. Some are only 
printing out the key for example and in that we just print the 
row/family/qualifier/timestamp separately.

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.99.0

 Attachments: 7320-simple.txt


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13882075#comment-13882075
 ] 

Lars Hofhansl commented on HBASE-7320:
--

Filed HBASE-10420. Let's leave this one unassigned as umbrella issue.

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.99.0

 Attachments: 7320-simple.txt


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881353#comment-13881353
 ] 

Lars Hofhansl commented on HBASE-7320:
--

Looks like jenkins is taking a vacation.

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.99.0

 Attachments: 7320-simple.txt


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-24 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881419#comment-13881419
 ] 

stack commented on HBASE-7320:
--

I went through trying to be careful.  Looks great to me.  If I had done it I 
would have missed loads of these... especially the likes of this:

{code}
-System.arraycopy(c.getTagsArray(), c.getTagsOffset(), 
newKV.getBuffer(),
+System.arraycopy(c.getTagsArray(), c.getTagsOffset(), 
newKV.getTagsArray(),
 newKV.getTagsOffset(), oldCellTagsLen);
{code}

Excellent [~lhofhansl]

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.99.0

 Attachments: 7320-simple.txt


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881472#comment-13881472
 ] 

Lars Hofhansl commented on HBASE-7320:
--

Even better, when're done many of these array copies can go away as we can just 
make a new Cell and point it to the existing array if we do not change them.

The next part is to get rid of the timestamp array API completely.

After that, I think, is to writeTo/readFrom files. The writing side might be 
simple as long we keep the format, in that case the reading side might not have 
to change at all (it is still OK to back a Cell by a single byte[])

Big parts are: ScanQueryMatcher, which dissects the KV itself, and all the 
comparators in KeyValue itself.


 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.99.0

 Attachments: 7320-simple.txt


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881473#comment-13881473
 ] 

Lars Hofhansl commented on HBASE-7320:
--

As for this patch, just I put it into a subtask?

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.99.0

 Attachments: 7320-simple.txt


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881728#comment-13881728
 ] 

Hadoop QA commented on HBASE-7320:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12625095/7320-simple.txt
  against trunk revision .
  ATTACHMENT ID: 12625095

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 24 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+  System.arraycopy(q, 0, newKV.getQualifierArray(), 
newKV.getQualifierOffset(), q.length);
+KeyValue rewriteKv = new KeyValue(newKv.getRowArray(), 
newKv.getRowOffset(), newKv.getRowLength(),
+KeyValue rewriteKv = new KeyValue(newKv.getRowArray(), 
newKv.getRowOffset(), newKv.getRowLength(),

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8502//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8502//console

This message is automatically generated.

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.99.0

 Attachments: 7320-simple.txt


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881730#comment-13881730
 ] 

Lars Hofhansl commented on HBASE-7320:
--

Coo. Passes all tests. I'll fix the long lines.
Then I think this is good to commit.

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.99.0

 Attachments: 7320-simple.txt


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-22 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878927#comment-13878927
 ] 

Nick Dimiduk commented on HBASE-7320:
-

[~lhofhansl] if I follow your intentions, this means:
# KeyValue#getBuffer goes away entirely -- there's API assumption that a 
KeyValue is backed by a single buffer object of any type (byte[], ByteBuffer, 
c.). A KeyValue instance /could/ be backed by a single buffer object, at the 
option of its creator, but this is an implementation detail.
# KeyValue objects by API design is now backed by 5 buffer objects -- one for 
each rowkey, cf, qualifier, ts, and value.
# previous point does not restrict some producer of KeyValue instances from 
using it's on encoding of multiple instances, but it does require that producer 
to generate instances that conform to this API. For example, say I wanted to 
store KeyValues in batches of 100 where all rowkeys are stored together, then 
all cf, then quals, then ts, then values and make optimizations therein. The 
requirement is I can produce a KeyValue instance from the block that implement 
getXXXArray methods AND I no longer must materialize a buffer object for 
support of getBuffer.

Did I get that right?

Do we have any thought on what an appropriate buffer object should be? Is 
that for another ticket?

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.98.0


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-22 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879080#comment-13879080
 ] 

Lars Hofhansl commented on HBASE-7320:
--

Exactly.

Would have to think a bit more about the TS. Do we want to this be backed by 
byte[] as well, or just completely get rid of the array/offset/length API for 
TS and just a getTimeStamp method?



 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.98.0


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-22 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879120#comment-13879120
 ] 

Matt Corgan commented on HBASE-7320:


It would be nice to move towards callers of getTimestamp() expecting the long 
value returned by the Cell interface.  I'm guessing there is little to negative 
performance gain operating on the bytes directly.

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.98.0


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-21 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877637#comment-13877637
 ] 

Andrew Purtell commented on HBASE-7320:
---

+1 for the deprecation. It already went in on the subtask anyhow.

bq. This is going to be a fun project. I started to look at all the times we 
get the family array. Its a bunch just to check if the KV family is legit 
client-side in the individual types 

We get both family and qualifier arrays in the access controller because we 
have to look in descending order at perms for global, namespace, table, cf, cf 
+ qualifier, cell.

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.98.0


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877745#comment-13877745
 ] 

Lars Hofhansl commented on HBASE-7320:
--

bq.  I started to look at all the times we get the family array.

That's legit, I think. Getting the row, family, qualifier and value via their 
own xyzArray method is OK.
We can assume that the row, family, qualifier, and value are laid out in ram in 
at least a byte[] (or bytebuffer). What we cannot assume that there is any 
layout relationship between them.

What is not OK are at least KeyValue.
* getBuffer
* getOffset
* getLength
* getKeyOffset
* getKeyLength
* getKey/getKeyString

As we should not assume that row/family/qualifier are laid out together nor 
that the entire KV is laid out together.



 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.98.0


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-21 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878242#comment-13878242
 ] 

ramkrishna.s.vasudevan commented on HBASE-7320:
---

Would love to help out in this area.  While doing the HFileV3 with cells 
combined with the encoders faced a lot of blockades with the KV format tightly 
coupled. 
Interesting places will be where we try to reseek/seek creating an Fake KV and 
do a compare to see if we have reached this KV or something greater than that. 

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.98.0


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878255#comment-13878255
 ] 

Lars Hofhansl commented on HBASE-7320:
--

If we all agree that is OK to store row, family, qualifier, and value 
continuously in ram, I can knock off all the simple case next week.
Note that that would for block encoding, where we would no longer need to 
rematerialize the KV just so that the entire KV is stored continuously; but it 
would not help with the prefixtries, as there we'd even want to store partial 
rows, families, etc. (Correct [~mcorgan]?)


 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.98.0


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878266#comment-13878266
 ] 

stack commented on HBASE-7320:
--

Mighty [~lhofhansl]   I am a little confused now.  You have If we all agree 
that is OK to store row, family, qualifier, and value continuously in ram... 
but before this you have ... What we cannot assume that there is any layout 
relationship between them.  I think I know what you are saying... but maybe 
clarify?

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.98.0


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-21 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878279#comment-13878279
 ] 

Matt Corgan commented on HBASE-7320:


{quote}If we all agree that is OK to store row, family, qualifier, and value 
continuously in ram{quote}Sounds like a step in the right direction.  So you 
would basically wrap naked byte[]'s in KeyValue objects rather than operating 
on them directly?

{quote}but it would not help with prefix tries.{quote}Right.  PrefixTreeCell 
goes further and breaks out each Cell field into its own primitive value/array. 
 For example, PrefixTreeCell.getFamilyLength() just returns a value it has 
already extracted rather than calling a few methods to calculate the length 
each time it's called (like KeyValue).

{quote}as there we'd even want to store partial rows, families, etc{quote}I'm 
probably reading too closely here.  The PrefixTreeCell has a separate array for 
row/family/etc, but a single row/family/etc is not split further into 
fragments.  (It is while encoded, but the decoder assembles each field into 
consecutive bytes.)

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.98.0


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878285#comment-13878285
 ] 

Lars Hofhansl commented on HBASE-7320:
--

[~stack], sorry I am having a hard time phrasing this concisely.

The problem is that we currently assume two things:
# the entire KV is stored in one contiguous area in ram
# the key portion (row/family/qual/ts) is stored contiguously in ram

It is OK that the row key is stored contiguously in ram, same for the family, 
qualifier, and value. I.e. we do not break up row-key, family, qualifier, or 
value, but we do no longer assume the entire KV or the key-portion are in one 
piece.
With that in mind things like getFamily, getFamilyArray, getFamilyOffset, 
getFamilyLength (and same for row, qualifier, and value) are OK. But 
getKeyLength, getLength, getBuffer, etc, are not OK.

Or in other words: To a caller it should not mater the the KV is stored in one 
byte[] or whether there are separate byte[] for some or all of rowkey, family, 
qualified, and value.

[~mcorgan], cool, so this *would* help with prefix trees. If families, 
row-keys, qualifier are not broken down into fragments. How about the values, 
can we assume them in one piece, still?

I also have to go back and think about timestamp.

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.98.0


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-21 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878292#comment-13878292
 ] 

Matt Corgan commented on HBASE-7320:


{quote}How about the values, can we assume them in one piece, still?{quote}yes, 
an individual value is contiguous too.  The CellScanner.advance() call decodes 
all these fields into optimal format for reading, and then the 
CellScanner.current() method returns a reference to PrefixTreeCell implements 
Cell.  When you call cell.getWhatever(), it's a trivial pass-through call to 
each pre-extracted field.  For example, timestamp has already been converted 
from bytes to a long before you call cell.getTimestamp().

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.98.0


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-21 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878305#comment-13878305
 ] 

Andrew Purtell commented on HBASE-7320:
---

bq. It is OK that the row key is stored contiguously in ram, same for the 
family, qualifier, and value. I.e. we do not break up row-key, family, 
qualifier, or value, but we do no longer assume the entire KV or the 
key-portion are in one piece.

That would be great, that's the first step to HBASE-9794

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.98.0


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877235#comment-13877235
 ] 

Lars Hofhansl commented on HBASE-7320:
--

Yes. I was thinking we start with:
# deprecate KeyValue.getBuffer(). Nobody should use this going forward
# fix all obvious cases with one large patch (i.e. at places where we should 
need the row, use getRowArray() instead of getBuffer())
# for all non-trivial cases (like the readers and writers) file individual 
jiras to fix them

#1 and #2 should be simple (albeit repetitive) tasks. #3 are the interesting 
issues.


 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl

 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-20 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877239#comment-13877239
 ] 

ramkrishna.s.vasudevan commented on HBASE-7320:
---

bq.for all non-trivial cases (like the readers and writers) file individual 
jiras to fix them
+1

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl

 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2014-01-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877301#comment-13877301
 ] 

stack commented on HBASE-7320:
--

Instead I made a subissue to get #1 into 0.98.  Will leave this issue open as 
umbrella issue.

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: stack
 Fix For: 0.98.0


 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray

2013-08-22 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748277#comment-13748277
 ] 

ramkrishna.s.vasudevan commented on HBASE-7320:
---

bq.others will need their own jiras.
In the sense the Reader/Writers need changes right?  We tried working with 
Cells in these areas and that required changes in the read path like how we do 
seek(), reseek() and the way the reader creates the KVs from the internal 
buffer.
The readers are tightly coupled with the KV format.

 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, 
 getFamilyArray(), getQualifierArray, and getValueArray
 

 Key: HBASE-7320
 URL: https://issues.apache.org/jira/browse/HBASE-7320
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl

 In many places this is simple task of just replacing the method name.
 There, however, quite a few places where we assume that:
 # the entire KV is backed by a single byte array
 # the KVs key portion is backed by a single byte array
 Some of those can easily be fixed, others will need their own jiras.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira