[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13882067#comment-13882067 ] stack commented on HBASE-7320: -- I already gave +1 on commit. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.99.0 Attachments: 7320-simple.txt In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13882066#comment-13882066 ] stack commented on HBASE-7320: -- bq. All calls to getBuffer() left are of two categories: access a KVs complete key, access the KV as a whole, Now we need to fix those. How you think we fix these Lars? Would the best approach now be to try and implement a new Cell type altogether? That would shake out any reliance on KV? bq. The next part is to get rid of the timestamp array API completely. Can we change the Cell Interface ex post facto? Cell was published in 0.96.0? Or could we deprecate the array accesses on timestamp and behind the scenes return an array if asked but not use it in core codebase? bq. After that, I think, is to writeTo/readFrom files. The writing side might be simple as long we keep the format, in that case the reading side might not have to change at all (it is still OK to back a Cell by a single byte[]) What is to be done here? Adding Cell Interfaces and moving over to use those? Encoders? bq, Big parts are: ScanQueryMatcher, which dissects the KV itself, and all the comparators in KeyValue itself. We would move to use Cell-base comparators rather than KV#comparators? Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.99.0 Attachments: 7320-simple.txt In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13882068#comment-13882068 ] stack commented on HBASE-7320: -- bq. As for this patch, should I put it into a subtask? Yes. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.99.0 Attachments: 7320-simple.txt In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13882073#comment-13882073 ] Lars Hofhansl commented on HBASE-7320: -- bq. Can we change the Cell Interface ex post facto? Cell has only {{long getTimestamp()}}, so that's clean. I meant removing calls to getTimestampOffset, etc. bq. Would the best approach now be to try and implement a new Cell type altogether? Maybe, but then we'd have to replace KeyValue everywhere, and the KeyValue type leaks to the client, so not sure we can do that. In the end it'd come to the same, either we make a new type, or we fix KeyValue to be the new type. Either way seems fine. bq. How you think we fix these Lars? Need to look at each case individually to see how to fix them. Some are only printing out the key for example and in that we just print the row/family/qualifier/timestamp separately. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.99.0 Attachments: 7320-simple.txt In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13882075#comment-13882075 ] Lars Hofhansl commented on HBASE-7320: -- Filed HBASE-10420. Let's leave this one unassigned as umbrella issue. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.99.0 Attachments: 7320-simple.txt In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881353#comment-13881353 ] Lars Hofhansl commented on HBASE-7320: -- Looks like jenkins is taking a vacation. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.99.0 Attachments: 7320-simple.txt In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881419#comment-13881419 ] stack commented on HBASE-7320: -- I went through trying to be careful. Looks great to me. If I had done it I would have missed loads of these... especially the likes of this: {code} -System.arraycopy(c.getTagsArray(), c.getTagsOffset(), newKV.getBuffer(), +System.arraycopy(c.getTagsArray(), c.getTagsOffset(), newKV.getTagsArray(), newKV.getTagsOffset(), oldCellTagsLen); {code} Excellent [~lhofhansl] Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.99.0 Attachments: 7320-simple.txt In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881472#comment-13881472 ] Lars Hofhansl commented on HBASE-7320: -- Even better, when're done many of these array copies can go away as we can just make a new Cell and point it to the existing array if we do not change them. The next part is to get rid of the timestamp array API completely. After that, I think, is to writeTo/readFrom files. The writing side might be simple as long we keep the format, in that case the reading side might not have to change at all (it is still OK to back a Cell by a single byte[]) Big parts are: ScanQueryMatcher, which dissects the KV itself, and all the comparators in KeyValue itself. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.99.0 Attachments: 7320-simple.txt In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881473#comment-13881473 ] Lars Hofhansl commented on HBASE-7320: -- As for this patch, just I put it into a subtask? Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.99.0 Attachments: 7320-simple.txt In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881728#comment-13881728 ] Hadoop QA commented on HBASE-7320: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12625095/7320-simple.txt against trunk revision . ATTACHMENT ID: 12625095 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 24 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + System.arraycopy(q, 0, newKV.getQualifierArray(), newKV.getQualifierOffset(), q.length); +KeyValue rewriteKv = new KeyValue(newKv.getRowArray(), newKv.getRowOffset(), newKv.getRowLength(), +KeyValue rewriteKv = new KeyValue(newKv.getRowArray(), newKv.getRowOffset(), newKv.getRowLength(), {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8502//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8502//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8502//console This message is automatically generated. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.99.0 Attachments: 7320-simple.txt In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881730#comment-13881730 ] Lars Hofhansl commented on HBASE-7320: -- Coo. Passes all tests. I'll fix the long lines. Then I think this is good to commit. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.99.0 Attachments: 7320-simple.txt In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878927#comment-13878927 ] Nick Dimiduk commented on HBASE-7320: - [~lhofhansl] if I follow your intentions, this means: # KeyValue#getBuffer goes away entirely -- there's API assumption that a KeyValue is backed by a single buffer object of any type (byte[], ByteBuffer, c.). A KeyValue instance /could/ be backed by a single buffer object, at the option of its creator, but this is an implementation detail. # KeyValue objects by API design is now backed by 5 buffer objects -- one for each rowkey, cf, qualifier, ts, and value. # previous point does not restrict some producer of KeyValue instances from using it's on encoding of multiple instances, but it does require that producer to generate instances that conform to this API. For example, say I wanted to store KeyValues in batches of 100 where all rowkeys are stored together, then all cf, then quals, then ts, then values and make optimizations therein. The requirement is I can produce a KeyValue instance from the block that implement getXXXArray methods AND I no longer must materialize a buffer object for support of getBuffer. Did I get that right? Do we have any thought on what an appropriate buffer object should be? Is that for another ticket? Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.98.0 In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879080#comment-13879080 ] Lars Hofhansl commented on HBASE-7320: -- Exactly. Would have to think a bit more about the TS. Do we want to this be backed by byte[] as well, or just completely get rid of the array/offset/length API for TS and just a getTimeStamp method? Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.98.0 In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879120#comment-13879120 ] Matt Corgan commented on HBASE-7320: It would be nice to move towards callers of getTimestamp() expecting the long value returned by the Cell interface. I'm guessing there is little to negative performance gain operating on the bytes directly. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.98.0 In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877637#comment-13877637 ] Andrew Purtell commented on HBASE-7320: --- +1 for the deprecation. It already went in on the subtask anyhow. bq. This is going to be a fun project. I started to look at all the times we get the family array. Its a bunch just to check if the KV family is legit client-side in the individual types We get both family and qualifier arrays in the access controller because we have to look in descending order at perms for global, namespace, table, cf, cf + qualifier, cell. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.98.0 In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877745#comment-13877745 ] Lars Hofhansl commented on HBASE-7320: -- bq. I started to look at all the times we get the family array. That's legit, I think. Getting the row, family, qualifier and value via their own xyzArray method is OK. We can assume that the row, family, qualifier, and value are laid out in ram in at least a byte[] (or bytebuffer). What we cannot assume that there is any layout relationship between them. What is not OK are at least KeyValue. * getBuffer * getOffset * getLength * getKeyOffset * getKeyLength * getKey/getKeyString As we should not assume that row/family/qualifier are laid out together nor that the entire KV is laid out together. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.98.0 In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878242#comment-13878242 ] ramkrishna.s.vasudevan commented on HBASE-7320: --- Would love to help out in this area. While doing the HFileV3 with cells combined with the encoders faced a lot of blockades with the KV format tightly coupled. Interesting places will be where we try to reseek/seek creating an Fake KV and do a compare to see if we have reached this KV or something greater than that. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.98.0 In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878255#comment-13878255 ] Lars Hofhansl commented on HBASE-7320: -- If we all agree that is OK to store row, family, qualifier, and value continuously in ram, I can knock off all the simple case next week. Note that that would for block encoding, where we would no longer need to rematerialize the KV just so that the entire KV is stored continuously; but it would not help with the prefixtries, as there we'd even want to store partial rows, families, etc. (Correct [~mcorgan]?) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.98.0 In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878266#comment-13878266 ] stack commented on HBASE-7320: -- Mighty [~lhofhansl] I am a little confused now. You have If we all agree that is OK to store row, family, qualifier, and value continuously in ram... but before this you have ... What we cannot assume that there is any layout relationship between them. I think I know what you are saying... but maybe clarify? Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.98.0 In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878279#comment-13878279 ] Matt Corgan commented on HBASE-7320: {quote}If we all agree that is OK to store row, family, qualifier, and value continuously in ram{quote}Sounds like a step in the right direction. So you would basically wrap naked byte[]'s in KeyValue objects rather than operating on them directly? {quote}but it would not help with prefix tries.{quote}Right. PrefixTreeCell goes further and breaks out each Cell field into its own primitive value/array. For example, PrefixTreeCell.getFamilyLength() just returns a value it has already extracted rather than calling a few methods to calculate the length each time it's called (like KeyValue). {quote}as there we'd even want to store partial rows, families, etc{quote}I'm probably reading too closely here. The PrefixTreeCell has a separate array for row/family/etc, but a single row/family/etc is not split further into fragments. (It is while encoded, but the decoder assembles each field into consecutive bytes.) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.98.0 In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878285#comment-13878285 ] Lars Hofhansl commented on HBASE-7320: -- [~stack], sorry I am having a hard time phrasing this concisely. The problem is that we currently assume two things: # the entire KV is stored in one contiguous area in ram # the key portion (row/family/qual/ts) is stored contiguously in ram It is OK that the row key is stored contiguously in ram, same for the family, qualifier, and value. I.e. we do not break up row-key, family, qualifier, or value, but we do no longer assume the entire KV or the key-portion are in one piece. With that in mind things like getFamily, getFamilyArray, getFamilyOffset, getFamilyLength (and same for row, qualifier, and value) are OK. But getKeyLength, getLength, getBuffer, etc, are not OK. Or in other words: To a caller it should not mater the the KV is stored in one byte[] or whether there are separate byte[] for some or all of rowkey, family, qualified, and value. [~mcorgan], cool, so this *would* help with prefix trees. If families, row-keys, qualifier are not broken down into fragments. How about the values, can we assume them in one piece, still? I also have to go back and think about timestamp. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.98.0 In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878292#comment-13878292 ] Matt Corgan commented on HBASE-7320: {quote}How about the values, can we assume them in one piece, still?{quote}yes, an individual value is contiguous too. The CellScanner.advance() call decodes all these fields into optimal format for reading, and then the CellScanner.current() method returns a reference to PrefixTreeCell implements Cell. When you call cell.getWhatever(), it's a trivial pass-through call to each pre-extracted field. For example, timestamp has already been converted from bytes to a long before you call cell.getTimestamp(). Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.98.0 In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878305#comment-13878305 ] Andrew Purtell commented on HBASE-7320: --- bq. It is OK that the row key is stored contiguously in ram, same for the family, qualifier, and value. I.e. we do not break up row-key, family, qualifier, or value, but we do no longer assume the entire KV or the key-portion are in one piece. That would be great, that's the first step to HBASE-9794 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.98.0 In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877235#comment-13877235 ] Lars Hofhansl commented on HBASE-7320: -- Yes. I was thinking we start with: # deprecate KeyValue.getBuffer(). Nobody should use this going forward # fix all obvious cases with one large patch (i.e. at places where we should need the row, use getRowArray() instead of getBuffer()) # for all non-trivial cases (like the readers and writers) file individual jiras to fix them #1 and #2 should be simple (albeit repetitive) tasks. #3 are the interesting issues. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877239#comment-13877239 ] ramkrishna.s.vasudevan commented on HBASE-7320: --- bq.for all non-trivial cases (like the readers and writers) file individual jiras to fix them +1 Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877301#comment-13877301 ] stack commented on HBASE-7320: -- Instead I made a subissue to get #1 into 0.98. Will leave this issue open as umbrella issue. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl Assignee: stack Fix For: 0.98.0 In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-7320) Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray
[ https://issues.apache.org/jira/browse/HBASE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748277#comment-13748277 ] ramkrishna.s.vasudevan commented on HBASE-7320: --- bq.others will need their own jiras. In the sense the Reader/Writers need changes right? We tried working with Cells in these areas and that required changes in the read path like how we do seek(), reseek() and the way the reader creates the KVs from the internal buffer. The readers are tightly coupled with the KV format. Replace calls to KeyValue.getBuffer with appropropriate calls to getRowArray, getFamilyArray(), getQualifierArray, and getValueArray Key: HBASE-7320 URL: https://issues.apache.org/jira/browse/HBASE-7320 Project: HBase Issue Type: Bug Reporter: Lars Hofhansl In many places this is simple task of just replacing the method name. There, however, quite a few places where we assume that: # the entire KV is backed by a single byte array # the KVs key portion is backed by a single byte array Some of those can easily be fixed, others will need their own jiras. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira