[jira] [Updated] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object
[ https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tudor Scurtu updated HBASE-5625: Attachment: 5625v6.txt Implemented coding style comments. Added separate unit tests for all modified or added methods. Avoid byte buffer allocations when reading a value from a Result object --- Key: HBASE-5625 URL: https://issues.apache.org/jira/browse/HBASE-5625 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.92.1 Reporter: Tudor Scurtu Assignee: Tudor Scurtu Labels: patch Fix For: 0.96.0 Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 5625v5.txt, 5625v6.txt When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch. Running tests with a profiler, the reduction of read time seems to be of up to 40%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object
[ https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tudor Scurtu updated HBASE-5625: Attachment: 5625v5.txt @Zhihong: Thanks for the review request. I actually had to make my own in order to upload the diff: https://reviews.apache.org/r/4607/ The performance actually depends on the system capabilities. It's hard to write a microbenchmark test for an issue that manifests itself on large I/O intensive jobs that put a lot of gc pressure. I implemented a few of Cosmin's suggestions. Avoid byte buffer allocations when reading a value from a Result object --- Key: HBASE-5625 URL: https://issues.apache.org/jira/browse/HBASE-5625 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.92.1 Reporter: Tudor Scurtu Assignee: Tudor Scurtu Labels: patch Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt, 5625v5.txt When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch. Running tests with a profiler, the reduction of read time seems to be of up to 40%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object
[ https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tudor Scurtu updated HBASE-5625: Attachment: 5625v4.txt @Zhihong: I added a new method, 'TestResult.testPerformance()'. This must be of course removed if the patch is accepted. It prints the results, so it doesn't require an integrated profiler. It takes a few minutes to run... Mentioned that the buffer isn't cleared or flipped in the 'KeyValue.loadValue()' method comment as well. Is that what you meant? Avoid byte buffer allocations when reading a value from a Result object --- Key: HBASE-5625 URL: https://issues.apache.org/jira/browse/HBASE-5625 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.92.1 Reporter: Tudor Scurtu Assignee: Tudor Scurtu Labels: patch Attachments: 5625.txt, 5625v2.txt, 5625v3.txt, 5625v4.txt When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch. Running tests with a profiler, the reduction of read time seems to be of up to 40%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object
[ https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tudor Scurtu updated HBASE-5625: Attachment: 5625v3.txt @Zhihong: I initially submitted a minimalistic version so I wouldn't have to make extensive modifications after reviews. Now the method is called from a unit test class. The variable names were copied from existing methods and I wanted them to have a uniform naming scheme. I renamed the variables, but not the method parameters. The same for the method comment. Implemented code style comments. @stack: Implemented code style comments. Added check for buffer size. Created 'KeyValue.checkParameters()' method. Should 'createEmptyByteArray()' call it as well? 'containsNonEmptyColumn()' checks if the value exists is not empty; 'containsEmptyColumn()' checks if the value exists is empty. If you would have only one, for the other case you would have to actually read the value and check it. Moved most 'loadValue()' functionality to 'KeyValue'. This raises a problem: how do we elegantly treat the case when the buffer (provided from 'Result') isn't big enough? Refactored 'Result.getSearchTerm()' as another 'KeyValue.createFirstOnRow()'. The new 'binarySearch()' method avoids allocating a byte array. We run incremental jobs that update values; we also have to read different values form the same row in different places. Avoid byte buffer allocations when reading a value from a Result object --- Key: HBASE-5625 URL: https://issues.apache.org/jira/browse/HBASE-5625 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.92.1 Reporter: Tudor Scurtu Labels: patch Attachments: 5625.txt, 5625v2.txt, 5625v3.txt When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch. Running tests with a profiler, the reduction of read time seems to be of up to 40%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object
[ https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tudor Scurtu updated HBASE-5625: Description: When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch. Running tests with a profiler, the reduction of read time seems to be of up to 40%. was: When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. On our use case - with large map-reduce jobs - we have noticed a reduction of read time of up to 40%. Avoid byte buffer allocations when reading a value from a Result object --- Key: HBASE-5625 URL: https://issues.apache.org/jira/browse/HBASE-5625 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.92.1 Reporter: Tudor Scurtu Labels: patch When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch. Running tests with a profiler, the reduction of read time seems to be of up to 40%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object
[ https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tudor Scurtu updated HBASE-5625: Attachment: 5625.txt The current patch introduces a method (Result.loadValue()) that uses pre-allocated buffers. Momentarily the new methods duplicate code from the original ones so as not to impact the current architecture. Avoid byte buffer allocations when reading a value from a Result object --- Key: HBASE-5625 URL: https://issues.apache.org/jira/browse/HBASE-5625 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.92.1 Reporter: Tudor Scurtu Labels: patch Attachments: 5625.txt When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch. Running tests with a profiler, the reduction of read time seems to be of up to 40%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5625) Avoid byte buffer allocations when reading a value from a Result object
[ https://issues.apache.org/jira/browse/HBASE-5625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tudor Scurtu updated HBASE-5625: Attachment: 5625v2.txt Correct trunk diff. Avoid byte buffer allocations when reading a value from a Result object --- Key: HBASE-5625 URL: https://issues.apache.org/jira/browse/HBASE-5625 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.92.1 Reporter: Tudor Scurtu Labels: patch Attachments: 5625.txt, 5625v2.txt When calling Result.getValue(), an extra dummy KeyValue and its associated underlying byte array are allocated, as well as a persistent buffer that will contain the returned value. These can be avoided by reusing a static array for the dummy object and by passing a ByteBuffer object as a value destination buffer to the read method. The current functionality is maintained, and we have added a separate method call stack that employs the described changes. I will provide more details with the patch. Running tests with a profiler, the reduction of read time seems to be of up to 40%. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira