[jira] Commented: (HBASE-3165) some performance things i did

ryan rawson (JIRA) Thu, 28 Oct 2010 13:35:42 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925941#action_12925941
 ]


ryan rawson commented on HBASE-3165:
------------------------------------

the problem is the code is pretty ugly and creates a 2nd body of serialization 
code.  I tried a lot of things here, and this is just a dump of what I did.  I 
need to change my measurement strategy, and test to see which one of the 2-3 
approaches works the best with minimal icky-code addition.  For example the 
final attempt made it so that Result used the ByteBuffer interface directly, 
thus ending up with 2 implementations of the serialization (but only 1 of the 
deserialization). I also have a ByteBufferOutputStream which translates 
OutputStream writes into BB writes and that would probably be a better from 
code maintainability, and it might be as fast as using BB directly.  I want 
proof of this instead of guessing. Sounds reasonable?

> some performance things i did
> -----------------------------
>
>                 Key: HBASE-3165
>                 URL: https://issues.apache.org/jira/browse/HBASE-3165
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ryan rawson
>            Assignee: ryan rawson
>         Attachments: HBASE-2165-2.txt, HBASE-2165.txt
>
>
> in an attempt to improve the profile of the serialization of results in the 
> regionserver side I did a large number of things to reduce buffer copies, 
> improve the API usage efficiency (using the BB API directly) and so on.
> Using a YCSB config like so:
> recordcount=10000
> #recordcount=5
> operationcount=1000
> workload=com.yahoo.ycsb.workloads.CoreWorkload
> readallfields=true
> readproportion=0
> updateproportion=0
> scanproportion=1
> insertproportion=0
> fieldlength=10
> fieldcount=100
> requestdistribution=zipfian
> scanlength=300
> scanlengthdistribution=zipfian
> threadcount=1
> columnfamily=data
> Doing a medium sized scan of 1-300 rows.
> Top line performance was at about 67ms, but these micro improvements didnt 
> budge that needle, and it didnt change the scale of the CPU profiler - ie: 
> cpu time spent in serialization was the same.
> Since then I also made an improvement to HBase-YCSB which may have been 
> masking the performance gains.  I have suspended this work in favor of 0.90 
> pre-release work for now.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3165) some performance things i did

Reply via email to