[ 
https://issues.apache.org/jira/browse/HBASE-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13580743#comment-13580743
 ] 

Matt Corgan commented on HBASE-7868:
------------------------------------

I have a decent start on a benchmark that tests many different combinations of 
inputs like blockSize, encoding, compression, keyLength, commonPrefixLength, 
valueLength.  You can either generate fake test data or provide an existing 
HFile.  It tests scans and seeks and outputs a summary of performance and 
memory/disk usage at the end so you can find the best settings for your use 
case.

It's lurking somewhere in my git repo.  I was planning to dig it up at the 
meetup tomorrow and get it working again.  Maybe we can combine all these 
benchmarks somehow.
                
> HFile performance regression between 0.92 and 0.94
> --------------------------------------------------
>
>                 Key: HBASE-7868
>                 URL: https://issues.apache.org/jira/browse/HBASE-7868
>             Project: HBase
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.94.5
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>             Fix For: 0.94.6
>
>         Attachments: FilteredScan.png, hfileperf-graphs.png, 
> performances.pdf, performances.pdf
>
>
> By HFilePerformanceEvaluation seems that 0.94 is slower then 0.92
> Looking at the profiler for the Scan path, seems that most of the time, 
> compared to 92, is spent in the metrics dictionary lookup. [~eclark] pointed 
> out the new per family/block metrics.
> By commenting the metrics call in HFileReaderV2, the performance seems to get 
> better, but maybe metrics is not the only problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to