[jira] Commented: (HBASE-2251) PE defaults to 1k rows - uncommon use case, and easy to hit benchmarks

Todd Lipcon (JIRA) Tue, 23 Feb 2010 09:10:49 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837317#action_12837317
 ]


Todd Lipcon commented on HBASE-2251:
------------------------------------

bq. Then I can write a Hudson plugin that fails a build if performance is out 
of line beyond some threshold. What do you think?

Even in the absence of automatically failing builds, Hudson has a facility to 
easily generate a graph with build # on the x axis and arbitrary data on the y 
axis - you just have to generate the data in .properties format for each build. 
At a web company I worked for in the past, we had graphs for # db queries, # 
cache queries, page load time, etc, for each of the important pages on the 
site. It was very easy to spot bad commits, but also easy to see if we were 
inching up slowly over time (even more insidious than a bad commit imo).

> PE defaults to 1k rows - uncommon use case, and easy to hit benchmarks
> ----------------------------------------------------------------------
>
>                 Key: HBASE-2251
>                 URL: https://issues.apache.org/jira/browse/HBASE-2251
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: ryan rawson
>             Fix For: 0.20.4, 0.21.0
>
>
> The PerformanceEvaluation uses 1k rows, which I would argue is uncommon, and 
> also provides an easy to hit performance goal.  Most of the harder 
> performance issues happens at the low and high side of cell size.  In our own 
> application, our key sizes range from 4 bytes to maybe 100 bytes.  Very 
> rarely 1000 bytes.  If we have large values, they are VERY large, like 
> multiple k sizes.
> Recently a change went into HBase that ran well with PE because the overhead 
> of 1k rows is very low in memory, but under small rows, the expected 
> performance would be hit much more.  This is because the per-value overhead 
> (eg: node objects of the skip list/memstore) is amortized more with 1k 
> values. 
> We should make this a tunable setting, and have a low default.  I would argue 
> for a 10-30 byte default.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2251) PE defaults to 1k rows - uncommon use case, and easy to hit benchmarks

Reply via email to