Thanks for your findings.
The second finding is more interesting. According to the javadoc:
// if this isnt the row we are interested in, then bail:
firstKv should be replaced by kv.
Please file a JIRA.
On Wed, Jul 20, 2011 at 7:49 AM, N Keywal <[email protected]> wrote:
> Hello,
>
> Some words on the context: We're thinking about using HBase for a product
> we're developping. For this reason, I am currently looking at HBase source
> code to understand how to debug & modify HBase. To start with something
> simple but useful, I am looking for performance improvement by profiling
> hbase during the execution of the unit tests. I expect that many of the
> hotspots found on the unit tests are also hotspots in real production. I
> plan to spend around 10 m.d on this until september.
>
>
> The method regionserver.MemStore/updateColumnValue seems quite used, and is
> ultimatly responsible of 30% of the time in the test subsets I am using.
>
>
> There is bit of it that can be optimized easily by changing the conditions
> order:
>
> if (firstKv.matchingQualifier(kv)) {
> if (kv.getType() == KeyValue.Type.Put.getCode()) {
> now = Math.max(now, kv.getTimestamp());
> }
> }
>
> becomes:
> if (kv.getType() == KeyValue.Type.Put.getCode() &&
> kv.getTimestamp() > now &&
> firstKv.matchingQualifier(kv)) {
> now = kv.getTimestamp();
> }
>
> As comparing the qualifier is much more expensive, we put it at the end.
> It improve the performances by 3% (i.e: total execution time lowered by
> 3%).
>
>
> So first question: would you be interested by a patch for this kind of
> stuff?
>
>
>
> Second question (more technical...): in this method
> (regionserver.MemStore/updateColumnValue), I see:
>
> KeyValue firstKv = KeyValue.createFirstOnRow(
> row, family, qualifier);
>
> [...]
> while (it.hasNext()) {
> KeyValue kv = it.next();
>
> // if this isnt the row we are interested in, then bail:
> if (!firstKv.matchingColumn(family, qualifier) ||
> !firstKv.matchingRow(kv)) {
> break; // rows dont match, bail.
> }
>
> [...]
> }
>
> For the test "firstKv.matchingColumn(family, qualifier)", I don't see:
> 1) Why it is tested in the loop, as firstKv is not modified, the result
> won't change.
> 2) How the result can be 'false', as firstKv is inialized with the family
> and the parameters.
>
> Or is it shared for update a way or another?
>
> If we can remove it, we gain another 2%...
>
>
> N.
>