Author: stack
Date: Wed Apr 13 04:33:58 2011
New Revision: 1091644
URL: http://svn.apache.org/viewvc?rev=1091644&view=rev
Log:
HBASE-3768 Add best practice to book for loading row key only
Modified:
hbase/trunk/CHANGES.txt
hbase/trunk/src/docbkx/performance.xml
Modified: hbase/trunk/CHANGES.txt
URL:
http://svn.apache.org/viewvc/hbase/trunk/CHANGES.txt?rev=1091644&r1=1091643&r2=1091644&view=diff
==============================================================================
--- hbase/trunk/CHANGES.txt (original)
+++ hbase/trunk/CHANGES.txt Wed Apr 13 04:33:58 2011
@@ -153,6 +153,8 @@ Release 0.91.0 - Unreleased
as a convenience (Erik Onnen via Stack)
HBASE-3769 TableMapReduceUtil is inconsistent with other table-related
classes that accept byte[] as a table name (Erik Onnen via
Stack)
+ HBASE-3768 Add best practice to book for loading row key only
+ (Erik Onnen via Stack)
TASKS
HBASE-3559 Move report of split to master OFF the heartbeat channel
Modified: hbase/trunk/src/docbkx/performance.xml
URL:
http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/performance.xml?rev=1091644&r1=1091643&r2=1091644&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/performance.xml (original)
+++ hbase/trunk/src/docbkx/performance.xml Wed Apr 13 04:33:58 2011
@@ -199,5 +199,16 @@ htable.close();</programlisting></para>
<varname>false</varname>. For frequently accessed rows, it is advisable
to use the block
cache.</para>
</section>
+ <section xml:id="perf.hbase.client.rowkeyonly">
+ <title>Optimal Loading of Row Keys</title>
+ <para>When performing a table <link
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">scan</link>
+ where only the row keys are needed (no families, qualifiers,
values or timestamps), add a FilterList with a
+ <varname>MUST_PASS_ALL</varname> operator to the scanner using
<methodname>setFilter</methodname>. The filter list
+ should include both a <link
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html">FirstKeyOnlyFilter</link>
+ and a <link
xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html">KeyOnlyFilter</link>.
+ Using this filter combination will result in a worst case scenario
of a region server reading a single value from disk
+ and minimal network traffic to the client for a single row.
+ </para>
+ </section>
</section>
</chapter>