HBase commit autoflush

Schubert Zhang Wed, 12 Aug 2009 12:02:25 -0700

When set autoflush=false, the client-side write buffer is fine to work.
But we may not get better performance when random inserting.
Since the
org.apache.hadoop.hbase.client.HConnectionManager.TableServers.processBatchOfRows()
will still send the random rows to different server respectively.



In the book:OReilly.Hadoop.The.Definitive.Guide.June.2009
<--
By default, each HTable.commit(BatchUpdate) actually performs the insert
without
any buffering. You can disable HTable auto-flush feature using
HTable.setAuto
Flush(false) and then set the size of configurable write buffer. When the
inserts
committed fill the write buffer, it is then flushed. Remember though, you
must call
a manual HTable.flushCommits() at the end of each task to ensure that
nothing is
left unflushed in the buffer. You could do this in an override of the
mapper’s
close() method.
-->

HBase commit autoflush

Reply via email to