That is right.
If this is the case, and you are not under tight memory constraints
client-side, you can significantly increase the size of your write buffer.
JG
Schubert Zhang wrote:
When set autoflush=false, the client-side write buffer is fine to work.
But we may not get better performance when random inserting.
Since the
org.apache.hadoop.hbase.client.HConnectionManager.TableServers.processBatchOfRows()
will still send the random rows to different server respectively.
In the book:OReilly.Hadoop.The.Definitive.Guide.June.2009
<--
By default, each HTable.commit(BatchUpdate) actually performs the insert
without
any buffering. You can disable HTable auto-flush feature using
HTable.setAuto
Flush(false) and then set the size of configurable write buffer. When the
inserts
committed fill the write buffer, it is then flushed. Remember though, you
must call
a manual HTable.flushCommits() at the end of each task to ensure that
nothing is
left unflushed in the buffer. You could do this in an override of the
mapper’s
close() method.
-->