Re: HBase commit autoflush

Schubert Zhang Wed, 12 Aug 2009 19:15:37 -0700

The current writeBuffer is a ArrayList, maybe use a sorted list can get more
performance?


On Thu, Aug 13, 2009 at 5:06 AM, Jonathan Gray <[email protected]> wrote:

> That is right.
>
> If this is the case, and you are not under tight memory constraints
> client-side, you can significantly increase the size of your write buffer.
>
> JG
>
>
> Schubert Zhang wrote:
>
>> When set autoflush=false, the client-side write buffer is fine to work.
>> But we may not get better performance when random inserting.
>> Since the
>>
>> org.apache.hadoop.hbase.client.HConnectionManager.TableServers.processBatchOfRows()
>> will still send the random rows to different server respectively.
>>
>>
>> In the book:OReilly.Hadoop.The.Definitive.Guide.June.2009
>> <--
>> By default, each HTable.commit(BatchUpdate) actually performs the insert
>> without
>> any buffering. You can disable HTable auto-flush feature using
>> HTable.setAuto
>> Flush(false) and then set the size of configurable write buffer. When the
>> inserts
>> committed fill the write buffer, it is then flushed. Remember though, you
>> must call
>> a manual HTable.flushCommits() at the end of each task to ensure that
>> nothing is
>> left unflushed in the buffer. You could do this in an override of the
>> mapper’s
>> close() method.
>> -->
>>
>>

Re: HBase commit autoflush

Reply via email to