The current writeBuffer is a ArrayList, maybe use a sorted list can get more performance?
On Thu, Aug 13, 2009 at 5:06 AM, Jonathan Gray <[email protected]> wrote: > That is right. > > If this is the case, and you are not under tight memory constraints > client-side, you can significantly increase the size of your write buffer. > > JG > > > Schubert Zhang wrote: > >> When set autoflush=false, the client-side write buffer is fine to work. >> But we may not get better performance when random inserting. >> Since the >> >> org.apache.hadoop.hbase.client.HConnectionManager.TableServers.processBatchOfRows() >> will still send the random rows to different server respectively. >> >> >> In the book:OReilly.Hadoop.The.Definitive.Guide.June.2009 >> <-- >> By default, each HTable.commit(BatchUpdate) actually performs the insert >> without >> any buffering. You can disable HTable auto-flush feature using >> HTable.setAuto >> Flush(false) and then set the size of configurable write buffer. When the >> inserts >> committed fill the write buffer, it is then flushed. Remember though, you >> must call >> a manual HTable.flushCommits() at the end of each task to ensure that >> nothing is >> left unflushed in the buffer. You could do this in an override of the >> mapper’s >> close() method. >> --> >> >>
