That explains it. Thanks! On Thu, Jan 15, 2009 at 2:11 PM, Jean-Daniel Cryans <[email protected]>wrote:
> Larry, > > This feature was done for 0.19.0 for which a release candidate is on the > way. > > J-D > > On Thu, Jan 15, 2009 at 2:03 PM, Larry Compton > <[email protected]>wrote: > > > I'm interested in trying this, but I'm not seeing "setAutoFlush()" and > > "setWriteBufferSize()" in the "HTable" API (I'm using HBase 0.18.1). > > > > Larry > > > > On Sun, Jan 11, 2009 at 5:11 PM, Ryan Rawson <[email protected]> wrote: > > > > > Hi all, > > > > > > New user of hbase here. I've been trolling about in IRC for a few days, > > and > > > been getting great help all around so far. > > > > > > The topic turns to importing data into hbase - I have largeish datasets > I > > > want to evaluate hbase performance on, so I've been working at > importing > > > said data. I've managed to get some impressive performance speedups, > and > > I > > > chronicled them here: > > > > > > > > > > > > http://ryantwopointoh.blogspot.com/2009/01/performance-of-hbase-importing.html > > > > > > To summarize: > > > - Use the Native HBASE API in Java or Jython (or presumably any JVM > > > language) > > > - Disable table auto flush, set write buffer large (12M for me) > > > > > > At this point I can import a 18 GB, 440m row comma-seperated flat file > in > > > about 72 minutes using map-reduce. This is on a 3 node cluster all > > running > > > hdfs,hbase,mapred with 12 map tasks (4 per). This hardware is loaner > DB > > > hardware, so once I get my real cluster I'll revise/publish new data. > > > > > > I look forward to meeting some of you next week at the hbase meetup at > > > powerset! > > > > > > -ryan > > > > > > -- Larry Compton SRA International 240.373.5312 (APL) 443.742.2762 (cell)
