Re: Performance of hbase importing

Larry Compton Thu, 15 Jan 2009 11:03:44 -0800

I'm interested in trying this, but I'm not seeing "setAutoFlush()" and
"setWriteBufferSize()" in the "HTable" API (I'm using HBase 0.18.1).


Larry

On Sun, Jan 11, 2009 at 5:11 PM, Ryan Rawson <[email protected]> wrote:

> Hi all,
>
> New user of hbase here. I've been trolling about in IRC for a few days, and
> been getting great help all around so far.
>
> The topic turns to importing data into hbase - I have largeish datasets I
> want to evaluate hbase performance on, so I've been working at importing
> said data.  I've managed to get some impressive performance speedups, and I
> chronicled them here:
>
>
> http://ryantwopointoh.blogspot.com/2009/01/performance-of-hbase-importing.html
>
> To summarize:
> - Use the Native HBASE API in Java or Jython (or presumably any JVM
> language)
> - Disable table auto flush, set write buffer large (12M for me)
>
> At this point I can import a 18 GB, 440m row comma-seperated flat file in
> about 72 minutes using map-reduce.  This is on a 3 node cluster all running
> hdfs,hbase,mapred with 12 map tasks (4 per).  This hardware is loaner DB
> hardware, so once I get my real cluster I'll revise/publish new data.
>
> I look forward to meeting some of you next week at the hbase meetup at
> powerset!
>
> -ryan
>

Re: Performance of hbase importing

Reply via email to