I am not an hbase expert so you might get better results asking on their 
mailing lists, then on the MR mailing lists.

My first question would be with any performance problem would be to look for 
the resource bottlenecks. What type of networking are you using?  How many 
spindles (disks) per box do you have configured?  How much RAM is on each box 
and how much is configured for hbase?  How much of each of there resources are 
being used on the various boxes when running your job? How large are your batch 
updates?

--Bobby Evans

From: Farrokh Shahriari 
<mohandes.zebeleh...@gmail.com<mailto:mohandes.zebeleh...@gmail.com>>
Reply-To: "hdfs-user@hadoop.apache.org<mailto:hdfs-user@hadoop.apache.org>" 
<hdfs-user@hadoop.apache.org<mailto:hdfs-user@hadoop.apache.org>>
Date: Saturday, January 5, 2013 11:20 PM
To: "cdh-u...@cloudera.org<mailto:cdh-u...@cloudera.org>" 
<cdh-u...@cloudera.org<mailto:cdh-u...@cloudera.org>>, 
"hdfs-user@hadoop.apache.org<mailto:hdfs-user@hadoop.apache.org>" 
<hdfs-user@hadoop.apache.org<mailto:hdfs-user@hadoop.apache.org>>
Subject: Tune MapReduce over HBase to insert data

Hi there
I have a cluster with 12 nodes that each of them has 2 core of CPU. Now,I want 
insert large data about 2Gb in 80 sec ( or 6Gb in 240sec ). I've used 
Map-Reduce over hbase,but I can't achieve proper result .
I'd be glad if you tell me what I can do to get better result or which 
parameters should I config or tune to improve Map-Reduce/Hbase performance ?

Tnx

Reply via email to