I am not an hbase expert so you might get better results asking on their mailing lists, then on the MR mailing lists.
My first question would be with any performance problem would be to look for the resource bottlenecks. What type of networking are you using? How many spindles (disks) per box do you have configured? How much RAM is on each box and how much is configured for hbase? How much of each of there resources are being used on the various boxes when running your job? How large are your batch updates? --Bobby Evans From: Farrokh Shahriari <mohandes.zebeleh...@gmail.com<mailto:mohandes.zebeleh...@gmail.com>> Reply-To: "hdfs-user@hadoop.apache.org<mailto:hdfs-user@hadoop.apache.org>" <hdfs-user@hadoop.apache.org<mailto:hdfs-user@hadoop.apache.org>> Date: Saturday, January 5, 2013 11:20 PM To: "cdh-u...@cloudera.org<mailto:cdh-u...@cloudera.org>" <cdh-u...@cloudera.org<mailto:cdh-u...@cloudera.org>>, "hdfs-user@hadoop.apache.org<mailto:hdfs-user@hadoop.apache.org>" <hdfs-user@hadoop.apache.org<mailto:hdfs-user@hadoop.apache.org>> Subject: Tune MapReduce over HBase to insert data Hi there I have a cluster with 12 nodes that each of them has 2 core of CPU. Now,I want insert large data about 2Gb in 80 sec ( or 6Gb in 240sec ). I've used Map-Reduce over hbase,but I can't achieve proper result . I'd be glad if you tell me what I can do to get better result or which parameters should I config or tune to improve Map-Reduce/Hbase performance ? Tnx