William Clay Moody wrote:

stack wrote:
--------------------
> Are you using TableOutputFormat to populate hbase? If so, have you seen
> this document, http://wiki.apache.org/hadoop/Hbase/MapReduce, and in
> particular the discussion in the last paragraph of the section 'Hbase as
> MapReduce job data source and sink'?  You might be able to lighten the
> general load doing the hbase insert in the MR Map.


I am using TableOutputFormat and HBase is the data sink. Our data source is a set of TextInputFile in HDFS. From my reading of the discussion, its seems you can only write to the HBase from the map if you are using TableInputFormat.

See Andrew's nice note just previous to this one (I've added it to our MR wiki page).

>I agree with Andrew's assesssment.  You might try just running just one
> or two TT's on nodes that do not also run regionservers and datanodes.

I will give that a shot. I have tripled the lease times on both the master and HRS side and begun import data such the target Regions are different and that has lighted the load on the HRS. I also decreased the memcache flush size from 64 MB to 8 MB since we were using 900 MB of swap. The jobs now run to completion and swap only gets to 150 MB. Is there any other memory setting for HBase that I can adjust to overcome my lack or RAM (1GB only)

There is the general heap size. It defaults to 1G. You set it in $HBASE_HOME/conf/hbase-env.sh. Try setting it to 850M, or 768M. You might OOME but if you could avoid swapping, your sailing will be smoother.

St.Ack

Reply via email to