Re: HBase not scaling well

Cosmin Lehene Fri, 29 Oct 2010 01:06:10 -0700

Hi Hari, 

Could you do some realtime monitoring (htop, iptraf, iostat) and report the 
results? Also you could add some timers to the map-reduce operations: measure 
average operations times to figure out what's taking so long.


Cosmin
On Oct 29, 2010, at 9:55 AM, Hari Shankar wrote:

> Hi,
> 
>     We are currently doing a POC for HBase in our system. We have
> written a bulk upload job to upload our data from a text file into
> HBase. We are using a 3-node cluster, one master which also works as
> slave (running as namenode, jobtracker, HMaster, datanode,
> tasktracker, HQuorumpeer and  HRegionServer) and 2 slaves (datanode,
> tasktracker, HQuorumpeer and  HRegionServer running). The problem is
> that we are getting lower performance from distributed cluster than
> what we were getting from single-node pseudo distributed node. The
> upload is taking about 30  minutes on an individual machine, whereas
> it is taking 2 hrs on the cluster. We have replication set to 3, so
> all parts should ideally be available on all nodes, so we doubt if the
> problem is network latency. scp of files between nodes gives a speed
> of about 12 MB/s, which I believe should be good enough for this to
> function. Please correct me if I am wrong here. The nodes are all 4
> core machines with 8 GB RAM.  We are spawning 4 simultaneous map tasks
> on each node, and the job does not have any reduce phase. Any help is
> greatly appreciated.
> 
> Thanks,
> Hari Shankar

Re: HBase not scaling well

Reply via email to