Hi Hari, Could you do some realtime monitoring (htop, iptraf, iostat) and report the results? Also you could add some timers to the map-reduce operations: measure average operations times to figure out what's taking so long.
Cosmin On Oct 29, 2010, at 9:55 AM, Hari Shankar wrote: > Hi, > > We are currently doing a POC for HBase in our system. We have > written a bulk upload job to upload our data from a text file into > HBase. We are using a 3-node cluster, one master which also works as > slave (running as namenode, jobtracker, HMaster, datanode, > tasktracker, HQuorumpeer and HRegionServer) and 2 slaves (datanode, > tasktracker, HQuorumpeer and HRegionServer running). The problem is > that we are getting lower performance from distributed cluster than > what we were getting from single-node pseudo distributed node. The > upload is taking about 30 minutes on an individual machine, whereas > it is taking 2 hrs on the cluster. We have replication set to 3, so > all parts should ideally be available on all nodes, so we doubt if the > problem is network latency. scp of files between nodes gives a speed > of about 12 MB/s, which I believe should be good enough for this to > function. Please correct me if I am wrong here. The nodes are all 4 > core machines with 8 GB RAM. We are spawning 4 simultaneous map tasks > on each node, and the job does not have any reduce phase. Any help is > greatly appreciated. > > Thanks, > Hari Shankar
