RE: HBase slowdown while running MR job with GET

Panagiotis Antonopoulos Wed, 07 Sep 2011 10:31:53 -0700















St.Ack you are always helping us!
Thank you very much!!!

The cluster has an NFS where the default directory of all users is saved. (when 
I log in my working directory is in the NFS)
I have Hadoop and HBase in the local filesystem of each node. However, is there 
any possibility that HBase uses the NFS?
Should I use any other parameters than those below?

hbase-site.xml is the following:
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://clone11:9000/hbase</value>
    <description>The directory shared by RegionServers.
    </description>
  </property>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
    <description>The mode the cluster will be in. Possible values are
      false: standalone and pseudo-distributed setups with managed Zookeeper
      true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
    </description>
  </property>

 <property>
    <name>hbase.zookeeper.quorum</name>
    <value>clone11</value>
    <description>A
    </description>
  </property>

 <property>
    <name>hbase.zookeeper.property.clientPort</name>
    <value>2181</value>
    <description>A
    </description>
  </property>

 <property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/local/panton/hadoop-0.20.2-cdh3u0/dfs/zoo</value>
    <description>A
    </description>
  </property>


>> You mean heap used? 

Yes you are totally right. I got mixed up.

>>Are all maps in flight when some complete in 2 minutes?

Yes. Always 48 maps are running.

>>What is happening with i/o as we go from 2-15 minutes?  Is it going up as 
>>time progresses?   What about the network?   Does iowait go up as job 
>>progresses?

How will I monitor the i/o, and the network? Can you please tell me a tool? I 
am not the admin of the cluster so maybe I will have to ask the admins to 
instal it.

i/o wait (using top command) is generally steady and around 0-15% even when the 
tasks take much longer, going up only for a couple of moments. 
Idle percentage is always really high.

>>What is the map doing?  A get only?   Or is it also populating the cluster so 
>>more data in the system when maps are taking longer to complete.

They do a GET to load a string from HBase and compare it with a string that 
comes as input. 
 When I use an empty table with the GET, The cpu usage is really high, i/o wait 
is low, the map tasks are completed much faster and the time is steady for all 
map tasks. 

On the other hand, if I keep the GET with the normal table (which has many 
rows) and remove all context.write() commands the problem remains. However the 
problems gets a bit smaller as the first tasks need 2-3 mins and the next need 
about 6-7 mins.

This is why I believe it has to do with HBase and the GET. Do you think this is 
a correct assumption?

>> Do you have many regions?  Are they evenly distributed, etc.
Yes I always take care to split the table and have the regions evenly 
distributed.






> Date: Wed, 7 Sep 2011 08:32:29 -0700
> Subject: Re: HBase slowdown while running MR job with GET
> From: [email protected]
> To: [email protected]
> 
> 2011/9/7 Panagiotis Antonopoulos <[email protected]>:
> > Although the map tasks which run first complete fast (in 2 minutes for 
> > example) then the next map tasks need much more time to complete (4mins) 
> > and even later the following map tasks need more that 15 mins to complete.
> >
> 
> Are all maps in flight when some complete in 2 minutes?  What is
> happening with i/o as we go from 2-15 minutes?  Is it going up as time
> progresses?   What about the network?   What is the map doing?  A get
> only?   Or is it also populating the cluster so more data in the
> system when maps are taking longer to complete.  Do you have many
> regions?  Are they evenly distributed, etc.
> 
> > It seems like HBase overloads and cannot respond fast enough.
> >
> > While the MR job is running I have noticed the following:
> >
> > 1) The cpu usage of the map tasks is high at the beginning and then goes 
> > down to 4-5%. I think that this means that the results of the GET command 
> > take long to be returned.
> >
> 
> This could be.  Does iowait go up as job progresses?
> 
> > 2) The used stack of the RegionServers (as shown in the web GUI) increases 
> > and it doesn't decrease even when the job is completed.
> >
> 
> You mean heap used?  Yeah, thats general tendency of java apps.  There
> is no 'shrink' of the allocated heap when done facility.
> 
> 
> > 3) Using the "top" command, I see that the memory used by the regionserver 
> > increases up to the stack limit I have selected (2GB) and it doesn't go 
> > down even when the job is completed.
> >
> 
> See above.
> St.Ack
RE: HBase slowdown while running MR job with GET

Reply via email to