2011/9/7 Panagiotis Antonopoulos <[email protected]>:
> Although the map tasks which run first complete fast (in 2 minutes for 
> example) then the next map tasks need much more time to complete (4mins) and 
> even later the following map tasks need more that 15 mins to complete.
>

Are all maps in flight when some complete in 2 minutes?  What is
happening with i/o as we go from 2-15 minutes?  Is it going up as time
progresses?   What about the network?   What is the map doing?  A get
only?   Or is it also populating the cluster so more data in the
system when maps are taking longer to complete.  Do you have many
regions?  Are they evenly distributed, etc.

> It seems like HBase overloads and cannot respond fast enough.
>
> While the MR job is running I have noticed the following:
>
> 1) The cpu usage of the map tasks is high at the beginning and then goes down 
> to 4-5%. I think that this means that the results of the GET command take 
> long to be returned.
>

This could be.  Does iowait go up as job progresses?

> 2) The used stack of the RegionServers (as shown in the web GUI) increases 
> and it doesn't decrease even when the job is completed.
>

You mean heap used?  Yeah, thats general tendency of java apps.  There
is no 'shrink' of the allocated heap when done facility.


> 3) Using the "top" command, I see that the memory used by the regionserver 
> increases up to the stack limit I have selected (2GB) and it doesn't go down 
> even when the job is completed.
>

See above.
St.Ack

Reply via email to