Thanks for the confirmation of the order of the slow down. IO and network are doing OK. Main problem appears to be GC, like J-D pointed out.
Thanks, Friso On 16 dec 2010, at 18:26, tsuna wrote: > On Wed, Dec 15, 2010 at 6:44 AM, Friso van Vollenhoven > <[email protected]> wrote: >> On the master UI HBase shows doing between 10K and 50K requests per second >> with quite some drops to almost zero for some amount of time, while without >> WAL for the same job it easily reaches over 100K sustained. > > Those numbers are in line with my experience. Disabling the WAL gives > you a performance boost of about an order of magnitude. It's expected > because no WAL = memory accesses only vs. WAL = HDFS latency. > > If your cluster is grinding to a halt past a certain point, you might > wanna take a look at the IO utilization on your boxes and maybe also > network utilization. How many disks per box do you have? What type > of disks are they? Have you looked at the output of "iostat -xkd 1" > while the cluster was performing poorly? Do you see a lot of iowait > ("wa" column in "top")? > > -- > Benoit "tsuna" Sigoure > Software Engineer @ www.StumbleUpon.com
