dan...@zoltak.com:
> I down graded the kernel to 2.6.31-gentoo-r6, applied the AUFS patch  
> and compiled the kernel and module.
>
> I then booted a node without AUFS loaded and ran it in an 8 node  
> cluster where the other nodes were running on the pre upgraded image.
>
> The rootfs was a tmpfs in both configurations i.e. the configuration  
> is identical onces the nodes are booted. The only major difference is  
> the boot method on the new image used Dracut instead of nash/mkinitrd.

Unfortunately I don't fully understand what your environment is. But I
am afraid it is not so important as long as you say "there is no
difference except using aufs."


> Comparing the nodes the load graphs are almost identical:
>
>   Refer to old_setup.png and root_tmpfs_no_aufs.png.

All the graphs you attached are shrinked? All test labels are
un-readble.


> old_setup load average 1.68, 1.71, 1.74. Also note the load is quite smooth.

Still I don't understand why you focus the load ave so strongly.
As you might know, the load averages indicates the number of runnable
tasks. Since your system has 8 cores, when the load average reaches 8,
then each cpu has a task to run. As long as the load average is under 8,
your cpu must be idle, ie. there is no task to run.

You might misunderstand the performance and the load average.
As I wrote before, I'd strongly suggest you to measure the turn-around
time on your http client. It must be the first performance indicator of
your http server.
If it becomes worse (longer time) when you use aufs on your http server,
then we should investigate why, by cpu usage, network i/o, local disk
i/o, memory usage, lock contention or something.
I agree the high load average means the longer turn-around time, ie. bad
performance. But you have never show the "performance." Just the load
average and the apache process is keep running. If there is something
strange in the behaviour of apache, then we should investigate it. But
we could not find such problem, could we?


> Note: Apache runs within the /chroot folder.

Is it same to the /proc/mounts you posted previously?


> root_Aufs load average spiked after a few minutes to 600 and stayed  
> there. The box was still responsive however dmesg showed the following  
> error:

If the number is correct while there are not so many processes, then the
load average is simply bogus.


> Call Trace:
>   [<ffffffff810afe5f>] ? 0xffffffff810afe5f
>   [<ffffffff8109c024>] ? 0xffffffff8109c024
        :::

You should convert these addresses into kernel function names by
ksymoops or something. But I don't know you have such tool provided by
gentoo.


J. R. Okajima

------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d

Reply via email to