Comparing vmgump with brutus, offloading services to loki...

Leo Simons Tue, 03 May 2005 15:03:58 -0700

Hi gang,

   Elapsed Time 1 hour 2 mins 24 secs

vs

   Elapsed Time 54 mins 40 secs

can you tell which is which? The bottom one is vmgump [1]. Both times are real short since xml-xerces has been failing, and relatively more time is spent in io-wait. For a more full run, brutus seems to be up to 15% faster.

So I got gump set up and running in its own vm.

A preliminary conclusion seems to be that gump-on-vmware performance is quite acceptable. Gump switches between waiting for network io (where the other end is the bottleneck 99% of the time) and doing heavy compile stuff (java processes taking 100% of a CPU if they can). Memory usage rarely seems to get above 500 megs, though /proc/meminfo gives a

HighTotal:     1179584 kB

meaning gump will most likely spend quite a bit swapping stuff in and out at some point during its run if you give it less than a GB of memory. It would be interesting to collect mem usage data and stuff. Might be a good excuse to learn DTrace and use a solaris zone :-)

Disk usage atm (one profile):

[EMAIL PROTECTED]:~]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             133M   53M   73M  43% /
tmpfs                1011M  4.0K 1011M   1% /dev/shm
/dev/sda7             361M  8.1M  334M   3% /tmp
/dev/sda5             4.6G  423M  4.0G  10% /usr
/dev/sda6             2.8G  377M  2.3G  15% /var
/dev/sda9              21G  7.5G   13G  39% /x1

so a single gump profile still fits within 10GB. Contrast with brutus though:

[EMAIL PROTECTED]:~]$ df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             7.4G  5.7G  1.3G  82% /
tmpfs                1011M     0 1011M   0% /dev/shm
/dev/sda6              25G   14G  9.4G  59% /home
/dev/sdb6              32G   18G   13G  60% /usr

clearly, disk space management on brutus could be a little tighter, and clearly, gump expands over time, filling up with lots of logs and the like.

Actually watching gump churn, there are some stretches where it nicely utilizes dual CPUs, but most of the time there's just one process running, completely filling up a CPU.

So, I think atm these are our absolute minimum requirements for gump runnning one "profile" on a dedicated machine building all of ASF software:

 * 1 GHz CPU
 * 512MB ram
 * 12 GB of disk
 * Excellent network connectivity (100MBit to cvs and svn services
   please!)

And the recommended setup is something like this:

 * dual 3+ GHz CPU
 * 4GB ram
 * 100GB of fast disk (I'm guessing esp. access time matters)
 * internal gigabit connection to cvs and svn server

on which you could comfortably run all our profiles on top of vmware and have some room for growth. More hardware than that could only be used to run the profiles more often, which would be nice, but really isn't neccessary.

However, that's not what we have available to us.

Gump on loki setup ================== The basic idea is that we get 1 VM running httpd and mysql (and maybe mod_python or whatever), and 2 VMs running 2 of the gump "profiles" currently on brutus (leaving 2 more on brutus). In addition we get some extra disk space which we share between these VMs.

I think a realistic and usable config of gump on loki could look like this:

Virtual Machines
----------------
 * vmgump.apache.org
   * Debian Stable
   * 3GB / disk
   * 3GB /x2 disk
   * memory
     * minimum 128MB
     * maximum 256MB
   * CPU #0
     * minimum 10
     * maximum 80

 Runs our mysql and http and possibly webapps. Has a public IP.

 * vm1.gump
   * Debian Unstable
   * 3GB / disk
   * memory
     * minimum 256MB
     * maximum 1024MB
   * CPU #0
     * minium 0
     * maximum 90

 Runs our "public" gump profile.

 * vm2.gump
   * Debian Unstable
   * 3GB / disk
   * memory
     * minimum 256MB
     * maximum 1024MB
   * CPU #0
     * minium 0
     * maximum 90

 Runs our "testing" gump profile.

Other resources
---------------
 * .gump internal virtual network
   * shared between all gump instances
   * vmgump.apache.org needs to act as http reverse proxy to the other
     vms

 * network bridge
   * allow access for all instances to the internet

 * /x1 disk shared between vmgump, vm1 and vm2
   * "as big as possible", ie 100+ GB would be nice
   * realistically, chuck us 15GB for now
   * potential for offloading onto our 1TB disk array or
     onto minotaur via NFS or whatever
   * /home and all gump data can be shared between all gump
     instances including vmgump, potentially /usr as well.

 * more disk space!
   * room for backing up 3GB-sized snapshots. I.e. a "hotspare" virtual
     disk.

Notes
-----
 * overcommit!
   * gump only needs lots of network or lots of CPU or lots of memory
     at certain relatively short stages in its lifecycle, most of the
     time it needs a lot less than the peak amount. Clever scheduling
     among the gump VMs means we can overcommit rather a lot for CPU and
     memory and still ensure snappy runs.

 * keep basic web presence
   * reserve CPU and memory for vmgump so it stays available if the
     other vms make a mess

 * play nice with other services
   * put Gump on one CPU, keeping the other free for other stuff.
   * allow gump to scale back quite a bit if there's resource
     contention. It'll likely be swapping like hell and some builds may
     be problematic but that is acceptable.
   * we really do need to use a lot of that memory.

 * it doesn't make sense to run all our brutus profiles on loki
   * no resources left for anything but gump
   * we can migrate the other profiles from brutus to the dual G5 we're
     getting and into solaris zones

 * this leaves completely free for other stuff:
   * nearly 1 full CPU (console takes up ~10%?)
   * 2GB of uncontested memory
   * +/- 15GB of disk?
     * we're at 27Gb in the scheme above, keeping some room free for
       admin and vmware itself, there's simply not a whole lot to spare
     * potential for 15GB more when gump gets space for its data
       elsewhere

 * scaling back much further than this probably isn't a very good idea.
   We really do seem to need the memory and the disk. Scaling back
   further on CPU is an option, but doesn't make all that much sense
   since there's just not more memory left.

Gump on Solaris setup ===================== If we get a similar amount of resources dedicated to us on helios, we can move the remaining "profiles" to run on top of that. That would free up brutus for other tasks (ie mail server backup and/or freebsd build box).

Gump on Dual G5 setup
=====================
Gump3 anyone? :-)

What do you guys think? Comments? Suggestions? I'd like to gather a round of input here then discuss it with Berin and the rest of [EMAIL PROTECTED]


cheers,

Leo

[1] http://vmgump.apache.org/gump/public/


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Comparing vmgump with brutus, offloading services to loki...

Reply via email to