OK, I am fairly new here (to OpenStack). Maybe I am missing something. Or

Have a DevStack, running in a VM (VirtualBox), backed by a single flash
drive (on my current generation MacBook). Could be I have something off in
my setup.

Testing nova backup - first the existing implementation, then my (much
changed) replacement.

Simple scripts for testing. Create images. Create instances (five). Run
backup on all instances.

Currently found in:

First time I started backups of all (five) instances, load on the Devstack
VM went insane, and all but one backup failed. Seems that all of the
backups were performed immediately (or attempted), without any sort of
queuing or load management. Huh. Well, maybe just the backup implementation
is naive...

I will write on this at greater length, but backup should interfere as
little as possible with foreground processing. Overloading a host is
entirely unacceptable.

Replaced the backup implementation so it does proper queuing (among other
things). Iterating forward - implementing and testing.

Fired off snapshots on five Cinder volumes (attached to five instances).
Again the load shot very high. Huh. Well, in a full-scale OpenStack setup,
maybe storage can handle that much I/O more gracefully ... or not. Again,
should taking snapshots interfere with foreground activity? I would say,
most often not. Queuing and serializing snapshots would strictly limit the
interference with foreground. Also, very high end storage can perform
snapshots *very* quickly, so serialized snapshots will not be slow. My take
is that the default behavior should be to queue and serialize all heavy I/O
operations, with non-default allowances for limited concurrency.

Cleaned up (which required reboot/unstack/stack and more). Tried again.

Ran two test backups (which in the current iteration create Cinder volume
snapshots). Asked Cinder to delete the snapshots. Again, very high load
factors, and in "top" I can see two long-running "dd" processes. (Given I
have a single disk, more than one "dd" is not good.)

Running too many heavyweight operations against storage can lead to
thrashing. Queuing can strictly limit that load, and insure better and
reliable performance. I am not seeing evidence of this thought in my
OpenStack testing.

So far it looks like there is no thought to managing the impact of disk
intensive management operations. Am I missing something?
OpenStack-dev mailing list

Reply via email to