Bayard Bell wrote:
CPU isn't that easily "given" to the VM, as RAM pages, for example. VirtualBox
internally need to run few threads doing disk/network IO. Same situation with host OS
too, so essentially some experiments is the best way to figure out how many vCPUs is
reasonable to give to the guest to get best performance.
Any suggestions as to how to go about that methodically?
Well, just try some representative subset (10-20 mins of compilation)
with 1,2,3,4... vCPUs and see the result :).
Could be easily automated with vboxshell and guest commands execution
facility.
What I know is that the run queue seems to back up to the point of crushing the
host if I provide only two vCPUs, while with 4 vCPUs, I only seem to get
consumption of 2 actual CPUs. I've got a slight further wrinkle, insofar as the
default behaviour of the build environment is to look at the number of CPUs and
amount of memory and decide for itself what the appropriate level of
parallelism is, although I can work around this by setting a fixed value before
experimenting with CPU count. Just to give this a bottom line, if I haven't
mentioned this previously: I've got a compile job that normally takes at most
few hours on comparable bare metal, and it's taking several days under VBox.
Resolving this is the difference between being able to get acceptably slower
performance under VBox and needing to sort myself out with a separate system.
Is project you're compiling open source? This could make analysis simpler.
With compilation, especially if you compile a lot of small files, significant
part of load is fork/exec performance (and so, VMM in the guest), and of
course, IO does matter too.
The I/O is trivial, but what I'm gathering is that the CPU overhead of the
system calls is increased considerably. I don't see a lot of fork and exec
load, but what I'm wondering is whether time spent in the kernel would actually
be relatively longer, such that relatively lightweight system calls on a normal
host would add up to a considerably higher percentage of CPU time in a virtual
environment.
Syscalls per se aren't affected much by virtualization, but privileged
operations they perform sometimes are.
Generally, this need deeper analysis, and you may want to try running
same guest on different host OS (ideally with
the same hardware), to see if some host specifics presented.
Also no sure if OSX is best OS to run SMP load in general.
Don't think you really need that. As VBox doesn't do explicit gang scheduling,
some assistance from host scheduler on that would be helpful, not explicit
assignment of CPU affinity. In theory, good scheduler shall gang schedule
threads with the same address space even without additional hints, as this will
likely increase performance. Not sure if OSX does that, although.
Thanks for that info. I'll see if there's any documentation or source to
satisfy my curiosity on this point. It might also be useful to see what DTrace
can tell me. Does VBox have its own DTrace probes to help with these kinds of
problems?
Don't think VBox has much of probes on its own, but even OS traces
could be sufficiently useful.
Nikolay
_______________________________________________
vbox-dev mailing list
[email protected]
http://vbox.innotek.de/mailman/listinfo/vbox-dev