On 02/15/2016 01:30 AM, Benjamin Bergia wrote:
Hi,

I recently noticed that all the packages used by SDC zones are using some strange settings. All of them, even database ones, are using 1 vCPU with a cap of 400. I can picture in my head the "meaning" of the CPU cap when the cap is lower than the sum of the CPU percentages. So something like 4 vCPU and a cap of 200 make sense to me.

Can somebody explain me what happens with a setting of let say 1 vCPU and a cap of 200 ?

In the SDC case what was the idea behind having this kind of settings ? Does it give any performance/portability/else improvement ?

I am rethinking my packages and where I previously used settings like you would use on vSphere, I am no wondering if I am not doing it totally wrong.

There are 3 important concepts. vCPUs, cpu_shares, and cpu_cap. From the vmadm man page (my further comments below that):

       vcpus:

For KVM VMs this parameter defines the number of virtual CPUs the guest
           will see. Generally recommended to be a multiple of 2.

           type: integer (number of CPUs)
           vmtype: KVM
           listable: yes
           create: KVM only
           update: KVM only (requires VM reboot to take effect)
           default: 1

       cpu_shares:

Sets a limit on the number of fair share scheduler (FSS) CPU shares for a VM. This value is relative to all other VMs on the system, so a value only has meaning in relation to other VMs. If you have one VM with a a value 10 and another with a value of 50, the VM with 50 will get 5x
           as much time from the scheduler as the one with 10 when there is
           contention.

           type: integer (number of shares)
           vmtype: OS,KVM
           listable: yes
           create: yes
           update: yes (live update)
           default: 100

       cpu_cap:

Sets a limit on the amount of CPU time that can be used by a VM. The unit used is the percentage of a single CPU that can be used by the VM.
           Eg. a value of 300 means up to 3 full CPUs.

           type: integer (percentage of single CPUs)
           vmtype: OS,KVM
           listable: yes
           create: yes
           update: yes (live update)


First, note that "vcpus" from a SmartOS perspective only applies to KVM VMs. That setting determines how many processors the VM can see and thus can use to schedule its processes. We'll come back to that.

Zones (both the kind containing the QEMU process for KVM VMs and regular LX and joyent branded ones) can see all of the physical processors and the OS can schedule processes on any of them. If you imagine yourself as a multi-tenant cloud provider you'll quickly realize that you need two things: 1. Fairness (preventing noisy neighbors) when the system is fully loaded. This is what cpu_shares does. If you give every zone the appropriate number of shares they will all get the proportional amount of system CPU when the system is fully loaded. 2. Paying for what you get. On a system that is *not* fully loaded, in theory a zone could use lots and lots of CPUs. Customers would be incentivized to create and destroy zones until they found one that could use lots of free CPU. This is where CPU caps come in. They ensure that on a system that is *not* fully loaded the zone can only burst up to a reasonable amount relative to what the customer is paying. This also helps manage expectations. Setting a CPU cap reasonably close to the amount of CPU that the customer gets when the system *is* fully loaded means that people are less likely to *think* that they are suffering from noisy neighbors (when the delta of how much CPU you get on a fully loaded vs fully unloaded system is small, you see more consistent performance.)

I haven't looked at the details of the SDC packages, but I can confidently say that "vcpu" in the context of a joyent branded zone is an approximation of what to expect based on the shares and the cap (as opposed to a KVM VM where that's literally the number that the VM will see).

So if you have a SDC service zone with "1 vCPU and a cap of 200" then it's getting shares such that when the system is fully loaded it should get approximately 1 CPU's worth of CPU time from the scheduler, but when the system is not fully loaded it should be able to get up to 2 CPUs worth of CPU time from the scheduler but no more. The difference between those two is what the Joyent cloud advertises as "bursting"

Coming back for one last moment to KVM VMs, remember that the QEMU process is running in a zone that can have shares and caps. Additionally, when the VM does I/O, QEMU threads need to be scheduled to do some (overhead) work to make that happen. So in theory you might need your shares and caps to be slightly more than just what the number of vCPUs might otherwise suggest (e.g. for something performance critical that *has* to live in a VM you could imagine having vCPUs be 8, but wanting cpu_cap to be 900 and having shares that give you some extra CPU time when the system is fully loaded.)

Finally let's see if I can answer your questions:

All of them, even database ones, are using 1 vCPU with a cap of 400.
They are configured so that when the system is fully loaded they should still get about 1 CPU's worth of CPU time, but if the system isn't fully loaded they can "burst" up to using 4 CPUs worth but no more.

I can picture in my head the "meaning" of the CPU cap when the cap is lower than the sum of the CPU percentages. So something like 4 vCPU and a cap of 200 make sense to me.
I find that confusing both for KVM VMs or for regular zones. The cap would ensure that you never get more than 2 CPUs worth of compute time and a KVM VM that thinks it has 4 processors but can never get more than 2 processors worth of work done seems like a bad idea.

Can somebody explain me what happens with a setting of let say 1 vCPU and a cap of 200 ?
For a KVM VM the guest would see 1 processor, but would still have headroom for the I/O overhead from QEMU. For a regular zone it's like before, you get 1 CPU when the system is fully loaded, but can burst up to 2 when there are spare cycles (but no more than that).

In the SDC case what was the idea behind having this kind of settings ? Does it give any performance/portability/else improvement?

The punchline comes down to "bursting". If you think your workloads are bursty then you want to leave some extra space in the caps so that the zones can take advantage of otherwise wasted cycles when they need them, but you also want to ensure fairness under load.

Hopefully this was helpful.

-Nahum


-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to