Thank you, it does make more sense now. It also looks like I have been doing it all wrong since the beginning.
Three points are still bothering me and I would like to get a confirmation if anybody as any idea. First, in the papi documentation <https://github.com/joyent/sdc-papi/blob/master/docs/index.md#formulas>, there is some obscure calculations regarding the cpu cap. Are these formulas up to date ? If yes what is the "burst ratio" used to calculate the cpu_burst_ratio. Then, the same documentation talk about cpu_burst_ratio, fss, ram_ratio. After creating a new package, these values are not returned by the papi. This let me think that the documentation might not be up to date. But I would still be interested by getting a confirmation. And finally, the triton documentation <https://docs.joyent.com/private-cloud/packages/configuring#packagedefinitions> specify that vCPU is only used by KVM and not required for SmartOS zones. I run exclusively SmartOS zones, that's why I am wondering if in my case, there is really some calculation running behind the scene to generate a cpu_share value from a given amount of vCPUs. Or is the vCPU field wrongly marked as "required" in the AdminUI and SDC's packages contain 1 vCPU as a simple place holder ? On Mon, Feb 15, 2016 at 10:51 PM, Jorge Schrauwen <[email protected]> wrote: > Excellent explenation Nahum! > > > > > On 2016-02-15 15:39, Nahum Shalman wrote: > >> On 02/15/2016 01:30 AM, Benjamin Bergia wrote: >> >>> Hi, >>> >>> I recently noticed that all the packages used by SDC zones are using >>> some strange settings. All of them, even database ones, are using 1 vCPU >>> with a cap of 400. I can picture in my head the "meaning" of the CPU cap >>> when the cap is lower than the sum of the CPU percentages. So something >>> like 4 vCPU and a cap of 200 make sense to me. >>> >>> Can somebody explain me what happens with a setting of let say 1 vCPU >>> and a cap of 200 ? >>> >>> In the SDC case what was the idea behind having this kind of settings ? >>> Does it give any performance/portability/else improvement ? >>> >>> I am rethinking my packages and where I previously used settings like >>> you would use on vSphere, I am no wondering if I am not doing it totally >>> wrong. >>> >> >> There are 3 important concepts. vCPUs, cpu_shares, and cpu_cap. From >> the vmadm man page (my further comments below that): >> >> vcpus: >> >> For KVM VMs this parameter defines the number of virtual >> CPUs the guest >> will see. Generally recommended to be a multiple of 2. >> >> type: integer (number of CPUs) >> vmtype: KVM >> listable: yes >> create: KVM only >> update: KVM only (requires VM reboot to take effect) >> default: 1 >> >> cpu_shares: >> >> Sets a limit on the number of fair share scheduler (FSS) >> CPU shares for >> a VM. This value is relative to all other VMs on the >> system, so a value >> only has meaning in relation to other VMs. If you have one VM >> with a >> a value 10 and another with a value of 50, the VM with 50 will >> get 5x >> as much time from the scheduler as the one with 10 when there >> is >> contention. >> >> type: integer (number of shares) >> vmtype: OS,KVM >> listable: yes >> create: yes >> update: yes (live update) >> default: 100 >> >> cpu_cap: >> >> Sets a limit on the amount of CPU time that can be used by a >> VM. The >> unit used is the percentage of a single CPU that can be >> used by the VM. >> Eg. a value of 300 means up to 3 full CPUs. >> >> type: integer (percentage of single CPUs) >> vmtype: OS,KVM >> listable: yes >> create: yes >> update: yes (live update) >> >> >> First, note that "vcpus" from a SmartOS perspective only applies to >> KVM VMs. That setting determines how many processors the VM can see >> and thus can use to schedule its processes. We'll come back to that. >> >> Zones (both the kind containing the QEMU process for KVM VMs and >> regular LX and joyent branded ones) can see all of the physical >> processors and the OS can schedule processes on any of them. >> If you imagine yourself as a multi-tenant cloud provider you'll >> quickly realize that you need two things: >> 1. Fairness (preventing noisy neighbors) when the system is fully >> loaded. This is what cpu_shares does. If you give every zone the >> appropriate number of shares they will all get the proportional amount >> of system CPU when the system is fully loaded. >> 2. Paying for what you get. On a system that is *not* fully loaded, in >> theory a zone could use lots and lots of CPUs. Customers would be >> incentivized to create and destroy zones until they found one that >> could use lots of free CPU. This is where CPU caps come in. They >> ensure that on a system that is *not* fully loaded the zone can only >> burst up to a reasonable amount relative to what the customer is >> paying. This also helps manage expectations. Setting a CPU cap >> reasonably close to the amount of CPU that the customer gets when the >> system *is* fully loaded means that people are less likely to *think* >> that they are suffering from noisy neighbors (when the delta of how >> much CPU you get on a fully loaded vs fully unloaded system is small, >> you see more consistent performance.) >> >> I haven't looked at the details of the SDC packages, but I can >> confidently say that "vcpu" in the context of a joyent branded zone is >> an approximation of what to expect based on the shares and the cap (as >> opposed to a KVM VM where that's literally the number that the VM will >> see). >> >> So if you have a SDC service zone with "1 vCPU and a cap of 200" then >> it's getting shares such that when the system is fully loaded it >> should get approximately 1 CPU's worth of CPU time from the scheduler, >> but when the system is not fully loaded it should be able to get up to >> 2 CPUs worth of CPU time from the scheduler but no more. The >> difference between those two is what the Joyent cloud advertises as >> "bursting" >> >> Coming back for one last moment to KVM VMs, remember that the QEMU >> process is running in a zone that can have shares and caps. >> Additionally, when the VM does I/O, QEMU threads need to be scheduled >> to do some (overhead) work to make that happen. >> So in theory you might need your shares and caps to be slightly more >> than just what the number of vCPUs might otherwise suggest (e.g. for >> something performance critical that *has* to live in a VM you could >> imagine having vCPUs be 8, but wanting cpu_cap to be 900 and having >> shares that give you some extra CPU time when the system is fully >> loaded.) >> >> Finally let's see if I can answer your questions: >> >> All of them, even database ones, are using 1 vCPU with a cap of 400. >>> >> They are configured so that when the system is fully loaded they >> should still get about 1 CPU's worth of CPU time, but if the system >> isn't fully loaded they can "burst" up to using 4 CPUs worth but no >> more. >> >> I can picture in my head the "meaning" of the CPU cap when the cap is >>> lower than the sum of the CPU percentages. So something like 4 vCPU and a >>> cap of 200 make sense to me. >>> >> I find that confusing both for KVM VMs or for regular zones. The cap >> would ensure that you never get more than 2 CPUs worth of compute time >> and a KVM VM that thinks it has 4 processors but can never get more >> than 2 processors worth of work done seems like a bad idea. >> >> Can somebody explain me what happens with a setting of let say 1 vCPU and >>> a cap of 200 ? >>> >> For a KVM VM the guest would see 1 processor, but would still have >> headroom for the I/O overhead from QEMU. For a regular zone it's like >> before, you get 1 CPU when the system is fully loaded, but can burst >> up to 2 when there are spare cycles (but no more than that). >> >> In the SDC case what was the idea behind having this kind of settings ? >>> Does it give any performance/portability/else improvement? >>> >> >> The punchline comes down to "bursting". If you think your workloads >> are bursty then you want to leave some extra space in the caps so that >> the zones can take advantage of otherwise wasted cycles when they need >> them, but you also want to ensure fairness under load. >> >> Hopefully this was helpful. >> >> -Nahum >> >> > > ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00 Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb Powered by Listbox: http://www.listbox.com
