Trent Mick pointed me to these two links on IRC:
https://smartos.org/bugview/PAPI-127
https://github.com/joyent/sdc-vmapi/blob/master/lib/common/validation.js#L948-L964

Looks like "fss" gets turned into "cpu_shares" and it also looks like if you don't provide it when you define the package it gets calculated based on memory sizing...

Still no idea on the rest.

-Nahum

On 02/16/2016 12:36 PM, Nahum Shalman wrote:
At this point you may want to CC the sdc-discuss mailing list; I don't know the innards of SDC like I know SmartOS...

I have a vague recollection that in general Joyent (and thus SDC by default?) assign cpu_shares relative to max_physical_memory. I can't find anything definitive from a little bit of spelunking in the code though, so don't quote me on that one...

I did notice that PAPI has no notion of cpu_shares, but VMAPI and vm-agent do. Somewhere a translation happens but I haven't found it yet and need to stop looking for now.

Again, maybe someone on the sdc-discuss mailing list would know where to look...

I'd love to know what the answer is though, so please follow up on this thread if/when you find out.

-Nahum

On 02/16/2016 07:43 AM, Benjamin Bergia wrote:
Thank you, it does make more sense now.
It also looks like I have been doing it all wrong since the beginning.

Three points are still bothering me and I would like to get a confirmation if anybody as any idea.

First, in the papi documentation <https://github.com/joyent/sdc-papi/blob/master/docs/index.md#formulas>, there is some obscure calculations regarding the cpu cap. Are these formulas up to date ? If yes what is the "burst ratio" used to calculate the cpu_burst_ratio.

Then, the same documentation talk about cpu_burst_ratio, fss, ram_ratio. After creating a new package, these values are not returned by the papi. This let me think that the documentation might not be up to date. But I would still be interested by getting a confirmation.

And finally, the triton documentation <https://docs.joyent.com/private-cloud/packages/configuring#packagedefinitions> specify that vCPU is only used by KVM and not required for SmartOS zones.

I run exclusively SmartOS zones, that's why I am wondering if in my case, there is really some calculation running behind the scene to generate a cpu_share value from a given amount of vCPUs. Or is the vCPU field wrongly marked as "required" in the AdminUI and SDC's packages contain 1 vCPU as a simple place holder ?


On Mon, Feb 15, 2016 at 10:51 PM, Jorge Schrauwen <[email protected] <mailto:[email protected]>> wrote:

    Excellent explenation Nahum!




    On 2016-02-15 15:39, Nahum Shalman wrote:

        On 02/15/2016 01:30 AM, Benjamin Bergia wrote:

            Hi,

            I recently noticed that all the packages used by SDC
            zones are using some strange settings. All of them, even
            database ones, are using 1 vCPU with a cap of 400. I can
            picture in my head the "meaning" of the CPU cap when the
            cap is lower than the sum of the CPU percentages. So
            something like 4 vCPU and a cap of 200 make sense to me.

            Can somebody explain me what happens with a setting of
            let say 1 vCPU and a cap of 200 ?

            In the SDC case what was the idea behind having this kind
            of settings ? Does it give any
            performance/portability/else improvement ?

            I am rethinking my packages and where I previously used
            settings like you would use on vSphere, I am no wondering
            if I am not doing it totally wrong.


        There are 3 important concepts. vCPUs, cpu_shares, and
        cpu_cap. From
        the vmadm man page (my further comments below that):

               vcpus:

                   For KVM VMs this parameter defines the number of
        virtual
        CPUs the guest
                   will see. Generally recommended to be a multiple of 2.

                   type: integer (number of CPUs)
                   vmtype: KVM
                   listable: yes
                   create: KVM only
                   update: KVM only (requires VM reboot to take effect)
                   default: 1

               cpu_shares:

                   Sets a limit on the number of fair share scheduler
        (FSS)
        CPU shares for
                   a VM. This value is relative to all other VMs on the
        system, so a value
                   only has meaning in relation to other VMs. If you
        have one VM with a
                   a value 10 and another with a value of 50, the VM
        with 50 will get 5x
                   as much time from the scheduler as the one with 10
        when there is
                   contention.

                   type: integer (number of shares)
                   vmtype: OS,KVM
                   listable: yes
                   create: yes
                   update: yes (live update)
                   default: 100

               cpu_cap:

                   Sets a limit on the amount of CPU time that can be
        used by a VM. The
                   unit used is the percentage of a single CPU that
        can be
        used by the VM.
                   Eg. a value of 300 means up to 3 full CPUs.

                   type: integer (percentage of single CPUs)
                   vmtype: OS,KVM
                   listable: yes
                   create: yes
                   update: yes (live update)


        First, note that "vcpus" from a SmartOS perspective only
        applies to
        KVM VMs. That setting determines how many processors the VM
        can see
        and thus can use to schedule its processes. We'll come back
        to that.

        Zones (both the kind containing the QEMU process for KVM VMs and
        regular LX and joyent branded ones) can see all of the physical
        processors and the OS can schedule processes on any of them.
        If you imagine yourself as a multi-tenant cloud provider you'll
        quickly realize that you need two things:
        1. Fairness (preventing noisy neighbors) when the system is fully
        loaded. This is what cpu_shares does. If you give every zone the
        appropriate number of shares they will all get the
        proportional amount
        of system CPU when the system is fully loaded.
        2. Paying for what you get. On a system that is *not* fully
        loaded, in
        theory a zone could use lots and lots of CPUs. Customers would be
        incentivized  to create and destroy zones until they found
        one that
        could use lots of free CPU. This is where CPU caps come in. They
        ensure that on a system that is *not* fully loaded the zone
        can only
        burst up to a reasonable amount relative to what the customer is
        paying. This also helps manage expectations. Setting a CPU cap
        reasonably close to the amount of CPU that the customer gets
        when the
        system *is* fully loaded means that people are less likely to
        *think*
        that they are suffering from noisy neighbors (when the delta
        of how
        much CPU you get on a fully loaded vs fully unloaded system
        is small,
        you see more consistent performance.)

        I haven't looked at the details of the SDC packages, but I can
        confidently say that "vcpu" in the context of a joyent
        branded zone is
        an approximation of what to expect based on the shares and
        the cap (as
        opposed to a KVM VM where that's literally the number that
        the VM will
        see).

        So if you have a SDC service zone with "1 vCPU and a cap of
        200" then
        it's getting shares such that when the system is fully loaded it
        should get approximately 1 CPU's worth of CPU time from the
        scheduler,
        but when the system is not fully loaded it should be able to
        get up to
        2 CPUs worth of CPU time from the scheduler but no more. The
        difference between those two is what the Joyent cloud
        advertises as
        "bursting"

        Coming back for one last moment to KVM VMs, remember that the
        QEMU
        process is running in a zone that can have shares and caps.
        Additionally, when the VM does I/O, QEMU threads need to be
        scheduled
        to do some (overhead) work to make that happen.
        So in theory you might need your shares and caps to be
        slightly more
        than just what the number of vCPUs might otherwise suggest
        (e.g. for
        something performance critical that *has* to live in a VM you
        could
        imagine having vCPUs be 8, but wanting cpu_cap to be 900 and
        having
        shares that give you some extra CPU time when the system is fully
        loaded.)

        Finally let's see if I can answer your questions:

            All of them, even database ones, are using 1 vCPU with a
            cap of 400.

        They are configured so that when the system is fully loaded they
        should still get about 1 CPU's worth of CPU time, but if the
        system
        isn't fully loaded they can "burst" up to using 4 CPUs worth
        but no
        more.

            I can picture in my head the "meaning" of the CPU cap
            when the cap is lower than the sum of the CPU
            percentages. So something like 4 vCPU and a cap of 200
            make sense to me.

        I find that confusing both for KVM VMs or for regular zones.
        The cap
        would ensure that you never get more than 2 CPUs worth of
        compute time
        and a KVM VM that thinks it has 4 processors but can never
        get more
        than 2 processors worth of work done seems like a bad idea.

            Can somebody explain me what happens with a setting of
            let say 1 vCPU and a cap of 200 ?

        For a KVM VM the guest would see 1 processor, but would still
        have
        headroom for the I/O overhead from QEMU. For a regular zone
        it's like
        before, you get 1 CPU when the system is fully loaded, but
        can burst
        up to 2 when there are spare cycles (but no more than that).

            In the SDC case what was the idea behind having this kind
            of settings ? Does it give any
            performance/portability/else improvement?


        The punchline comes down to "bursting". If you think your
        workloads
        are bursty then you want to leave some extra space in the
        caps so that
        the zones can take advantage of otherwise wasted cycles when
        they need
        them, but you also want to ensure fairness under load.

        Hopefully this was helpful.

        -Nahum



    http://www.listbox.com



*smartos-discuss* | Archives <https://www.listbox.com/member/archive/184463/=now> <https://www.listbox.com/member/archive/rss/184463/22280019-340ab187> | Modify <https://www.listbox.com/member/?&;> Your Subscription [Powered by Listbox] <http://www.listbox.com>





-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to