Re: [openstack-dev] Is the pendulum swinging on PaaS layers?

Monty Taylor Fri, 19 May 2017 13:37:14 -0700

On 05/19/2017 03:05 PM, Zane Bitter wrote:

On 19/05/17 15:06, Kevin Benton wrote:

Don't even get me started on Neutron.[2]


It seems to me the conclusion to that thread was that the majority of
your issues stemmed from the fact that we had poor documentation at the
time.  A major component of the complaints resulted from you
misunderstanding the difference between networks/subnets in Neutron.


It's true that I was completely off base as to what the various
primitives in Neutron actually do. (Thanks for educating me!) The
implications for orchestration are largely unchanged though. It's a
giant pain that we have to infer implicit dependencies between stuff to
get them to create/delete in the right order, pretty much independently
of what that stuff does.

So knowing now that a Network is a layer-2 network segment and a Subnet
is... effectively a glorified DHCP address pool, I understand better why
it probably seemed like a good idea to hook stuff up magically. But at
the end of the day, I still can't create a Port until a Subnet exists, I
still don't know what Subnet a Port will be attached to (unless the user
specifies it explicitly using the --fixed-ip option... regardless of
whether they actually specify a fixed IP), and I have no way in general
of telling which Subnets can be deleted before a given Port is and which
will fail to delete until the Port disappears.

There are some legitimate issues in there about the extra routes
extension being replace-only and the routers API not accepting a list of
interfaces in POST.  However, it hardly seems that those are worthy of
"Don't even get me started on Neutron."


https://launchpad.net/bugs/1626607
https://launchpad.net/bugs/1442121
https://launchpad.net/bugs/1626619
https://launchpad.net/bugs/1626630
https://launchpad.net/bugs/1626634

It would be nice if you could write up something about current gaps that
would make Heat's life easier, because a large chunk of that initial
email is incorrect and linking to it as a big list of "issues" is
counter-productive.

I used to have angst at the Neutron API but have come to like it moreand more over time.

I think the main thing I run in to is that Neutron's API is modelling aa pile of data to allow for power users to do very flexible things. Whatit's missing most of the time is an easy button.


I'll give some examples:

My favorite for-instance, which I mentioned in a different thread thisweek and have mentioned in almost every talk I've given over the last 3years - is that there is no way to find out if a given network canprovide connectivity to a resource from outside of the cloud.

There are _many_ reasons why it's hard to fully express a completelyaccurate answer to this problem. "What does external mean" "what ifthere are multiple external networks" etc. Those are all valid, and allspeak to real workloads and real user scenarios ...


But there's also:

As a user I want to boot a VM on this cloud and have my users who arenot necessarily on this cloud be able to connect a service I'm going torun on it. (aka, I want to run a wordpress)

and

As a user I want to boot a VM on this cloud and I do not want anyone whois not another resource on this cloud to be able to connect to anythingit's running. ( aka, I want to run a mysql)

Unless you know things about the cloud already somehow not from the API,it is impossible to consistently perform those two tasks.

We've done a great job empowering the power users to do a bunch ofreally cool things. But we missed booting a wordpress as a basic use case.

Other things exist but aren't anyone's fault really. We still can't as acommunity agree on a consistent worldview related to fixed ips, neutronports and floating ips. Neutron amazingly supports ALL of the use casecombinations for those topics ... it just doesn't always do so in all ofthe clouds.

Heck - while I'm on floating ips ... if you have some pre-existingfloating ips and you want to boot servers on them and you want to dothat in parallel, you can't. You can boot a server with a floating ipthat did not pre-exist if you get the port id of the fixed ip of theserver then pass that id to the floating ip create call. Of course, theserver doesn't return the port id in the server record, so at the veryleast you need to make a GET /ports.json?device_id={server_id} call. Ofcourse what you REALLY need to find is the port_id of the ip of theserver that came from a subnet that has 'gateway_ip' defined, which iseven more fun since ips are associated with _networks_ on the serverrecord and not with subnets.

Possibly to Zane's point, you basically have to recreate a multi-tabledata model client side and introspect relationships between objects tobe able to figure out how to correctly get a floating ip on to a server.NOW - as opposed to the external network bit- it IS possible to do andto do correctly and have it work every time.

But if you want to re-use an existing floating ip you either have tokeep a client-side database of them where you can allocate one to aserver, or you just have to do a try/fail/try/fail loop because the onlyway you can claim one is to just try to attach it.

In any case - I apologize that I have not been able to more crisplydescribe these issues such that people can deal with them. I trulybelieve there is a point-of-view issue and the conversations can failfrom the consume side and the produce side having different context. Ithink we made several big steps forward related to keystone at theBostom Summit. Maybe next time we should try to do a similar thing fornova/neutron?

Yes, agreed. I wish I had a clean thread to link to. It's a huge amount
of work to research it all though.

cheers,
Zane.

On Fri, May 19, 2017 at 7:36 AM, Zane Bitter <[email protected]
<mailto:[email protected]>> wrote:

    On 18/05/17 20:19, Matt Riedemann wrote:

        I just wanted to blurt this out since it hit me a few times at
the
        summit, and see if I'm misreading the rooms.

        For the last few years, Nova has pushed back on adding
        orchestration to
        the compute API, and even define a policy for it since it comes
        up so
        much [1]. The stance is that the compute API should expose
        capabilities
        that a higher-level orchestration service can stitch together
        for a more
        fluid end user experience.


    I think this is a wise policy.

        One simple example that comes up time and again is allowing a
        user to
        pass volume type to the compute API when booting from volume
        such that
        when nova creates the backing volume in Cinder, it passes
        through the
        volume type. If you need a non-default volume type for boot from
        volume,
        the way you do this today is first create the volume with said
        type in
        Cinder and then provide that volume to the compute API when
        creating the
        server. However, people claim that is bad UX or hard for users to
        understand, something like that (at least from a command line, I
        assume
        Horizon hides this, and basic users should probably be using
Horizon
        anyway right?).


    As always, there's a trade-off between simplicity and flexibility. I
    can certainly understand the logic in wanting to make the simple
    stuff simple. But users also need to be able to progress from simple
    stuff to more complex stuff without having to give up and start
    over. There's a danger of leading them down the garden path.

        While talking about claims in the scheduler and a top-level
        conductor
        for cells v2 deployments, we've talked about the desire to
eliminate
        "up-calls" from the compute service to the top-level controller
        services
        (nova-api, nova-conductor and nova-scheduler). Build retries is
        one such
        up-call. CERN disables build retries, but others rely on them,
        because
        of how racy claims in the computes are (that's another story
and why
        we're working on fixing it). While talking about this, we asked,
        "why
        not just do away with build retries in nova altogether? If the
        scheduler
        picks a host and the build fails, it fails, and you have to
        retry/rebuild/delete/recreate from a top-level service."


    (FWIW Heat does this for you already.)

        But during several different Forum sessions, like user API
        improvements
        [2] but also the cells v2 and claims in the scheduler sessions,
        I was
        hearing about how operators only wanted to expose the base IaaS
        services
        and APIs and end API users wanted to only use those, which
means any
        improvements in those APIs would have to be in the base APIs
(nova,
        cinder, etc). To me, that generally means any orchestration
        would have
        to be baked into the compute API if you're not using Heat or
        something
        similar.


    The problem is that orchestration done inside APIs is very easy to
    do badly in ways that cause lots of downstream pain for users and
    external orchestrators. For example, Nova already does some
    orchestration: it creates a Neutron port for a server if you don't
    specify one. (And then promptly forgets that it has done so.) There
    is literally an entire inner platform, an orchestrator within an
    orchestrator, inside Heat to try to manage the fallout from this.
    And the inner platform shares none of the elegance, such as it is,
    of Heat itself, but is rather a collection of cobbled-together hacks
    to deal with the seemingly infinite explosion of edge cases that we
    kept running into over a period of at least 5 releases.

    The get-me-a-network thing is... better, but there's no provision
    for changes after the server is created, which means we have to
    copy-paste the Nova implementation into Heat to deal with update.[1]
    Which sounds like a maintenance nightmare in the making. That seems
    to be a common mistake: to assume that once users create something
    they'll never need to touch it again, except to delete it when
    they're done.

    Don't even get me started on Neutron.[2]

    Any orchestration that is done behind-the-scenes needs to be done
    superbly well, provide transparency for external orchestration tools
    that need to hook in to the data flow, and should be developed in
    consultation with potential consumers like Shade and Heat.

        Am I missing the point, or is the pendulum really swinging
away from
        PaaS layer services which abstract the dirty details of the
        lower-level
        IaaS APIs? Or was this always something people wanted and I've
just
        never made the connection until now?


    (Aside: can we stop using the term 'PaaS' to refer to "everything
    that Nova doesn't do"? This habit is not helping us to communicate
    clearly.)

    cheers,
    Zane.

    [1] https://review.openstack.org/#/c/407328/
    <https://review.openstack.org/#/c/407328/>
    [2]

http://lists.openstack.org/pipermail/openstack-dev/2014-April/032098.html

<http://lists.openstack.org/pipermail/openstack-dev/2014-April/032098.html>




__________________________________________________________________________

    OpenStack Development Mailing List (not for usage questions)
    Unsubscribe:
    [email protected]?subject:unsubscribe

<http://[email protected]?subject:unsubscribe>
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
    <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>




__________________________________________________________________________

OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
[email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Is the pendulum swinging on PaaS layers?

Reply via email to