Re: [openstack-dev] [Nova] Cells conversation starter

Andrew Laski Wed, 22 Oct 2014 12:03:13 -0700


On 10/22/2014 03:42 AM, Vineet Menon wrote:

On 22 October 2014 06:24, Tom Fifield <[email protected]<mailto:[email protected]>> wrote:


    On 22/10/14 03:07, Andrew Laski wrote:
    >
    > On 10/21/2014 04:31 AM, Nikola Đipanov wrote:
    >> On 10/20/2014 08:00 PM, Andrew Laski wrote:
    >>> One of the big goals for the Kilo cycle by users and
    developers of the
    >>> cells functionality within Nova is to get it to a point where
    it can be
    >>> considered a first class citizen of Nova.  Ultimately I think
    this comes
    >>> down to getting it tested by default in Nova jobs, and making
    it easy
    >>> for developers to work with.  But there's a lot of work to get
    there.
    >>> In order to raise awareness of this effort, and get the
    conversation
    >>> started on a few things, I've summarized a little bit about
    cells and
    >>> this effort below.
    >>>
    >>>
    >>> Goals:
    >>>
    >>> Testing of a single cell setup in the gate.
    >>> Feature parity.
    >>> Make cells the default implementation. Developers write code
    once and
    >>> it works for  cells.
    >>>
    >>> Ultimately the goal is to improve maintainability of a large
    feature
    >>> within the Nova code base.
    >>>
    >> Thanks for the write-up Andrew! Some thoughts/questions below.
    Looking
    >> forward to the discussion on some of these topics, and would be
    happy to
    >> review the code once we get to that point.
    >>
    >>> Feature gaps:
    >>>
    >>> Host aggregates
    >>> Security groups
    >>> Server groups
    >>>
    >>>
    >>> Shortcomings:
    >>>
    >>> Flavor syncing
    >>>      This needs to be addressed now.
    >>>
    >>> Cells scheduling/rescheduling
    >>> Instances can not currently move between cells
    >>>      These two won't affect the default one cell setup so they
    will be
    >>> addressed later.
    >>>
    >>>
    >>> What does cells do:
    >>>
    >>> Schedule an instance to a cell based on flavor slots available.
    >>> Proxy API requests to the proper cell.
    >>> Keep a copy of instance data at the global level for quick
    retrieval.
    >>> Sync data up from a child cell to keep the global level up to
    date.
    >>>
    >>>
    >>> Simplifying assumptions:
    >>>
    >>> Cells will be treated as a two level tree structure.
    >>>
    >> Are we thinking of making this official by removing code that
    actually
    >> allows cells to be an actual tree of depth N? I am not sure if
    doing so
    >> would be a win, although it does complicate the
    RPC/Messaging/State code
    >> a bit, but if it's not being used, even though a nice
    generalization,
    >> why keep it around?
    >
    > My preference would be to remove that code since I don't
    envision anyone
    > writing tests to ensure that functionality works and/or doesn't
    > regress.  But there's the challenge of not knowing if anyone is
    actually
    > relying on that behavior.  So initially I'm not creating a
    specific work
    > item to remove it.  But I think it needs to be made clear that
    it's not
    > officially supported and may get removed unless a case is made for
    > keeping it and work is put into testing it.

    While I agree that N is a bit interesting, I have seen N=3 in
    production

    [central API]-->[state/region1]-->[state/region DC1]
                                   \->[state/region DC2]
                  -->[state/region2 DC]
                  -->[state/region3 DC]
                  -->[state/region4 DC]

I'm curious.

What are the use cases for this deployment? Agreeably, root node runsn-api along with horizon, key management etc. What components aredeployed in tier 2 and tier 3?And AFAIK, currently, openstack cell deployment isn't even a tree butDAG since, one cell can have multiple parents. Has anyone come up anysuch requirement?

While there's nothing to prevent a cell from having multiple parents Iwould be curious to know if this would actually work in practice, sinceI can imagine a number of cases that might cause problems. And is therea practical use for this?

Maybe we should start logging a warning when this is setup stating thatthis is an unsupported(i.e. untested) configuration to start to codifythe design as that of a tree. At least for the initial scope of work Ithink this makes sense, and if a case is made for a DAG setup that canbe done independently.


    >>
    >>> Plan:
    >>>
    >>> Fix flavor breakage in child cell which causes boot tests to fail.
    >>> Currently the libvirt driver needs flavor.extra_specs which is not
    >>> synced to the child cell.  Some options are to sync flavor and
    extra
    >>> specs to child cell db, or pass full data with the request.
    >>> https://review.openstack.org/#/c/126620/1 offers a means of
    passing full
    >>> data with the request.
    >>>
    >>> Determine proper switches to turn off Tempest tests for
    features that
    >>> don't work with the goal of getting a voting job.  Once this
    is in place
    >>> we can move towards feature parity and work on internal
    refactorings.
    >>>
    >>> Work towards adding parity for host aggregates, security
    groups, and
    >>> server groups.  They should be made to work in a single cell
    setup, but
    >>> the solution should not preclude them from being used in multiple
    >>> cells.  There needs to be some discussion as to whether a host
    aggregate
    >>> or server group is a global concept or per cell concept.
    >>>
    >> Have there been any previous discussions on this topic? If so
    I'd really
    >> like to read up on those to make sure I understand the pros and
    cons
    >> before the summit session.
    >
    > The only discussion I'm aware of is some comments on
    > https://review.openstack.org/#/c/59101/ , though they mention a
    > discussion at the Utah mid-cycle.
    >
    > The main con I'm aware of for defining these as global concepts
    is that
    > there is no rescheduling capability in the cells scheduler.  So if a
    > build is sent to a cell with a host aggregate that can't fit that
    > instance the build will fail even though there may be space in
    that host
    > aggregate from a global perspective.  That should be somewhat
    > straightforward to address though.
    >
    > I think it makes sense to define these as global concepts.  But
    these
    > are features that aren't used with cells yet so I haven't put a
    lot of
    > thought into potential arguments or cases for doing this one way or
    > another.
    >

Keeping aggregates local also poses problem in case when cells aretemporarily dead (out of system). Since top level doesn't have anyidea about local features including who all to contact for deletion ofa particular aggregate.


    >
    >>> Work towards merging compute/api.py and compute/cells_api.py
    so that

>>> developers only need to make changes/additions in once place.The goal

    >>> is for as much as possible to be hidden by the RPC layer,
    which will
    >>> determine whether a call goes to a compute/conductor/cell.
    >>>
    >>> For syncing data between cells, look at using objects to
    handle the
    >>> logic of writing data to the cell/parent and then syncing the
    data to
    >>> the other.
    >>>
    >> Some of that work has been done already, although in a somewhat
    ad-hoc
    >> fashion, were you thinking of extending objects to support this
    natively
    >> (whatever that means), or do we continue to inline the code in the
    >> existing object methods.
    >
    > I would prefer to have some native support for this.  In general
    data is
    > considered authoritative at the global level or the cell level.  For
    > example, instance data is synced down from the global level to a
    > cell(except for a few fields which are synced up) but a
    migration would
    > be synced up.  I could imagine decorators that would specify how
    data
    > should be synced and handle that as transparently as possible.
    >
    >>
    >>> A potential migration scenario is to consider a non cells
    setup to be a
    >>> child cell and converting to cells will mean setting up a
    parent cell
    >>> and linking them.  There are periodic tasks in place to sync
    data up
    >>> from a child already, but a manual kick off mechanism will
    need to be
    >>> added.
    >>>
    >>>
    >>> Future plans:
    >>>
    >>> Something that has been considered, but is out of scope for
    now, is that
    >>> the parent/api cell doesn't need the same data model as the
    child cell.
    >>> Since the majority of what it does is act as a cache for API
    requests,
    >>> it does not need all the data that a cell needs and what data
    it does
    >>> need could be stored in a form that's optimized for reads.
    >>>
    >>>
    >>> Thoughts?
    >>>
    >>> _______________________________________________
    >>> OpenStack-dev mailing list
    >>> [email protected]
    <mailto:[email protected]>
    >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
    >>
    >> _______________________________________________
    >> OpenStack-dev mailing list
    >> [email protected]
    <mailto:[email protected]>
    >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
    >
    >
    > _______________________________________________
    > OpenStack-dev mailing list
    > [email protected]
    <mailto:[email protected]>
    > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


    _______________________________________________
    OpenStack-dev mailing list
    [email protected]
    <mailto:[email protected]>
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Cells conversation starter

Reply via email to