Re: [openstack-dev] [TripleO] Is Swift a good choice of database for the TripleO API?

Jiri Tomasek Wed, 06 Jan 2016 03:32:19 -0800

On 01/06/2016 11:48 AM, Dougal Matthews wrote:

On 5 January 2016 at 17:09, Jiri Tomasek <[email protected]<mailto:[email protected]>> wrote:


    On 12/23/2015 07:40 PM, Steven Hardy wrote:

        On Wed, Dec 23, 2015 at 11:05:05AM -0600, Ben Nemec wrote:

            On 12/23/2015 10:26 AM, Steven Hardy wrote:

                On Wed, Dec 23, 2015 at 09:28:59AM -0600, Ben Nemec wrote:

                    On 12/23/2015 03:19 AM, Dougal Matthews wrote:


                        On 22 December 2015 at 17:59, Ben Nemec
                        <[email protected]
                        <mailto:[email protected]>
                        <mailto:[email protected]
                        <mailto:[email protected]>>> wrote:

                             Can we just do git like I've been
                        suggesting all along? ;-)

                             More serious discussion inline. :-)

                             On 12/22/2015 09:36 AM, Dougal Matthews
                        wrote:
                             > Hi all,
                             >
                             > This topic came up in the 2015-12-15
                        meeting[1], and again briefly
                             today.
                             > After working with the code that came
                        out of the deployment library
                             > spec[2] I
                             > had some concerns with how we are
                        storing the templates.
                             >
                             > Simply put, when we are dealing with
                        100+ files from
                             tripleo-heat-templates
                             > how can we ensure consistency in Swift
                        without any atomicity or
                             > transactions.
                             > I think this is best explained with a
                        couple of examples.
                             >
                             >  - When we create a new deployment plan
                        (upload all the templates
                             to swift)
                             >    how do we handle the case where
                        there is an error? For example,
                             if we are
                             >    uploading 10 files - what do we do
                        if the 5th one fails for
                             some reason?
                             >    There is a patch to do a manual
                        rollback[3], but I have
                             concerns about
                             >    doing this in Python. If Swift is
                        completely inaccessible for a
                             short
                             >    period the rollback wont work either.
                             >
                             >  - When deploying to Heat, we need to
                        download all the YAML files from
                             > Swift.
                             >    This can take a couple of seconds.
                        What happens if somebody
                             starts to
                             >    upload a new version of the plan in
                        the middle? We could end up
                             trying to
                             >    deploy half old and half new files.
                        We wouldn't have a
                             consistent view of
                             >    the database.
                             >
                             > We had a few suggestions in the meeting:
                             >
                             >  - Add a locking mechanism. I would be
                        concerned about deadlocks or
                             > having to
                             >    lock for the full duration of a deploy.

                             There should be no need to lock the plan
                        for the entire deploy.  It's
                             not like we're re-reading the templates
                        at the end of the deploy today.
                              It's a one-shot read and then the plan
                        could be unlocked, at least as
                             far as I know.


                        Good point. That would be holding the lock for
                        longer than we need.

                             The only option where we wouldn't need
                        locking at all is the
                             read-copy-update model Clint mentions,
                        which might be a valid option as
                             well.  Whatever we do, there are going to
                        be concurrency issues though.
                              For example, what happens if two users
                        try to make updates to the plan
                             at the same time?  If you don't either
                        merge the changes or disallow one
                             of them completely then one user's
                        changes might be lost.

                             TBH, this is further convincing me that
                        we should just make this git
                             backed and let git handle the merging and
                        conflict resolution (never
                             mind the fact that it gets us a
                        well-understood version control system
                             for "free").  For updates that don't
                        conflict with other changes, git
                             can merge them automatically, but for
                        merge conflicts you just return a
                             rebase error to the user and make them
                        resolve it.  I have a feeling
                             this is the behavior we'll converge on
                        eventually anyway, and rather
                             than reimplement git, let's just use the
                        real thing.


                        I'd be curious to hear more how you would go
                        about doing this with git. I've
                        never automated git to this level, so I am
                        concerned about what issues we
                        might hit.

                    TBH I haven't thought it through to that extent
                    yet.  I'm mostly
                    suggesting it because it seems like a fit for the
                    template storage
                    requirements - we know we want version control, we
                    want to be able to
                    merge changes from multiple sources, and we want
                    some way to handle
                    merge conflicts.  Git does all of this already.

That said, I'm not sure about everything here.For example, how would

                    you expose merge conflicts to the user?  I don't
                    know that I would want
                    to force a user to learn git in order to use
                    TripleO (although that
                    would be the devops-y thing to do), but maybe just
                    passing them back the
                    files with the merge conflict markers and having
                    them resolve those
                    locally and retry the update would work.  I'm not
                    sure how that would
                    map to the current version of the API though. Do
                    we provide any way to
                    pass templates back to the user?  I feel like that
                    was kind of a one-way
                    street.

                What part of the deployment API workflow could result
                in merge conflicts?

                My understanding was that it's something like:

                1. Take copy of reference templates tree
                2. Introspect tempalates, expose required parameters
                so user can be
                prompted for them
                3. Create environment files(s) derived from the user input
                4. Validate the combination of (1) and (3)
                5. Deploy the templates+environments

                On update, (1) would be "overwrite existing version of
                templates"

            This update policy means you may have just blown away
            someone else's
            work, unless you rebase on the plan's templates
            immediately before
            updating (and even then there's a race if two people
            submit updates at
            the same time).

        What has been proposed to date is somewhat more limited in
        scope than what
        you're hinting at (which I think is more of a
        colloborate-on-templates
        requirement?)

        
https://github.com/openstack/tripleo-specs/blob/master/specs/mitaka/tripleo-overcloud-deployment-library.rst

        Here, you would expect any template collaboration to happen
        outside of the
        scope of the actual deployment workflow, so e.g step 1 above
        consumes
        either a packaged version of tripleo-heat-templates (which we
        don't expect
        to be routinely modified), or another location on the local
        filesystem
        (such as a repository managed by e.g git, outside of the
        deployment
        workflow).

        The "plan" then takes a copy of the golden tree, prompts for
        additional
        inputs, validates and deploys it.

        You are right though, if we allow concurrent update of the
        plan, it's
        possible that environments added to two versions of the plan
        would have to
        be merged, which could mean either conflicts or validation
        errors (if two
        operators select mutually exclusive configurations for example).

            Possible example: Two operators are working on enabling
            separate
            features in their cloud, and need to make configuration
            changes to the
            plan to do so.  Let's say one decides they need to enable
            the Storage
            network, while the other decides to enable the Tenant
            network.  The
            first operator makes their changes, sends the update and
            thinks their
            work is done.  The second operator, working from the same
            base set of
            templates as the first, makes their changes and sends the
            update.  Using
            the "overwrite" method of conflict resolution the first
            operator's
            changes have just been silently destroyed with no
            indication to either
            user that anything bad happened.

        Ok, so separating the two requirements alluded to here may
        help improve
        clarity:

        1. Multiple users collaborating on the t-h-t tree as a whole.

        2. Enabling multiple features via updates and avoiding
        mid-air-collisions

        I think (2) may simpler problem to consider, particularly if a
        lock
        of some sort is considered acceptable, e.g we explcitly do not
        allow multiple
        operators actively modifying the cloud concurrently.

        That would also be consistent with the current heat behavior,
        e.g even if
        you did allow multiple operators to concurrently change a
        plan, they cannot
        concurrently update the overcloud via heat anyway (this will
        change
        eventually with convergence).

        (1) is a much harder problem, and I can't help thinking it'd
        be better
        solved with existing tools (e.g document how to use git,
        gerrit, jenkins &
        CI test your own t-h-t tree, potentially allowing for
        semi-automated
        promotion of things between environments, a staging workflow).

            I guess you could tell users "don't do that", but unless
            you have
            exactly one person making updates to the templates there's
            going to be
            the possibility of conflicts, and in the Swift case all it
            takes is two
            people editing the same file, even in completely different
            areas, for
            someone's changes to be lost.

        Ok, good point, I think I'd been assuming more of a serialized
        workflow as
        a given, so it's definitely something to consider, thanks for
        clarifying.

        Steve

        
__________________________________________________________________________
        OpenStack Development Mailing List (not for usage questions)
        Unsubscribe:
        [email protected]?subject:unsubscribe
        <http://[email protected]?subject:unsubscribe>
        http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


    To add the information here and maybe (hopefully) clear things a
    bit,  the current workflow does not manipulate the templates and
    environments content.
    We only set the metadata about certain templates/environments and
    create single temporary environment file:

    1. Upload files (using git, it means provide git url) and identify
    capabilities-map file (capabilities_map.yaml) and set it's 'type'
    metadata to 'capabilities-map'

I think we have multiple ideas related to git floating around - usinggit as an external input source, or using git as a data store that weupdate and manage and store on the undercloud. Both seem valid.


    2. based on the capabilities-map information, identify
    'root-template' (overcloud.yaml), 'root-environment'
    (overcloud-resource-registry-puppet.yaml), 'environment'
    (environments/*.yaml) and store this information in those files
    'type' metadata.

I don't think we need to set this metadata. We can use thecapabilities-map as an index and look up that file each time we needthis information.


Good point, that get's us rid of having to store those.

    3. Let user select from optional environments ('type' is
    'environment') based on the constraints defined in
    capabilities-map. Store the information about selected
    environments in 'enabled' meta.

The metadata for enabled is environments is important, but I'll comeback to this below.


    4. Generate a list of parameters by sending templates,
    root-environment and _enabled_ optional environments to
    heat-validate (nested). Let user set values for those parameters
    and store the parameter values in newly created temporary
    environment's parameter_defaults block. Upload this template to
    Swift and set it's 'type' meta to 'temp-environment'.
    5. Deploy - take everything from Swift, process templates (to
    resolve the urls in get_file etc.) and merge environments in
    order: root environment < enabled optional environments <
    temporary environment. And send this to Heat API's Stack Create.

    So you can see, that we don't really manipulate the template
    files, we just add a metadata and create single temporary

environment that holds the parameter values,

Don't we allow users to upload new template files or update them? Ifusers need to delete a plan and create a new one for each version thatsounds painful.

    although this is not really necessary and can be replaced by
    storing the parameter values in DB and then send this as
    'parameters' param to Heat.  I think that storing files in Git is
    good idea as it is what we already have (t-h-t) but we probably
    need to use DB to store the metadata because the metadata are
    plan-specific, whereas the Git repository is not (or is it meant
    to be? That would mean creating separate git repo for every
    deploymeny attempt.)
I think we need to be careful how we store any metadata. They keyadvantage (AFAICT) with storing the files in git is that operators caneasily access and deploy them manually. However, if they need tounderstand our bespoke metadata or extract it from a database tounderstand the deploy then that advantage is lost. Maybe rather thanmetadata we can update a file (or users can add this file) thatdefines the deployment, this would be similar to one that has beenproposed to python-tripleoclient[1]. If we can then support this filein python-heatclient it would mean a deploy could easily be understoodfrom the API, python-tripleoclient and python-heatclient. Even withoutheatclient supporting this file, it is easy to look at and see how youwould call heatclient.
[1]: https://review.openstack.org/#/c/249222/
When we make a deploy, we will want to store the sha that we havedeployed, I am not sure where we want to store this information.

Ok, so this approach involves branching the git repo with a Plancreation and the Plan metadata would get stored in the answers file thatgets committed to that branch. Sounds good.

In regards to uploading/updating new templates, this sounds somewhatcounterproductive to me. Is there a use case for adding/changingtemplate as part of Plan design? IMO if we want to add template it isusually done globally in t-h-t and not in Plan specific branch. I don'tsee when we could need to do this. Adding environment is more validprobably, but that would involve also updating the capabilities map. Wehave the feature to add additional files to plan currently because weuse Swift and we have this step of uploading files as part of plancreation. Using GIT, Plan creation is just a matter of pointing to git repo.

This is why I tend to not touch the files and just store the metadata.Tying the metadata to the git repo (using answers file and branchingrepo on Plan creation) is totally valid point.



    To make sure, that Plan is in sync with Git repo (t-h-t) we can
    create the Plan is tied to not just specific repository, but also
    to a specific tag or commit. This way if the user updates the
    templates repository with changes he wants to use, he needs to
    create a new Plan and start over the deployment process.

    Correct me if I am wrong, but I think this approach resolves the
    problems with merge conflicts. The Files and Plan (Deployment) are
    separate thing - Files are stored in Git and Plan is stored in DB,
    holds the files metadata and is tied to a Git repo commit/tag.

    Any changes that involve the changes in templates themself should
    be done in Git repo and I am not convinced that we want to
    introduce anything like that in GUI/CLI deployment workflow, as as
    it was agreed before, Git is best tool for doing/tracking such
    changes.

    Jirka



    __________________________________________________________________________
    OpenStack Development Mailing List (not for usage questions)
    Unsubscribe:
    [email protected]?subject:unsubscribe
    <http://[email protected]?subject:unsubscribe>
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Jirka

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO] Is Swift a good choice of database for the TripleO API?

Reply via email to