Re: Let’s discuss database upgrades

Daan Hoogland Tue, 29 Dec 2015 04:04:05 -0800

Rafael,

On Tue, Dec 29, 2015 at 12:22 PM, Rafael Weingärtner <
[email protected]> wrote:


> Thanks, Daan and Wido for your contributions, I will discuss them as
> follows.
>
> Daan, about the idea of per commit upgrades. Do you mean that we separate
> each change in the database that is introduced by PRs/Commits in a
> different file (routine upgrade) per ACS version?
> So we would have, V_480_A.sql (for a PR),V_480_B.sql (for another PR) and
> so forth
>
> If that is the case, we can achieve that using a simple convention naming
> as I suggested. Each developer when she/he needs to change or add something
> in the database creates an upgrade routine separately and gives it an
> execution order to be taken by Flywaydb. I think that could help RMs to
> track and isolate the problem, right?
>
Yes, with one little caveat. We do not know in what version a feature/PR
will end up at the time of implementing, so a name containing the version
would not be ideal.



>
> Hi Wido, now I understand your example.
> I understand your worry about upgrade paths, and that is the point I want
> to discuss and solve. In your example, if we release a 4.6.0 and later a
> 4.5.3. You said that there would be no upgrade path from 4.5.3 to 4.6.0.
> Well, today that is what happens. However, if we change the technology we
> use to upgrade the database (using a tool such as Flywaydb) and if we
> define a standard to create upgrade routines that would not be a problem.
>
> As I have written in my first email, to go from a version to another we
> should be able to run all of the upgrade routines in between them
> (including the upgrade routine of the goal version). Therefore, if we
> release a version 4.6.0, and then 4.5.3, if someone upgrades to 4.5.3 from
> any other version, and then wants to upgrade to 4.6.0, that would not be a
> problem, it would be a metter of running only the routine upgrade of 4.6.0
> version. We do not need to explicitly create upgrade paths. They should be
> implicit by our upgrade conventions.
>
> About creating versions of the code that rely on some version of the
> database. I do not like much because of compatibility issues that might
> arise. For instance, let’s say version X of ACS depends on version >=Y of
> the database. If I upgrade the database to version Y + 1 or +2, the same
> ACS version has to keep running nice and shiny. My worry is that may bring
> some complications, such as to remove columns that cease to be used or data
> structure that we might want to improve.
>
> I normally see that the database version and the code base are tied in a
> mapping 1 to 1. Maybe I am having troubles identifying the benefits of that
> change.
>
> Thanks for your time ;)
>
> On Tue, Dec 29, 2015 at 8:15 AM, Wido den Hollander <[email protected]>
> wrote:
>
> >
> >
> > On 28-12-15 21:34, Rafael Weingärtner wrote:
> > > Hi Wido, Rohit,
> > > I have just read the feature suggestion.
> > >
> > > Wido, I am not trying to complicate things, quite the opposite, I just
> > > illustrate a simple thing that can happen and is happening; I just
> > pointed
> > > how it can be easily solved.
> > >
> > > About the release of .Z, releases more constant and others, I do not
> want
> > > to mix topics. Let’s keep this thread strict to discuss database
> > upgrades.
> > >
> >
> > I do not want to start the release discussion, but what I meant is that
> > we try to find a technical solution to something which might be solved
> > easier by just changing the way we release.
> >
> > 4.6.0 is released and afterwards 4.5.3 is released. How does somebody
> > upgrade from 4.5.3 to 4.6.0? He can't, since the 4.6.0 code doesn't
> > support that path.
> >
> > So my idea is to split the database version from the code version.
> >
> > The code requires database version >= X and during boot it simply checks
> > that.
> >
> > The database migration tool can indeed do the DB migration, it doesn't
> > have to be the mgmt server who does the upgrade.
> >
> > > Now, about the FS. I agree with Rohit that we should have only one way
> of
> > > managing database upgrades and creation. I just do not like the idea of
> > > creating a tool that work as a wrapper on frameworks/tools such as
> > > flywaydb. I think that those frameworks already work pretty good as
> they
> > > are; and, I would rather maintain configurations than some wrapper
> code.
> > >
> > > I personally like the way ACS works during upgrades (I just do not like
> > the
> > > code itself and how things are structured), as a system administrator I
> > > like to change the version in the
> > “/etc/apt/sources.list.d/cloudstack.list”
> > > and use the "apt-get" "update" and "install" from the command line. I
> do
> > > not see the need to add another tool that is just a wrapper to the mix.
> > If
> > > I update ACS code to 4.7.0, why would I let the database schema in an
> > older
> > > version? And if we want version DB schemas and application code
> > separately
> > > maintaining somehow compatibility between them, which would bring a
> whole
> > > other level of complexity to the code; I think we should avoid that.
> > >
> > > The flywaydb can be easily integrated with everything we have now; we
> > could
> > > have a maven profile for developers and integrate it in ACS bootstrap
> > using
> > > its API as a Spring bean. Therefore, we could remove the current
> > > “DatabaseUpgradeChecker “, “DbUpgrade” and other classes that aim to do
> > > that. We could even add the creation of the schema into the first time
> it
> > > boots using flywaydb and retire the “cloudstack-setup-database” script,
> > or
> > > at least make it less complicated, using it just to configure the
> > database
> > > URL and users.
> > >
> > > The point is that to use Flywaydb we would have to agree upon a
> > convention
> > > on creating routines (java and SQL) to execute upgrades. Moreover,
> using
> > a
> > > tool such as Flywaydb we do not need to worry about upgrade paths. As I
> > > wrote in the email I used to start this thread, the upgrade has to be
> > > straightforward, to go to a version we have to run all of the upgrade
> > > routines between the current version until the desired one. Our job is
> to
> > > create upgrade routines that work and name them properly, the job of
> the
> > > tool is to check the current version, the desired one, the upgrades
> that
> > it
> > > needs to run and execute everything properly.
> > >
> >
> > Yes, indeed. I just wanted to start the discussion if we shouldn't
> > version the database differently from the code.
> >
> > > Additionally, I do not see the need to break compatibility as Rohit
> > > suggested in the FS; in my opinion, everything we have up today can be
> > > migrated to the new structure I proposed. If we use a tool such as
> > > Flywaydb, I even volunteered for that. The only thing we have to
> discuss
> > > and agree upon is the naming conventions for upgrades routines, where
> to
> > > put them and the configurations for flywaydb.
> > >
> > > Thanks for your contribution and time.
> > >
> > >
> > > On Mon, Dec 28, 2015 at 2:10 PM, Rohit Yadav <
> [email protected]>
> > > wrote:
> > >
> > >> Hi Rafael and Wido,
> > >>
> > >> Thanks for starting a conversation in this regard, I could not pursue
> > the
> > >> Chimp tool due to other $dayjob work though it’s good to see some
> > >> discussion has started again. Hope we’ll solve this in 2016.
> > >>
> > >> In my opinion, we will need to first separate the database
> > init/migration
> > >> tooling away from mgmt server (right now the mgmt server does db
> > migrations
> > >> when it starts and there is a code/db version mismatch) and secondly
> > make
> > >> sure that we’re using the same code/tool to deploy database (right
> now,
> > >> users use the cloudstack-setup-database python tool while developer
> use
> > the
> > >> maven/java DatabaseCreator activated by the -Ddeploydb flag).
> > >>
> > >> After we’ve addressed these two issues we can look into how we can
> > support
> > >> minor releases workflow (or decide to do something else, like not
> > support
> > >> .Z releases like Wido mentioned), and see if we can or want to use any
> > >> existing migration tool or write a wrapper tool “chimp” that uses
> > existing
> > >> tools (some of those are mentioned in the Chimp FS like flywaydb etc).
> > For
> > >> allowing users to go back and forth from a db schema/version, we’ll
> also
> > >> need some new DB migration
> conventions/versioning/rules/static-checking,
> > >> and how developer need to write such paths (forward and reverse) etc.
> > >>
> > >> The best approach I figured at the time was to decide that we’ll use
> the
> > >> previous db upgrade path mechanism till a certain CloudStack version
> > (say
> > >> 4.8.0) and after that we’ll use the new approach or tooling to
> > >> upgrade/downgrade DB schemas (thereby retiring away from the old DB
> > upgrade
> > >> path mess).
> > >>
> > >>>
> > >>
> > >> [image: ShapeBlue] <http://www.shapeblue.com> Rohit Yadav Software
> > >> Architect ,  ShapeBlue d:  * | s: +44 203 603 0540*
> > >> <%7C%20s:%20+44%20203%20603%200540>  |  m:  *+91 8826230892*
> > >> <+91%208826230892> e:  *[email protected] | t: *
> > >> <[email protected]%20%7C%20t:>  |  w:  *www.shapeblue.com*
> > >> <http://www.shapeblue.com> a:
> > >> 53 Chandos Place, Covent Garden London WC2N 4HS UK Shape Blue Ltd is a
> > >> company incorporated in England & Wales. ShapeBlue Services India LLP
> > is a
> > >> company incorporated in India and is operated under license from Shape
> > Blue
> > >> Ltd. Shape Blue Brasil Consultoria Ltda is a company incorporated in
> > Brasil
> > >> and is operated under license from Shape Blue Ltd. ShapeBlue SA Pty
> Ltd
> > is
> > >> a company registered by The Republic of South Africa and is traded
> under
> > >> license from Shape Blue Ltd. ShapeBlue is a registered trademark.
> > >> This email and any attachments to it may be confidential and are
> > intended
> > >> solely for the use of the individual to whom it is addressed. Any
> views
> > or
> > >> opinions expressed are solely those of the author and do not
> necessarily
> > >> represent those of Shape Blue Ltd or related companies. If you are not
> > the
> > >> intended recipient of this email, you must neither take any action
> based
> > >> upon its contents, nor copy or show it to anyone. Please contact the
> > sender
> > >> if you believe you have received this email in error.
> > >>
> > >>
> > >> On 28-Dec-2015, at 9:10 PM, Wido den Hollander <[email protected]>
> wrote:
> > >>>
> > >>>
> > >>>
> > >>> On 28-12-15 16:21, Rafael Weingärtner wrote:
> > >>>> Thanks for your contribution Wido,
> > >>>> I have not seen Rohit’s email; I will take a look at it.
> > >>>>
> > >>>
> > >>> Ok, he has a FS here:
> > >>>
> > https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+Chimp
> > >>>
> > >>>> About database schema changes happening only in X.Y, I also agree
> with
> > >> you
> > >>>> (that is a convention we all could agree on, and such as conding and
> > >>>> release procedures we could have a wiki page for that). However, I
> > >> think we
> > >>>> still might have scripts in versions X.Y.Z to add data to a table
> such
> > >> as
> > >>>> “guest_os_hypervisor”.
> > >>>>
> > >>>
> > >>> Yes, that is true. A bugfix could be a addition into the database,
> but
> > >>> we have to prevent it as much as possible.
> > >>>
> > >>>> The point to manage such scripts is that, if we are in version such
> as
> > >>>> 4.7.0 and a new script emerges in version 4.5.3, we would have to
> > >> decide to
> > >>>> run or not to run it. I would rather not run them, since if they add
> > >>>> something to the code base; those changes should also be applied
> into
> > >>>> master and as a consequence it will be available in a future update.
> > >>>>
> > >>>
> > >>> I understand, but this is where our release cycle becomes the
> problem.
> > >>> It is because we release a X.Y.Z release we run into these kind of
> > >> problems.
> > >>>
> > >>> If we as a project simple do not release the .Z releases we would be
> > >>> fine as well ;)
> > >>>
> > >>> You can try to complicate things with technical things, or if we
> > release
> > >>> every two / three weeks we don't run into these kind of situations :)
> > >>>
> > >>> We might even cut the database version loose from the code version.
> > >>>
> > >>> Database version is simple 100, 101, 102, 103, 104, 105. And a code
> > >>> version requires a certain version of the database.
> > >>>
> > >>> Wido
> > >>>
> > >>>> On Mon, Dec 28, 2015 at 12:50 PM, Wido den Hollander <
> [email protected]>
> > >> wrote:
> > >>>>
> > >>>>>
> > >>>>>
> > >>>>> On 28-12-15 14:16, Rafael Weingärtner wrote:
> > >>>>>> Hi all devs,
> > >>>>>> First of all, sorry the long text, but I hope we can start a
> > >> discussion
> > >>>>>> here and improve that part of ACS.
> > >>>>>>
> > >>>>>> A while ago I have faced the code that Apache CloudStack (ACS)
> uses
> > to
> > >>>>>> upgrade from a version to newer one and that did not seem to be a
> > good
> > >>>>> way
> > >>>>>> to execute our upgrades. Therefore, I decided to use some time to
> > >> search
> > >>>>>> for alternatives.
> > >>>>>>
> > >>>>>
> > >>>>> I think we all saw that happen once or more :)
> > >>>>>
> > >>>>>> I have read some material about versioning of scripts used to
> > upgrade
> > >> a
> > >>>>>> database (DB) of a system and went through some frameworks that
> > could
> > >>>>> help
> > >>>>>> us.
> > >>>>>>
> > >>>>>> In the literature of software engineering, it is firmly stated
> that
> > we
> > >>>>> have
> > >>>>>> to version DB scripts as we do with the source code of the
> > >> application,
> > >>>>>> using the baseline approach. Gladly, we were not that bad at this
> > >> point,
> > >>>>> we
> > >>>>>> already versioned our routines for DB upgrade (.sql and .java).
> > >>>>> Therefore,
> > >>>>>> it seemed that we just did not have used a practical approach to
> > help
> > >> us
> > >>>>>> during DB upgrades.
> > >>>>>>
> > >>>>>> From my readings and looking at the ACS source code I raised the
> > >>>>> following
> > >>>>>> requirement:
> > >>>>>> • We should be able to write more than one routine to upgrade to a
> > >>>>>> version; those routines can be written in Java and SQL. We might
> > have
> > >>>>> more
> > >>>>>> than a routine to be executed for each version and we should be
> able
> > >> to
> > >>>>>> define an order of execution. Additionally, to go to an upper
> > >> version, we
> > >>>>>> have to run all of the routines from smaller versions first, until
> > we
> > >>>>>> achieve the desired version.
> > >>>>>>
> > >>>>>> We could also add another requirement that is the downgrade from a
> > >>>>> version,
> > >>>>>> which we currently do not support. With that comes my first
> question
> > >> for
> > >>>>>> discussion:
> > >>>>>> • Do we want/need a method to downgrade from a version to a
> previous
> > >>>>> one?
> > >>>>>>
> > >>>>>
> > >>>>> I personally do not care. Usually people should create a backup
> PRIOR
> > >> to
> > >>>>> a upgrade. If that fails they can restore the backup.
> > >>>>>
> > >>>>>> I found an explanation for not supporting downgrades, and I liked
> > it:
> > >>>>>> http://flywaydb.org/documentation/faq.html#downgrade
> > >>>>>>
> > >>>>>> So, what I devised for us:
> > >>>>>> First the bureaucracy part - our migrations occur basically in
> three
> > >> (3)
> > >>>>>> steps, first we have a "prepare script", then a cleanup script and
> > >>>>> finally
> > >>>>>> the migration per se that is written in Java, at least, that is
> what
> > >> we
> > >>>>> can
> > >>>>>> expect when reading the interface
> “com.cloud.upgrade.dao.DbUpgrade”.
> > >>>>>>
> > >>>>>> Additionally, our scripts have the following naming convention:
> > >>>>>> schema-<currentVersion>to<desiredVersion>, which in IMHO may cause
> > >> some
> > >>>>>> confusion because at first sight we may think that from the same
> > >> version
> > >>>>> we
> > >>>>>> could have different paths to an upper version, which in practice
> is
> > >> not
> > >>>>>> happening. Instead of a <currentVersion>to<version> we could
> simply
> > >> use
> > >>>>>> V_<numberOfVersion>_<sequencial>.<fileExtension>, giving that, we
> > >> have to
> > >>>>>> execute all of the V_<version> scripts that are smaller than the
> > >> version
> > >>>>> we
> > >>>>>> want to upgrade.
> > >>>>>>
> > >>>>>> To clarify what I am saying, I will use an example. Let’s say we
> > have
> > >>>>> just
> > >>>>>> installed ACS and ran the cloudstack-setup-database. That command
> > will
> > >>>>>> create a database schema in version 4.0.0. To upgrade that schema
> to
> > >>>>>> version 4.3.0 (it is just an example, it could be any other
> > version),
> > >> ACS
> > >>>>>> will use the following mapping:
> > >>>>>>
> > >>>>>> _upgradeMap.put("4.0.0", new DbUpgrade[] {new Upgrade40to41(), new
> > >>>>>> Upgrade410to420(), new Upgrade420to421(), new Upgrade421to430())
> > >>>>>>
> > >>>>>> After loading the mapping, ACS will execute the scripts defined in
> > >> each
> > >>>>> one
> > >>>>>> of the Upgrade path classes and the migration code per se.
> > >>>>>>
> > >>>>>> Now, let’s say we change the “.sql” scripts name to the pattern I
> > >>>>>> mentioned, we would have the following scripts; those are the
> > scripts
> > >>>>> found
> > >>>>>> that aim to upgrade to versions between the interval 4.0.0 – 4.3.0
> > >>>>>> (considering 4.3.0, since that is the goal version):
> > >>>>>>
> > >>>>>>
> > >>>>>> - schema-40to410, can be named to: V_410_A.sql
> > >>>>>> - schema-40to410-cleanup, can be named to: V_410_B.sql
> > >>>>>> - schema-410to420, can be named to: V_420_A.sql
> > >>>>>> - schema-410to420-cleanup , can be named to: V_420_b.sql
> > >>>>>> - schema-420to421, can be named to: V_421_A.sql
> > >>>>>> - schema-421to430, can be named to: V_430_A.sql
> > >>>>>> - schema-421to430-cleanup, can be named to: V_430_B.sql
> > >>>>>>
> > >>>>>>
> > >>>>>> Additionally, all of the java code would have to follow the same
> > >>>>>> convention. For instance, we have
> > >> “com.cloud.upgrade.dao.Upgrade40to41”,
> > >>>>>> which has some java code to migrate from 4.0.0 to 4.1.0. The idea
> is
> > >> to
> > >>>>>> extract that migration code to a Java class named: V_410_C.java,
> > >> giving
> > >>>>>> that it has to execute the SQL scripts before the java code.
> > >>>>>>
> > >>>>>> In order to go from a smaller version (4.0.0) to an upper one
> > >> (4.3.0), we
> > >>>>>> have to run all of the migration routines from intermediate
> > versions.
> > >>>>> That
> > >>>>>> is what we are already doing, but we do all of that manually.
> > >>>>>>
> > >>>>>> Bottom line, I think we could simple use the convention
> > >>>>>> V_<numberOfVersion>_<sequencial>.<fileExtension> to name upgrade
> > >>>>> routines.
> > >>>>>> That would facilitate us to use a framework to help us with that
> > >> process.
> > >>>>>> Additionally, I believe that we should always assume that to go
> > from a
> > >>>>>> smaller version to a higher one, we should run all of the scripts
> > that
> > >>>>>> exist between them. What do you guys think of that?
> > >>>>>>
> > >>>>>
> > >>>>> That seems good to me. But we still have to prevent that we perform
> > >>>>> database changes in a X.Y.Z release since that is branched off to a
> > >>>>> different branch.
> > >>>>>
> > >>>>> Imho database changes should only happen in X.Y releases.
> > >>>>>
> > >>>>>> After the bureaucracy, we can discuss tools. If we use that
> > >> convention to
> > >>>>>> name migration (upgrade) routines, we can start thinking on tools
> to
> > >>>>>> support our migration process. I found two (2) promising ones:
> > >> Liquibase
> > >>>>>> and Flywaydb (both seem to be under Apache license, but the first
> > one
> > >> has
> > >>>>>> an enterprise version?!). After reading the documentation and some
> > >> usage
> > >>>>>> examples I found the flywaydb easier and simpler to use.
> > >>>>>>
> > >>>>>> What are the options of tools that we can use to help us manage
> the
> > >>>>>> database upgrade, without needing to code the upgrade path that
> you
> > >> know?
> > >>>>>>
> > >>>>>> After that, I think we should decide if we should create another
> > >>>>>> project/component to take care of migrations, or we can just add
> the
> > >>>>>> dependency of the tool to a project such as “cloud-framework-db”
> and
> > >>>>> start
> > >>>>>> using it.
> > >>>>>>
> > >>>>>> The “cloud-framework-db” project seems to have a focus on other
> > things
> > >>>>> such
> > >>>>>> as managing transactions and generating SQLs from annotations (?!?
> > >> That
> > >>>>>> should be a topic for another discussion). Therefore, I would
> rather
> > >>>>> create
> > >>>>>> a new project that has the specific goal of managing ACS DB
> > upgrades.
> > >> I
> > >>>>>> would also move all of the routines (SQL and Java) to this new
> > >> project.
> > >>>>>> This project would be a module of the CloudStack project and it
> > would
> > >>>>>> execute the upgrade routines at the startup of ACS.
> > >>>>>>
> > >>>>>> I believe that going from a homemade solution to one that is more
> > >>>>>> consolidated and used by other communities would be the way to go.
> > >>>>>>
> > >>>>>> I can volunteer myself to create a PR with the aforementioned
> > changes
> > >> and
> > >>>>>> using flywaydb to manage our upgrades. However, I prefer to have a
> > >> good
> > >>>>>> discussion with other devs first, before starting coding.
> > >>>>>>
> > >>>>>> Do you have suggestions or points that should be raised before we
> > >> start
> > >>>>>> working on that?
> > >>>>>
> > >>>>> Rohit suggested Chimp earlier this year:
> > >>>>>
> > >>>>>
> > >>
> >
> http://mail-archives.apache.org/mod_mbox/cloudstack-dev/201508.mbox/%[email protected]%3E
> > >>>>>
> > >>>>> The thread is called: "[DISCUSS] Let's fix CloudStack Upgrades and
> DB
> > >>>>> migrations with CloudStack Chimp"
> > >>>>>
> > >>>>> Maybe there is something good in there.
> > >>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>> Rafael Weingärtner
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>>
> > >>
> > >> Regards.
> > >>
> > >> Find out more about ShapeBlue and our range of CloudStack related
> > services:
> > >> IaaS Cloud Design & Build
> > >> <http://shapeblue.com/iaas-cloud-design-and-build//> | CSForge –
> rapid
> > >> IaaS deployment framework <http://shapeblue.com/csforge/>
> > >> CloudStack Consulting <http://shapeblue.com/cloudstack-consultancy/>
> |
> > CloudStack
> > >> Software Engineering
> > >> <http://shapeblue.com/cloudstack-software-engineering/>
> > >> CloudStack Infrastructure Support
> > >> <http://shapeblue.com/cloudstack-infrastructure-support/> |
> CloudStack
> > >> Bootcamp Training Courses <http://shapeblue.com/cloudstack-training/>
> > >>
> > >
> > >
> > >
> >
>
>
>
> --
> Rafael Weingärtner
>



-- 
Daan

Re: Let’s discuss database upgrades

Reply via email to