On Wed, Mar 23, 2016 at 2:21 PM, Joshua Cohen <jco...@apache.org> wrote:
> Hi Aurorans, > > As you may have seen (https://issues.apache.org/jira/browse/AURORA-1648), > we ran into an issue when upgrading a cluster that uses dbScripts in > snapshots. To restate the problem from the ticket, when the scheduler > starts up it creates the H2 database from schema.sql which contains newly > added (or changed) tables. If the dbScript exists in the snapshot, it then > drops all tables and runs the script which re-creates the database in the > form it was previously in and repopulates all the data. At this point, if a > table was added that was not in the snapshot, that table is lost. > > The solution here is to be run a migration after restoring the database, my > question is what level of complexity do we want for migrations. I see two > possible options: > > > 1. The relatively simple approach where we have a single migration.sql > alongside schema.sql which contains any changes to run against the > database. After restoring the db from a dbScript in the snapshot, we > would > run this script. On release boundaries(?), we clear the script out. The > main caveat here is that the ddl statements in this script must be > idempotent since they would potentially be run multiple times. Another > issue is this would make it more complicated to jump over versions > (upgrading from v0.12.0 to v0.14.0 means you might miss a migration that > only existed in the 0.13.0 release). > 2. A slightly more involved approach would be to track the schema > version in the database itself. Instead of having a single migration.sql > script that contains all migrations, we could instead have > migrations/001.sql, migrations/002.sql, etc. We'd add a new table to the > database to track which migrations have been run. After restoring via > dbScript, we'd check the table and run any migrations necessary (or all > of > the table does not exist). > > Thoughts on these options? > I have used 2 a few times in the past and my take is it only more complex in 1 fixed piece of (java) code that can be well unit tested. The sql scripts are easier to read and write and reason about for devs and operators alike. > > Anyone have any alternatives? > > Cheers, > > Joshua >