Hi Aurorans,

As you may have seen (https://issues.apache.org/jira/browse/AURORA-1648),
we ran into an issue when upgrading a cluster that uses dbScripts in
snapshots. To restate the problem from the ticket, when the scheduler
starts up it creates the H2 database from schema.sql which contains newly
added (or changed) tables. If the dbScript exists in the snapshot, it then
drops all tables and runs the script which re-creates the database in the
form it was previously in and repopulates all the data. At this point, if a
table was added that was not in the snapshot, that table is lost.

The solution here is to be run a migration after restoring the database, my
question is what level of complexity do we want for migrations. I see two
possible options:


   1. The relatively simple approach where we have a single migration.sql
   alongside schema.sql which contains any changes to run against the
   database. After restoring the db from a dbScript in the snapshot, we would
   run this script. On release boundaries(?), we clear the script out. The
   main caveat here is that the ddl statements in this script must be
   idempotent since they would potentially be run multiple times. Another
   issue is this would make it more complicated to jump over versions
   (upgrading from v0.12.0 to v0.14.0 means you might miss a migration that
   only existed in the 0.13.0 release).
   2. A slightly more involved approach would be to track the schema
   version in the database itself. Instead of having a single migration.sql
   script that contains all migrations, we could instead have
   migrations/001.sql, migrations/002.sql, etc. We'd add a new table to the
   database to track which migrations have been run. After restoring via
   dbScript, we'd check the table and run any migrations necessary (or all of
   the table does not exist).

Thoughts on these options?

Anyone have any alternatives?

Cheers,

Joshua

Reply via email to