betodealmeida opened a new pull request #15032: URL: https://github.com/apache/superset/pull/15032
### SUMMARY <!--- Describe the change below, including rationale and design decisions --> The script used to benchmark migrations was failing with [this simple migration](https://github.com/apache/superset/blob/master/superset/migrations/versions/453530256cea_add_save_option_column_to_db_model.py). The root cause was obvious in hindsight, but it also raised errors on two unrelated migration scripts that have implicit relationships. After changing the two migration scripts I realized that the migration was failing because the model being benchmarked was out of sync with the underlying table. The script works this way: 1. Identify models 2. Downgrade DB 3. Add instances of the models 4. Upgrade DB 5. Calculate duration The problem is that it fails with a migration that adds a column. Imagine a migration that adds a column `b` to a model that has only a column `a`. This would happen: 1. Identify models: found 1 model with 2 columns, `a` and `b`. 2. Downgrade DB: table has only column `a`. 3. Add instances of the model — this fails because the model has columns `a` and `b`! The fix is easy: 1. Downgrade DB 2. [Build models from table schema](https://docs.sqlalchemy.org/en/14/orm/extensions/automap.html) instead of reusing models from the codebase 3. Add instances of the models 4. Upgrade DB 5. Calculate duration Now it works. I also fixed a couple minor things needed to get the migration working with Postgres UUIDs and encrypted columns from the `dbs` table. ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF <!--- Skip this if not applicable --> N/A ### TESTING INSTRUCTIONS <!--- Required! What steps can be taken to manually verify the changes? --> Tested with the script that was failing: ```bash (superset) [beto] ~/Projects/superset [fix_migration_benchmark_script] M % python scripts/benchmark_migration.py superset/migrations/versions/453530256cea_add_save_option_column_to_db_model.py Loaded your LOCAL configuration at [/Users/beto/Projects/superset/superset_config.py] logging was configured successfully 2021-06-07 21:10:40,176:INFO:superset.utils.logging_configurator:logging was configured successfully 2021-06-07 21:10:40,183:INFO:root:Configured event logger of type <class 'superset.utils.log.DBEventLogger'> /Users/beto/.pyenv/versions/superset/lib/python3.8/site-packages/flask_caching/__init__.py:191: UserWarning: Flask-Caching: CACHE_TYPE is set to null, caching is effectively disabled. warnings.warn( Importing migration script: superset/migrations/versions/453530256cea_add_save_option_column_to_db_model.py Migration goes from f1410ed7ec95 to 453530256cea Current version of the DB is 453530256cea Running benchmark will downgrade the Superset DB to f1410ed7ec95 and upgrade to 453530256cea again. There may be data loss in downgrades. Continue? [y/N]: y INFO [alembic.runtime.migration] Context impl PostgresqlImpl. INFO [alembic.runtime.migration] Will assume transactional DDL. INFO [alembic.runtime.migration] Running downgrade 453530256cea -> f1410ed7ec95, add_save_form_column_to_db_model Identifying models used in the migration: - dbs (0 rows in table dbs) Benchmarking migration INFO [alembic.runtime.migration] Context impl PostgresqlImpl. INFO [alembic.runtime.migration] Will assume transactional DDL. INFO [alembic.runtime.migration] Running upgrade f1410ed7ec95 -> 453530256cea, add_save_form_column_to_db_model Migration on current DB took: 0.20 seconds INFO [alembic.runtime.migration] Context impl PostgresqlImpl. INFO [alembic.runtime.migration] Will assume transactional DDL. INFO [alembic.runtime.migration] Running downgrade 453530256cea -> f1410ed7ec95, add_save_form_column_to_db_model Running with at least 10 entities of each model - Adding 10 entities to the dbs model INFO [alembic.runtime.migration] Context impl PostgresqlImpl. INFO [alembic.runtime.migration] Will assume transactional DDL. INFO [alembic.runtime.migration] Running upgrade f1410ed7ec95 -> 453530256cea, add_save_form_column_to_db_model Migration for 10+ entities took: 0.35 seconds INFO [alembic.runtime.migration] Context impl PostgresqlImpl. INFO [alembic.runtime.migration] Will assume transactional DDL. INFO [alembic.runtime.migration] Running downgrade 453530256cea -> f1410ed7ec95, add_save_form_column_to_db_model Running with at least 100 entities of each model - Adding 90 entities to the dbs model INFO [alembic.runtime.migration] Context impl PostgresqlImpl. INFO [alembic.runtime.migration] Will assume transactional DDL. INFO [alembic.runtime.migration] Running upgrade f1410ed7ec95 -> 453530256cea, add_save_form_column_to_db_model Migration for 100+ entities took: 0.22 seconds INFO [alembic.runtime.migration] Context impl PostgresqlImpl. INFO [alembic.runtime.migration] Will assume transactional DDL. INFO [alembic.runtime.migration] Running downgrade 453530256cea -> f1410ed7ec95, add_save_form_column_to_db_model Running with at least 1000 entities of each model - Adding 900 entities to the dbs model INFO [alembic.runtime.migration] Context impl PostgresqlImpl. INFO [alembic.runtime.migration] Will assume transactional DDL. INFO [alembic.runtime.migration] Running upgrade f1410ed7ec95 -> 453530256cea, add_save_form_column_to_db_model Migration for 1000+ entities took: 0.19 seconds Cleaning up DB Results: Current: 0.20 s 10+: 0.35 s 100+: 0.22 s 1000+: 0.19 s ``` Since I also modified 2 migrations, I ran them on a fresh install with `superset db upgrade` to make sure nothing broke. ### ADDITIONAL INFORMATION <!--- Check any relevant boxes with "x" --> <!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue --> - [ ] Has associated issue: - [ ] Changes UI - [ ] Includes DB Migration (follow approval process in [SIP-59](https://github.com/apache/superset/issues/13351)) - [ ] Migration is atomic, supports rollback & is backwards-compatible - [ ] Confirm DB migration upgrade and downgrade tested - [ ] Runtime estimates and downtime expectations provided - [ ] Introduces new feature or API - [ ] Removes existing feature or API -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
