ktmud edited a comment on pull request #19421: URL: https://github.com/apache/superset/pull/19421#issuecomment-1086218951
@eschutho I propose to change current migration to no-op and move my updated code to a new migration. I DM'ed @betodealmeida and @hughhhh earlier on Slack. Reposting the messages here for visibility: --- Hi, I noticed we are making more adjustments to SIP-68 models and have prepared a [couple](https://github.com/apache/superset/pull/19425) of [more](https://github.com/apache/superset/pull/19487) db migrations. I’m wondering whether we should bundle all these migrations (including the first one that’s already merged) into one new migration and change the original migration to no-op. **Pros:** - Reduced total migration time: bundle everything should be faster than running them separately - We get a chance to fix a couple of more errors such as [using MediumText for MySQL](https://github.com/apache/superset/pull/19421#discussion_r839942807) and [incorrect additive_metric_types matching](https://github.com/apache/superset/pull/19421#discussion_r839903477) - We get a chance to copy over other missing data such as [changed on and last updated](https://github.com/apache/superset/pull/19421#discussion_r840089807) - We can re-ID the copied entities to follow the original ones, making it easier to spot-check potential data inconsistency bugs down the road - Everyone’s db is in a clean and consistent state - It's easier to review the db structure in the future **Cons:** - Those who already ran the migration and bore the slowness may have to experience it again Happy to incorporate [#19487](https://github.com/apache/superset/pull/19487/) and [#19425](https://github.com/apache/superset/pull/19425) to [my PR](https://github.com/apache/superset/pull/19421) if they are still needed. Btw, I think the `Dataset` model may need a `database_id` column as well. There is the implicit assumption that a dataset can only run on one database. I cannot imagine a case where we need to support a virtual dataset being used on different tables in different databases. Having direct link to databases makes sure existing virtual datasets can be linked to the correct database without relying on an unreliable table name extraction process. Currently if table name extraction fails, a virtual dataset lost its association with a correct table, hence the only link to database. It would require joining `SqlaTable` with `sqlatable_id` to get the correct database id. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For additional commands, e-mail: notifications-h...@superset.apache.org