ktmud opened a new pull request #19421:
URL: https://github.com/apache/superset/pull/19421


   ### SUMMARY
   
   This is another take on #19406 and #19416. The goal is to rewrite the bulk 
of loading + rewriting operations from Python to native SQL statements by 
utilizing `INSERT SELECT FROM`.
   
   Still a lot of work to do but this route seems promising. Loading millions 
of columns + metrics only took 20 seconds on my test box. I'd imagine other 
operations not as expensive as it.
   
   The whole migration happens in x steps:
   
   - [ ] Copy columns and metrics to the new `sl_columns` table
   - [ ] Tuck additional metadata columns (`verbose_name`, etc) under 
`extra_json`
   - [ ] Copy `SqlaTable` to the new `sl_datasets` and `sl_tables` table
   - [ ] Copy the relationship tables
      - [ ] table + columns       | via SQL joins
      - [ ] dataset + columns   | via SQL joins
      - [ ] dataset + tables       | via SQL parse
   - [ ] Apply dataset level `is_managed_externally` and `external_url` to 
columns
   
   ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
   <!--- Skip this if not applicable -->
   
   ### TESTING INSTRUCTIONS
   <!--- Required! What steps can be taken to manually verify the changes? -->
   
   ### ADDITIONAL INFORMATION
   <!--- Check any relevant boxes with "x" -->
   <!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue -->
   - [ ] Has associated issue:
   - [ ] Required feature flags:
   - [ ] Changes UI
   - [ ] Includes DB Migration (follow approval process in 
[SIP-59](https://github.com/apache/superset/issues/13351))
     - [ ] Migration is atomic, supports rollback & is backwards-compatible
     - [ ] Confirm DB migration upgrade and downgrade tested
     - [ ] Runtime estimates and downtime expectations provided
   - [ ] Introduces new feature or API
   - [ ] Removes existing feature or API
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org
For additional commands, e-mail: notifications-h...@superset.apache.org

Reply via email to