john-bodley opened a new pull request #5449:
URL: https://github.com/apache/superset/pull/5449


   We've noticed a number of anomalies in our database caused by ill-defined 
forms and/or table schema definitions. This PR resolves a number of issues 
related to the `tables` table including:
   - Modifies the uniqueness logic defined by PR 
https://github.com/apache/incubator-superset/pull/3978 (which was never created 
in a migration) to include the schema name (previously it was only based on the 
database/table). This is now handled by a SQLAlchemy event listener as many 
dialects permit multiple NULL values for columns than can contain NULL.
   - Removes the pre-add for checking for uniqueness. This is no longer 
required as the uniqueness constraint defines the same logic. This also only 
worked when adding a table and thus it was possible to have duplicates names if 
someone edited the record.
   - Adds form validation to ensure that the schema name is not part of the 
table name. 
   - Ensures the `table_name` column is non-nullable.
   
   ![Screen Shot 2019-03-22 at 5 09 58 
PM](https://user-images.githubusercontent.com/4567245/54858973-63508380-4cc5-11e9-9c37-0c25e2b94500.png)
   
   Note this migration will fail if the `table_name` column is NULL. One must 
manually fix these records as programmatically trying to remedy these invalid 
records is difficult as the intent is unclear and the tables may function (from 
a query standpoint) if SQL is provided. The following query determines which 
records are problematic:
   ```
   SELECT 
       *
   FROM 
      tables
   WHERE 
      table_name IS NULL
   ```
   
   
   Finally this migration _will_ fail if the `tables` table is corrupt  in 
terms of the uniqueness. One **must** manually consolidate duplicate or 
non-valid records given there's no way of programmatically removing invalid 
records. The following query determines whether there are duplicates:
   ```
   SELECT 
       database_id,
       `schema`, 
       table_name,
       COUNT(1)
   FROM 
       tables
   GROUP BY 
       1, 2, 3
   HAVING 
       COUNT(1) > 1
   ORDER BY 
       COUNT(1) DESC
   ```
   
   Note this PR is gated by 
https://github.com/apache/incubator-superset/pull/5445 and 
https://github.com/apache/incubator-superset/pull/7084 which ensure that empty 
strings associated with form-data wont persist in the database and is necessary 
for ensuring that the relevant entries are non-NULL.
   
   to: @fabianmenges @graceguo-supercat @michellethomas @mistercrunch 
@timifasubaa 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to