hughhhh commented on a change in pull request #16859:
URL: https://github.com/apache/superset/pull/16859#discussion_r717015572



##########
File path: superset/connectors/sqla/models.py
##########
@@ -1673,7 +1673,7 @@ def before_update(
 
         if not DatasetDAO.validate_uniqueness(
             target.database_id, target.schema, target.table_name
-        ):
+        ) and hasattr(target, "columns"):

Review comment:
       The issue started happening once this PR landed: #15909
   
   From my understanding the whole point of the uniqueness check is to make 
sure we arent having the same name for datasets that may be updated from the 
explore view.

##########
File path: superset/connectors/sqla/models.py
##########
@@ -1673,7 +1673,7 @@ def before_update(
 
         if not DatasetDAO.validate_uniqueness(
             target.database_id, target.schema, target.table_name
-        ):
+        ) and hasattr(target, "columns"):

Review comment:
       Here's an example of the payload that is failing whenever i try to 
overwrite the dataset
   ```
   {'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 
0x124877c40>,
    'cache_timeout': None,
    'changed_by_fk': 1,
    'changed_on': datetime.datetime(2021, 9, 27, 14, 49, 25, 830168),
    'created_by_fk': 1,
    'created_on': datetime.datetime(2021, 9, 27, 13, 55, 20, 78105),
    'database_id': 1,
    'default_endpoint': None,
    'description': None,
    'extra': None,
    'fetch_values_predicate': None,
    'filter_select_enabled': False,
    'id': 29,
    'is_featured': False,
    'is_sqllab_view': True,
    'main_dttm_col': None,
    'offset': 0,
    'params': None,
    'perm': '[examples].[hmiles.test_dataset](id:29)',
    'schema': 'public',
    'schema_perm': '[examples].[public]',
    'sql': 'select * from flights',
    'table_name': 'hmiles.test_dataset',
    'template_params': None,
    'uuid': UUID('dfeaabe4-838e-4379-8536-307a3c95988b')}
   
   ```

##########
File path: superset/connectors/sqla/models.py
##########
@@ -1673,7 +1673,7 @@ def before_update(
 
         if not DatasetDAO.validate_uniqueness(
             target.database_id, target.schema, target.table_name
-        ):
+        ) and hasattr(target, "columns"):

Review comment:
       Here's an example of the payload that is failing whenever i try to 
overwrite the dataset
   ```
   {'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 
0x124877c40>,
    'cache_timeout': None,
    'changed_by_fk': 1,
    'changed_on': datetime.datetime(2021, 9, 27, 14, 49, 25, 830168),
    'created_by_fk': 1,
    'created_on': datetime.datetime(2021, 9, 27, 13, 55, 20, 78105),
    'database_id': 1,
    'default_endpoint': None,
    'description': None,
    'extra': None,
    'fetch_values_predicate': None,
    'filter_select_enabled': False,
    'id': 29,
    'is_featured': False,
    'is_sqllab_view': True,
    'main_dttm_col': None,
    'offset': 0,
    'params': None,
    'perm': '[examples].[hmiles.test_dataset](id:29)',
    'schema': 'public',
    'schema_perm': '[examples].[public]',
    'sql': 'select * from flights',
    'table_name': 'hmiles.test_dataset',
    'template_params': None,
    'uuid': UUID('dfeaabe4-838e-4379-8536-307a3c95988b')}
   
   ```
   
   Here's a successful payload example:
   ```
   {'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 
0x122a126a0>,
    'cache_timeout': None,
    'changed_by_fk': 1,
    'changed_on': datetime.datetime(2021, 9, 27, 14, 53, 30, 837841),
    'columns': [YEAR,
                MONTH,
                DAY,
                DAY_OF_WEEK,
                AIRLINE,
                FLIGHT_NUMBER,
                TAIL_NUMBER,
                ORIGIN_AIRPORT,
                DESTINATION_AIRPORT,
                SCHEDULED_DEPARTURE,
                DEPARTURE_TIME,
                DEPARTURE_DELAY,
                TAXI_OUT,
                WHEELS_OFF,
                SCHEDULED_TIME,
                ELAPSED_TIME,
                AIR_TIME,
                DISTANCE,
                WHEELS_ON,
                TAXI_IN,
                SCHEDULED_ARRIVAL,
                ARRIVAL_TIME,
                ARRIVAL_DELAY,
                DIVERTED,
                CANCELLED,
                CANCELLATION_REASON,
                AIR_SYSTEM_DELAY,
                SECURITY_DELAY,
                AIRLINE_DELAY,
                LATE_AIRCRAFT_DELAY,
                WEATHER_DELAY,
                ds,
                AIRPORT,
                CITY,
                STATE,
                COUNTRY,
                LATITUDE,
                LONGITUDE,
                AIRPORT_DEST,
                CITY_DEST,
                STATE_DEST,
                COUNTRY_DEST,
                LATITUDE_DEST,
                LONGITUDE_DEST],
    'created_by_fk': 1,
    'created_on': datetime.datetime(2021, 9, 27, 13, 55, 20, 78105),
    'database': examples,
    'database_id': 1,
    'default_endpoint': None,
    'description': None,
    'extra': None,
    'fetch_values_predicate': None,
    'filter_select_enabled': False,
    'id': 29,
    'is_featured': False,
    'is_sqllab_view': True,
    'main_dttm_col': None,
    'offset': 0,
    'override_columns': True,
    'params': None,
    'perm': '[examples].[hmiles.test_dataset](id:29)',
    'schema': 'public',
    'schema_perm': '[examples].[public]',
    'sql': 'select * from flights',
    'table_name': 'hmiles.test_dataset',
    'template_params': None,
    'uuid': UUID('dfeaabe4-838e-4379-8536-307a3c95988b')}
   ```

##########
File path: superset/connectors/sqla/models.py
##########
@@ -1673,7 +1673,7 @@ def before_update(
 
         if not DatasetDAO.validate_uniqueness(
             target.database_id, target.schema, target.table_name
-        ):
+        ) and hasattr(target, "columns"):

Review comment:
       So looked into the `state` whenever we break the if condition, and are 
getting the following state:
   ```
   {'_instance_dict': <weakref at 0x12dccc1d0; to 'WeakInstanceDict' at 
0x12dc9f2e0>,
    '_strong_obj': public.hmiles.test_dataset,
    'class_': <class 'superset.connectors.sqla.models.SqlaTable'>,
    'committed_state': {'database_id': symbol('NO_VALUE')},
    'expired': False,
    'expired_attributes': set(),
    'identity_token': None,
    'key': (<class 'superset.connectors.sqla.models.SqlaTable'>, (29,), None),
    'load_options': set(),
    'load_path': CachingEntityRegistry((<Mapper at 0x12ccec700; SqlaTable>,)),
    'manager': {'cache_timeout': 
<sqlalchemy.orm.attributes.InstrumentedAttribute object at 0x12ccfa680>,
                'changed_by': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccfa040>,
                'changed_by_fk': 
<sqlalchemy.orm.attributes.InstrumentedAttribute object at 0x12ccfaf40>,
                'changed_on': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccfa220>,
                'columns': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12d0da770>,
                'created_by': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccf0ef0>,
                'created_by_fk': 
<sqlalchemy.orm.attributes.InstrumentedAttribute object at 0x12ccfaea0>,
                'created_on': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccfa180>,
                'database': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccf0d10>,
                'database_id': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccfaa40>,
                'default_endpoint': 
<sqlalchemy.orm.attributes.InstrumentedAttribute object at 0x12ccfa400>,
                'description': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccfa360>,
                'extra': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccfae00>,
                'fetch_values_predicate': 
<sqlalchemy.orm.attributes.InstrumentedAttribute object at 0x12ccfaae0>,
                'filter_select_enabled': 
<sqlalchemy.orm.attributes.InstrumentedAttribute object at 0x12ccfa540>,
                'id': <sqlalchemy.orm.attributes.InstrumentedAttribute object 
at 0x12ccfa2c0>,
                'is_featured': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccfa4a0>,
                'is_sqllab_view': 
<sqlalchemy.orm.attributes.InstrumentedAttribute object at 0x12ccfacc0>,
                'main_dttm_col': 
<sqlalchemy.orm.attributes.InstrumentedAttribute object at 0x12ccfa9a0>,
                'metrics': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12d0f64a0>,
                'offset': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccfa5e0>,
                'owners': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccf0c20>,
                'params': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccfa720>,
                'perm': <sqlalchemy.orm.attributes.InstrumentedAttribute object 
at 0x12ccfa7c0>,
                'row_level_security_filters': 
<sqlalchemy.orm.attributes.InstrumentedAttribute object at 0x12d147130>,
                'schema': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccfab80>,
                'schema_perm': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccfa860>,
                'slices': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccf0e00>,
                'sql': <sqlalchemy.orm.attributes.InstrumentedAttribute object 
at 0x12ccfac20>,
                'table_name': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x12ccfa900>,
                'template_params': 
<sqlalchemy.orm.attributes.InstrumentedAttribute object at 0x12ccfad60>,
                'uuid': <sqlalchemy.orm.attributes.InstrumentedAttribute object 
at 0x12ccfa0e0>},
    'modified': True,
    'obj': <weakref at 0x12dd36950; to 'SqlaTable' at 0x12dd37790>,
    'parents': {},
    'runid': 2037,
    'session_id': 2}
   ```

##########
File path: superset/connectors/sqla/models.py
##########
@@ -1673,7 +1673,7 @@ def before_update(
 
         if not DatasetDAO.validate_uniqueness(
             target.database_id, target.schema, target.table_name
-        ):
+        ) and hasattr(target, "columns"):

Review comment:
       Realizing that committed state is returning `NO VALUE` for database_id

##########
File path: superset/connectors/sqla/models.py
##########
@@ -1673,7 +1673,7 @@ def before_update(
 
         if not DatasetDAO.validate_uniqueness(
             target.database_id, target.schema, target.table_name
-        ):
+        ) and hasattr(target, "columns"):

Review comment:
       Realizing that committed state is returning `NO VALUE` for database_id, 
but there is definitely a value for database_id in the target
   ```
   {'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 
0x12dd37760>,
    'cache_timeout': None,
    'changed_by_fk': 1,
    'changed_on': datetime.datetime(2021, 9, 27, 17, 3, 16, 324223),
    'columns': [],
    'created_by_fk': 1,
    'created_on': datetime.datetime(2021, 9, 27, 13, 55, 20, 78105),
    'database_id': 1,
    'default_endpoint': None,
    'description': None,
    'extra': None,
    'fetch_values_predicate': None,
    'filter_select_enabled': False,
    'id': 29,
    'is_featured': False,
    'is_sqllab_view': True,
    'main_dttm_col': None,
    'offset': 0,
    'params': None,
    'perm': '[examples].[hmiles.test_dataset](id:29)',
    'schema': 'public',
    'schema_perm': '[examples].[public]',
    'sql': 'select * from flights',
    'table_name': 'hmiles.test_dataset',
    'template_params': None,
    'uuid': UUID('dfeaabe4-838e-4379-8536-307a3c95988b')}
   ```

##########
File path: superset/connectors/sqla/models.py
##########
@@ -1673,7 +1673,7 @@ def before_update(
 
         if not DatasetDAO.validate_uniqueness(
             target.database_id, target.schema, target.table_name
-        ):
+        ) and hasattr(target, "columns"):

Review comment:
       Realizing that committed state is returning `NO VALUE` for database_id, 
but there is definitely a value for database_id in the target
   ```
   {'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 
0x12dd37760>,
    'cache_timeout': None,
    'changed_by_fk': 1,
    'changed_on': datetime.datetime(2021, 9, 27, 17, 3, 16, 324223),
    'columns': [],
    'created_by_fk': 1,
    'created_on': datetime.datetime(2021, 9, 27, 13, 55, 20, 78105),
    'database_id': 1, # <------
    'default_endpoint': None,
    'description': None,
    'extra': None,
    'fetch_values_predicate': None,
    'filter_select_enabled': False,
    'id': 29,
    'is_featured': False,
    'is_sqllab_view': True,
    'main_dttm_col': None,
    'offset': 0,
    'params': None,
    'perm': '[examples].[hmiles.test_dataset](id:29)',
    'schema': 'public',
    'schema_perm': '[examples].[public]',
    'sql': 'select * from flights',
    'table_name': 'hmiles.test_dataset',
    'template_params': None,
    'uuid': UUID('dfeaabe4-838e-4379-8536-307a3c95988b')}
   ```

##########
File path: superset/connectors/sqla/models.py
##########
@@ -1673,7 +1673,7 @@ def before_update(
 
         if not DatasetDAO.validate_uniqueness(
             target.database_id, target.schema, target.table_name
-        ):
+        ) and hasattr(target, "columns"):

Review comment:
       I dnt think we can since it is a fk:
   
https://github.com/apache/superset/blob/f6c30fcb8d1a3fc4a02a81964b13582d86ad1208/superset/connectors/sqla/models.py?plain=1#L491

##########
File path: superset/connectors/sqla/models.py
##########
@@ -1673,7 +1673,7 @@ def before_update(
 
         if not DatasetDAO.validate_uniqueness(
             target.database_id, target.schema, target.table_name
-        ):
+        ) and hasattr(target, "columns"):

Review comment:
       We shouldn't since it is a foreign key + these object already have been 
created since we are just doing and update:
   
https://github.com/apache/superset/blob/f6c30fcb8d1a3fc4a02a81964b13582d86ad1208/superset/connectors/sqla/models.py?plain=1#L491

##########
File path: superset/connectors/sqla/models.py
##########
@@ -1673,7 +1673,7 @@ def before_update(
 
         if not DatasetDAO.validate_uniqueness(
             target.database_id, target.schema, target.table_name
-        ):
+        ) and hasattr(target, "columns"):

Review comment:
       We shouldn't since it is a foreign key + these objects already have been 
created since we are just doing an update:
   
https://github.com/apache/superset/blob/f6c30fcb8d1a3fc4a02a81964b13582d86ad1208/superset/connectors/sqla/models.py?plain=1#L491

##########
File path: superset/connectors/sqla/models.py
##########
@@ -1673,7 +1673,7 @@ def before_update(
 
         if not DatasetDAO.validate_uniqueness(
             target.database_id, target.schema, target.table_name
-        ):
+        ) and hasattr(target, "columns"):

Review comment:
       I feel like it might just be an issue with the event handler, because 
the error is very intermittent.
   
   1 possible solution is we remove `database_id` from the attr array, and then 
if the `table_name` or `schema` has changed we use the uniqueness check to 
verify then throw the error
   
   Thoughts?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to