sadpandajoe commented on code in PR #37557:
URL: https://github.com/apache/superset/pull/37557#discussion_r2794902617


##########
tests/unit_tests/databases/commands/importers/v1/import_test.py:
##########
@@ -120,6 +120,39 @@ def test_import_database_sqlite_invalid(
     current_app.config["PREVENT_UNSAFE_DB_CONNECTIONS"] = True
 
 
+def test_import_database_sqlite_allowed_with_ignore_permissions(
+    mocker: MockerFixture, session: Session
+) -> None:
+    """
+    Test that SQLite imports succeed when ignore_permissions=True.
+
+    System imports (like examples) use URIs from server config, not user input,
+    so they should bypass the PREVENT_UNSAFE_DB_CONNECTIONS check. This is the
+    key fix from PR #37577 that allows example loading to work in CI/showtime
+    environments where PREVENT_UNSAFE_DB_CONNECTIONS is enabled.
+    """
+    from superset.commands.database.importers.v1.utils import import_database
+    from superset.models.core import Database
+    from tests.integration_tests.fixtures.importexport import 
database_config_sqlite
+
+    current_app.config["PREVENT_UNSAFE_DB_CONNECTIONS"] = True
+    
mocker.patch("superset.commands.database.importers.v1.utils.add_permissions")
+
+    engine = db.session.get_bind()
+    Database.metadata.create_all(engine)  # pylint: disable=no-member
+
+    config = copy.deepcopy(database_config_sqlite)
+    # With ignore_permissions=True, the security check should be skipped
+    database = import_database(config, ignore_permissions=True)
+
+    assert database.database_name == "imported_database"
+    assert "sqlite" in database.sqlalchemy_uri
+
+    # Cleanup
+    db.session.delete(database)
+    db.session.flush()

Review Comment:
   Fixed in 80d94386a1 — switched to `mocker.patch.dict(current_app.config, 
...)` which auto-restores after the test.



##########
superset/examples/generic_loader.py:
##########
@@ -34,6 +34,45 @@
 logger = logging.getLogger(__name__)
 
 
+def _find_dataset(
+    table_name: str,
+    database_id: int,
+    uuid: Optional[str] = None,
+    schema: Optional[str] = None,
+) -> tuple[Optional[SqlaTable], bool]:
+    """Find a dataset by UUID first, then fall back to table_name + 
database_id.
+
+    Includes schema in the fallback lookup to prevent cross-schema collisions.
+
+    This avoids unique constraint violations when a duplicate row exists.
+
+    Args:
+        table_name: The table name to look up
+        database_id: The database ID
+        uuid: Optional UUID to look up first
+        schema: Optional schema to include in fallback lookup
+
+    Returns:
+        A tuple of (dataset or None, found_by_uuid bool)
+    """
+    tbl = None
+    found_by_uuid = False
+
+    if uuid:
+        tbl = db.session.query(SqlaTable).filter_by(uuid=uuid).first()
+        if tbl:
+            found_by_uuid = True
+
+    if not tbl:
+        tbl = (
+            db.session.query(SqlaTable)
+            .filter_by(table_name=table_name, database_id=database_id, 
schema=schema)

Review Comment:
   Not applicable — this PR was rebased to a test-only scope. The source code 
changes to `generic_loader.py` are no longer part of this PR (the production 
fix was merged in PR #37577).



##########
superset/examples/generic_loader.py:
##########
@@ -90,12 +131,20 @@ def load_parquet_table(  # noqa: C901
     table_exists = database.has_table(Table(table_name, schema=schema))
     if table_exists and not force:
         logger.info("Table %s already exists, skipping data load", table_name)
-        tbl = (
-            db.session.query(SqlaTable)
-            .filter_by(table_name=table_name, database_id=database.id)
-            .first()
-        )
+        tbl, found_by_uuid = _find_dataset(table_name, database.id, uuid, 
schema)
         if tbl:
+            needs_update = False
+            # Backfill UUID if found by table_name (not UUID) and UUID not set
+            if uuid and not tbl.uuid and not found_by_uuid:
+                tbl.uuid = uuid
+                needs_update = True
+            # Backfill schema if existing table has no schema set
+            if schema and not tbl.schema:
+                tbl.schema = schema
+                needs_update = True

Review Comment:
   Not applicable — this PR was rebased to a test-only scope. The source code 
changes to `generic_loader.py` are no longer part of this PR (the production 
fix was merged in PR #37577).



##########
superset/examples/generic_loader.py:
##########
@@ -163,17 +212,21 @@ def safe_serialize(x: Any, column_name: str) -> 
Optional[str]:
 
         logger.info("Loaded %d rows into %s", len(pdf), table_name)
 
-    # Create or update SqlaTable metadata
-    tbl = (
-        db.session.query(SqlaTable)
-        .filter_by(table_name=table_name, database_id=database.id)
-        .first()
-    )
+    # Create or update SqlaTable metadata using UUID-first lookup
+    tbl, found_by_uuid = _find_dataset(table_name, database.id, uuid, schema)
 
     if not tbl:
         tbl = SqlaTable(table_name=table_name, database_id=database.id)

Review Comment:
   Not applicable — this PR was rebased to a test-only scope. The source code 
changes to `generic_loader.py` are no longer part of this PR (the production 
fix was merged in PR #37577).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to