codeant-ai-for-open-source[bot] commented on code in PR #37582:
URL: https://github.com/apache/superset/pull/37582#discussion_r2747421181
##########
superset/extensions/__init__.py:
##########
@@ -45,11 +45,15 @@ def get_sqla_class() -> Any:
from superset.extensions.stats_logger import BaseStatsLoggerManager
from superset.security.manager import SupersetSecurityManager
from superset.utils.cache_manager import CacheManager
+from superset.utils.database import apply_mariadb_ddl_fix
from superset.utils.encrypt import EncryptedFieldFactory
from superset.utils.feature_flag_manager import FeatureFlagManager
from superset.utils.machine_auth import MachineAuthProviderFactory
from superset.utils.profiler import SupersetProfiler
+# Apply MariaDB DDL fix early in the import chain
+apply_mariadb_ddl_fix()
Review Comment:
**Suggestion:** Import-time side-effect: calling `apply_mariadb_ddl_fix()`
at module import is unguarded and can raise any exception (NameError,
ImportError, runtime errors from the patched function), which would abort
importing this module and crash app startup; wrap the call to avoid crashing
the whole process and log failures instead. [possible bug]
<details>
<summary><b>Severity Level:</b> Major ⚠️</summary>
```mdx
- ❌ App startup failure during imports
(superset/extensions/__init__.py:54-55)
- ❌ 'superset db upgrade' startup sequence may abort
- ⚠️ Patch logging missing makes diagnosis harder
```
</details>
```suggestion
import logging
from superset.utils.database import apply_mariadb_ddl_fix
# Apply MariaDB DDL fix early in the import chain
try:
apply_mariadb_ddl_fix()
except Exception:
logging.exception("Applying MariaDB DDL fix failed; continuing without
patch")
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Open a Python REPL or start the Superset process that imports the package
(e.g. run
`python -c "import superset.extensions"`). The import executes
`superset/extensions/__init__.py` and reaches the call site at
`superset/extensions/__init__.py:54-55` which invokes
`apply_mariadb_ddl_fix()`.
2. `apply_mariadb_ddl_fix()` is defined in `superset/utils/database.py`
(function
beginning at `superset/utils/database.py:90`). When called, it directly
accesses
SQLAlchemy compiler internals (e.g., `compiler.DDLCompiler`) inside that
function.
3. If `apply_mariadb_ddl_fix()` raises any exception (NameError,
ImportError, or runtime
error inside the patch) during that import, the exception propagates out of
`superset/extensions/__init__.py:55` and aborts the import, causing the
process to fail to
start.
4. Observe import failure stack trace pointing to
`superset/extensions/__init__.py:55` and
`superset/utils/database.py:90-105`. Wrapping the call prevents an
import-time crash and
logs the failure instead.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/extensions/__init__.py
**Line:** 48:55
**Comment:**
*Possible Bug: Import-time side-effect: calling
`apply_mariadb_ddl_fix()` at module import is unguarded and can raise any
exception (NameError, ImportError, runtime errors from the patched function),
which would abort importing this module and crash app startup; wrap the call to
avoid crashing the whole process and log failures instead.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
##########
superset/utils/database.py:
##########
@@ -84,3 +85,21 @@ def remove_database(database: Database) -> None:
db.session.delete(database)
db.session.flush()
+
+
+def apply_mariadb_ddl_fix() -> None:
+ """
+ Fix MariaDB "NO CYCLE" syntax issue - MariaDB uses "NOCYCLE" (no space).
+
+ This fix will be included in SQLAlchemy v2.1.0.
+ See:
https://github.com/sqlalchemy/sqlalchemy/blob/rel_2_1_0b1/lib/sqlalchemy/dialects/mysql/_mariadb_shim.py
+ """
+ original_visit_create_sequence = compiler.DDLCompiler.visit_create_sequence
+
+ def patched_visit_create_sequence(self: Any, create: Any, **kw: Any) ->
str:
+ text = original_visit_create_sequence(self, create, **kw)
+ if self.dialect.name == "mariadb":
Review Comment:
**Suggestion:** The dialect name check is too strict: comparing
`self.dialect.name == "mariadb"` will miss dialect names that include a driver
or prefix (for example "mysql+mariadb" or variants), causing the patch to not
run for some MariaDB connections; use a case-insensitive substring check (or
normalize the name) before applying the replacement. [logic error]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ 'superset db upgrade' migration SQL uses invalid syntax.
- ⚠️ Table/sequence creation fails for some MariaDB connections.
- ⚠️ Patch silently ineffective for certain engine names.
```
</details>
```suggestion
dialect_name = getattr(self.dialect, "name", "") or ""
if "mariadb" in dialect_name.lower():
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Open file superset/utils/database.py and locate apply_mariadb_ddl_fix()
(def at line
90, patched function body at lines 97-105). The equality check is
implemented at
superset/utils/database.py:101.
2. Apply the patch by calling the function directly (simulates how the PR
intends it to be
used): run from project root:
python -c "from superset.utils.database import apply_mariadb_ddl_fix;
apply_mariadb_ddl_fix()"
(this executes the closure defined at superset/utils/database.py:97-105).
3. Create an SQLAlchemy engine using a MariaDB URL that includes a driver
prefix, e.g.
"mysql+mariadb://user:pw@host/db", then compile a CreateSequence against
that engine:
- example (run in project environment): from sqlalchemy import
create_engine, Sequence;
from sqlalchemy.schema import CreateSequence
e = create_engine('mysql+mariadb://user:pw@localhost/test'); seq_sql =
str(CreateSequence(Sequence('s')).compile(dialect=e.dialect))
The patched visitor at superset/utils/database.py:99-103 will be invoked,
but
self.dialect.name for this engine is "mysql+mariadb" (not exactly
"mariadb"), so the
equality check at superset/utils/database.py:101 fails.
4. Observe generated SQL still contains "NO CYCLE" (not replaced by
"NOCYCLE"),
reproducing the original MariaDB DDL error when migrations run (e.g., during
"superset db
upgrade" when sequences are created). This demonstrates the strict equality
check misses
variant dialect names like "mysql+mariadb".
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** superset/utils/database.py
**Line:** 101:101
**Comment:**
*Logic Error: The dialect name check is too strict: comparing
`self.dialect.name == "mariadb"` will miss dialect names that include a driver
or prefix (for example "mysql+mariadb" or variants), causing the patch to not
run for some MariaDB connections; use a case-insensitive substring check (or
normalize the name) before applying the replacement.
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
##########
tests/unit_tests/utils/test_database.py:
##########
@@ -0,0 +1,48 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""Tests for superset.utils.database module."""
+
+from sqlalchemy import Sequence
+from sqlalchemy.dialects import mysql, postgresql
+from sqlalchemy.schema import CreateSequence
+from sqlalchemy.sql.compiler import DDLCompiler
+
+from superset.utils.database import apply_mariadb_ddl_fix
+
+apply_mariadb_ddl_fix()
Review Comment:
**Suggestion:** Calling `apply_mariadb_ddl_fix()` at module import time can
double-patch SQLAlchemy's `DDLCompiler.visit_create_sequence` when the same fix
is applied elsewhere in the import chain, causing infinite recursion or
test-suite-wide side effects; remove the module-level call from the test module
(invoke the fix within a scoped test or rely on package initialization
instead). [possible bug]
<details>
<summary><b>Severity Level:</b> Critical 🚨</summary>
```mdx
- ❌ RecursionError when DDLCompiler.visit_create_sequence called.
- ⚠️ Unit tests may fail unpredictably across test runs.
- ⚠️ Global test-suite state polluted by permanent patch.
```
</details>
```suggestion
# Do not apply the global MariaDB DDL fix at module import time in tests;
# the application code (extensions initialization) applies it. Tests that
# need the patch should apply it in a scoped manner to avoid double-patching.
```
<details>
<summary><b>Steps of Reproduction ✅ </b></summary>
```mdx
1. Run the unit test module `tests/unit_tests/utils/test_database.py`. The
test file
imports and executes the module-level call at
`tests/unit_tests/utils/test_database.py:26`
which calls `apply_mariadb_ddl_fix()` (defined in
`superset/utils/database.py`).
2. The function `apply_mariadb_ddl_fix()` is defined in
`superset/utils/database.py` (see
function body at `superset/utils/database.py:90-105`). The first call
captures
`original_visit_create_sequence =
compiler.DDLCompiler.visit_create_sequence` and then
replaces `compiler.DDLCompiler.visit_create_sequence` with the patched
implementation.
3. If `apply_mariadb_ddl_fix()` is invoked again (for example, by application
initialization or another test), the `original_visit_create_sequence`
captured will be the
already-patched function. When `DDLCompiler.visit_create_sequence` is later
invoked (as
done in this test at `tests/unit_tests/utils/test_database.py:33-36`), the
patched
function calls `original_visit_create_sequence(...)` which refers back to
the patched
function, causing infinite recursion (RecursionError) at runtime.
4. Reproduce locally: (a) start Python REPL or test run; (b) import and call
`apply_mariadb_ddl_fix()` twice using `from superset.utils.database import
apply_mariadb_ddl_fix` (function defined at
`superset/utils/database.py:90-105`); (c)
create a `DDLCompiler` (as in `tests/unit_tests/utils/test_database.py:33`)
and call
`visit_create_sequence(...)` (as in
`tests/unit_tests/utils/test_database.py:36`) to
observe RecursionError. The module-level call in the test
(`tests/unit_tests/utils/test_database.py:26`) makes this double-call
realistic in full
test runs when the application also applies the patch.
```
</details>
<details>
<summary><b>Prompt for AI Agent 🤖 </b></summary>
```mdx
This is a comment left during a code review.
**Path:** tests/unit_tests/utils/test_database.py
**Line:** 26:26
**Comment:**
*Possible Bug: Calling `apply_mariadb_ddl_fix()` at module import time
can double-patch SQLAlchemy's `DDLCompiler.visit_create_sequence` when the same
fix is applied elsewhere in the import chain, causing infinite recursion or
test-suite-wide side effects; remove the module-level call from the test module
(invoke the fix within a scoped test or rely on package initialization instead).
Validate the correctness of the flagged issue. If correct, How can I resolve
this? If you propose a fix, implement it and please make it concise.
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]