Copilot commented on code in PR #39340:
URL: https://github.com/apache/superset/pull/39340#discussion_r3190484267
##########
superset/sql/parse.py:
##########
@@ -135,6 +137,29 @@ class CTASMethod(enum.Enum):
VIEW = enum.auto()
+def _normalized_generator(
+ dialect_name: DialectType,
+ *,
+ pretty: bool,
+ comments: bool,
+) -> Generator:
Review Comment:
Type mismatch risk: `_normalized_generator` requires `DialectType`, but call
sites pass `self._dialect` / `SQLGLOT_DIALECTS.get(engine)` which appear to be
`DialectType | None` (unknown engine). This can break static type checking and
makes the contract misleading. Consider updating the parameter annotation to
`DialectType | None` (and keep using `Dialect.get_or_raise(...)`), or
explicitly coerce/fallback (eg to `Dialects.DIALECT`) in the helper so the type
matches actual usage.
##########
tests/unit_tests/sql/parse_tests.py:
##########
@@ -2717,6 +2757,30 @@ def test_sanitize_clause(sql: str, expected: str |
Exception, engine: str) -> No
sanitize_clause(sql, engine)
[email protected](
+ "engine",
+ [
+ "postgresql",
+ "presto",
+ "trino",
+ "duckdb",
+ "dremio",
+ ],
+)
+def test_sqlstatement_format_preserves_multi_arg_distinct(engine: str) -> None:
+ """
+ Regression guard for https://github.com/apache/superset/issues/39223:
+ ``SQLStatement.format()`` must not rewrite user-defined multi-argument
+ DISTINCT aggregates into row-expression null guards. This is the SQL Lab /
+ executor path; the metric-expression path is covered by
+ ``test_sanitize_clause``.
+ """
+ sql = "SELECT DISTINCT_AVG(DISTINCT a, b) FROM t"
+ formatted = SQLScript(sql, engine).format()
+ assert "DISTINCT_AVG(DISTINCT a, b)" in formatted
+ assert "CASE WHEN" not in formatted
+
+
Review Comment:
These assertions are somewhat brittle because `SQLScript(...).format()` is a
pretty-printer and could legally introduce line breaks/extra whitespace inside
the call (eg `DISTINCT_AVG(DISTINCT a,\\n b)`), causing a false negative. A
more robust regression assertion would parse `formatted` back into an AST and
assert the `DISTINCT_AVG` call still has two distinct args and does not contain
a `CASE` expression in its argument tree (instead of relying on substring
matching).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]