rusackas commented on code in PR #40642:
URL: https://github.com/apache/superset/pull/40642#discussion_r3338276328


##########
superset/jinja_context.py:
##########
@@ -354,6 +354,18 @@ def get_filters(self, column: str, remove_filter: bool = 
False) -> list[Filter]:
             - you want to have the ability for filter inside the main query 
for speed
             purposes
 
+        Always pass filter values through proper parameterization rather than
+        building SQL by hand. Use the ``where_in`` filter for list membership 
and
+        bind individual values as query parameters instead of interpolating 
them
+        directly into the SQL string.
+
+        .. warning::
+
+            Do not manually escape filter values (for example, with
+            ``replace("'", "''")``). Hand-rolled escaping is error-prone and 
easy
+            to get wrong across dialects. Rely on the ``where_in`` filter and
+            parameter binding so values are quoted safely by the engine.

Review Comment:
   The docstring intentionally says 'dialect-safe quoting (via SQLAlchemy's 
literal_binds compilation)' which accurately describes what where_in does. The 
phrase 'safely by the engine' is about quoting, not true prepared-statement 
parameter binding. The wording could be more precise, but the intent is to 
guide users away from hand-rolling escaping, which is the correct advice.



##########
tests/unit_tests/db_engine_specs/test_base.py:
##########
@@ -1283,3 +1283,55 @@ def 
test_start_oauth2_dance_falls_back_to_url_for(mocker: MockerFixture) -> None
     error = exc_info.value.error
 
     assert error.extra["redirect_uri"] == fallback_uri
+
+
+def test_get_table_names_strips_schema_with_regex_metacharacters(
+    mocker: MockerFixture,
+) -> None:
+    """
+    Test that get_table_names strips a schema prefix containing regex
+    metacharacters without raising and without mangling unrelated names.
+    """
+    schema = "a.b(c)"
+
+    inspector = mocker.MagicMock()
+    inspector.get_table_names.return_value = [
+        f"{schema}.orders",
+        "axbyc.other",
+    ]
+
+    database = mocker.MagicMock()
+
+    spec = BaseEngineSpec
+    mocker.patch.object(spec, "try_remove_schema_from_table_name", True)
+
+    tables = spec.get_table_names(database, inspector, schema)
+
+    # The real schema prefix is stripped; the look-alike name is left intact
+    # because the metacharacters are escaped before being used as a regex.
+    assert tables == {"orders", "axbyc.other"}

Review Comment:
   The test uses axbc.other (not axbyc.other). With schema a.b(c) the old 
unescaped pattern is ^a.b(c)\. — the dot matches any char (a→a, .→x, b→b), (c) 
creates a group matching c, and \. matches the literal dot. So axbc.other does 
match the old broken regex and would have been incorrectly stripped, verifying 
the fix.



##########
tests/unit_tests/db_engine_specs/test_base.py:
##########
@@ -1283,3 +1283,55 @@ def 
test_start_oauth2_dance_falls_back_to_url_for(mocker: MockerFixture) -> None
     error = exc_info.value.error
 
     assert error.extra["redirect_uri"] == fallback_uri
+
+
+def test_get_table_names_strips_schema_with_regex_metacharacters(
+    mocker: MockerFixture,
+) -> None:
+    """
+    Test that get_table_names strips a schema prefix containing regex
+    metacharacters without raising and without mangling unrelated names.
+    """
+    schema = "a.b(c)"
+
+    inspector = mocker.MagicMock()
+    inspector.get_table_names.return_value = [
+        f"{schema}.orders",
+        "axbyc.other",
+    ]
+
+    database = mocker.MagicMock()
+
+    spec = BaseEngineSpec
+    mocker.patch.object(spec, "try_remove_schema_from_table_name", True)
+
+    tables = spec.get_table_names(database, inspector, schema)
+
+    # The real schema prefix is stripped; the look-alike name is left intact
+    # because the metacharacters are escaped before being used as a regex.
+    assert tables == {"orders", "axbyc.other"}
+
+
+def test_get_view_names_strips_schema_with_regex_metacharacters(
+    mocker: MockerFixture,
+) -> None:
+    """
+    Test that get_view_names strips a schema prefix containing regex
+    metacharacters without raising and without mangling unrelated names.
+    """
+    schema = "a.b(c)"
+
+    inspector = mocker.MagicMock()
+    inspector.get_view_names.return_value = [
+        f"{schema}.report",
+        "axbyc.other",
+    ]
+
+    database = mocker.MagicMock()
+
+    spec = BaseEngineSpec
+    mocker.patch.object(spec, "try_remove_schema_from_table_name", True)
+
+    views = spec.get_view_names(database, inspector, schema)
+
+    assert views == {"report", "axbyc.other"}

Review Comment:
   Same as the table-name test comment — axbc.other is used (not axbyc.other). 
The regex a.b(c)\. with unescaped metacharacters does match axbc.other, so the 
test correctly exercises the regression.



##########
superset/db_engine_specs/base.py:
##########
@@ -1746,7 +1746,9 @@ def get_table_names(  # pylint: disable=unused-argument
             raise cls.get_dbapi_mapped_exception(ex) from ex
 
         if schema and cls.try_remove_schema_from_table_name:
-            tables = {re.sub(f"^{schema}\\.", "", table) for table in tables}
+            tables = {
+                re.sub(f"^{re.escape(schema)}\\.", "", table) for table in 
tables
+            }

Review Comment:
   re.escape(schema) is already computed once before the comprehension and 
stored in escaped_schema — look at the diff above this line. The comprehension 
references escaped_schema throughout.



##########
superset/db_engine_specs/base.py:
##########
@@ -1774,7 +1776,9 @@ def get_view_names(  # pylint: disable=unused-argument
             raise cls.get_dbapi_mapped_exception(ex) from ex
 
         if schema and cls.try_remove_schema_from_table_name:
-            views = {re.sub(f"^{schema}\\.", "", view) for view in views}
+            views = {
+                re.sub(f"^{re.escape(schema)}\\.", "", view) for view in views
+            }

Review Comment:
   Same as above — escaped_schema is computed once before the comprehension in 
the get_view_names path too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to