bito-code-review[bot] commented on PR #39976:
URL: https://github.com/apache/superset/pull/39976#issuecomment-4431111832
<!-- Bito Reply -->
The flagged issue is correct: using raw `rfind("--", ...)` on the SQL string
incorrectly identifies '--' inside string literals as comments, causing the
insertion point to move into quoted text and corrupt SQL syntax when injecting
RLS predicates. This can lead to invalid SQL or unintended query behavior. To
resolve, replace the raw string search with proper SQL tokenization to detect
only actual comment tokens.
**superset/sql/rls_splice.py**
```
def _before_trivia(sql, offset):
line_start = sql.rfind('\n', 0, offset) + 1
# Use sqlglot tokenization to find real comment tokens, not '--' in
strings
try:
dialect = sqlglot.Dialect.get_or_raise('base') # Adjust dialect as
needed
tokens = list(dialect.tokenize(sql))
comment_start = -1
for token in tokens:
if (token.start >= line_start and token.start < offset and
token.token_type == sqlglot.tokens.TokenType.COMMENT):
comment_start = token.start
if comment_start != -1:
offset = comment_start
except sqlglot.errors.SqlglotError:
# Fallback to original logic if tokenization fails
inline_comment_start = sql.rfind("--", line_start, offset)
if inline_comment_start != -1:
offset = inline_comment_start
return offset
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]