Copilot commented on code in PR #40585:
URL: https://github.com/apache/superset/pull/40585#discussion_r3336919877


##########
superset/sql/execution/celery_task.py:
##########
@@ -99,9 +100,12 @@ def _handle_query_error(
         query.set_extra_json_key("errors", errors_payload)
 
     db.session.commit()  # pylint: disable=consider-using-transaction
+    logger.exception("Query %s failed: %s", query.id, ex)

Review Comment:
   Same duplicate-logging issue as in `superset/sql_lab.py`: 
`_handle_query_error` calls `logger.exception(...)` here, and its sole caller 
at line 358 is preceded by another `logger.exception("Query %d: %s", query_id, 
ex)` at line 354. Every failed query will produce two identical tracebacks in 
the Celery worker logs. Pick one location to emit the traceback.



##########
superset/sql/execution/celery_task.py:
##########
@@ -99,9 +100,12 @@ def _handle_query_error(
         query.set_extra_json_key("errors", errors_payload)
 
     db.session.commit()  # pylint: disable=consider-using-transaction
+    logger.exception("Query %s failed: %s", query.id, ex)
     payload.update(
         {"status": query.status.value, "error": msg, "errors": errors_payload}
     )
+    if stacktrace := traceback.format_exc():
+        payload["stacktrace"] = stacktrace

Review Comment:
   Same information-disclosure concern as in `sql_lab.py`: the stacktrace is 
unconditionally added to the payload that is serialized to the results backend 
/ returned to the user. Recommend gating on the existing `SHOW_STACKTRACE` 
config (see `superset/config.py`) so production deployments don't leak server 
internals to SQL Lab users by default.



##########
superset/sql_lab.py:
##########
@@ -121,7 +122,10 @@ def handle_query_error(
         query.set_extra_json_key("errors", errors_payload)
 
     db.session.commit()
+    logger.exception("Query %s failed: %s", query.id, ex)
     payload.update({"status": query.status, "error": msg, "errors": 
errors_payload})
+    if stacktrace := traceback.format_exc():
+        payload["stacktrace"] = stacktrace

Review Comment:
   Returning the raw Python traceback in the API payload exposes server-side 
implementation details (file paths, module structure, internal variable names, 
library versions) to every SQL Lab user, including non-admin Gamma/sql_lab 
principals. This is a meaningful information-disclosure regression compared to 
the prior behavior, which only returned `status`/`error`/`errors`. Consider 
gating the `stacktrace` field on an explicit config flag (e.g. 
`SHOW_STACKTRACE`, which already exists in `superset/config.py` and defaults to 
False in production) or restricting it to admins, rather than always including 
it in the payload.



##########
superset/sql_lab.py:
##########
@@ -121,7 +122,10 @@ def handle_query_error(
         query.set_extra_json_key("errors", errors_payload)
 
     db.session.commit()
+    logger.exception("Query %s failed: %s", query.id, ex)

Review Comment:
   `handle_query_error` now calls `logger.exception(...)` internally, but the 
only callers (the outer `except` blocks at lines 198–203 and the inner one at 
526) also pass through `logger.exception(...)` either directly above (line 199) 
or from prior error logging in `execute_sql_statements`. As a result, the same 
traceback is logged twice for every failed query. Consider removing the new 
`logger.exception` here and relying on the caller-side `logger.exception` at 
line 199 (which already preserves the traceback), or vice versa — but not both.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to