codeant-ai-for-open-source[bot] commented on code in PR #38490:
URL: https://github.com/apache/superset/pull/38490#discussion_r2900103359


##########
superset/databases/schemas.py:
##########
@@ -277,6 +277,20 @@ def extra_validator(value: str) -> str:
                         )
                     ]
                 )
+
+        metadata_cache_timeout = extra_.get("metadata_cache_timeout", {})
+        for key in ("schema_cache_timeout", "table_cache_timeout"):
+            timeout = metadata_cache_timeout.get(key)
+            if timeout is not None and (not isinstance(timeout, int) or 
timeout < 0):
+                raise ValidationError(
+                    [
+                        _(
+                            "The %(key)s in metadata_cache_timeout must be a "
+                            "non-negative integer.",
+                            key=key,
+                        )
+                    ]
+                )

Review Comment:
   **Suggestion:** The new validation assumes that `metadata_cache_timeout` is 
always a dict and calls `.get()` on it; if existing configs have 
`metadata_cache_timeout` set to a non-dict (e.g. an integer, string, list, or 
null), this will raise an AttributeError instead of a clean ValidationError, 
breaking schema loading for such payloads. Add a type check (and gracefully 
handle non-dict/None cases) before iterating over keys so that invalid 
structures are rejected with a proper validation error rather than causing an 
unexpected exception. [type error]
   
   <details>
   <summary><b>Severity Level:</b> Major ⚠️</summary>
   
   ```mdx
   - ❌ Database creation with malformed metadata_cache_timeout crashes backend.
   - ❌ Database update using DatabasePutSchema fails with AttributeError.
   - ⚠️ Test-connection flows using DatabaseTestConnectionSchema can error.
   - ⚠️ Admin database connection UI may show generic server error.
   - ⚠️ Invalid extra structure not reported as clear validation error.
   ```
   </details>
   
   ```suggestion
           metadata_cache_timeout = extra_.get("metadata_cache_timeout") or {}
           if not isinstance(metadata_cache_timeout, dict):
               raise ValidationError(
                   [
                       _(
                           "The metadata_cache_timeout must be a mapping from 
string keys "
                           "to non-negative integer values."
                       )
                   ]
               )
           for key in ("schema_cache_timeout", "table_cache_timeout"):
               timeout = metadata_cache_timeout.get(key)
               if timeout is not None and (not isinstance(timeout, int) or 
timeout < 0):
                   raise ValidationError(
                       [
                           _(
                               "The %(key)s in metadata_cache_timeout must be a 
"
                               "non-negative integer.",
                               key=key,
                           )
                       ]
                   )
   ```
   <details>
   <summary><b>Steps of Reproduction ✅ </b></summary>
   
   ```mdx
   1. In `superset/databases/schemas.py`, note that `extra_validator` (around 
lines 246–294
   in the final file) deserializes the JSON string `extra` and then executes:
   
      `metadata_cache_timeout = extra_.get("metadata_cache_timeout", {})`
   
      followed by `metadata_cache_timeout.get(...)` on lines 281–283.
   
   2. Also in the same file, observe that multiple Marshmallow schemas attach
   `extra_validator` to their `extra` field, for example:
   
      - `DatabasePostSchema.extra` (defined later in 
`superset/databases/schemas.py`) with
      `validate=extra_validator`
   
      - `DatabasePutSchema.extra`, `DatabaseValidateParametersSchema.extra`,
      `DatabaseTestConnectionSchema.extra`, and 
`DatabaseConnectionSchema.extra` all
      similarly use `validate=extra_validator`.
   
   3. In a real execution path, any of these schemas will be used to validate 
payloads
   containing a string `extra` field. To reproduce via code, run the Superset 
backend (or
   open a Python shell in the project) and execute:
   
      ```python
   
      from superset.databases.schemas import DatabasePostSchema
   
      # `metadata_cache_timeout` is incorrectly set to an integer, not a dict
   
      data = {
   
          "database_name": "test_db",
   
          "sqlalchemy_uri": "sqlite:///test.db",
   
          "extra": '{"metadata_cache_timeout": 600}'
   
      }
   
      DatabasePostSchema().load(data)
   
      ```
   
      This triggers `extra_validator` via the `extra` field validation.
   
   4. During `extra_validator` execution, `extra_["metadata_cache_timeout"]` 
will be the
   integer `600`, so `metadata_cache_timeout = 600`. On line 283, `timeout =
   metadata_cache_timeout.get(key)` is executed on an `int`, raising 
`AttributeError: 'int'
   object has no attribute 'get'`. This bubbles up as an unexpected exception 
instead of a
   controlled `ValidationError`, causing the database create/update/test flow 
that uses this
   schema to fail with a 500-style error rather than a user-facing validation 
message.
   ```
   </details>
   <details>
   <summary><b>Prompt for AI Agent 🤖 </b></summary>
   
   ```mdx
   This is a comment left during a code review.
   
   **Path:** superset/databases/schemas.py
   **Line:** 281:293
   **Comment:**
        *Type Error: The new validation assumes that `metadata_cache_timeout` 
is always a dict and calls `.get()` on it; if existing configs have 
`metadata_cache_timeout` set to a non-dict (e.g. an integer, string, list, or 
null), this will raise an AttributeError instead of a clean ValidationError, 
breaking schema loading for such payloads. Add a type check (and gracefully 
handle non-dict/None cases) before iterating over keys so that invalid 
structures are rejected with a proper validation error rather than causing an 
unexpected exception.
   
   Validate the correctness of the flagged issue. If correct, How can I resolve 
this? If you propose a fix, implement it and please make it concise.
   ```
   </details>
   <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38490&comment_hash=ccbe8c8a905309512e50c4d029311390f181cf42d9471a51bb4022f24f0fcc55&reaction=like'>👍</a>
 | <a 
href='https://app.codeant.ai/feedback?pr_url=https%3A%2F%2Fgithub.com%2Fapache%2Fsuperset%2Fpull%2F38490&comment_hash=ccbe8c8a905309512e50c4d029311390f181cf42d9471a51bb4022f24f0fcc55&reaction=dislike'>👎</a>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to