Copilot commented on code in PR #35920:
URL: https://github.com/apache/superset/pull/35920#discussion_r2535667784


##########
superset/themes/schemas.py:
##########
@@ -87,20 +93,20 @@ def validate_and_sanitize_json_data(self, value: str) -> 
None:
         # Note: This modifies the input data to ensure sanitized content is 
stored
         if sanitized_config != theme_config:
             # Re-serialize the sanitized config
-            self.context["sanitized_json_data"] = json.dumps(sanitized_config)
+            sanitized_json_context.set(json.dumps(sanitized_config))

Review Comment:
   The `sanitized_json_context` ContextVar is set here but never appears to be 
read anywhere in the codebase. If the sanitized data needs to be retrieved 
later, you should add code to read it using `sanitized_json_context.get()`. If 
it's not needed, consider removing this context variable and the `.set()` calls 
entirely.



##########
superset/marshmallow_compatibility.py:
##########
@@ -0,0 +1,68 @@
+"""
+Marshmallow 4.x Compatibility Module for Flask-AppBuilder 5.0.0
+
+This module provides compatibility between Flask-AppBuilder 5.0.0 and
+marshmallow 4.x, specifically handling missing auto-generated fields
+during schema initialization.
+"""
+
+from marshmallow import fields
+
+
+def patch_marshmallow_for_flask_appbuilder():
+    """
+    Patches marshmallow Schema._init_fields to handle Flask-AppBuilder 5.0.0
+    compatibility with marshmallow 4.x.
+
+    Flask-AppBuilder 5.0.0 automatically generates schema fields that reference
+    SQL relationship fields that may not exist in marshmallow 4.x's stricter
+    field validation. This patch dynamically adds missing fields as Raw fields
+    to prevent KeyError exceptions during schema initialization.
+    """
+    import marshmallow
+
+    # Store the original method
+    original_init_fields = marshmallow.Schema._init_fields
+
+    def patched_init_fields(self):
+        """Patched version that handles missing declared fields."""
+        max_retries = 10  # Prevent infinite loops in case of unexpected errors
+        retries = 0
+
+        while retries < max_retries:
+            try:
+                return original_init_fields(self)
+            except KeyError as e:
+                # Extract the missing field name from the KeyError
+                missing_field = str(e).strip("'\"")
+
+                # Initialize declared_fields if it doesn't exist
+                if not hasattr(self, "declared_fields"):
+                    self.declared_fields = {}
+
+                # Only add if it doesn't already exist
+                if missing_field not in self.declared_fields:
+                    # Use Raw field as a safe fallback for unknown 
auto-generated fields
+                    self.declared_fields[missing_field] = fields.Raw(
+                        allow_none=True,
+                        dump_only=True,  # Prevent validation issues during 
serialization
+                    )
+
+                    print(
+                        f"Marshmallow compatibility: Added missing field "
+                        f"'{missing_field}' as Raw field"
+                    )
+
+                retries += 1
+                # Continue the loop to retry initialization
+            except Exception:
+                # For any other type of error, just propagate it
+                raise
+
+        # If we've exhausted retries, something is seriously wrong
+        raise RuntimeError(
+            f"Marshmallow field initialization failed after {max_retries} 
retries"
+        )
+
+    # Apply the patch
+    marshmallow.Schema._init_fields = patched_init_fields

Review Comment:
   Multiple calls to `patch_marshmallow_for_flask_appbuilder()` will repeatedly 
wrap the `_init_fields` method, creating nested wrappers. Each call to the 
patch function stores a reference to what it thinks is the "original" method, 
but on subsequent calls, this will be the already-patched version. Consider 
adding a guard to prevent multiple applications of the patch:
   ```python
   def patch_marshmallow_for_flask_appbuilder():
       import marshmallow
       
       # Guard against multiple patches
       if hasattr(marshmallow.Schema._init_fields, '_patched_for_fab'):
           return
       
       original_init_fields = marshmallow.Schema._init_fields
       
       def patched_init_fields(self):
           # ... existing implementation
       
       patched_init_fields._patched_for_fab = True
       marshmallow.Schema._init_fields = patched_init_fields
   ```



##########
superset/themes/schemas.py:
##########
@@ -118,7 +124,7 @@ def validate_and_sanitize_json_data(self, value: str) -> 
None:
         # Note: This modifies the input data to ensure sanitized content is 
stored
         if sanitized_config != theme_config:
             # Re-serialize the sanitized config
-            self.context["sanitized_json_data"] = json.dumps(sanitized_config)
+            sanitized_json_context.set(json.dumps(sanitized_config))

Review Comment:
   The `sanitized_json_context` ContextVar is set here but never appears to be 
read anywhere in the codebase. If the sanitized data needs to be retrieved 
later, you should add code to read it using `sanitized_json_context.get()`. If 
it's not needed, consider removing this context variable and the `.set()` calls 
entirely.



##########
superset/marshmallow_compatibility.py:
##########
@@ -0,0 +1,68 @@
+"""
+Marshmallow 4.x Compatibility Module for Flask-AppBuilder 5.0.0
+
+This module provides compatibility between Flask-AppBuilder 5.0.0 and
+marshmallow 4.x, specifically handling missing auto-generated fields
+during schema initialization.
+"""
+
+from marshmallow import fields
+
+
+def patch_marshmallow_for_flask_appbuilder():
+    """
+    Patches marshmallow Schema._init_fields to handle Flask-AppBuilder 5.0.0
+    compatibility with marshmallow 4.x.
+
+    Flask-AppBuilder 5.0.0 automatically generates schema fields that reference
+    SQL relationship fields that may not exist in marshmallow 4.x's stricter
+    field validation. This patch dynamically adds missing fields as Raw fields
+    to prevent KeyError exceptions during schema initialization.
+    """
+    import marshmallow
+
+    # Store the original method
+    original_init_fields = marshmallow.Schema._init_fields
+
+    def patched_init_fields(self):
+        """Patched version that handles missing declared fields."""
+        max_retries = 10  # Prevent infinite loops in case of unexpected errors
+        retries = 0
+
+        while retries < max_retries:
+            try:
+                return original_init_fields(self)
+            except KeyError as e:
+                # Extract the missing field name from the KeyError
+                missing_field = str(e).strip("'\"")
+
+                # Initialize declared_fields if it doesn't exist
+                if not hasattr(self, "declared_fields"):
+                    self.declared_fields = {}
+
+                # Only add if it doesn't already exist
+                if missing_field not in self.declared_fields:
+                    # Use Raw field as a safe fallback for unknown 
auto-generated fields
+                    self.declared_fields[missing_field] = fields.Raw(
+                        allow_none=True,
+                        dump_only=True,  # Prevent validation issues during 
serialization
+                    )
+
+                    print(
+                        f"Marshmallow compatibility: Added missing field "
+                        f"'{missing_field}' as Raw field"
+                    )

Review Comment:
   Using `print()` for logging in production code is not recommended. Consider 
using the Python `logging` module instead for better control over log levels 
and outputs. For example:
   ```python
   import logging
   logger = logging.getLogger(__name__)
   # Then use:
   logger.debug(f"Marshmallow compatibility: Added missing field 
'{missing_field}' as Raw field")
   ```



##########
superset/advanced_data_type/schemas.py:
##########
@@ -23,10 +23,10 @@
 advanced_data_type_convert_schema = {
     "type": "object",
     "properties": {
-        "type": {"type": "string", "default": "port"},
+        "type": {"type": "string", "dump_default": "port"},
         "values": {
             "type": "array",
-            "items": {"default": "http"},
+            "items": {"dump_default": "http"},

Review Comment:
   This appears to be a JSON Schema (not a Marshmallow schema), and 
`dump_default` is not a valid JSON Schema keyword. The correct JSON Schema 
keyword is `default`. The marshmallow 3.x -> 4.x migration applies to 
Marshmallow schemas, not JSON schemas. This change should be reverted to use 
`"default"` instead of `"dump_default"`.
   ```suggestion
               "items": {"default": "http"},
   ```



##########
superset/advanced_data_type/schemas.py:
##########
@@ -23,10 +23,10 @@
 advanced_data_type_convert_schema = {
     "type": "object",
     "properties": {
-        "type": {"type": "string", "default": "port"},
+        "type": {"type": "string", "dump_default": "port"},

Review Comment:
   This appears to be a JSON Schema (not a Marshmallow schema), and 
`dump_default` is not a valid JSON Schema keyword. The correct JSON Schema 
keyword is `default`. The marshmallow 3.x -> 4.x migration applies to 
Marshmallow schemas, not JSON schemas. This change should be reverted to use 
`"default"` instead of `"dump_default"`.



##########
superset/themes/schemas.py:
##########
@@ -56,20 +62,20 @@ def validate_json_data(self, value: dict[str, Any]) -> None:
                 value.clear()
                 value.update(sanitized_config)
             else:
-                self.context["sanitized_json_data"] = 
json.dumps(sanitized_config)
+                sanitized_json_context.set(json.dumps(sanitized_config))

Review Comment:
   The `sanitized_json_context` ContextVar is set here but never appears to be 
read anywhere in the codebase. If the sanitized data needs to be retrieved 
later, you should add code to read it using `sanitized_json_context.get()`. If 
it's not needed, consider removing this context variable and the `.set()` calls 
entirely. The validation already modifies the dict in-place when 
`isinstance(value, dict)` is true (line 61-63), so this may be redundant.



##########
superset/marshmallow_compatibility.py:
##########
@@ -0,0 +1,68 @@
+"""
+Marshmallow 4.x Compatibility Module for Flask-AppBuilder 5.0.0
+
+This module provides compatibility between Flask-AppBuilder 5.0.0 and
+marshmallow 4.x, specifically handling missing auto-generated fields
+during schema initialization.
+"""
+
+from marshmallow import fields
+
+
+def patch_marshmallow_for_flask_appbuilder():
+    """
+    Patches marshmallow Schema._init_fields to handle Flask-AppBuilder 5.0.0
+    compatibility with marshmallow 4.x.
+
+    Flask-AppBuilder 5.0.0 automatically generates schema fields that reference
+    SQL relationship fields that may not exist in marshmallow 4.x's stricter
+    field validation. This patch dynamically adds missing fields as Raw fields
+    to prevent KeyError exceptions during schema initialization.
+    """
+    import marshmallow
+
+    # Store the original method
+    original_init_fields = marshmallow.Schema._init_fields
+
+    def patched_init_fields(self):
+        """Patched version that handles missing declared fields."""
+        max_retries = 10  # Prevent infinite loops in case of unexpected errors
+        retries = 0
+
+        while retries < max_retries:
+            try:
+                return original_init_fields(self)
+            except KeyError as e:
+                # Extract the missing field name from the KeyError
+                missing_field = str(e).strip("'\"")
+
+                # Initialize declared_fields if it doesn't exist
+                if not hasattr(self, "declared_fields"):
+                    self.declared_fields = {}
+
+                # Only add if it doesn't already exist
+                if missing_field not in self.declared_fields:
+                    # Use Raw field as a safe fallback for unknown 
auto-generated fields
+                    self.declared_fields[missing_field] = fields.Raw(
+                        allow_none=True,
+                        dump_only=True,  # Prevent validation issues during 
serialization
+                    )
+
+                    print(
+                        f"Marshmallow compatibility: Added missing field "
+                        f"'{missing_field}' as Raw field"
+                    )
+
+                retries += 1
+                # Continue the loop to retry initialization
+            except Exception:
+                # For any other type of error, just propagate it
+                raise

Review Comment:
   [nitpick] The broad exception handler `except Exception:` catches all 
exceptions and re-raises them, but this includes `KeyboardInterrupt` and 
`SystemExit` which inherit from `BaseException` in Python 3, not `Exception`. 
While this specific code is technically correct, it's better practice to be 
more specific about what you're catching. If you want to propagate all 
non-KeyError exceptions, consider catching specific exception types or using 
`except (SomeException, AnotherException):` for known exceptions that should be 
propagated.
   ```suggestion
               # Any other exception will propagate naturally
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to