codeant-ai-for-open-source[bot] commented on PR #36458: URL: https://github.com/apache/superset/pull/36458#issuecomment-3628244034
## Nitpicks 🔍 <table> <tr><td>🔒 <strong>No security issues identified</strong></td></tr> <tr><td>⚡ <strong>Recommended areas for review</strong><br><br> - [ ] <a href='https://github.com/apache/superset/pull/36458/files#diff-c895cf0d2e35cd84dd278b251c1e4ab8a0ecb9588e21b13e7c6ace5113f862b4R56-R103'><strong>Eager init</strong></a><br>`ModelGetSchemaCore` instances are created at import time. If construction becomes heavier (or DAOs have side effects), this can increase startup cost and complicate testing/mocking. Consider lazy initialization or a factory to create cores on first use.<br> - [ ] <a href='https://github.com/apache/superset/pull/36458/files#diff-33f87750cdefd39870a50f191b5993739c162d0841accdb11e4646ea019da04aR115-R121'><strong>Parsing Issue</strong></a><br>`request.select_columns` is used directly when building the serialization context. If the request field can be provided as a JSON string, CSV, or other serialized form, using it directly may lead to incorrect behavior. Use the same parsing utility used by `ModelListCore` to normalize the value (e.g., parse JSON/CSV -> list) before applying it.<br> - [ ] <a href='https://github.com/apache/superset/pull/36458/files#diff-e1a76d1f013e6c55119080fd7ec635310f15bc77f7c49e235244cb3121eb1aa3R120-R125'><strong>Columns vs. serialization mismatch</strong></a><br>The response returned by ModelListCore includes `columns_loaded` (the actual columns loaded). The code uses `request.select_columns` (or default) to filter fields for output; this can cause the reported `columns_loaded` to diverge from the fields actually serialized, producing inconsistent API responses.<br> - [ ] <a href='https://github.com/apache/superset/pull/36458/files#diff-05a9898ba969a7986371667690eb49e5f685d450c31a1f642e020f6caaf5f28aR136-R145'><strong>Unparsed select_columns (Possible Bug)</strong></a><br>The code now uses `request.select_columns` directly for the serialization context when calling `result.model_dump(...)`. However, `request.select_columns` may be a JSON string or CSV (the same inputs that `ModelListCore.run_tool` parses internally). Passing an unparsed string into the context can break field filtering logic which expects a list of column names. Use the parsed column list that the DAO used (or coerce/parse the request value) to ensure consistent behavior.<br> - [ ] <a href='https://github.com/apache/superset/pull/36458/files#diff-e1a76d1f013e6c55119080fd7ec635310f15bc77f7c49e235244cb3121eb1aa3R147-R156'><strong>Request parsing mismatch</strong></a><br>`request.select_columns` is used directly to build the serialization context. However, ModelListCore internally parses `select_columns` (accepting CSV / JSON / list). If `request.select_columns` is a CSV or raw string, the serialization context may receive a string instead of a list which can break filtering logic in DatasetInfo serialization.<br> - [ ] <a href='https://github.com/apache/superset/pull/36458/files#diff-c53108521a676f135ad7846c809dddc3eb33240abb7d123d0948cd5f7b2e9765R356-R396'><strong>Identifier mutation</strong></a><br>The `ColumnRef.sanitize_name` validator returns `html.escape(...)` (escaped value) for `name`, which can change column identifiers used for queries/joins (e.g., converting spaces/quotes to HTML entities). That may make the `name` no longer match database/metric identifiers and break query generation or equality checks. Consider preserving a canonical identifier for internals and only escaping for display.<br> - [ ] <a href='https://github.com/apache/superset/pull/36458/files#diff-7399392e20028281da2c39f4cccfcffeb9ea5d5a0cbaa739d8d072fd1f35c6ebR506-R540'><strong>Serialization mismatch</strong></a><br>The simple serializer `serialize_dashboard_object` populates keys like `created_by_name` and `changed_by_name` when constructing `DashboardInfo`, but `DashboardInfo` defines `created_by` and `changed_by` (no `_name` fields). This can lead to ignored fields or validation errors / unexpected missing data in responses.<br> - [ ] <a href='https://github.com/apache/superset/pull/36458/files#diff-367d5eb5f000133464ca2fde918009da6b5467e821cb3a94fabd369771201f50R553-R561'><strong>Schema field name mismatch</strong></a><br>The run_tool constructs the output schema using the keyword `default_select`. Ensure the target Pydantic output_schema actually expects `default_select` (not `default_columns` or similar). A mismatch will raise validation errors at runtime.<br> </td></tr> </table> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
