Zouxxyy opened a new pull request, #8034: URL: https://github.com/apache/paimon/pull/8034
### Purpose When `merge-schema` is enabled and source column names differ only in case from target columns (e.g. source `ID` vs target `id`), `SchemaMergingUtils` treats them as new columns due to case-sensitive `HashMap` lookups. This causes duplicate columns in the schema and makes the table unreadable (`Field names must be unique`). This PR adds a `caseSensitive` parameter through the schema merge chain (`SchemaMergingUtils` → `SchemaManager` → `FileStore` → Spark `SchemaHelper`), using `TreeMap(String.CASE_INSENSITIVE_ORDER)` for field matching when `caseSensitive=false`. Spark callers pass `spark.sql.caseSensitive` config (default `false`). Affects both `INSERT ... merge-schema=true` and `MERGE INTO ... merge-schema=true` paths. ### Tests Added 13 case-sensitivity tests in `WriteMergeSchemaTest` covering: - INSERT and MERGE INTO with case-mismatched column names - Nested struct fields with case mismatch - Schema unchanged when only case differs (no new columns) - Repeated writes with alternating case - Mixed case-mismatch with genuinely new columns - Case-sensitive mode correctly treats different case as new columns -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
