tsreaper opened a new pull request, #7846: URL: https://github.com/apache/paimon/pull/7846
### Purpose Add schema-level validation for the file-index.<index-type>.columns options (bloom-filter / bitmap / bsi / range-bitmap). Currently, if a user configures a column name that does not exist in the table schema (or uses the nested col[k] syntax on a non-map column / a map with a non-string key), the table can be created and written to successfully, but the write job will fail much later — when MergeTreeWriter.flushWriteBuffer triggers DataFileIndexWriter.<init> during checkpoint, throwing IllegalArgumentException: xxx does not exist in column fields (or is not map type / Only support map data type with key field of CHAR、VARCHAR、STRING.). The same misconfiguration also blocks compaction. This PR moves these three checks up-front into SchemaValidation so the error is surfaced at create-table / alter-table time: 1. Every column listed in any file-index.<type>.columns must exist in the schema. 2. If nested syntax col[k] is used, col must be of MAP type. 3. A MAP referenced via nested syntax must have CHAR / VARCHAR key type (matching MapFileIndexMaintainer's runtime requirement). ### Tests Added two tests in SchemaValidationTest, each looping over all four index types (bloom-filter, bitmap, bsi, range-bitmap) to cover the configuration uniformly: - testFileIndexColumns — valid columns pass; an unknown column throws with a clear "does not exist in table schema" message. - testFileIndexNestedColumn — m[k] on MAP<STRING, INT> passes; f3[k] on a non-map column throws "is not a map type"; mi[k] on MAP<INT, INT> throws "Only CHAR/VARCHAR is supported". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
