tsreaper opened a new pull request, #7846:
URL: https://github.com/apache/paimon/pull/7846

   ### Purpose
   
   Add schema-level validation for the file-index.<index-type>.columns options 
(bloom-filter / bitmap / bsi / range-bitmap).
   
   Currently, if a user configures a column name that does not exist in the 
table schema (or uses the nested col[k] syntax on a non-map column / a map with 
a   non-string key), the table can be created and written to successfully, but 
the write job will fail much later — when MergeTreeWriter.flushWriteBuffer      
triggers DataFileIndexWriter.<init> during checkpoint, throwing 
IllegalArgumentException: xxx does not exist in column fields (or is not map 
type / Only    support map data type with key field of CHAR、VARCHAR、STRING.). 
The same misconfiguration also blocks compaction.
   
   This PR moves these three checks up-front into SchemaValidation so the error 
is surfaced at create-table / alter-table time:
   
   1. Every column listed in any file-index.<type>.columns must exist in the 
schema.
   2. If nested syntax col[k] is used, col must be of MAP type.
   3. A MAP referenced via nested syntax must have CHAR / VARCHAR key type 
(matching MapFileIndexMaintainer's runtime requirement).
   
   ### Tests
   
   Added two tests in SchemaValidationTest, each looping over all four index 
types (bloom-filter, bitmap, bsi, range-bitmap) to cover the configuration 
uniformly:
   
   - testFileIndexColumns — valid columns pass; an unknown column throws with a 
clear "does not exist in table schema" message.
   - testFileIndexNestedColumn — m[k] on MAP<STRING, INT> passes; f3[k] on a 
non-map column throws "is not a map type"; mi[k] on MAP<INT, INT> throws "Only 
CHAR/VARCHAR is supported".
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to