gortiz commented on PR #18668:
URL: https://github.com/apache/pinot/pull/18668#issuecomment-4647621154

   ## Review notes (non-blocking — flagging for maintainer consideration)
   
   I reviewed this PR focusing on the validation/serialization changes. CI is 
green, but I think the green is partly misleading (details below). Two things 
I'd like the maintainers to weigh in on before merge.
   
   ### 1. Update path for existing tables
   `TableConfigUtils.validate` unconditionally rejects any `FieldConfig` with a 
field-level `encodingType`, and that method runs on table **update** 
(`PinotTableRestletResource.java:364`) and the validate endpoint (`:868`), not 
just on create. Any deployment with a table whose **stored** config carries 
`fieldConfigList[].encodingType` — historically the documented and common form 
— would be unable to apply *any* subsequent config update until an operator 
manually rewrites it into `indexes.forward.encodingType`. No auto-migration 
runs on this path (`createTableConfigFromOldFormat` has no production callers).
   
   Suggestion: auto-rewrite the deprecated field into the forward block during 
validation/normalization and emit a deprecation warning, rather than 
hard-rejecting — or gate the rejection behind a flag / staged rollout with an 
explicit migration step. Per the repo's mixed-version compat mandate, removing 
an accepted config key on the write path probably wants a migration story 
(migration notes / rollback / flag), none of which is in the PR today.
   
   ### 2. CI green doesn't cover the break
   The compatibility-verifier suite creates tables on the *old* binary and 
never updates a legacy config on the *new* controller, so the update-path break 
above is unverified. Relatedly, those suite configs themselves still use the 
deprecated top-level form — e.g. 
`compatibility-verifier/sample-test-suite/config/feature-test-1.json:4` still 
has `"encodingType": "RAW"`. The PR migrated the pinot-tools samples and README 
but missed these. Worth either migrating them or documenting why the compat 
path is intentionally exempt. A regression test that updates a stored legacy 
config on the new controller would close the gap.
   
   ### Smaller notes
   - **Serialization shape change:** `FieldConfig` no longer defaults 
`_encodingType` to DICTIONARY and omits the key when unset. Self-roundtrip is 
tested; mixed-version (old broker/server reading a new-controller config that 
omits the key) is not.
   - **Silent precedence change:** `ForwardIndexType.createDeserializer` 
dropped the conflict check, so a config setting both field-level and 
forward-block encoding to *conflicting* values now silently uses the forward 
value instead of erroring.
   
   ### Positives
   - The `List`→`LinkedHashSet` change in 
`DictionaryIndexType.handleIndexSpecificCleanup` fixes a latent double-update 
bug.
   - The rejection message is excellent — it names the column and the exact 
replacement key.
   - README, UI, and pinot-tools sample configs were migrated consistently.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to