xiangfu0 commented on code in PR #18668:
URL: https://github.com/apache/pinot/pull/18668#discussion_r3365846313
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/dictionary/DictionaryIndexType.java:
##########
@@ -178,8 +180,22 @@ private static boolean
hasExplicitDictionaryConfig(FieldConfig fieldConfig) {
return indexes != null && indexes.isObject() &&
indexes.has(StandardIndexes.DICTIONARY_ID);
}
+ private static FieldConfig.EncodingType getForwardEncodingType(FieldConfig
fieldConfig) {
Review Comment:
**Minor — single-source the forward-encoding resolution.**
This helper re-implements the `indexes.forward.encodingType` parse that
`ForwardIndexType.createDeserializer` already does, and the JSON keys
`"forward"`/`"encodingType"` are hardcoded here, in `withForwardEncoding`
below, and in the test helpers. If the precedence/shape ever changes, these can
drift. Consider one shared accessor (e.g. on `FieldConfig` or a forward-index
util) plus a constant for the `"encodingType"` key, used by both resolvers.
Also note the fallback `return fieldConfig.getEncodingType();` here is now
mostly unreachable for explicit values once `validate()` forbids field-level
encoding — worth reconciling with that validation so the legacy path isn't
half-alive.
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/utils/TableConfigUtils.java:
##########
@@ -1594,11 +1593,13 @@ private static void
validateIndexingConfigAndFieldConfigList(TableConfig tableCo
String column = fieldConfig.getName();
Preconditions.checkState(seenColumns.add(column), "Duplicate
FieldConfig for column: %s", column);
Preconditions.checkState(schema.hasColumn(column), "Failed to find
column: %s in schema", column);
+ Preconditions.checkState(!fieldConfig.hasFieldLevelEncodingType(),
Review Comment:
**Backward-compat: this hard-rejects the legacy field-level `encodingType`
with no migration path.**
`hasFieldLevelEncodingType()` is true whenever `encodingType` is explicitly
set (`_encodingType != null`), so a long-standing config like
`{"name":"col","encodingType":"RAW"}` now fails validation.
`TableConfigUtils.validate` runs on the as-submitted config at create/update
(`PinotTableRestletResource#868`, `TableConfigsRestletResource#594/604`) and
nothing migrates field-level→forward beforehand
(`createTableConfigFromOldFormat`/`convertFromLegacyTableConfig` are
test-only). So editing any existing table that uses field-level `encodingType`
is blocked until the user hand-migrates each column.
This contradicts the PR description ("Existing configs using top-level
`encodingType` continue to work"). Suggest either auto-migrating
field-level→`indexes.forward.encodingType` on the write path *before*
validation (true deprecation, no user impact), or — if the hard break is
intended — labeling it `backward-incompat`/`release-notes`, adding
rolling-upgrade notes, and updating the description.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]