Jackie-Jiang opened a new pull request, #18470: URL: https://github.com/apache/pinot/pull/18470
## Summary - Extract a common **`ColumnShape`** interface from `ColumnStatistics`, `ColumnMetadata`, and `IndexCreationContext` so column-shape attributes (cardinality, element lengths, isAscii, maxRowLengthInBytes, partition info, etc.) flow through a single accessor surface. Add **`EmptyColumnShape`** / **`EmptyColumnMetadata`** for zero-row segments. - Rework **`IndexCreationContext.Builder`** to require `(File indexDir, TableConfig tableConfig, ColumnStatistics | ColumnMetadata)` at construction. `tableNameWithType` and `continueOnError` are derived from the `TableConfig`; a `String` / `ColumnShape` fallback constructor exists for callers without a `TableConfig`. Dead `forwardIndexDisabled` plumbing is removed and verbose Javadoc is trimmed. - **`ColumnMetadataImpl`** gains a `_maxRowLengthInBytes` field derived in `Builder.build()` from canonicalized shape fields — correct for SV and fixed-width MV, plus uniform-length var-width MV; `UNAVAILABLE` for varying-length var-width MV. `extractFieldSpec` / `extractPartitionFunction` / `extractPartitions` are exposed as public static helpers. - **`SegmentMetadataImpl`** branches on `_totalDocs == 0` and uses `EmptyColumnMetadata.fromPropertiesConfiguration` for empty segments, skipping V3 index-map loading. - **`ForwardIndexHandler.rewriteDictToRawForwardIndex`** leans on `columnMetadata.getMaxRowLengthInBytes()` and only scans when the metadata returns `UNAVAILABLE`. - Bug fix: missing `_totalDocs++` in `NoDictColumnStatisticsCollector.collect(Object)` that was causing `BufferOverflowException` at segment build. 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
