jasperjiaguo opened a new pull request, #14016: URL: https://github.com/apache/pinot/pull/14016
This condition check introduced in https://github.com/apache/pinot/issues/945 is breaking our internal segment build ``` Caused by: java.lang.IllegalStateException: targetMaxChunkSize should only be used when deriveNumDocsPerChunk is true or rawIndexWriterVersion is 4 at org.apache.pinot.segment.spi.index.ForwardIndexConfig.<init>(ForwardIndexConfig.java:69) at org.apache.pinot.segment.spi.index.ForwardIndexConfig$Builder.build(ForwardIndexConfig.java:325) at org.apache.pinot.segment.local.segment.creator.impl.SegmentColumnarIndexCreator.adaptConfig(SegmentColumnarIndexCreator.java:246) at org.apache.pinot.segment.local.segment.creator.impl.SegmentColumnarIndexCreator.init(SegmentColumnarIndexCreator.java:172) at org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl.build(SegmentIndexCreationDriverImpl.java:266) at ... ``` rc is the existing config does not contain either `_targetMaxChunkSize` or `_deriveNumDocsPerChunk` and has `_rawIndexWriterVersion` < 4. When this is read into `SegmentGeneratorConfig`, `_targetMaxChunkSize` get populated with non null default value, and `_deriveNumDocsPerChunk` is set to false. And later in `SegmentColumnarIndexCreator`, when this config is used to build ForwardIndexConfig, it gets rejected because `_targetMaxChunkSize` is non-null. This check and the default value assignment is self contradictory, causing regression. Meanwhile, after reading the new code, I'm not sure about the rational behind the check... `_targetMaxChunkSize` and `_targetMaxChunkSizeBytes` are used in `MultiValueVarByteRawIndexCreator` for version<4 and regardless of `_deriveNumDocsPerChunk` https://github.com/apache/pinot/blob/b4dfd04c4db8539b1d286786ebf904442877714a/pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/creator/impl/fwd/MultiValueVarByteRawIndexCreator.java#L82 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
