mneedham opened a new pull request, #9134:
URL: https://github.com/apache/pinot/pull/9134
This is a bug fix for an issue I found when using the timestamp index with
streaming data.
The problem is that the schema passed into the `LLRealTimeDataManager` (and
then into `MutableSegmentImpl`) doesn't know about the extra timestamp fields.
This means that when any rows are indexed they ignore the new fields and
when Pinot tries to commit the segment we get this type of exception:
```
java.lang.NullPointerException: null
at
org.apache.pinot.segment.spi.creator.ColumnIndexCreationInfo.getDistinctValueCount(ColumnIndexCreationInfo.java:67)
~[pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2]
at
org.apache.pinot.segment.local.segment.creator.impl.SegmentColumnarIndexCreator.init(SegmentColumnarIndexCreator.java:201)
~[pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2]
at
org.apache.pinot.segment.local.segment.creator.impl.SegmentIndexCreationDriverImpl.build(SegmentIndexCreationDriverImpl.java:216)
~[pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2]
at
org.apache.pinot.segment.local.realtime.converter.RealtimeSegmentConverter.build(RealtimeSegmentConverter.java:123)
~[pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2]
at
org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.buildSegmentInternal(LLRealtimeSegmentDataManager.java:851)
[pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2]
at
org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.buildSegmentForCommit(LLRealtimeSegmentDataManager.java:778)
[pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2]
at
org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager$PartitionConsumer.run(LLRealtimeSegmentDataManager.java:677)
[pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:0.11.0-SNAPSHOT-0c1037ed90d75bb7cd95315cd6a6bdd00f34a6c2]
at java.lang.Thread.run(Thread.java:829) [?:?]
```
I have updated the streaming QuickStart to add the timestamp index. While
doing that I had to change the value for `mtime` because there is another bug
where Pinot runs the following expression when it tries to add the extra date
columns:
```
dateTrunc('DAY', '2022-07-29 11:18:23')
```
Which doesn't work because the second parameter of this function needs to be
a `LONG` value, which isn't yet the case as the `DataTypeTransformer` hasn't
coerced the type. I'm not sure what the proper fix for that issue should be, so
I'm working around it for the sake of this PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]