jadami10 opened a new issue, #12721:
URL: https://github.com/apache/pinot/issues/12721
We have the following datetime column in our schema
```
date_time_field_specs:
- name: created
data_type: TIMESTAMP
format: 1:MILLISECONDS:EPOCH
granularity: 1:HOURS
```
The pinot [schema
docs](https://docs.pinot.apache.org/configuration-reference/schema) are vary
confusing and vague, but this is seemingly an allowed way to configure the
column.
When we try to query the table with the time pruner enabled and a `where
created between '2024-03-22 16:18:51.000' and '2024-03-22 21:37:41.000'`
clause, the broker fails with the following stacktrace.
```
java.lang.NumberFormatException: Character array is missing "e" notation
exponential mark.
at java.math.BigDecimal.<init>(BigDecimal.java:581) ~[?:?]
at java.math.BigDecimal.<init>(BigDecimal.java:405) ~[?:?]
at java.math.BigDecimal.<init>(BigDecimal.java:838) ~[?:?]
at
org.apache.pinot.spi.data.DateTimeFormatSpec.fromFormatToMillis(DateTimeFormatSpec.java:305)
at
org.apache.pinot.broker.routing.segmentpruner.TimeSegmentPruner.getFilterTimeIntervals(TimeSegmentPruner.java:281)
at
org.apache.pinot.broker.routing.segmentpruner.TimeSegmentPruner.getFilterTimeIntervals(TimeSegmentPruner.java:178)
at
org.apache.pinot.broker.routing.segmentpruner.TimeSegmentPruner.prune(TimeSegmentPruner.java:138)
at
org.apache.pinot.broker.routing.BrokerRoutingManager$RoutingEntry.calculateRouting(BrokerRoutingManager.java:788)
at
org.apache.pinot.broker.routing.BrokerRoutingManager.getRoutingTable(BrokerRoutingManager.java:612)
at
org.apache.pinot.broker.requesthandler.BaseBrokerRequestHandler.handleRequest(BaseBrokerRequestHandler.java:593)
at
org.apache.pinot.broker.requesthandler.BaseBrokerRequestHandler.handleRequest(BaseBrokerRequestHandler.java:263)
at
org.apache.pinot.broker.requesthandler.BrokerRequestHandlerDelegate.handleRequest(BrokerRequestHandlerDelegate.java:107)
at
org.apache.pinot.broker.api.resources.PinotClientRequest.executeSqlQuery(PinotClientRequest.java:317)
at
org.apache.pinot.broker.api.resources.PinotClientRequest.executeSqlQuery(PinotClientRequest.java:294)
at
org.apache.pinot.broker.api.resources.PinotClientRequest.processSqlQueryPost(PinotClientRequest.java:157)
```
It seems related to the fact that `EPOCH` is parsed [one
way](https://github.com/apache/pinot/blob/0a08a919ea6122212a1b2963de1f40cee84f6742/pinot-spi/src/main/java/org/apache/pinot/spi/data/DateTimeFormatSpec.java#L276-L277)
and
[timestamp](https://github.com/apache/pinot/blob/0a08a919ea6122212a1b2963de1f40cee84f6742/pinot-spi/src/main/java/org/apache/pinot/spi/data/DateTimeFormatSpec.java#L278-L279)
a totally different way.
Changing the format to `1:MILLISECONDS:TIMESTAMP` allows the query to
succeed.
Overall there's several issues here
- [ ] Failing timestamp parsing in the `TimeSegmentPruner` can fail the
whole query. This should not be allowed for something that's supposed to be an
optimization
- [ ] `1:MILLISECONDS:TIMESTAMP` is specifically allowed in code, but it's
not documented anywhere. Should it be used? Is it valid? Is it what we want
here?
- [ ] `1:MILLISECONDS:EPOCH` cannot parse `yyyy-mm-dd hh:mm:ss[.fffffffff]`
correctly
- [ ] I'm fairly certain `5:HOURS:TIMESTAMP` would be a valid format, but it
just always ignore the `5`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]