edg956 opened a new issue, #17701:
URL: https://github.com/apache/pinot/issues/17701
Hi.
While fixing our integration with PinotDb I have found what seem to be some
bugs in the query compilation or execution process.
For context, we (OpenMetadata) have a stack of metrics we usually compute
for our users' tables as part of our data profiling feature. I reckon
non-OpenMetadata users might find themselves writing similar queries for their
data observability needs.
I'll detail them separately in the order I have discovered them:
<details>
<summary>
Comparing with floats in some nested functions throws a
<code>NumberFormatException</code>
</summary>
One of those metrics is actually a histogram, which for all other DBs we
compute with a query roughly like this:
```sql
select
sum(case when (age < 35.0) then 1 else 0 end) as age_group_1,
sum(case when (age >= 35.0 and age < 45.0) then 1 else 0 end)
age_group_2,
sum(case when (age >= 45.0) then 1 else 0 end) age_group_3
from data_source;
```
Which returns the following error:
```text
Error Code: 200
QueryExecutionError:
Received error query execution result block: {200=QueryExecutionError:
org.apache.pinot.spi.exception.BadQueryRequestException:
java.lang.NumberFormatException: For input string: "35.0"
at
org.apache.pinot.core.operator.filter.predicate.PredicateEvaluatorProvider.getPredicateEvaluator(PredicateEvaluatorProvider.java:94)
at
org.apache.pinot.core.operator.filter.predicate.PredicateEvaluatorProvider.getPredicateEvaluator(PredicateEvaluatorProvider.java:100)
at
org.apache.pinot.core.plan.FilterPlanNode.constructPhysicalOperator(FilterPlanNode.java:310)
at org.apache.pinot.core.plan.FilterPlanNode.run(FilterPlanNode.java:93)
...
Caused by: java.lang.NumberFormatException: For input string: "35.0"
at
java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67)
at java.base/java.lang.Integer.parseInt(Integer.java:668)
at java.base/java.lang.Integer.parseInt(Integer.java:786)
at
org.apache.pinot.segment.local.segment.index.readers.IntDictionary.insertionIndexOf(IntDictionary.java:44)}
org.apache.pinot.query.service.dispatch.QueryDispatcher.runReducer(QueryDispatcher.java:306)
org.apache.pinot.query.service.dispatch.QueryDispatcher.submitAndReduce(QueryDispatcher.java:96)
org.apache.pinot.broker.requesthandler.MultiStageBrokerRequestHandler.handleRequest(MultiStageBrokerRequestHandler.java:219)
org.apache.pinot.broker.requesthandler.BaseBrokerRequestHandler.handleRequest(BaseBrokerRequestHandler.java:133)
```
This is solved when changing the boundaries to integers.
I initially thought it was an issue with comparing different types but, on
top of discovering that it also happens on float<->float comparisons, I
discovered another issue:
</details>
<details>
<summary>Nested casting can also break query execution</summary>
I attempted to cast the column to float as a fix (before realizing
float<->float comparisons also fail), and got the following:
```sql
select
sum(case when (cast(age as float) < 35.0) then 1 else 0 end) as
age_group_1,
sum(case when (cast(age as float) >= 35.0 and cast(age as float) <
45.0) then 1 else 0 end) age_group_2,
sum(case when (cast(age as float) >= 45.0) then 1 else 0 end)
age_group_3
from data_source;
```
```text
ProcessingException(errorCode:150, message:SQLParsingError:
java.lang.Exception: Unable to find table for this query
at
org.apache.pinot.controller.api.resources.PinotQueryResource.getMultiStageQueryResponse(PinotQueryResource.java:222)
at
org.apache.pinot.controller.api.resources.PinotQueryResource.executeSqlQuery(PinotQueryResource.java:179)
at
org.apache.pinot.controller.api.resources.PinotQueryResource.handlePostSql(PinotQueryResource.java:127)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
...
Caused by: java.lang.RuntimeException: Error composing query plan for: select
sum(case when (cast(age as float) < 35.0) then 1 else 0 end) as
age_group_1,
sum(case when (cast(age as float) >= 35.0 and cast(age as float) <
45.0) then 1 else 0 end) age_group_2,
sum(case when (cast(age as float) >= 45.0) then 1 else 0 end)
age_group_3
from data_source;
...
Caused by: java.lang.IllegalArgumentException: Table does not exist:
'data_source'
at
org.apache.pinot.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:145)
at
org.apache.pinot.query.catalog.PinotCatalog.getTable(PinotCatalog.java:78)
at
org.apache.calcite.jdbc.SimpleCalciteSchema.getImplicitTable(SimpleCalciteSchema.java:128)
at
org.apache.calcite.jdbc.CalciteSchema.getTable(CalciteSchema.java:283))
```
</details>
<details>
<summary>Finally, I discovered the source of these errors: <code>SUM</code>
breaks the whole thing</summary>
If I changed the query to:
```sql
select
case when (cast(age as float) < 35.0) then 1 else 0 end as age_group_1,
case when (cast(age as float) >= 35.0 and cast(age as float) < 45.0)
then 1 else 0 end age_group_2,
case when (cast(age as float) >= 45.0) then 1 else 0 end age_group_3
from data_source;
```
And got 1's and 0's where I expected. So there's must be something fishy
going about `SUM`.
</details>
For the record I am using PinotDB 1.2.0 in a simple docker compose
deployment.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]