edg956 opened a new issue, #17701:
URL: https://github.com/apache/pinot/issues/17701

   Hi.
   
   While fixing our integration with PinotDb I have found what seem to be some 
bugs in the query compilation or execution process.
   
   For context, we (OpenMetadata) have a stack of metrics we usually compute 
for our users' tables as part of our data profiling feature. I reckon 
non-OpenMetadata users might find themselves writing similar queries for their 
data observability needs.
   
   I'll detail them separately in the order I have discovered them:
   
   <details>
   <summary>
   Comparing with floats in some nested functions throws a 
<code>NumberFormatException</code>
   </summary>
   
   One of those metrics is actually a histogram, which for all other DBs we 
compute with a query roughly like this:
   
   ```sql
   select
        sum(case when (age < 35.0) then 1 else 0 end) as age_group_1,
        sum(case when (age >= 35.0 and age < 45.0) then 1 else 0 end) 
age_group_2,
        sum(case when (age >= 45.0) then 1 else 0 end) age_group_3
   from data_source;
   ```
   
   Which returns the following error:
   
   ```text
   Error Code: 200
   
   QueryExecutionError:
   Received error query execution result block: {200=QueryExecutionError:
   org.apache.pinot.spi.exception.BadQueryRequestException: 
java.lang.NumberFormatException: For input string: "35.0"
        at 
org.apache.pinot.core.operator.filter.predicate.PredicateEvaluatorProvider.getPredicateEvaluator(PredicateEvaluatorProvider.java:94)
        at 
org.apache.pinot.core.operator.filter.predicate.PredicateEvaluatorProvider.getPredicateEvaluator(PredicateEvaluatorProvider.java:100)
        at 
org.apache.pinot.core.plan.FilterPlanNode.constructPhysicalOperator(FilterPlanNode.java:310)
        at org.apache.pinot.core.plan.FilterPlanNode.run(FilterPlanNode.java:93)
   ...
   Caused by: java.lang.NumberFormatException: For input string: "35.0"
        at 
java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67)
        at java.base/java.lang.Integer.parseInt(Integer.java:668)
        at java.base/java.lang.Integer.parseInt(Integer.java:786)
        at 
org.apache.pinot.segment.local.segment.index.readers.IntDictionary.insertionIndexOf(IntDictionary.java:44)}
   
org.apache.pinot.query.service.dispatch.QueryDispatcher.runReducer(QueryDispatcher.java:306)
   
org.apache.pinot.query.service.dispatch.QueryDispatcher.submitAndReduce(QueryDispatcher.java:96)
   
org.apache.pinot.broker.requesthandler.MultiStageBrokerRequestHandler.handleRequest(MultiStageBrokerRequestHandler.java:219)
   
org.apache.pinot.broker.requesthandler.BaseBrokerRequestHandler.handleRequest(BaseBrokerRequestHandler.java:133)
   ```
   
   This is solved when changing the boundaries to integers.
   
   I initially thought it was an issue with comparing different types but, on 
top of discovering that it also happens on float<->float comparisons, I 
discovered another issue:
   
   </details>
   
   <details>
   <summary>Nested casting can also break query execution</summary>
   
   I attempted to cast the column to float as a fix (before realizing 
float<->float comparisons also fail), and got the following:
   
   ```sql
   select
        sum(case when (cast(age as float) < 35.0) then 1 else 0 end) as 
age_group_1,
        sum(case when (cast(age as float) >= 35.0 and cast(age as float) < 
45.0) then 1 else 0 end) age_group_2,
        sum(case when (cast(age as float) >= 45.0) then 1 else 0 end) 
age_group_3
   from data_source;
   ```
   
   ```text
   ProcessingException(errorCode:150, message:SQLParsingError:
   java.lang.Exception: Unable to find table for this query
        at 
org.apache.pinot.controller.api.resources.PinotQueryResource.getMultiStageQueryResponse(PinotQueryResource.java:222)
        at 
org.apache.pinot.controller.api.resources.PinotQueryResource.executeSqlQuery(PinotQueryResource.java:179)
        at 
org.apache.pinot.controller.api.resources.PinotQueryResource.handlePostSql(PinotQueryResource.java:127)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   ...
   Caused by: java.lang.RuntimeException: Error composing query plan for: select
        sum(case when (cast(age as float) < 35.0) then 1 else 0 end) as 
age_group_1,
        sum(case when (cast(age as float) >= 35.0 and cast(age as float) < 
45.0) then 1 else 0 end) age_group_2,
        sum(case when (cast(age as float) >= 45.0) then 1 else 0 end) 
age_group_3
   from data_source;
   ...
   Caused by: java.lang.IllegalArgumentException: Table does not exist: 
'data_source'
        at 
org.apache.pinot.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:145)
        at 
org.apache.pinot.query.catalog.PinotCatalog.getTable(PinotCatalog.java:78)
        at 
org.apache.calcite.jdbc.SimpleCalciteSchema.getImplicitTable(SimpleCalciteSchema.java:128)
        at 
org.apache.calcite.jdbc.CalciteSchema.getTable(CalciteSchema.java:283))
   ```
   </details>
   
   <details>
   <summary>Finally, I discovered the source of these errors: <code>SUM</code> 
breaks the whole thing</summary>
   
   If I changed the query to:
   
   ```sql
   select
        case when (cast(age as float) < 35.0) then 1 else 0 end as age_group_1,
        case when (cast(age as float) >= 35.0 and cast(age as float) < 45.0) 
then 1 else 0 end age_group_2,
        case when (cast(age as float) >= 45.0) then 1 else 0 end age_group_3
   from data_source;
   ```
   
   And got 1's and 0's where I expected. So there's must be something fishy 
going about `SUM`.
   </details>
   
   For the record I am using PinotDB 1.2.0 in a simple docker compose 
deployment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to