Thanks for the feedback.

The SUBSTR function is not a SQL standard function, so it requires some
specific configuration (specifically the "fun" parameter).
Please check
https://calcite.apache.org/docs/reference.html#dialect-specific-operators
for more info (and notice how SUBSTR is listed in the table in that
section).
The alternative would be modifying the queries to use the standard
SUBSTRING function.

I am trying to reproduce the partition query problem within Calcite, but so
far I have not succeeded.
I tried adding your query in a test in TpcdsTest.java [1]. SqlToRel
conversion seems to run fine (then the query fails at execution time with
"CalciteException: Cannot convert null to long", but that seems a different
problem than yours).
Could you please try to reproduce the exception with a unit test in Calcite?

Best,
Ruben

[1]
class TpcdsTest {
...
  @Test void testYXZ() {
    with()
        .query("select  i_item_id ...")
        .runs();
  }
...
}

On Fri, Feb 19, 2021 at 7:13 PM Priyendra Deshwal <[email protected]>
wrote:

> Thanks for confirming the decorrelation bugfix.
>
> Here's the info for the SUBSTR error. As noted, this query fails in
> SqlValidator. I have verified that this fails in both 1.25 and 1.26. Have
> not been able to test master yet.
>
> select  ca_zip
>        ,sum(cs_sales_price)
>  from catalog_sales
>      ,customer
>      ,customer_address
>      ,date_dim
>  where cs_bill_customer_sk = c_customer_sk
>   and c_current_addr_sk = ca_address_sk
>   and ( substr(ca_zip,1,5) in ('85669', '86197','88274','83405','86475',
>                                    '85392', '85460', '80348', '81792')
>         or ca_state in ('CA','WA','GA')
>         or cs_sales_price > 500)
>   and cs_sold_date_sk = d_date_sk
>   and d_qoy = 2 and d_year = 2001
>  group by ca_zip
>  order by ca_zip
>  limit 100
>
> Exception in thread "main"
> org.apache.calcite.runtime.CalciteContextException: From line 9, column 9
> to line 9, column 26: No match found for function signature
> SUBSTR(<CHARACTER>, <NUMERIC>, <NUMERIC>)
>   at
>
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>   at
>
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at
>
> java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at
> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
>   at
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:467)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:867)
>   at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:852)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:5003)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.handleUnresolvedFunction(SqlValidatorImpl.java:1835)
>   at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:313)
>   at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:219)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5766)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5753)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl(SqlValidatorImpl.java:1753)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType(SqlValidatorImpl.java:1738)
>   at
> org.apache.calcite.sql.type.InferTypes.lambda$static$0(InferTypes.java:47)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1933)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1941)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1941)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1941)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1941)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1941)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.inferUnknownTypes(SqlValidatorImpl.java:1941)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateWhereOrOn(SqlValidatorImpl.java:4093)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateWhereClause(SqlValidatorImpl.java:4085)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3424)
>   at
>
> org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
>   at
>
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1064)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:1038)
>   at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:232)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:1013)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:721)
>   at
>
> io.netspring.playground.calcite.CalcitePoc.processQuery(CalcitePoc.java:621)
>   at io.netspring.playground.calcite.CalcitePoc.main(CalcitePoc.java:705)
> Caused by: org.apache.calcite.sql.validate.SqlValidatorException: No match
> found for function signature SUBSTR(<CHARACTER>, <NUMERIC>, <NUMERIC>)
>   at
>
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> Method)
>   at
>
> java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at
>
> java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at
> java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
>   at
> org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:467)
>   at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:560)
>   ... 31 more
>
> Here's the info for the partition over query. As noted, this query fails in
> SqlToRelConverter. I have verified that this query fails in both 1.25 and
> 1.26. Have not been able to test master yet.
>
> select  i_item_id
>       ,i_item_desc
>       ,i_category
>       ,i_class
>       ,i_current_price
>       ,sum(ws_ext_sales_price) as itemrevenue
>       ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over
>           (partition by i_class) as revenueratio
> from
>   web_sales
>       ,item
>       ,date_dim
> where
>   ws_item_sk = i_item_sk
>     and i_category in ('Sports', 'Books', 'Home')
>     and ws_sold_date_sk = d_date_sk
>   and d_date between cast('1999-02-22' as date)
>         and (cast('1999-02-22' as date) + INTERVAL '30' day)
> group by
>   i_item_id
>         ,i_item_desc
>         ,i_category
>         ,i_class
>         ,i_current_price
> order by
>   i_category
>         ,i_class
>         ,i_item_id
>         ,i_item_desc
>         ,revenueratio
> limit 100
>
> Exception in thread "main" java.lang.RuntimeException: while converting
> SUM(`WS_EXT_SALES_PRICE`) * 100 / (SUM(SUM(`WS_EXT_SALES_PRICE`)) OVER
> (PARTITION BY `I_CLASS`))
>   at
>
> org.apache.calcite.sql2rel.ReflectiveConvertletTable.lambda$registerNodeTypeMethod$0(ReflectiveConvertletTable.java:86)
>   at
>
> org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:62)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:5001)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4297)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4864)
>   at
>
> org.apache.calcite.sql2rel.StandardConvertletTable.lambda$new$9(StandardConvertletTable.java:207)
>   at
>
> org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:62)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:5001)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4297)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4864)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:3052)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:2896)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:674)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:636)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3380)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:562)
>   at
>
> io.netspring.playground.calcite.CalcitePoc.processQuery(CalcitePoc.java:643)
>   at io.netspring.playground.calcite.CalcitePoc.main(CalcitePoc.java:705)
> Caused by: java.lang.reflect.InvocationTargetException
>   at jdk.internal.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at
>
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at
>
> org.apache.calcite.sql2rel.ReflectiveConvertletTable.lambda$registerNodeTypeMethod$0(ReflectiveConvertletTable.java:83)
>   ... 19 more
> Caused by: java.lang.UnsupportedOperationException: class
> org.apache.calcite.sql.SqlBasicCall: SUM(SUM(`WS_EXT_SALES_PRICE`)) OVER
> (PARTITION BY `I_CLASS`)
>   at org.apache.calcite.util.Util.needToImplement(Util.java:975)
>   at
>
> org.apache.calcite.sql.validate.SqlValidatorImpl.getValidatedNodeType(SqlValidatorImpl.java:1655)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter.convertOver(SqlToRelConverter.java:2005)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter.access$1900(SqlToRelConverter.java:219)
>   at
>
> org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4857)
>   at
>
> org.apache.calcite.sql2rel.StandardConvertletTable.convertExpressionList(StandardConvertletTable.java:839)
>   at
>
> org.apache.calcite.sql2rel.StandardConvertletTable.convertCall(StandardConvertletTable.java:815)
>   at
>
> org.apache.calcite.sql2rel.StandardConvertletTable.convertCall(StandardConvertletTable.java:802)
>   ... 23 more
>
> On Fri, Feb 19, 2021 at 10:28 AM Ruben Q L <[email protected]> wrote:
>
> > Hello Priyendra,
> >
> > Indeed, your decorrelator issue (4) seems to be solved in master thanks
> to
> > https://issues.apache.org/jira/browse/CALCITE-4333
> >
> > Could you please provide the query and the stack-trace from substr issue
> > (2) ?
> >
> > What is the actual query that originates partition issue (3)?
> >
> > Best,
> > Ruben
> >
> >
> > On Fri, Feb 19, 2021 at 5:40 PM Priyendra Deshwal <[email protected]>
> > wrote:
> >
> > > FYI - I was running my tests on the v1.26 release. When I ran the same
> > test
> > > on master, it seems the query does get decorrelated. Have there been
> some
> > > recent bugs or bugfixes in that area?
> > >
> > > with customer_total_return as
> > > (select sr_customer_sk as ctr_customer_sk
> > > ,sr_store_sk as ctr_store_sk
> > > ,sum(sr_return_amt) as ctr_total_return
> > > from store_returns
> > > ,date_dim
> > > where sr_returned_date_sk = d_date_sk
> > > and d_year = 2000
> > > group by sr_customer_sk
> > > ,sr_store_sk)
> > >  select  c_customer_id
> > > from customer_total_return ctr1
> > > ,store
> > > ,customer
> > > where ctr1.ctr_total_return > (select avg(ctr_total_return)*1.2
> > > from customer_total_return ctr2
> > > where ctr1.ctr_store_sk = ctr2.ctr_store_sk)
> > > and s_store_sk = ctr1.ctr_store_sk
> > > and s_state = 'TN'
> > > and ctr1.ctr_customer_sk = c_customer_sk
> > > order by c_customer_id
> > > limit 100
> > >
> > >
> > > *[Initial Logical Plan]*LogicalSort(sort0=[$0], dir0=[ASC],
> fetch=[100])
> > >   LogicalProject(C_CUSTOMER_ID=[$33])
> > >     LogicalFilter(condition=[AND(>($2, $50), =($3, $1), =($27, 'TN'),
> > =($0,
> > > $32))])
> > >       LogicalCorrelate(correlation=[$cor0], joinType=[left],
> > > requiredColumns=[{1}])
> > >         LogicalJoin(condition=[true], joinType=[inner])
> > >           LogicalJoin(condition=[true], joinType=[inner])
> > >             LogicalProject(CTR_CUSTOMER_SK=[$0], CTR_STORE_SK=[$1],
> > > CTR_TOTAL_RETURN=[$2])
> > >               LogicalAggregate(group=[{0, 1}],
> > CTR_TOTAL_RETURN=[SUM($2)])
> > >                 LogicalProject(sr_customer_sk=[$3], sr_store_sk=[$7],
> > > sr_return_amt=[$11])
> > >                   LogicalFilter(condition=[AND(=($0, $20), =($26,
> > 2000))])
> > >                     LogicalJoin(condition=[true], joinType=[inner])
> > >                       LogicalTableScan(table=[[store_returns]])
> > >                       LogicalTableScan(table=[[date_dim]])
> > >             LogicalTableScan(table=[[store]])
> > >           LogicalTableScan(table=[[customer]])
> > >         LogicalProject(EXPR$0=[*($0, 1.2:DECIMAL(2, 1))])
> > >           LogicalAggregate(group=[{}], agg#0=[AVG($0)])
> > >             LogicalProject(CTR_TOTAL_RETURN=[$2])
> > >               LogicalFilter(condition=[=($cor0.CTR_STORE_SK, $1)])
> > >                 LogicalProject(CTR_CUSTOMER_SK=[$0], CTR_STORE_SK=[$1],
> > > CTR_TOTAL_RETURN=[$2])
> > >                   LogicalAggregate(group=[{0, 1}],
> > > CTR_TOTAL_RETURN=[SUM($2)])
> > >                     LogicalProject(sr_customer_sk=[$3],
> sr_store_sk=[$7],
> > > sr_return_amt=[$11])
> > >                       LogicalFilter(condition=[AND(=($0, $20), =($26,
> > > 2000))])
> > >                         LogicalJoin(condition=[true], joinType=[inner])
> > >                           LogicalTableScan(table=[[store_returns]])
> > >                           LogicalTableScan(table=[[date_dim]])
> > >
> > >
> > > *[Decorrelated Logical Plan]*LogicalSort(sort0=[$0], dir0=[ASC],
> > > fetch=[100])
> > >   LogicalProject(C_CUSTOMER_ID=[$33])
> > >     LogicalProject(CTR_CUSTOMER_SK=[$0], CTR_STORE_SK=[$1],
> > > CTR_TOTAL_RETURN=[$2], s_store_sk=[$3], s_store_id=[$4],
> > > s_rec_start_date=[$5], s_rec_end_date=[$6], s_closed_date_sk=[$7],
> > > s_store_name=[$8], s_number_employees=[$9], s_floor_space=[$10],
> > > s_hours=[$11], s_manager=[$12], s_market_id=[$13],
> > s_geography_class=[$14],
> > > s_market_desc=[$15], s_market_manager=[$16], s_division_id=[$17],
> > > s_division_name=[$18], s_company_id=[$19], s_company_name=[$20],
> > > s_street_number=[$21], s_street_name=[$22], s_street_type=[$23],
> > > s_suite_number=[$24], s_city=[$25], s_county=[$26], s_state=[$27],
> > > s_zip=[$28], s_country=[$29], s_gmt_offset=[$30],
> s_tax_precentage=[$31],
> > > c_customer_sk=[$32], c_customer_id=[$33], c_current_cdemo_sk=[$34],
> > > c_current_hdemo_sk=[$35], c_current_addr_sk=[$36],
> > > c_first_shipto_date_sk=[$37], c_first_sales_date_sk=[$38],
> > > c_salutation=[$39], c_first_name=[$40], c_last_name=[$41],
> > > c_preferred_cust_flag=[$42], c_birth_day=[$43], c_birth_month=[$44],
> > > c_birth_year=[$45], c_birth_country=[$46], c_login=[$47],
> > > c_email_address=[$48], c_last_review_date_sk=[$49],
> > > CTR_STORE_SK0=[CAST($50):BIGINT], $f1=[CAST($51):DOUBLE])
> > >       LogicalJoin(condition=[AND(=($1, $50), >($2, *($51,
> 1.2:DECIMAL(2,
> > > 1))))], joinType=[inner])
> > >         LogicalJoin(condition=[=($0, $32)], joinType=[inner])
> > >           LogicalJoin(condition=[=($3, $1)], joinType=[inner])
> > >             LogicalProject(CTR_CUSTOMER_SK=[$0], CTR_STORE_SK=[$1],
> > > CTR_TOTAL_RETURN=[$2])
> > >               LogicalAggregate(group=[{0, 1}],
> > CTR_TOTAL_RETURN=[SUM($2)])
> > >                 LogicalProject(sr_customer_sk=[$3], sr_store_sk=[$7],
> > > sr_return_amt=[$11])
> > >                   LogicalJoin(condition=[=($0, $20)], joinType=[inner])
> > >                     LogicalTableScan(table=[[store_returns]])
> > >                     LogicalFilter(condition=[=($6, 2000)])
> > >                       LogicalTableScan(table=[[date_dim]])
> > >             LogicalFilter(condition=[=($24, 'TN')])
> > >               LogicalTableScan(table=[[store]])
> > >           LogicalTableScan(table=[[customer]])
> > >         LogicalAggregate(group=[{0}], agg#0=[AVG($1)])
> > >           LogicalProject(CTR_STORE_SK=[$1], CTR_TOTAL_RETURN=[$2])
> > >             LogicalAggregate(group=[{0, 1}],
> CTR_TOTAL_RETURN=[SUM($2)])
> > >               LogicalProject(sr_customer_sk=[$3], sr_store_sk=[$7],
> > > sr_return_amt=[$11])
> > >                 LogicalJoin(condition=[=($0, $20)], joinType=[inner])
> > >                   LogicalTableScan(table=[[store_returns]])
> > >                   LogicalFilter(condition=[=($6, 2000)])
> > >                     LogicalTableScan(table=[[date_dim]])
> > >
> >
>

Reply via email to