[jira] [Comment Edited] (CALCITE-2696) Filter containing IN clause not passed to Enumerable.scan

2018-11-23 Thread Dirk Mahler (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16697096#comment-16697096
 ] 

Dirk Mahler edited comment on CALCITE-2696 at 11/23/18 12:54 PM:
-

Yes, it's the default threshold. I've built the current 1.18.0-SNAPSHOT from 
source and used Integer.MAX_VALUE and the "problem" disappeared. Is there a way 
to make the setting configurable (see setup in the description/attached test 
case)?


was (Author: dirk.mahler):
Yes, it's the default threshold. I've built the current 1.18.0-SNAPSHOT from 
source and used Integer.MAX_VALUE and the "problem" disappeared. Is there a way 
to make the setting configurabe (see setup in the description/attached test 
case)?

> Filter containing IN clause not passed to Enumerable.scan
> -
>
> Key: CALCITE-2696
> URL: https://issues.apache.org/jira/browse/CALCITE-2696
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Dirk Mahler
>Assignee: Julian Hyde
>Priority: Major
> Attachments: calcite-in-clause.zip
>
>
> I'm using the Calcite JDBC driver with an own SchemaFactory (defined by a 
> model property) that provides a schema containing a 
> ProjectableFilterableTable:
> {code:java}
> String model = "inline:" //
> + "{" //
> + " version: '1.0', " //
> + " defaultSchema: 'test'," //
> + " schemas: [" //
> + " {" //
> + " name: 'test'," //
> + " type: 'custom'," //
> + " factory: '" + TestSchemaFactory.class.getName() + "'" //
> + " }"
> + " ]" //
> + "}";
> Properties properties = new Properties();
> properties.put(CalciteConnectionProperty.MODEL.camelName(), model);
> connection = DriverManager.getConnection("jdbc:calcite:", properties);
> {code}
>  
>  
> {code:java}
> class TestTable extends AbstractQueryableTable implements 
> ProjectableFilterableTable {
>   public Enumerable scan(DataContext root, List filters, 
> int[] projects) {
> ...
>   }
>   ...
> }{code}
>  
> It maps to a Java class and provides two Integer typed columns "value1" and 
> "value2".
> The following query leads to a quite expensive behavior in the scan method if 
> the following statement is executed:
>  
> {code:java}
> SELECT "value" FROM "TEST_TABLE" WHERE "value1" = 1 AND "value2" in 
> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)
> {code}
> The scan method is invoked with a filter that only covers the part "value1" = 
> 1, the IN clause is completely omitted. The result on the JDBC side is still 
> valid but in my case this still leads to a full scan of a large underlying 
> data set (millions of rows).
> Interestingly the filter part reflecting the IN operator is provided if the 
> number of elements in the list is below 20. It seems that this is controlled 
> by 
> org.apache.calcite.sql2rel.SqlToRelConverter.Config#getInSubQueryThreshold. 
> It would at be very helpful if this behavior could be confgiured on the JDBC 
> property level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2696) Filter containing IN clause not passed to Enumerable.scan

2018-11-22 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16696376#comment-16696376
 ] 

Julian Hyde edited comment on CALCITE-2696 at 11/23/18 4:19 AM:


I think you're hitting {{SqlToRelConverter.DEFAULT_IN_SUB_QUERY_THRESHOLD}} 
(which is 20). If an IN clause has more values than that threshold, Calcite 
converts the IN clause to a JOIN to a VALUES. Therefore it's not implemented as 
a filter on the table, and so it is not in the {{filters}} parameter when your 
table's {{scan}} method is called.


was (Author: julianhyde):
I think you're hitting {{SqlToRelConverter.DEFAULT_IN_SUB_QUERY_THRESHOLD}} 
(which is 20). Above that threshold, Calcite converts an IN clause to a JOIN to 
a VALUES. Therefore it's not implemented as a filter on the table, and so it is 
not in the {{filters}} parameter when your table's {{scan}} method is called.

> Filter containing IN clause not passed to Enumerable.scan
> -
>
> Key: CALCITE-2696
> URL: https://issues.apache.org/jira/browse/CALCITE-2696
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Dirk Mahler
>Assignee: Julian Hyde
>Priority: Major
> Attachments: calcite-in-clause.zip
>
>
> I'm using the Calcite JDBC driver with an own SchemaFactory (defined by a 
> model property) that provides a schema containing a 
> ProjectableFilterableTable:
> {code:java}
> String model = "inline:" //
> + "{" //
> + " version: '1.0', " //
> + " defaultSchema: 'test'," //
> + " schemas: [" //
> + " {" //
> + " name: 'test'," //
> + " type: 'custom'," //
> + " factory: '" + TestSchemaFactory.class.getName() + "'" //
> + " }"
> + " ]" //
> + "}";
> Properties properties = new Properties();
> properties.put(CalciteConnectionProperty.MODEL.camelName(), model);
> connection = DriverManager.getConnection("jdbc:calcite:", properties);
> {code}
>  
>  
> {code:java}
> class TestTable extends AbstractQueryableTable implements 
> ProjectableFilterableTable {
>   public Enumerable scan(DataContext root, List filters, 
> int[] projects) {
> ...
>   }
>   ...
> }{code}
>  
> It maps to a Java class and provides two Integer typed columns "value1" and 
> "value2".
> The following query leads to a quite expensive behavior in the scan method if 
> the following statement is executed:
>  
> {code:java}
> SELECT "value" FROM "TEST_TABLE" WHERE "value1" = 1 AND "value2" in 
> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)
> {code}
> The scan method is invoked with a filter that only covers the part "value1" = 
> 1, the IN clause is completely omitted. The result on the JDBC side is still 
> valid but in my case this still leads to a full scan of a large underlying 
> data set (millions of rows).
> Interestingly the filter part reflecting the IN operator is provided if the 
> number of elements in the list is below 20. It seems that this is controlled 
> by 
> org.apache.calcite.sql2rel.SqlToRelConverter.Config#getInSubQueryThreshold. 
> It would at be very helpful if this behavior could be confgiured on the JDBC 
> property level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)