Hello,

I have noticed that our queries get converted into full table scans when
they contain IN predicates with long list of elements (eg. id IN ($manyIds)
).

Upon some debuging the reason seems to be
SqlToRelConverter.substituteSubQuery()
<https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L1001>.
Calcite will automatically translate subqueries into joins when
DEFAULT_IN_SUB_QUERY_THRESHOLD
<https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L221>
is
reached (currently 20 elements). We have instances when $manyIds can have
10K of elements (out of 50M rows table).

Some questions:

   1.

   In order to skip this query translation seems like I have to change
   SqlToRelConverter.Config.InSubQueryThreshold to MAX_VALUE. How can one
   do it in CalciteConnection or PreparedStatement ? Is is the only way ?
   2.

   Is SubQueryConverter
   
<https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql2rel/SubQueryConverter.java>
   of any use in this particular case ?
   3.

   Can one keep IN predicate "as is" without converting to disjunctions
   (see convertInToOr
   
<https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L1396>)
   ?
   The reason I’m asking is because our elastic queries get pretty big if
   constructucted using ORs (multiple nested JSONs for each element vs
   simple JSON array).

Example to reproduce (for Mongo Adapter)

@Testpublic void subQueryTheshold() {
    // SqlToRelConverter.DEFAULT_IN_SUB_QUERY_THRESHOLD = 20
    // just a long list of ids (>20 to trigger query rewrite)
    final String ids = IntStream.range(0,
SqlToRelConverter.DEFAULT_IN_SUB_QUERY_THRESHOLD)
        .mapToObj(i -> String.format("'%d'", i))
        .collect(Collectors.joining(","));

    assertModel(MODEL)
        .query(
            "select * from zips where state in (" + ids + ")")
        .explainContains("MongoFilter")
        .returns("");
}

As usual, thanks a lot for your help.

Andrei.

Reply via email to