Hi all,
This is a rough idea, I'd like to see how the community think about it.
RexListCmp extends RexNode / RexCall {
public final SqlOperator op;
public final RexNode left;
public final ImmutableList<RexNode> list;
public final RexQuantifier quantifier;
public final RelDataType type;
}
Enum RexQuantifier {
ALL,
ANY
}
Background:
It is not uncommon that the query contains large number of constant IN list,
e.g.
1) SELECT * FROM foo WHERE a NOT IN (1, 2, 3, ...., 10000);
2) SELECT * FROM bar WHERE b IN (1, 2, 3, ...., 10000);
Currently, Calcite either translates it into a Join, or expand to OR/AND, which
is inefficient, and may cause problems.
With RexListCmp, the predicate in query 1) will be represented as:
RexListCmp {
op = "<>",
left = "a"
list = "1,2,3...10000"
quantifier = "ALL"
}
The predicate in query 2) will be represented as:
RexListCmp {
op = "=",
left = "b"
list = "1,2,3...10000"
quantifier = "ANY"
}
It may also be used to represent the predicate in the following query:
SELECT * FROM bar WHERE (a,b) IN / NOT IN ((1,1), (2,2), (3,3), ... (1000,
1000));
Further more, it is extensible. The op is not limited to be equals or not
equals, it also be >, <, >=, <=, IDF, INDF or even customized sql operator like
geospatial operator intersect:
boolean &&( geometry A , geometry B )
Thoughts?
Thanks,
Haisheng Yuan