I wanted to reply and share our recent requirement for handling SQL like the following `error_code IN (3002, 3030)' and the challenges we faced. For our implementation on top of Apache Kudu, each disjunction creates a `Scanner' – a resource we need to limit as it represents a denial of service attack vector (e.g. too many scanners, heap fills up). Good news for us is Kudu ships with an [`inListPredicate'] and we expected a plan to include the `SqlKind.IN' as the function which we could translate into `inListPredicate'. We were surprised when it didn't do that. We did eventually make this work for our customers with a hack below but it is not valid plan – for instance unparsing the plan produces invalid SQL query – and therefore is brittle (but *works* :fingers-crossed:) .
┌──── │ // This is not the correct use of Array. │ final RelDataType listType = builder.getTypeFactory().createArrayType(fieldType, -1); │ return builder.call(SqlStdOperatorTable.IN, │ builder.field(conditionTableName, columnName), │ rexBuilder.makeLiteral(resultValue, listType, true)); └──── We filed a ticket to do it the correct way, which is to take all the disjunctions, and "un-parse" them into `inListPredicate' calls *if possible*. This struck us as pretty dense code *but* would apply to other disjunctions. It would be *great* if Calcite shipped with a `RexCall' that our implementation could translate with little effort into a `inListPredicate'. [`inListPredicate'] https://kudu.apache.org/apidocs/org/apache/kudu/client/KuduPredicate.html#newInListPredicate-org.apache.kudu.ColumnSchema-java.util.List- On Mon, Jul 20, 2020 at 3:09 PM Stamatis Zampetakis <[email protected]> wrote: > Another quick thought as far as it concerns the IN operator would be to use > RexCall as it is right now where the first operand in the list is a > RexInputRef for instance and the rest are the literals. > I assume that taking this direction would need to change a bit the > respective SqlOperator. > > I haven't thought of this thoroughly so maybe there are important things > that I am missing. > > Best, > Stamatis > > > On Tue, Jul 21, 2020 at 12:41 AM Julian Hyde <[email protected]> wrote: > > > The name isn't very intuitive. > > > > The concept of a list and a comparison operator seems OK. As Vladimir > > points out, it is somewhat similar to RexSubQuery, so maybe this could > > be a sub-class (but organizing the data a bit more efficiently). > > > > I would be very wary of null semantics. RexNode scalar operators are > > forced to do 3-valued logic, but this is almost a relational operator > > and it would be better without that burden. > > > > Julian > > > > > > > > On Mon, Jul 20, 2020 at 3:45 AM Vladimir Sitnikov > > <[email protected]> wrote: > > > > > > >Do you know what is the impact on Enumerable implementation? > > > > > > I guess there are plenty of options there. > > > > > > The key question regarding RexListCmp is as we introduce a new Rex > node, > > > all the planning rules and all engines > > > must support it somehow. > > > > > > Technically speaking, we have RexSubQuery. > > > Haisheng, have you considered an option to stick with RexSubQuery to > > avoid > > > having two more-or-less the same rex classes? > > > > > > Vladimir > > >
