SEARCH is a semi-join too. It’s just a semi-join to an infinite set (the points 
on the number line, or any totally-ordered type). :)

I agree that it’s useful to have multiple functions, and mappings between them. 
We certainly allow multiple functions in the SQL we consume, and the SQL we 
produce (via the JDBC adapter). And we map between them by desugaring (e.g. "a 
BETWEEN b AND c" ==> "a >= b AND a <= c”.)

It is very advantageous to restrict the set of functions that can be used in 
planner rules and simplifications. Since SEARCH can represent BETWEEN (on a 
constant range) and >, the argument goes, let’s convert them both to SEARCH, 
and our rules can be simpler.

I am a fan of this approach. It is much harder to produce smart, consistent, 
robust planning rules if we let the surface area get too large.

This approach is at tension with people who want and expect their operators to 
pass through planning unchanged. If someone wrote "NVL(a, b)" they are 
surprised if it comes out as "CASE WHEN a IS NULL THEN b ELSE a”. I guess we 
can mitigate by re-sugaring on the backend, before we generate SQL.

Julian




> On Oct 14, 2025, at 1:51 PM, Stephen Carlin <[email protected]> 
> wrote:
> 
> Heh, it's funny you say that because Stamatis filed this improvement
> request in a meeting where I was complaining about things being more
> complicated and having broken code because of SEARCH :)
> 
> My main complaint is this, and perhaps it makes sense to file another Jira
> for a potential new improvement:
> 
> I totally get SEARCH and its usefulness.  So I have no complaints about the
> existence of SEARCH, why it is needed, and why it makes sense.
> 
> My problem with SEARCH is simply that it makes my code more complex.  I
> kinda think of it this way:  The SEARCH operator is a superset of the
> BETWEEN operator.  And we either are ok with having a representation for
> both or we are stuck with it due to legacy reasons.  I prefer the former
> explanation, heh.  So why not have an IN_FUNC operator as well?  No need to
> get rid of SEARCH!  Just keep it as a superset.  And then create a RexUtil
> method that does the work for me that converts a relevant SEARCH to an
> IN_FUNC.
> 
> Perhaps the general issue with IN is this: Like SEARCH, IN also has a
> broader definition than just a function, but the broader definition is
> different from SEARCH.  In our internal discussion, we talked about how
> "IN" is always a semi-join of some nature, sometimes of course with a
> subquery.  I believe this is how Calcite treats IN.  But in practice, IN
> has a specialized case as something that fits right into what a
> RexCall/function does.  I think many databases (at least the two I work on)
> implement this specialized case of IN as a function internally, just like
> BETWEEN.  So to have a 1:1 RexCall operator to function for IN just seems
> to make intuitive sense for me.
> 
> Perhaps we don't have a problem with the existence of a BETWEEN operator
> stems from the fact that BETWEEN doesn't have the same type of overload
> that IN does.
> 
> So yeah though: If we don't think that an IN type operator is good for
> Calcite, I need to develop code local to my database that does a conversion
> to a function which differs from my normal RexCall conversion.   And my
> intuition tells me that this effort has already been done elsewhere.  Which
> is why I think it belongs in Calcite, where the community would benefit.
> 
> Anyway, having said all this:  the improvement in CALCITE-7226 will
> definitely help me out because creating all those ORs before generating a
> SEARCH seems quite inefficient. It might even lead to problems in my code
> as the multiple ORs generate many layers of RexNodes which I think created
> some really slow compilations for me during one of my tests.
> 
> On Tue, Oct 14, 2025 at 11:30 AM Julian Hyde <[email protected]> wrote:
> 
>> It’s nice to see issues like
>> https://issues.apache.org/jira/browse/CALCITE-7226 "Convert IN to SEARCH
>> in SqlToRelConverter” being logged.
>> 
>> When I introduced SEARCH I got a lot of complaints about how I was making
>> plans more complicated and generally breaking things. All I was really
>> doing was bringing the concept of a Sarg [1] into our logical algebra. I’m
>> glad that people now see that SEARCH leads to more robust, efficient plans.
>> 
>> Julian
>> 
>> [1] https://en.wikipedia.org/wiki/Sargable

Reply via email to