Yes, it would make sense to have this logic reside in Calcite - at least
the core part that does not depend on the histogram.
I will capture some thoughts  on this in a JIRA.   Thanks for your input.

Aman

On Mon, Feb 25, 2019 at 3:48 PM Julian Hyde <jh...@apache.org> wrote:

> Many of the default selectivity formulas are simple but poor.
>
> I totally agree that your approach - treating BETWEEN as one range
> condition rather than an AND of two independent conditions - is superior.
> Of course you can override it in your class but it would be better if you
> could contribute it back to Calcite.
>
> Julian
>
>
> > On Feb 23, 2019, at 10:07 PM, Aman Sinha <amansi...@gmail.com> wrote:
> >
> > Hi devs,
> > I am trying to estimate the selectivity of BETWEEN predicates using
> > histograms.   Calcite will convert it to a conjunction.
> >  e.g  WHERE c1 BETWEEN 10 and 20   ==>  WHERE c1 >= 10 AND c1 <= 20
> >
> > The question is : what's the formula for the selectivity of the top level
> > AND expression ?  Since these are not conditions on independent
> columns,  I
> > don't want to multiply selectivities of individual conjuncts.   Ideally,
> I
> > want to supply the [low, high] values of the range to my histogram and
> have
> > it return the selectivity based on bucket boundaries.
> >
> > However,  looking at the code in RelMdSelectivity.java and
> > RelMdUtil.guessSelectivity(), the behavior is to treat each conjunct
> > independently.  I can over-ride the relevant methods in my derived class
> > and implement the selectivity calculation but I am wondering if there's
> > some place else in Calcite that deals with such calculation.
> >
> > thanks,
> > Aman
>
>

Reply via email to