Me and my team are considering adding a getSelectivity(RelSubset, …)
override in our codebase and I'd like to check whether there's a known
reason core RelMdSelectivity doesn't do this — i.e. whether we'd be walking
into something the project has already considered and decided against.

I checked https://lists.apache.org
<https://lists.apache.org/[email protected]:gte=0d:getSelectivity>
 and https://issues.apache.org
<https://issues.apache.org/jira/browse/CALCITE-3298?jql=project%20%3D%20CALCITE%20AND%20text%20~%20getSelectivity>
and
don't think this subject has already been discussed there.

We're planning this override because during Volcano exploration,
mq.getSelectivity(subset,
p) for a RelSubset falls to the RelNode catch-all in RelMdSelectivity and
returns RelMdUtil.guessSelectivity(predicate) — a pure function of the
predicate's syntactic shape (per-SqlKind factors multiplied across
conjuncts), with no dependency on the underlying RelNode.

The override exists in Apache Flink and Apache Drill, which makes its
absence in core feel intentional rather than accidental.

1. Is the absence of a RelSubset handler in RelMdSelectivity deliberate?
2. Are there pitfalls in the Flink/Drill-style override that we'd be
inheriting? Delegating to subset.getBestOrOriginal() seems like the obvious
shape, but I want to make sure I'm not missing a known footgun before we
ship it.
3. If you've tried this in a Calcite-based engine and hit a problem, I'd
love to hear what.

Not asking for any changes in core — just trying to sanity-check our
downstream decision before we commit to it.

Thanks,
Etienne Pelissier

Reply via email to