It sounds like a Runtime Filter[1], which is commonly used by many systems.
As Stamatis mentioned, integrating it into the cost model is much more
challenging than implementing the rule. Fortunately, we can refer to the
practices of other systems.
[1]
The topic is really interesting, thanks for sharing your ideas Zoltan!
I see no drawbacks adding the new transformation rule; definitely worth
having! However, adding them to the default rule set or using them in a
cost based decision may require much more work/thinking.
Calcite's built-in cost
It would be great to have such a rule. People who don’t want it can disable it;
and people who enable it can use a cost function.
Some systems that use Bloom filters (and other probabilistic filters) don’t
execute the query twice but use a side-channel to send the Bloom filter from
one scan to
Hi,
I was wondering about the pros and cons of having a Calcite rule which could
rewrite a join to utilize bloom filters; something like:
select e.*
from emp e
join dept d on(e.deptno=d.deptno);
where d.dname='Sales';
into something like:
select e.*
from (