As far as I am aware, the optimizer has no access to data, only metadata.
The traditional way to solve such problems would be to select among
different join algorithms which perform better for varying cardinalities of
each side of the join. Unfortunately, I think you're likely to have a tough
time extracting the necessary data to do the rewrite you're aiming for.

--
Michael Mior
[email protected]



Le mar. 28 août 2018 à 20:34, Andrei Sereda <[email protected]> a écrit :

> Hello,
>
> I’m looking for a way to improve performance of a join query.
>
> Suppose one joins two heterogeneous sources t1 and t2 with some predicates.
>
> Further assume that cardinality of one of the predicates is very low
> (compared cardinality of the second one). (How) Is it possible to convert
> second query (predicate) to include results (primary keys) of the first one
> (with low selectivity) ?
> Example
>
> select *from
>   t1 left join t1 on (t1.id = t2.id)where
>   t1.attr = 'foo' and t2.attr = 'bar'
>
> Let’s say that predicate t1.attr = 'foo' results in 3 rows (id=1, 2, 3).
> Will it be possible to rewrite t2 query to :
>
> select *from t2 where
>    id in (1, 2, 3) and t2.attr = 'bar'
>
> I’m aware of existence of Metadata
> <
> https://calcite.apache.org/apidocs/org/apache/calcite/rel/metadata/Metadata.html
> >
> but not sure to use it.
>
> Any hits / directions are appreciated.
>
> Thanks,
> Andrei.
>

Reply via email to