I certainly agree that that query should use that MV. If it does not currently, please log a jira case.
Since c2 is one of the GROUP BY columns in the query, I think it would be also be valid (and probably more efficient) to rewrite to select case when c2 is null then 0 else 1 end from test_mv group by c2, c3 Julian > On Aug 6, 2024, at 11:13 PM, Da Dash <[email protected]> wrote: > > Hello everyone, recently I have been working on materialized views using > Calcite, and in our use case, there are a lot of queries involving > CountDistinct. > And generally, to support rewriting for Count Distinct, we will always use > bitmap. However, recently, I have developed a new capability in Calcite that > allows rewriting of Count Distinct queries to read from the materialized view > table without the need for bitmap, as long as the Count Distinct is querying > the group by columns of the materialized view. > > For example, let's assume we have the following materialized view: > > ```sql > CREATE MATERIALIZED VIEW test_mv AS > SELECT > c1, c2, c3, sum(c4) > FROM > t > GROUP BY > c1, c2, c3 > ``` > > After the materialized view created, the following query arrives: > > ```sql > > select count(distinct c2) from t group by c2, c3 > > ``` > > With the capability I've developed, the above query can be rewritten as: > > ```sql > > select count(distinct c2) from test_mv group by c2, c3 > > ``` > > The rewrite mentioned above, compared to calculating COUNT DISTINCT directly > on the original table, will significantly reduce the query time because the > materialized view contains a reduced amount of data. > > Is anyone interested in this? I can initiate a Pull Request. :) > > > > > > >
