Re: Materialized View Planner; How to rewrite query to use materialized view

Justin So Tue, 22 Oct 2024 23:08:18 -0700

Hi all,

Update on my progress with some issues:


After copying the code from
https://www.querifylabs.com/blog/assembling-a-query-optimizer-with-apache-calcite
to
setup the necessary tables, schema and optimizer, I was able to rewrite a
query to utilize an MV with the following code:

        SimpleTable lineitem = SimpleTable.newBuilder("lineitem")
                .addField("l_quantity", SqlTypeName.DECIMAL)
                .addField("l_extendedprice", SqlTypeName.DECIMAL)
                .addField("l_discount", SqlTypeName.DECIMAL)
                .addField("l_shipdate", SqlTypeName.DATE)
                .addField("l_orderkey", SqlTypeName.INTEGER)
                .addField("l_tax", SqlTypeName.DECIMAL)
                .withRowCount(60_000L)
                .build();

        SimpleTable mv = SimpleTable.newBuilder("mv0")
                .addField("l_discount", SqlTypeName.DECIMAL)
                .addField("l_tax", SqlTypeName.DECIMAL)
                .withRowCount(3_000L)
                .build();

        SimpleSchema schema =
SimpleSchema.newBuilder("tpch").addTable(lineitem).addTable(mv).build();

        Optimizer optimizer = Optimizer.create(schema);

        String sql = "select l_discount, sum(l_tax)\n" +
                "from lineitem\n" +
                "where l_discount > 100 group by l_discount";
        SqlNode sqlTree = optimizer.parse(sql);
        SqlNode validatedSqlTree = optimizer.validate(sqlTree);
        RelNode queryRel = optimizer.convert(validatedSqlTree);

        String mvSQL = "select * from mv0";
        SqlNode mvsqlTree = optimizer.parse(mvSQL);
        SqlNode mvvalidatedSqlTree = optimizer.validate(mvsqlTree);
        RelNode mvtableRel = optimizer.convert(mvvalidatedSqlTree);

        System.out.println(queryRel.explain());
        /*
        LogicalAggregate(group=[{0}], EXPR$1=[SUM($1)])
            LogicalProject(l_discount=[$2], l_tax=[$5])
                LogicalFilter(condition=[>($2, 100)])
                    LogicalTableScan(table=[[tpch, lineitem]])
         */
        RelOptMaterialization mat1 =
                new RelOptMaterialization(mvtableRel, queryRel, null,
                        ImmutableList.of("default", "mv"));
        optimizer.planner.addMaterialization(mat1);

        RuleSet rules = RuleSets.ofList(
                CoreRules.FILTER_TO_CALC,
                CoreRules.PROJECT_TO_CALC,
                CoreRules.FILTER_CALC_MERGE,
                CoreRules.PROJECT_CALC_MERGE,
                EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE,
                EnumerableRules.ENUMERABLE_PROJECT_RULE,
                EnumerableRules.ENUMERABLE_FILTER_RULE,
                EnumerableRules.ENUMERABLE_CALC_RULE,
                EnumerableRules.ENUMERABLE_AGGREGATE_RULE
        );

        Program program = Programs.of(RuleSets.ofList(rules));

        RelNode optimizerRelTree = program.run(
                optimizer.planner,
                queryRel,
                queryRel.getTraitSet().plus(EnumerableConvention.INSTANCE),
                List.of(mat1),
                Collections.emptyList()
        );
        System.out.println(optimizerRelTree.explain());
        /*
        EnumerableProject(l_discount=[$0], EXPR$1=[$1])
            EnumerableTableScan(table=[[tpch, mv0]])
         */


These are my current issues:
1. My query is being rewritten to use a materialized view, however, I did
not set any rules from MaterializedViewsRules. How do I make my code use
MaterializedViewsRules?
2. The current optimizer uses the VolcanoPlanner, and when I change it to
use HepPlanner, I get an error relating to null root, specifically
java.lang.NullPointerException: root

at java.base/java.util.Objects.requireNonNull(Objects.java:246)
at org.apache.calcite.plan.hep.HepPlanner.changeTraits(HepPlanner.java:182)
at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:325)
at test.OptimizerTest.test_tpch_q6_materialization(OptimizerTest.java:158)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1597)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1597)

Not sure why this is happening because in Programs.run() it sets the root
in the planner.

Thanks,
Justin So

On Mon, Oct 21, 2024 at 8:00 AM Justin So <sojusti...@gmail.com> wrote:

> Hi all,
> Hopefully this is the right place to direct my questions. If not could
> someone direct me towards the right place.
>
> I am currently exploring creating a Materialized View planner rule and am
> running into issues on how to create a materialized view in Calcite and
> utilize the current materialized view planner rules for a query workload.
> The Apache Calcite docs
> <https://calcite.apache.org/docs/materialized_views.html> only show what
> it can do, but not the actual code on how to get started with using
> materialized views or creating materialized views. Specifically I want to
> integrate the materialized view rules under
> https://calcite.apache.org/docs/materialized_views.html#rewriting-using-plan-structural-information
>  into
> a planner (e.g Volcano or HepPlanner). I tried looking into the codebase,
> but it is so large I am not sure where to start looking, especially with
> the many Materialized[a-z]+ classes. Could someone point me to any code
> examples that utilize materialized views or at least in the right direction?
>
> Is there a way to automatically discover materialized view candidates that
> is not through creating a model.json file and modifying it like in
> https://calcite.apache.org/docs/lattice.html.
>
> Thanks,
> Justin So
>
>
>
>
>

Re: Materialized View Planner; How to rewrite query to use materialized view

Reply via email to