Hi all, I got HepPlanner to utilize MaterializedViewRules! My code is a little messy so I'll give the general framework in case anyone else comes into a similar issue of wanting to rewrite the query to utilize materialized views
1. Create SQL statement* s* 2. Convert* s* to RelNode *qRel (*i.e logical plan) 3. Create HepPlanner *planner* 4. Add MaterializedViewRules to *planner* 5. Create MV SQL statement *mv* 6. Add MV table to schema 7. Convert* mv *to RelNode *mvRel* 8. Create RelOptMaterialization with query (*qRel)* and MV (*mvRel)* to be rewritten (e.g new RelOptMaterialization(mvRel, qRel, null, ImmutableList.of("default", "mv")) 9. Add the created RelOptMaterialization to planner (e.g planner.addMaterialization(...)) 10. Call the planner optimizer (HepPlanner.findBestExp()) If there's enough interest I could create a more thorough example with code that can be posted to the docs or examples/ or something Thanks, Justin So On Tue, Oct 29, 2024 at 2:45 AM Justin So <sojusti...@gmail.com> wrote: > Thank you so much for your response @suibianwanwan33! Sorry I can't seem > to figure out how to reply to these Apache emails since I keep getting > error 400 messages when I click "Reply with your own email client" on the > web interface. > > Hi, I didn't look at the code specifically, as far as I know, there are >> two types of materialized rewrites for calcite, one based on >> MaterializedViewRules and one based on SubstitutionRule rewrites. There are >> some differences in the implementation and rewriting capabilities of these >> two rewrites. If you don't add any rules about materialized rewriting but >> implement rewriting, it's probably SubstitutionRule implementation. You can >> read the SubstitutionVisitor code for this part. I hope it helps you. > > > I looked at VolcanoPlanner class and found registerMaterializations() is > using the SubstitutionVisitor code and is the reason why VolcanoPlanner can > use materialized views and HepPlanner is unable to since HepPlanner does > not have a similar implementation of registerMaterializations(). > > More specifically on why HepPlanner is unable to use Materialized Views > without rewriting the internals, VolcanoPlanner uses > SubstitutionVisitor.DEFAULT_RULES to rewrite the query to utilize the > materialized view and I am unable to add SubstitutionVisitor.DEFAULT_RULES > to the HepPlanner via HepProgramBuilder.addRuleInstance(). > > I am attempting to use MaterializedViewRules in my code, however, I am > unable to get the rules to work. I read over the code in > MaterializedViewRelOptRulesTest to utilize a materialized view, but am > still failing to do so after adding in the MaterializedViewsRules rules in > the above code. Can anyone point me in the right direction to make the > planner utilize any of the rules in MaterializedViewRules? > > Thanks, > Justin So > > > > On Wed, Oct 23, 2024 at 2:04 AM Justin So <sojusti...@gmail.com> wrote: > >> Hi all, >> >> Update on my progress with some issues: >> >> After copying the code from >> https://www.querifylabs.com/blog/assembling-a-query-optimizer-with-apache-calcite >> to >> setup the necessary tables, schema and optimizer, I was able to rewrite a >> query to utilize an MV with the following code: >> >> SimpleTable lineitem = SimpleTable.newBuilder("lineitem") >> .addField("l_quantity", SqlTypeName.DECIMAL) >> .addField("l_extendedprice", SqlTypeName.DECIMAL) >> .addField("l_discount", SqlTypeName.DECIMAL) >> .addField("l_shipdate", SqlTypeName.DATE) >> .addField("l_orderkey", SqlTypeName.INTEGER) >> .addField("l_tax", SqlTypeName.DECIMAL) >> .withRowCount(60_000L) >> .build(); >> >> SimpleTable mv = SimpleTable.newBuilder("mv0") >> .addField("l_discount", SqlTypeName.DECIMAL) >> .addField("l_tax", SqlTypeName.DECIMAL) >> .withRowCount(3_000L) >> .build(); >> >> SimpleSchema schema = >> SimpleSchema.newBuilder("tpch").addTable(lineitem).addTable(mv).build(); >> >> Optimizer optimizer = Optimizer.create(schema); >> >> String sql = "select l_discount, sum(l_tax)\n" + >> "from lineitem\n" + >> "where l_discount > 100 group by l_discount"; >> SqlNode sqlTree = optimizer.parse(sql); >> SqlNode validatedSqlTree = optimizer.validate(sqlTree); >> RelNode queryRel = optimizer.convert(validatedSqlTree); >> >> String mvSQL = "select * from mv0"; >> SqlNode mvsqlTree = optimizer.parse(mvSQL); >> SqlNode mvvalidatedSqlTree = optimizer.validate(mvsqlTree); >> RelNode mvtableRel = optimizer.convert(mvvalidatedSqlTree); >> >> System.out.println(queryRel.explain()); >> /* >> LogicalAggregate(group=[{0}], EXPR$1=[SUM($1)]) >> LogicalProject(l_discount=[$2], l_tax=[$5]) >> LogicalFilter(condition=[>($2, 100)]) >> LogicalTableScan(table=[[tpch, lineitem]]) >> */ >> RelOptMaterialization mat1 = >> new RelOptMaterialization(mvtableRel, queryRel, null, >> ImmutableList.of("default", "mv")); >> optimizer.planner.addMaterialization(mat1); >> >> RuleSet rules = RuleSets.ofList( >> CoreRules.FILTER_TO_CALC, >> CoreRules.PROJECT_TO_CALC, >> CoreRules.FILTER_CALC_MERGE, >> CoreRules.PROJECT_CALC_MERGE, >> EnumerableRules.ENUMERABLE_TABLE_SCAN_RULE, >> EnumerableRules.ENUMERABLE_PROJECT_RULE, >> EnumerableRules.ENUMERABLE_FILTER_RULE, >> EnumerableRules.ENUMERABLE_CALC_RULE, >> EnumerableRules.ENUMERABLE_AGGREGATE_RULE >> ); >> >> Program program = Programs.of(RuleSets.ofList(rules)); >> >> RelNode optimizerRelTree = program.run( >> optimizer.planner, >> queryRel, >> queryRel.getTraitSet().plus(EnumerableConvention.INSTANCE), >> List.of(mat1), >> Collections.emptyList() >> ); >> System.out.println(optimizerRelTree.explain()); >> /* >> EnumerableProject(l_discount=[$0], EXPR$1=[$1]) >> EnumerableTableScan(table=[[tpch, mv0]]) >> */ >> >> >> These are my current issues: >> 1. My query is being rewritten to use a materialized view, however, I did >> not set any rules from MaterializedViewsRules. How do I make my code use >> MaterializedViewsRules? >> 2. The current optimizer uses the VolcanoPlanner, and when I change it to >> use HepPlanner, I get an error relating to null root, specifically >> java.lang.NullPointerException: root >> >> at java.base/java.util.Objects.requireNonNull(Objects.java:246) >> at >> org.apache.calcite.plan.hep.HepPlanner.changeTraits(HepPlanner.java:182) >> at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:325) >> at test.OptimizerTest.test_tpch_q6_materialization(OptimizerTest.java:158) >> at java.base/java.lang.reflect.Method.invoke(Method.java:580) >> at java.base/java.util.ArrayList.forEach(ArrayList.java:1597) >> at java.base/java.util.ArrayList.forEach(ArrayList.java:1597) >> >> Not sure why this is happening because in Programs.run() it sets the root >> in the planner. >> >> Thanks, >> Justin So >> >> On Mon, Oct 21, 2024 at 8:00 AM Justin So <sojusti...@gmail.com> wrote: >> >>> Hi all, >>> Hopefully this is the right place to direct my questions. If not could >>> someone direct me towards the right place. >>> >>> I am currently exploring creating a Materialized View planner rule and >>> am running into issues on how to create a materialized view in Calcite and >>> utilize the current materialized view planner rules for a query workload. >>> The Apache Calcite docs >>> <https://calcite.apache.org/docs/materialized_views.html> only show >>> what it can do, but not the actual code on how to get started with using >>> materialized views or creating materialized views. Specifically I want to >>> integrate the materialized view rules under >>> https://calcite.apache.org/docs/materialized_views.html#rewriting-using-plan-structural-information >>> into >>> a planner (e.g Volcano or HepPlanner). I tried looking into the codebase, >>> but it is so large I am not sure where to start looking, especially with >>> the many Materialized[a-z]+ classes. Could someone point me to any code >>> examples that utilize materialized views or at least in the right direction? >>> >>> Is there a way to automatically discover materialized view candidates >>> that is not through creating a model.json file and modifying it like in >>> https://calcite.apache.org/docs/lattice.html. >>> >>> Thanks, >>> Justin So >>> >>> >>> >>> >>>