Hi colleagues, We are building a Calcite-based optimizer for Hazelcast, and I have some problems understanding Calcite's logic with respect to converters. Let me briefly explain the problem.
We have an execution backend, so we do not need Bindable or Enumerable. Instead, we would like to use Calcite to convert original SQL to a tree with our own convention, then convert it to our internal representation, and finally, execute. We started with looking at other Calcite integrations and eventually came to a classical two-phase optimization approach. We have two internal conventions - LOGICAL and PHYSICAL. The goal is to optimize the tree as follows: 1) NONE -> LOGICAL - heuristical optimizations 2) LOGICAL -> PHYSICAL - cost-based planning Suppose that after the first phase I have the following tree of our own operators: HZLogicalRoot -> HZLogicalProject -> HZLogicalScan For this specific case, there is not much to optimize, so we only need to transition to physical nodes and do some boilerplate with traits propagation: HZPhysicalRoot -> HZPhysicalProject -> HZPhysicalScan In order to achieve this, I define three rules, which just do a conversion of relevant nodes. Volcano optimizer is used. Now, the problem - somehow it works only when I override Convention.Impl.canConvertConvention to true for our PHYSICAL convention, but that blows the search space and the same rules are called many times. A lot of time is spent on endless PHYSICAL -> LOGICAL conversions, which are of no use. If I change canConvertConvention to false, then rules are called a sensible number of times, but cannot produce a complete PHYSICAL tree. Here is how it works: 1) "Root" rule is invoked, which converts "HZLogicalRoot" to "HZPhysicalRoot" 2) "Project" rule is invoked, but do not produce any transformations, since it needs Scan distribution, which is not known yet. This desired behavior at this point. 3) "Scan" rule is invoked, "HZLogicalScan" is converted to "HZPhysicalScan". Distribution is resolved 4) At this point, we have [LogicalRoot, PhysicalRoot] -> [LogicalProject] -> [LogicalScan, PhysicalScan] sets . I expect that since new scan was installed, the "Project" rule will be fired again. This time we know the distribution, so the transformation is possible. But the rule is not called and we fail with an error. So my questions are: 1) What is the real role of converters in this process? For some reason, when unnecessary (from a logical standpoint) PHYSICAL -> LOGICAL conversion is allowed, even complex plans could be built. And Drill does it for some reason. But it costs multiple additional invocations of the same rules. Are there any docs or presentations explaining the mechanics behind? 2) What are the minimum requirements, that will allow a rule on the parent to be fired again after it's child node has changed? I can provide any additional information, source code or even working example of this problem if needed. I don't want to bother you with it at the moment, because it feels like I miss something very simple. Would appreciate your help. Regards, Vladimir.
