Let me elaborate on what I'm working on.

I'm working on converting Calcite logical plan to Storm Trident logical
plan.
(Trident itself does optimization on topology so I just create the Trident
topology and let Trident plans on it.)

Thanks to Samza SQL implementation, I also succeed to translate Calcite
logical plan to Storm Trident logical plan. After calling
planner.transform() I can get converted RelNode, so traverse the tree to
build Trident topology just same as handling Calcite logical plan. But
unlike using Calcite logical plan, there's RelSubset at input in RelNode so
I need to pick 'best' node in RelSubset to get selected input.

Is there any other ways to do it? Or is there a way to get rid of RelSubset
so that only actual selected operators can be placed?

This is a PR for review in Storm community:
https://github.com/apache/storm/pull/1736 and TridentPlanCreator,
StormRelUtils, QueryPlanner classes are directly related to my question.

Thanks in advance!
Jungtaek Lim (HeartSaVioR)


2016년 10월 15일 (토) 오전 7:22, Julian Hyde <jhyde.apa...@gmail.com>님이 작성:

> If you’re wanting to traverse into RelSubsets you’re almost certainly
> doing it wrong. We have a very powerful mechanism for identifying
> sub-sections of the RelNode graph: planner rules and the VolcanoPlanner.
>
> Suppose that a RelSubset has 8 RelNodes in it, and also has 5 consumers.
> That is 40 pairs of RelNodes. If you write a planner rule that matches
> (Project, Filter) it will automatically be fired at the right relational
> operators throughout the graph, and will be fired again (just once) when
> new nodes are added to the graph that make new combinations possible.
>
> > On Oct 13, 2016, at 5:27 PM, Jungtaek Lim <kabh...@gmail.com> wrote:
> >
> > This makes me another question, since Storm SQL should traverse the
> > selected plan so having similar requirement.
> >
> > How we can visit RelSubset if it's an input node in current RelNode?
> >
> > Before asking a question I was following 'best' rel in RelSubset,
> assuming
> > that best is what Volcano planner picks.
> >
> > If it's not, which is recommended way to traverse RelNode tree?
> >
> > Or if it is, we can just pick 'best' in RelSubset.explain() (only if
> > available) to represent selected plan, like I said before.
> >
> > - Jungtaek Lim (HeartSaVioR)
> > On Fri, 14 Oct 2016 at 8:34 AM Julian Hyde <jh...@apache.org> wrote:
> >
> >> I agree, it would be useful to be able to print out the plan (or the
> best
> >> plan known at the moment). I suppose RelWriter could have a method that
> >> chooses which member of a RelSubset is to be preferred; and if that
> member
> >> is the best according to some cost model then what will come out is the
> >> best plan.
> >>
> >> RelNode.buildCheapestPlan basically does this, but it would be nice to
> >> have a way print the plan non-destructively.
> >>
> >> Julian
> >>
> >>
> >>> On Oct 13, 2016, at 4:26 PM, Jungtaek Lim <kabh...@gmail.com> wrote:
> >>>
> >>> Sorry I was not clear on that. RelSubSet is correct, and while
> executing
> >>> RelOptUtil.toString(), it calls RelSubSet.explain(), and
> >>> RelSubSet.explain() picks input with that manner.
> >>>
> >>> Yes I'm saying about after planning. I agree that RelOptUtil.toString()
> >> can
> >>> be called before planning so it can't handle.
> >>> If toString() doesn't care about that I'd like to see the possibility
> to
> >>> have a feature which prints out actual plan, not just showing structure
> >> of
> >>> rel. As one of end-user I would like to see the actual plan which has
> >> more
> >>> meaning and what I was expecting.
> >>>
> >>> 2016년 10월 14일 (금) 오전 8:15, Julian Hyde <jh...@apache.org>님이 작성:
> >>>
> >>>> No, it’s not possible to print the “actual plan” because until
> planning
> >> is
> >>>> finished there is no plan. The job of toString is to print the
> >> STRUCTURE of
> >>>> the rel, and cost should not come into that.
> >>>>
> >>>> There is no “RelSub” class - I presume you mean RelSubSet? But
> RelSubSet
> >>>> has no toString method (other than the one it inherits from
> >>>> AbstractRelNode). So I assume you are talking about
> >>>> RelOptUtil.toString(RelNode)?
> >>>>
> >>>> Julian
> >>>>
> >>>>
> >>>>> On Oct 13, 2016, at 4:05 PM, Jungtaek Lim <kabh...@gmail.com> wrote:
> >>>>>
> >>>>> Thanks Julian for the quick response. I'll follow CALCITE-794.
> >>>>>
> >>>>> And I got more details:
> >>>>>
> >>>>> rels:
> >>>>> 0 = {LogicalFilter@3579}
> >>>>>
> >>>>
> >>
> "rel#10:LogicalFilter.NONE.[](input=rel#9:Subset#0.ENUMERABLE.[],condition=>($0,
> >>>>> 3))"
> >>>>> 1 = {LogicalProject@3584}
> >>>>>
> >>>>
> >>
> "rel#12:LogicalProject.NONE.[](input=rel#11:Subset#1.NONE.[],ID=$0,NAME=$1,ADDR=$2)"
> >>>>> 2 = {StormProjectRel@3554}
> >>>>>
> >>>>
> >>
> "rel#21:StormProjectRel.STORM_LOGICAL.[](input=rel#20:Subset#1.STORM_LOGICAL.[],ID=$0,NAME=$1,ADDR=$2)"
> >>>>> 3 = {LogicalCalc@3585}
> >>>>>
> >>>>
> >>
> "rel#22:LogicalCalc.NONE.[[]](input=rel#11:Subset#1.NONE.[],expr#0..2={inputs},ID=$t0,NAME=$t1,ADDR=$t2)"
> >>>>> 4 = {TridentStormFilterRel@3569}
> >>>>>
> >>>>
> >>
> "rel#24:TridentStormFilterRel.STORM_LOGICAL.[](input=rel#23:Subset#0.STORM_LOGICAL.[],condition=>($0,
> >>>>> 3))"
> >>>>> 5 = {TridentStormCalcRel@3586}
> >>>>>
> >>>>
> >>
> "rel#25:TridentStormCalcRel.STORM_LOGICAL.[[]](input=rel#20:Subset#1.STORM_LOGICAL.[],expr#0..2={inputs},ID=$t0,NAME=$t1,ADDR=$t2)"
> >>>>> 6 = {LogicalCalc@3587}
> >>>>>
> >>>>
> >>
> "rel#26:LogicalCalc.NONE.[[]](input=rel#9:Subset#0.ENUMERABLE.[],expr#0..2={inputs},expr#3=3,expr#4=>($t0,
> >>>>> $t3),ID=$t0,NAME=$t1,ADDR=$t2,$condition=$t4)"
> >>>>> 7 = {TridentStormCalcRel@3588}
> >>>>>
> >>>>
> >>
> "rel#27:TridentStormCalcRel.STORM_LOGICAL.[[]](input=rel#23:Subset#0.STORM_LOGICAL.[],expr#0..2={inputs},expr#3=3,expr#4=>($t0,
> >>>>> $t3),ID=$t0,NAME=$t1,ADDR=$t2,$condition=$t4)"
> >>>>>
> >>>>> best:
> >>>>>
> >>>>
> >>
> rel#24:TridentStormFilterRel.STORM_LOGICAL.[](input=rel#23:Subset#0.STORM_LOGICAL.[],condition=>($0,
> >>>>> 3))
> >>>>>
> >>>>> Relsub.toString() just picks the first one, which might not be same
> as
> >>>>> Volcano planner selects. (Please correct me if I'm wrong.)
> >>>>> If it is, I think it's not intuitive though they're logically
> >> equivalent,
> >>>>> because when user requests explain user wants to see actual
> (selected)
> >>>>> plan. Is it possible to provide the actual plan? If 'best'
> representing
> >>>> the
> >>>>> selection, why not just printing out the best?
> >>>>>
> >>>>> Btw, I filed an issue:
> >>>> https://issues.apache.org/jira/browse/CALCITE-1438
> >>>>> I'm not clearly understanding on your solution for now, but if you
> >>>> haven't
> >>>>> had time to resolve after I followed up CALCITE-794, I'll try to
> make a
> >>>>> patch.
> >>>>>
> >>>>> Thanks,
> >>>>> Jungtaek Lim (HeartSaVioR)
> >>>>>
> >>>>> 2016년 10월 14일 (금) 오전 7:30, Julian Hyde <jh...@apache.org>님이 작성:
> >>>>>
> >>>>>> Cycles in the rel graph are difficult to avoid. See
> >>>>>> https://issues.apache.org/jira/browse/CALCITE-794 <
> >>>>>> https://issues.apache.org/jira/browse/CALCITE-794> for details.
> They
> >>>> are
> >>>>>> not fatal for optimization (as long as the nodes in the graph have
> >>>> positive
> >>>>>> cost, the cheapest plan (which is basically a path through the
> graph)
> >>>> will
> >>>>>> not be a cycle) but they are still best avoided. Adding
> >> simplifications
> >>>> in
> >>>>>> RelBuilder eliminates some common causes of cycles.
> >>>>>>
> >>>>>> I agree that RelOptUtil.toString must not give StackOverflowError.
> Can
> >>>> you
> >>>>>> please log a bug for this?
> >>>>>>
> >>>>>> The solution is probably straightforward: maintain a set of “active”
> >>>> nodes
> >>>>>> in the RelWriter created by RelOptUtil.toString.
> >>>>>>
> >>>>>> Julian
> >>>>>>
> >>>>>>
> >>>>>>> On Oct 13, 2016, at 3:17 PM, Jungtaek Lim <kabh...@gmail.com>
> wrote:
> >>>>>>>
> >>>>>>> Hi devs,
> >>>>>>>
> >>>>>>> While I'm converting Storm SQL to convert Calcite logical to
> Storm's
> >>>> own
> >>>>>>> logical, I found one of Storm's unit test is failing. I put
> >>>>>>> RelOptUtil.toString() on every tests, and broken test is throwing
> >>>>>>> StackOverflowError.
> >>>>>>>
> >>>>>>> When it was also failing from IDEA, I dug it more, and found that
> one
> >>>> of
> >>>>>>> rels in Relsub has parent Relsub as 'input' (Relsub and Project).
> >>>>>>> Fortunately it was not selected to 'best', but Relsub print out
> first
> >>>>>>> occurence of rel which match the trait, and unfortunately it's
> first
> >>>> one.
> >>>>>>>
> >>>>>>> Query is really simple, INSERT INTO BAR SELECT ID, NAME, ADDR FROM
> >> FOO
> >>>>>>> WHERE ID > 3. It was not making an issue when Storm SQL uses
> Calcite
> >>>>>>> logical.
> >>>>>>>
> >>>>>>> I'm not sure making cross reference is a problem, but IMO throwing
> >>>>>>> StackOverflowError in explain is a major problem.
> >>>>>>>
> >>>>>>> I picked the workaround by printing out 'best' if it's available
> (not
> >>>>>> null)
> >>>>>>> instead of first occurrence of rel which match the trait.
> >>>>>>>
> >>>>>>> Does it make sense? If then I'll come up with filing an issue and
> >>>>>> following
> >>>>>>> pull request. I don't have an idea to reproduce so might not have
> >> test
> >>>> on
> >>>>>>> it.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Jungtaek Lim (HeartSaVioR)
> >>>>>>>
> >>>>>>> ps. Below is the ruleset I'm experimenting with. Please correct if
> >> I'm
> >>>>>>> using conflict rules together, or any odd things.
> >>>>>>>
> >>>>>>> SortRemoveRule.INSTANCE,
> >>>>>>> FilterToCalcRule.INSTANCE,
> >>>>>>> ProjectToCalcRule.INSTANCE,
> >>>>>>> FilterCalcMergeRule.INSTANCE,
> >>>>>>> ProjectCalcMergeRule.INSTANCE,
> >>>>>>> CalcMergeRule.INSTANCE,
> >>>>>>> PruneEmptyRules.FILTER_INSTANCE,
> >>>>>>> PruneEmptyRules.PROJECT_INSTANCE,
> >>>>>>> PruneEmptyRules.UNION_INSTANCE,
> >>>>>>> ProjectFilterTransposeRule.INSTANCE,
> >>>>>>> FilterProjectTransposeRule.INSTANCE,
> >>>>>>> ProjectRemoveRule.INSTANCE,
> >>>>>>> ReduceExpressionsRule.FILTER_INSTANCE,
> >>>>>>> ReduceExpressionsRule.PROJECT_INSTANCE,
> >>>>>>> ReduceExpressionsRule.CALC_INSTANCE,
> >>>>>>> UnionEliminatorRule.INSTANCE,
> >>>>>>> StormScanRule.INSTANCE,
> >>>>>>> StormFilterRule.INSTANCE,
> >>>>>>> StormProjectRule.INSTANCE,
> >>>>>>> StormAggregateRule.INSTANCE,
> >>>>>>> StormJoinRule.INSTANCE,
> >>>>>>> StormModifyRule.INSTANCE,
> >>>>>>> StormCalcRule.INSTANCE
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Reply via email to