Thanks for the pointer! It helps a lot. We hadn't thought to look up
metadata in our own rules like this, but it makes sense.

On Wed, 4 Sept 2024 at 10:57, Julian Hyde <jhyde.apa...@gmail.com> wrote:

> +1 using metadata
>
> In query planning there are only 3 places: rules, RelNodes, metadata. We
> have tried putting ’state’ in rules and RelNodes, and it hasn’t worked out
> too well.
>
> Metadata is superior to RelNodes for making decisions during planning.
> Suppose you want to figure out whether to push down a filter ‘x > 0’. You
> could stop when you see a Filter RelNode with x > 0 (or indeed x > 10). But
> it’s better to look for any RelNode with a predicate ‘x > 0’. Metadata
> propagates through RelNodes, and belongs to entire equivalence sets. You
> don’t need to mutate a RelNode r to give it a property - you just add a
> RelNode with that property to an equivalence set that r belongs to.
>
> Julian
>
>
>
> > On Sep 3, 2024, at 4:22 AM, Stamatis Zampetakis <zabe...@gmail.com>
> wrote:
> >
> > Hey Victor,
> >
> > In general it is best to avoid stateful rules but in some cases it may
> > be unavoidable. A common way to check state for deciding whether to
> > apply a transformation (or not) is via the metadata mechanism
> > (RelMetadataQuery).
> >
> > It seems that when the union is already subject to a limit then the
> > new rule should not trigger. To achieve that one way would be to
> > introduce a condition that relies on RelMetadataQuery#getMaxRowCount
> > of the union (or its inputs) and bail out when the optimization is not
> > gonna bring some notable benefit.
> >
> > ```
> > LogicalSort[fetch = 10]
> >  LogicalUnion[all=true]
> >    LogicalSort[fetch = 10]
> >      <Input 1>
> >    LogicalSort[fetch = 10]
> >      <Input 2>
> > ```
> > In the plan above the max row count of the union should be 20 and
> > pushing a limit 10 is not gonna change that so the rule should detect
> > that (e.g., via getMaxRowCount) and not trigger again.
> >
> > If you start working with metadata classes you may find that some
> > cases are not fully handled so don't hesitate to raise JIRAs/PRs to
> > improve this area.
> >
> > Best,
> > Stamatis
> >
> > On Thu, Aug 29, 2024 at 11:31 PM Victor Barua
> > <victor.ba...@datadoghq.com.invalid> wrote:
> >>
> >> Hello!
> >>
> >> We've been attempting to implement a simple optimization rule in which
> we
> >> duplicate a limit through a UNION ALL to reduce the amount of data we
> need
> >> to fetch for a query.
> >>
> >> Starting from something like
> >> ```
> >> LogicalSort[fetch = 10]
> >>  LogicalUnion[all=true]
> >>    <Input 1>
> >>    <Input 2>
> >> ```
> >>
> >> We're trying to turn it into
> >> ```
> >> LogicalSort[fetch = 10]
> >>  LogicalUnion[all=true]
> >>    LogicalSort[fetch = 10]
> >>      <Input 1>
> >>    LogicalSort[fetch = 10]
> >>      <Input 2>
> >> ```
> >>
> >> This, somewhat expectedly, causes issues with the VolcanoPlanner because
> >> the newly generated relation is also a candidate for our rule so we end
> up
> >> with an infinite planning loop. We tried to take inspiration from the
> >> JoinCommuteRule, which uses the ensureRegistered method to prevent this
> (or
> >> at least that's what we think it's doing). Unfortunately, in our case
> this
> >> appears to be insufficient. I would appreciate any pointers and or
> >> suggestions around this. I've included the code for the raw rule below.
> >>
> >>
> >> ```
> >>
> >> @Value.Enclosing
> >> public class LimitThroughUnionRule extends
> RelRule<LimitThroughUnionRule.Config>
> >>    implements SubstitutionRule {
> >>
> >>  public static final LimitThroughUnionRule INSTANCE =
> >>      LimitThroughUnionRule.Config.DEFAULT.toRule();
> >>
> >>  protected LimitThroughUnionRule(Config config) {
> >>    super(config);
> >>  }
> >>
> >>  private RelNode pushLimitThrough(RelBuilder relBuilder, Sort sort,
> >> Union union) {
> >>    for (RelNode unionInput : union.getInputs()) {
> >>      relBuilder.push(unionInput).sortLimit(sort.offset, sort.fetch,
> >> Collections.emptyList());
> >>    }
> >>    relBuilder.union(true);
> >>    relBuilder.sortLimit(sort.offset, sort.fetch, sort.getSortExps());
> >>    return relBuilder.build();
> >>  }
> >>
> >>  @Override
> >>  public void onMatch(RelOptRuleCall call) {
> >>    Sort sort = call.rel(0);
> >>    Union union = call.rel(1);
> >>
> >>    RelNode newNode = pushLimitThrough(call.builder(), sort, union);
> >>    RelNode nextNode =
> >>        pushLimitThrough(call.builder(), (Sort) newNode, (Union)
> >> newNode.getInput(0));
> >>
> >>    call.transformTo(newNode);
> >>    call.getPlanner().ensureRegistered(nextNode, newNode);
> >>  }
> >>
> >>  @Value.Immutable
> >>  public interface Config extends RelRule.Config {
> >>    LimitThroughUnionRule.Config DEFAULT =
> >>        ImmutableLimitThroughUnionRule.Config.builder()
> >>            .operandSupplier(
> >>                b0 ->
> >>                    b0.operand(Sort.class)
> >>                        .predicate(sort -> !(RelOptUtil.isOrder(sort)
> >> || RelOptUtil.isOffset(sort)))
> >>                        .oneInput(u ->
> u.operand(Union.class).anyInputs()))
> >>            .build();
> >>
> >>    @Override
> >>    default LimitThroughUnionRule toRule() {
> >>      return new LimitThroughUnionRule(this);
> >>    }
> >>  }
> >> }
> >>
> >> ```
>
>

Reply via email to