Re: Optimizer question

Qifan Chen Thu, 01 Oct 2015 17:23:37 -0700

Good to know!  thanks --Qifan

On Thu, Oct 1, 2015 at 4:32 PM, Dave Birdsall <[email protected]>
wrote:


> Hi,
>
> That's true. I think I'll just leave this alone for now.
>
> A bit of good news: In my test results so far, I am seeing some of the
> regression test plans switch from "trafodion_delete" to a TSJ plan. In all
> the cases I've looked at, the TSJ plan seems like the better plan.
>
> Dave
>
> -----Original Message-----
> From: Qifan Chen [mailto:[email protected]]
> Sent: Thursday, October 1, 2015 2:04 PM
> To: [email protected]
> Subject: Re: Optimizer question
>
> Will this then favors too much on the TSJ plan?
>
> We still have the costing logic to help decide which one will win.
>
> Adding the two checks on missing keys do not disfavor the rule too much.
> And in general the costing will also disfavor such delete as well, if the
> checks are not added.
>
> Thanks.
>
>
> -Qifan
>
> Sent from my iPhone
>
> > On Oct 1, 2015, at 3:17 PM, Dave Birdsall <[email protected]>
> wrote:
> >
> > Hi,
> >
> > Actually, I've been wondering why the test is even present in
> > HbaseDeleteRule::nextSubstitute. Seems like we only want to consider
> > the "trafodion_delete" only plan when the number of rows is very small
> > (such as one). Maybe the right thing to do is simply remove the
> > "skey->areAllChosenPreds" call from the 'if' there (see the bottom of
> > this e-mail train)?
> >
> > Dave
> >
> > -----Original Message-----
> > From: Qifan Chen [mailto:[email protected]]
> > Sent: Thursday, October 1, 2015 1:13 PM
> > To: dev <[email protected]>
> > Subject: Re: Optimizer question
> >
> > Hi Dave,
> >
> > I guess the debating point is about the meaning of "areAllChosenPreds"
> > in method areAllChosenPredsEqualPreds().
> >
> > Since the subject is a searchKey instance, areAllChosenPreds is well
> > defined and should not be empty (at least my interpretation).
> >
> > I am afraid that if we remove those 4 lines of code, later if the same
> > method is used, we still forget to test the empty-ness of the key
> > which may be important.
> >
> > Thanks --Qifan
> >
> >> On Thu, Oct 1, 2015 at 2:31 PM, Qifan Chen <[email protected]>
> wrote:
> >>
> >> Hi Dave,
> >>
> >> You may be right with the mathematician's interpretation :-).
> >>
> >> My thinking of that method is that it tests two things:
> >> 1. existing predicates on keys
> >> 2. all these predicates are equal predicates
> >>
> >> If there is no predicate on keys (which is this case is about), then
> >> the 1st test above fail, and therefore the method should return false.
> >>
> >> Thanks --Qifan
> >>
> >> On Thu, Oct 1, 2015 at 1:44 PM, Dave Birdsall
> >> <[email protected]>
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> Hmmm... I would have thought the logical thing would be: every
> >>> universally quantified predicate is true on an empty set. At least
> >>> that's the way a mathematician or logician would look at it.
> >>>
> >>> I could add the extra checks, but this and one other place (similar
> >>> code in an update rule) are the only places where this particular
> >>> function
> >>> areAllChosenPredsEqualPreds() are called. It would seem simpler to
> >>> delete
> >>> 4
> >>> lines of code than to add 2 more tests in 2 places?
> >>>
> >>> Dave
> >>>
> >>> -----Original Message-----
> >>> From: Qifan Chen [mailto:[email protected]]
> >>> Sent: Thursday, October 1, 2015 11:39 AM
> >>> To: dev <[email protected]>
> >>> Subject: Re: Optimizer question
> >>>
> >>> Hi Dave,
> >>>
> >>> From logic point, I think the code below does the right thing,
> >>> because there is no predicates present.
> >>>
> >>>
> >>>  if ((allChosenPredsAreEqualPreds) &&
> >>>      (allBeginKeysMissing) &&
> >>>      (allEndKeysMissing))
> >>>    allChosenPredsAreEqualPreds = FALSE;
> >>>
> >>>
> >>> To fix the problem you found, I wonder if we add some additional
> >>> checks shown below.
> >>>
> >>>  if ((! skey->isUnique()) &&
> >>>
> >>>      (skey->areAllChosenPredsEqualPreds() ||
> >>> skey->areAllBeginKeysMissing() || skey->areAllEndKeysMissing()) &&
> >>>
> >>>      (NOT bef->noCheck()) &&
> >>>
> >>>      (CmpCommon::getDefault(HBASE_SQL_IUD_SEMANTICS) == DF_ON) &&
> >>>
> >>>      (CmpCommon::getDefault(HBASE_UPDEL_CURSOR_OPT) == DF_ON) &&
> >>>
> >>>      ((ActiveSchemaDB()->getDefaults()).getAsLong(COMP_INT_74) ==
> >>> 0))
> >>>
> >>>    result = NULL;
> >>>
> >>> thanks --Qifan
> >>>
> >>> On Thu, Oct 1, 2015 at 1:00 PM, Dave Birdsall
> >>> <[email protected]>
> >>> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>>
> >>>>
> >>>> I'm working on code to cost DELETEs in Trafodion.
> >>>>
> >>>>
> >>>>
> >>>> One of the rules in the Optimizer (see ImplRule.cpp) is the
> >>>> HbaseDeleteRule. This rule decides whether to consider a delete
> >>>> plan consisting of just a "trafodion_delete" node. By
> >>>> experimentation, it seems to fire when all of the keys are covered
> >>>> with equality predicates. If only a leading subset of the keys are
> >>>> covered, it does not fire. But if none of the keys are covered by
> >>>> any predicates
> >>> (example:
> >>>> delete from t;), it fires!
> >>>>
> >>>>
> >>>>
> >>>> I can see considering a "trafodion_delete" only plan if all the
> >>>> keys are covered. We are deleting a unique row. This plan uses the
> >>>> Hbase delete API, deleting one row at a time. When there is only
> >>>> one row, this makes perfect sense.
> >>>>
> >>>>
> >>>>
> >>>> The alternative plan is a Scan + a Tuple_flow + a Trafodion_delete
> >>>> or Trafodion_delete_vsbb, the latter uses the Hbase deleteRows API
> >>>> to delete several rows in one shot. The alternative plan is likely
> >>>> more efficient whenever we are deleting multiple rows.
> >>>>
> >>>>
> >>>>
> >>>> So it makes sense not to consider a "trafodion_delete" only plan
> >>>> when not all the keys are covered. But what seems odd is the case
> >>>> where there are no key predicates at all. Why would we consider
> >>>> such a plan then?
> >>>>
> >>>>
> >>>>
> >>>> Drilling down into the code: In HbaseDeleteRule::nextSubstitute is
> >>>> the
> >>>> following:
> >>>>
> >>>>
> >>>>
> >>>>  if ((! skey->isUnique()) &&
> >>>>
> >>>>      (skey->areAllChosenPredsEqualPreds()) &&
> >>>>
> >>>>      (NOT bef->noCheck()) &&
> >>>>
> >>>>      (CmpCommon::getDefault(HBASE_SQL_IUD_SEMANTICS) == DF_ON) &&
> >>>>
> >>>>      (CmpCommon::getDefault(HBASE_UPDEL_CURSOR_OPT) == DF_ON) &&
> >>>>
> >>>>      ((ActiveSchemaDB()->getDefaults()).getAsLong(COMP_INT_74) ==
> >>>> 0))
> >>>>
> >>>>    result = NULL;
> >>>>
> >>>>
> >>>>
> >>>> When this 'if' is taken, the  "trafodion_delete" only plan is not
> >>>> considered.
> >>>>
> >>>>
> >>>>
> >>>> Here, "skey" is a SearchKey object which represents a begin/end key
> >>>> pair, that is, a subset of rows within the clustering key range to
> >>>> be processed.
> >>>> For the unique row case, "skey->isUnique()" returns true, so the "if"
> >>>> is not taken. So, the "trafodion_delete" only plan IS considered.
> >>>>
> >>>>
> >>>>
> >>>> For the non-unique case, "skey->isUnique()" returns false, and we
> >>>> look at the other conditions in the 'if'. If the key is partially
> >>>> covered with equality predicates,
> >>>> "skey->areAllChosenPredsEqualPreds()" returns true.
> >>>> The other conditions also return true, and the 'if' is taken. So
> >>>> the "trafodion_delete" only plan IS NOT considered. But if there
> >>>> are no key predicates at all (the "delete from t;" example),
> >>>> "skey->areAllChosenPredsEqualPreds()" returns false.
> >>>>
> >>>>
> >>>>
> >>>> I think this is a bug. I think "skey->areAllChosenPredsEqualPreds()"
> >>>> should return true in this case also. It's an example of the old
> >>>> not-handling-the-empty-set-case-properly bug.
> >>>>
> >>>>
> >>>>
> >>>> I investigated how this particular flag is set. At the end of
> >>>> SearchKeyWorkSpace::computeBeginKeyAndEndKeyValues (in file
> >>>> SearchKey.cpp) is the following code:
> >>>>
> >>>>
> >>>>
> >>>>  if ((allChosenPredsAreEqualPreds) &&
> >>>>
> >>>>      (allBeginKeysMissing) &&
> >>>>
> >>>>      (allEndKeysMissing))
> >>>>
> >>>>    allChosenPredsAreEqualPreds = FALSE;
> >>>>
> >>>>
> >>>>
> >>>> So, there is explicit code there to treat the empty set case
> >>> differently.
> >>>> So, someone, somewhere, thought this was the right thing to do.
> >>>>
> >>>>
> >>>>
> >>>> But, I'm guessing the right thing to do is to simply delete this
> code.
> >>>>
> >>>>
> >>>>
> >>>> Does anyone have an opinion?
> >>>>
> >>>>
> >>>>
> >>>> Thanks,
> >>>>
> >>>>
> >>>>
> >>>> Dave
> >>>
> >>>
> >>>
> >>> --
> >>> Regards, --Qifan
> >>
> >>
> >>
> >> --
> >> Regards, --Qifan
> >
> >
> > --
> > Regards, --Qifan
>



-- 
Regards, --Qifan

Re: Optimizer question

Reply via email to