RE: Optimizer question

Dave Birdsall Thu, 01 Oct 2015 14:33:13 -0700

Hi,

That's true. I think I'll just leave this alone for now.


A bit of good news: In my test results so far, I am seeing some of the
regression test plans switch from "trafodion_delete" to a TSJ plan. In all
the cases I've looked at, the TSJ plan seems like the better plan.

Dave

-----Original Message-----
From: Qifan Chen [mailto:[email protected]]
Sent: Thursday, October 1, 2015 2:04 PM
To: [email protected]
Subject: Re: Optimizer question

Will this then favors too much on the TSJ plan?

We still have the costing logic to help decide which one will win.

Adding the two checks on missing keys do not disfavor the rule too much.
And in general the costing will also disfavor such delete as well, if the
checks are not added.

Thanks.


-Qifan

Sent from my iPhone

> On Oct 1, 2015, at 3:17 PM, Dave Birdsall <[email protected]>
wrote:
>
> Hi,
>
> Actually, I've been wondering why the test is even present in
> HbaseDeleteRule::nextSubstitute. Seems like we only want to consider
> the "trafodion_delete" only plan when the number of rows is very small
> (such as one). Maybe the right thing to do is simply remove the
> "skey->areAllChosenPreds" call from the 'if' there (see the bottom of
> this e-mail train)?
>
> Dave
>
> -----Original Message-----
> From: Qifan Chen [mailto:[email protected]]
> Sent: Thursday, October 1, 2015 1:13 PM
> To: dev <[email protected]>
> Subject: Re: Optimizer question
>
> Hi Dave,
>
> I guess the debating point is about the meaning of "areAllChosenPreds"
> in method areAllChosenPredsEqualPreds().
>
> Since the subject is a searchKey instance, areAllChosenPreds is well
> defined and should not be empty (at least my interpretation).
>
> I am afraid that if we remove those 4 lines of code, later if the same
> method is used, we still forget to test the empty-ness of the key
> which may be important.
>
> Thanks --Qifan
>
>> On Thu, Oct 1, 2015 at 2:31 PM, Qifan Chen <[email protected]>
wrote:
>>
>> Hi Dave,
>>
>> You may be right with the mathematician's interpretation :-).
>>
>> My thinking of that method is that it tests two things:
>> 1. existing predicates on keys
>> 2. all these predicates are equal predicates
>>
>> If there is no predicate on keys (which is this case is about), then
>> the 1st test above fail, and therefore the method should return false.
>>
>> Thanks --Qifan
>>
>> On Thu, Oct 1, 2015 at 1:44 PM, Dave Birdsall
>> <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> Hmmm... I would have thought the logical thing would be: every
>>> universally quantified predicate is true on an empty set. At least
>>> that's the way a mathematician or logician would look at it.
>>>
>>> I could add the extra checks, but this and one other place (similar
>>> code in an update rule) are the only places where this particular
>>> function
>>> areAllChosenPredsEqualPreds() are called. It would seem simpler to
>>> delete
>>> 4
>>> lines of code than to add 2 more tests in 2 places?
>>>
>>> Dave
>>>
>>> -----Original Message-----
>>> From: Qifan Chen [mailto:[email protected]]
>>> Sent: Thursday, October 1, 2015 11:39 AM
>>> To: dev <[email protected]>
>>> Subject: Re: Optimizer question
>>>
>>> Hi Dave,
>>>
>>> From logic point, I think the code below does the right thing,
>>> because there is no predicates present.
>>>
>>>
>>>  if ((allChosenPredsAreEqualPreds) &&
>>>      (allBeginKeysMissing) &&
>>>      (allEndKeysMissing))
>>>    allChosenPredsAreEqualPreds = FALSE;
>>>
>>>
>>> To fix the problem you found, I wonder if we add some additional
>>> checks shown below.
>>>
>>>  if ((! skey->isUnique()) &&
>>>
>>>      (skey->areAllChosenPredsEqualPreds() ||
>>> skey->areAllBeginKeysMissing() || skey->areAllEndKeysMissing()) &&
>>>
>>>      (NOT bef->noCheck()) &&
>>>
>>>      (CmpCommon::getDefault(HBASE_SQL_IUD_SEMANTICS) == DF_ON) &&
>>>
>>>      (CmpCommon::getDefault(HBASE_UPDEL_CURSOR_OPT) == DF_ON) &&
>>>
>>>      ((ActiveSchemaDB()->getDefaults()).getAsLong(COMP_INT_74) ==
>>> 0))
>>>
>>>    result = NULL;
>>>
>>> thanks --Qifan
>>>
>>> On Thu, Oct 1, 2015 at 1:00 PM, Dave Birdsall
>>> <[email protected]>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> I'm working on code to cost DELETEs in Trafodion.
>>>>
>>>>
>>>>
>>>> One of the rules in the Optimizer (see ImplRule.cpp) is the
>>>> HbaseDeleteRule. This rule decides whether to consider a delete
>>>> plan consisting of just a "trafodion_delete" node. By
>>>> experimentation, it seems to fire when all of the keys are covered
>>>> with equality predicates. If only a leading subset of the keys are
>>>> covered, it does not fire. But if none of the keys are covered by
>>>> any predicates
>>> (example:
>>>> delete from t;), it fires!
>>>>
>>>>
>>>>
>>>> I can see considering a "trafodion_delete" only plan if all the
>>>> keys are covered. We are deleting a unique row. This plan uses the
>>>> Hbase delete API, deleting one row at a time. When there is only
>>>> one row, this makes perfect sense.
>>>>
>>>>
>>>>
>>>> The alternative plan is a Scan + a Tuple_flow + a Trafodion_delete
>>>> or Trafodion_delete_vsbb, the latter uses the Hbase deleteRows API
>>>> to delete several rows in one shot. The alternative plan is likely
>>>> more efficient whenever we are deleting multiple rows.
>>>>
>>>>
>>>>
>>>> So it makes sense not to consider a "trafodion_delete" only plan
>>>> when not all the keys are covered. But what seems odd is the case
>>>> where there are no key predicates at all. Why would we consider
>>>> such a plan then?
>>>>
>>>>
>>>>
>>>> Drilling down into the code: In HbaseDeleteRule::nextSubstitute is
>>>> the
>>>> following:
>>>>
>>>>
>>>>
>>>>  if ((! skey->isUnique()) &&
>>>>
>>>>      (skey->areAllChosenPredsEqualPreds()) &&
>>>>
>>>>      (NOT bef->noCheck()) &&
>>>>
>>>>      (CmpCommon::getDefault(HBASE_SQL_IUD_SEMANTICS) == DF_ON) &&
>>>>
>>>>      (CmpCommon::getDefault(HBASE_UPDEL_CURSOR_OPT) == DF_ON) &&
>>>>
>>>>      ((ActiveSchemaDB()->getDefaults()).getAsLong(COMP_INT_74) ==
>>>> 0))
>>>>
>>>>    result = NULL;
>>>>
>>>>
>>>>
>>>> When this 'if' is taken, the  "trafodion_delete" only plan is not
>>>> considered.
>>>>
>>>>
>>>>
>>>> Here, "skey" is a SearchKey object which represents a begin/end key
>>>> pair, that is, a subset of rows within the clustering key range to
>>>> be processed.
>>>> For the unique row case, "skey->isUnique()" returns true, so the "if"
>>>> is not taken. So, the "trafodion_delete" only plan IS considered.
>>>>
>>>>
>>>>
>>>> For the non-unique case, "skey->isUnique()" returns false, and we
>>>> look at the other conditions in the 'if'. If the key is partially
>>>> covered with equality predicates,
>>>> "skey->areAllChosenPredsEqualPreds()" returns true.
>>>> The other conditions also return true, and the 'if' is taken. So
>>>> the "trafodion_delete" only plan IS NOT considered. But if there
>>>> are no key predicates at all (the "delete from t;" example),
>>>> "skey->areAllChosenPredsEqualPreds()" returns false.
>>>>
>>>>
>>>>
>>>> I think this is a bug. I think "skey->areAllChosenPredsEqualPreds()"
>>>> should return true in this case also. It's an example of the old
>>>> not-handling-the-empty-set-case-properly bug.
>>>>
>>>>
>>>>
>>>> I investigated how this particular flag is set. At the end of
>>>> SearchKeyWorkSpace::computeBeginKeyAndEndKeyValues (in file
>>>> SearchKey.cpp) is the following code:
>>>>
>>>>
>>>>
>>>>  if ((allChosenPredsAreEqualPreds) &&
>>>>
>>>>      (allBeginKeysMissing) &&
>>>>
>>>>      (allEndKeysMissing))
>>>>
>>>>    allChosenPredsAreEqualPreds = FALSE;
>>>>
>>>>
>>>>
>>>> So, there is explicit code there to treat the empty set case
>>> differently.
>>>> So, someone, somewhere, thought this was the right thing to do.
>>>>
>>>>
>>>>
>>>> But, I'm guessing the right thing to do is to simply delete this
code.
>>>>
>>>>
>>>>
>>>> Does anyone have an opinion?
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>>
>>>>
>>>> Dave
>>>
>>>
>>>
>>> --
>>> Regards, --Qifan
>>
>>
>>
>> --
>> Regards, --Qifan
>
>
> --
> Regards, --Qifan

RE: Optimizer question

Reply via email to