Re: Implement predicate propagation for non-equivalence clauses

Alexander Kuzmenkov Thu, 22 Nov 2018 08:17:02 -0800

Hi Richard,

I took a look at the v2, here are some comments:

* This feature needs tests, especially for the cases where opfamilies ordata types or collations don't match, and other non-obvious cases whereit shouldn't work.

* Deducing an inequality to a constant is not always helpful. If we knowthat a = b and a = const1 and b < const2, we will deduce b = const1 fromthe EC, and adding b < const2 doesn't improve selectivity and only makesthe cost estimate worse. One situation where it does make sense is whenwe can detect contradictions in these clauses, e.g. if we know thatconst1 > const2 and therefore know that the above selection clause isalways false. Looking at the regression test changes, I see that v2doesn't do that. I think the handling of deduced inequalities shoud bemodeled on the flow of generate_base_implied_equalities ->process_implied_equality -> distribute_qual_to_rels. This would allow usto correctly handle a deduced const1 < const2 clause and turn it into agating Result qual.

* There are functions named generate_implied_quals, gen_implied_qualsand gen_implied_qual. The names are confusingly similar, we could usesomething like generate_implied_quals_for_clause andgenerate_one_implied_qual for the latter two.



@@ gen_implied_quals
    else if (clause && IsA(clause, ScalarArrayOpExpr))
    {
* When can the clause be NULL?


@@ gen_implied_quals
    item1 = canonicalize_ec_expression(item1,
                                       exprType((Node *) item1),
                                       collation);
    item2 = canonicalize_ec_expression(item2,
                                       exprType((Node *) item2),
                                       collation);

* Why do we do this? If the collation or type of the original expressionisn't right and we have to add RelabelType, the resulting expressionwon't match the original clause and we won't be able to substitute itwith other equavalence members. So we can just check that the type andcollation match.

* In gen_implied_quals, I'd rename item1 => left, item2 => right, em1 =>orig_em, em2 => other_em, and same for list cells and types. As it isnow, em1 can actually match both item1 and item2, and em2 is not relatedto either of them, so it took me some effort to understand what's going on.

* In gen_implied_qual, why do we search the clause for a matchingsubexpression? Reading the thread, I thought that we can only do thesubstitution for OpExprs of the same opfamilies as the generating EC.This code looks like it can do the substitution an at arbitrary depth,so we might change an argument of some unsuitable function, and theresult will not be correct. What we should probably do is that after wematched one side of OpExpr to one EC member, we just replace it withanother suitable member and add the resulting clause.



@@ gen_implied_qual
    check_mergejoinable(new_rinfo);
    check_hashjoinable(new_rinfo);
...
    /*
     * If the clause has a mergejoinable operator, set the EquivalenceClass

* links. Otherwise, a mergejoinable operator with NULLleft_ec/right_ec

     * will cause update_mergeclause_eclasses fails at assertion.
     */
    if (new_rinfo->mergeopfamilies)
        initialize_mergeclause_eclasses(root, new_rinfo);

* It's not an equality clause, so it is not going to be mergejoinablenor hashjoinable and we can skip these checks.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Implement predicate propagation for non-equivalence clauses

Reply via email to