Re: Teach predtest about IS [NOT] proofs

2024-04-05 Thread James Coleman
On Mon, Apr 1, 2024 at 8:06 AM James Coleman  wrote:
>
> On Mon, Mar 25, 2024 at 5:53 PM Tom Lane  wrote:
> >
> > James Coleman  writes:
> > > [ v6 patchset ]
> >
> > I went ahead and committed 0001 after one more round of review
> >
> > statements; my bad).  I also added the changes in test_predtest.c from
> > 0002.  I attach a rebased version of 0002, as well as 0003 which isn't
> > changed, mainly to keep the cfbot happy.
> >
> > I'm still not happy with what you did in predicate_refuted_by_recurse:
> > it feels wrong and rather expensively so.  There has to be a better
> > way.  Maybe strong vs. weak isn't quite the right formulation for
> > refutation tests?
>
> Possibly. Earlier I'd mused that:
>
> > Alternatively (to avoid unnecessary CPU burn) we could modify
> > predicate_implied_by_recurse (and functionals called by it) to have a
> > argument beyond "weak = true/false" Ie.g., an enum that allows for
> > something like "WEAK", "STRONG", and "EITHER". That's a bigger change,
> > so I didn't want to do that right away unless there was agreement on
> > that direction.
>
> I'm going to try implementing that and see how I feel about what it
> looks like in practice.

Attached is v8 which does this. Note that I kept the patch 0001 as
before and inserted a new 0002 to show exactly what's changed from the
previously version -- I wouldn't expect that to be committed
separately, of course. With this change we only need to recurse a
single time and can check for both strong and weak refutation when
either will do for proving refutation of the "NOT x" construct.

Regards,
James Coleman


v8-0001-Teach-predtest.c-about-BooleanTest.patch
Description: Binary data


v8-0003-Add-temporary-all-permutations-test.patch
Description: Binary data


v8-0002-Recurse-weak-and-strong-implication-at-the-same-t.patch
Description: Binary data


Re: Teach predtest about IS [NOT] proofs

2024-04-01 Thread James Coleman
On Mon, Mar 25, 2024 at 5:53 PM Tom Lane  wrote:
>
> James Coleman  writes:
> > [ v6 patchset ]
>
> I went ahead and committed 0001 after one more round of review
>
> statements; my bad).  I also added the changes in test_predtest.c from
> 0002.  I attach a rebased version of 0002, as well as 0003 which isn't
> changed, mainly to keep the cfbot happy.
>
> I'm still not happy with what you did in predicate_refuted_by_recurse:
> it feels wrong and rather expensively so.  There has to be a better
> way.  Maybe strong vs. weak isn't quite the right formulation for
> refutation tests?

Possibly. Earlier I'd mused that:

> Alternatively (to avoid unnecessary CPU burn) we could modify
> predicate_implied_by_recurse (and functionals called by it) to have a
> argument beyond "weak = true/false" Ie.g., an enum that allows for
> something like "WEAK", "STRONG", and "EITHER". That's a bigger change,
> so I didn't want to do that right away unless there was agreement on
> that direction.

I'm going to try implementing that and see how I feel about what it
looks like in practice.

Regards,
James Coleman




Re: Teach predtest about IS [NOT] proofs

2024-04-01 Thread James Coleman
On Mon, Mar 25, 2024 at 11:45 PM Tom Lane  wrote:
>
> I wrote:
> > I went ahead and committed 0001 after one more round of review
> >
> > statements; my bad).  I also added the changes in test_predtest.c from
> > 0002.  I attach a rebased version of 0002, as well as 0003 which isn't
> > changed, mainly to keep the cfbot happy.
>
> [ squint.. ]  Apparently I managed to hit ^K right before sending this
> email.  The missing line was meant to be more or less
>
> > which found a couple of missing "break"
>
> Not too important, but perhaps future readers of the archives will
> be confused.

I was wondering myself :) so thanks for clarifying.

Regards,
James Coleman




Re: Teach predtest about IS [NOT] proofs

2024-03-25 Thread Tom Lane
I wrote:
> I went ahead and committed 0001 after one more round of review
> 
> statements; my bad).  I also added the changes in test_predtest.c from
> 0002.  I attach a rebased version of 0002, as well as 0003 which isn't
> changed, mainly to keep the cfbot happy.

[ squint.. ]  Apparently I managed to hit ^K right before sending this
email.  The missing line was meant to be more or less

> which found a couple of missing "break"

Not too important, but perhaps future readers of the archives will
be confused.

regards, tom lane




Re: Teach predtest about IS [NOT] proofs

2024-03-25 Thread Tom Lane
James Coleman  writes:
> [ v6 patchset ]

I went ahead and committed 0001 after one more round of review

statements; my bad).  I also added the changes in test_predtest.c from
0002.  I attach a rebased version of 0002, as well as 0003 which isn't
changed, mainly to keep the cfbot happy.

I'm still not happy with what you did in predicate_refuted_by_recurse:
it feels wrong and rather expensively so.  There has to be a better
way.  Maybe strong vs. weak isn't quite the right formulation for
refutation tests?

regards, tom lane

diff --git a/src/backend/optimizer/util/predtest.c b/src/backend/optimizer/util/predtest.c
index 6e3b376f3d..5bb5bb4f0e 100644
--- a/src/backend/optimizer/util/predtest.c
+++ b/src/backend/optimizer/util/predtest.c
@@ -99,6 +99,8 @@ static bool predicate_implied_by_simple_clause(Expr *predicate, Node *clause,
 			   bool weak);
 static bool predicate_refuted_by_simple_clause(Expr *predicate, Node *clause,
 			   bool weak);
+static bool predicate_implied_not_null_by_clause(Expr *predicate, Node *clause,
+ bool weak);
 static Node *extract_not_arg(Node *clause);
 static Node *extract_strong_not_arg(Node *clause);
 static bool clause_is_strict_for(Node *clause, Node *subexpr, bool allow_false);
@@ -198,6 +200,11 @@ predicate_implied_by(List *predicate_list, List *clause_list,
  * (i.e., B must yield false or NULL).  We use this to detect mutually
  * contradictory WHERE clauses.
  *
+ * A notable difference between implication and refutation proofs is that
+ * strong/weak refutations don't vary the input of A (both must be true) but
+ * vary the allowed outcomes of B (false vs. non-truth), while for implications
+ * we vary both A (truth vs. non-falsity) and B (truth vs. non-falsity).
+ *
  * Weak refutation can be proven in some cases where strong refutation doesn't
  * hold, so it's useful to use it when possible.  We don't currently have
  * support for disproving one CHECK constraint based on another one, nor for
@@ -740,6 +747,16 @@ predicate_refuted_by_recurse(Node *clause, Node *predicate,
 			 !weak))
 return true;
 
+			/*
+			 * Because weak refutation expands the allowed outcomes for B
+			 * from "false" to "false or null", we can additionally prove
+			 * weak refutation in the case that strong refutation is proven.
+			 */
+			if (weak && not_arg &&
+predicate_implied_by_recurse(predicate, not_arg,
+			 true))
+return true;
+
 			switch (pclass)
 			{
 case CLASS_AND:
@@ -1137,21 +1154,27 @@ predicate_implied_by_simple_clause(Expr *predicate, Node *clause,
 
 	Assert(list_length(op->args) == 2);
 	rightop = lsecond(op->args);
-	/* We might never see null Consts here, but better check */
-	if (rightop && IsA(rightop, Const) &&
-		!((Const *) rightop)->constisnull)
+	if (rightop && IsA(rightop, Const))
 	{
+		Const	*constexpr = (Const *) rightop;
 		Node	   *leftop = linitial(op->args);
 
-		if (DatumGetBool(((Const *) rightop)->constvalue))
+		/*
+		 * We might never see a null Const here, but better
+		 * check anyway.
+		 */
+		if (constexpr->constisnull)
+			return false;
+
+		if (DatumGetBool(constexpr->constvalue))
 		{
-			/* X = true implies X */
+			/* x = true implies x */
 			if (equal(predicate, leftop))
 return true;
 		}
 		else
 		{
-			/* X = false implies NOT X */
+			/* x = false implies NOT x */
 			if (is_notclause(predicate) &&
 equal(get_notclausearg(predicate), leftop))
 return true;
@@ -1160,6 +1183,97 @@ predicate_implied_by_simple_clause(Expr *predicate, Node *clause,
 }
 			}
 			break;
+		case T_NullTest:
+			{
+NullTest *clausentest = (NullTest *) clause;
+
+/*
+ * row IS NOT NULL does not act in the simple way we have in
+ * mind
+ */
+if (clausentest->argisrow)
+	return false;
+
+switch (clausentest->nulltesttype)
+{
+	case IS_NULL:
+		/*
+		 * A clause in the form "foo IS NULL" implies a
+		 * predicate "NOT foo" that is strict for "foo", but
+		 * only weakly since "foo" being null will result in
+		 * the clause evaluating to true while the predicate
+		 * will evaluate to null.
+		 */
+		if (weak && is_notclause(predicate) &&
+			clause_is_strict_for((Node *) get_notclausearg(predicate), (Node *) clausentest->arg, true))
+			return true;
+
+		break;
+	case IS_NOT_NULL:
+		break;
+}
+			}
+			break;
+		case T_BooleanTest:
+			{
+BooleanTest	*clausebtest = (BooleanTest *) clause;
+
+switch (clausebtest->booltesttype)
+{
+	case IS_TRUE:
+		/* x IS TRUE implies x */
+		if (equal(predicate, clausebtest->arg))
+			return true;
+		break;
+	case IS_FALSE:
+		/* x IS FALSE implies NOT x */
+		if (is_notclause(predicate) &&
+			equal(get_notclausearg(predicate), clausebtest->arg))
+			

Re: Teach predtest about IS [NOT] proofs

2024-01-22 Thread Tom Lane
James Coleman  writes:
> 0001 does the initial pure refactor. 0003 makes a lot of modifications
> to what we can prove about implication and refutation. Finally, 0003
> isn't intended to be committed, but attempts to validate more
> holistically that none of the changes creates any invalid proofs
> beyond the mostly happy-path tests added in 0004.

> I ended up not tackling changing how test_predtest tests run for now.
> That's plausibly still useful, and I'd be happy to add that if you
> generally agree with the direction of the patch and with that
> abstraction being useful.

> I added some additional verifications to the test_predtest module to
> prevent additional obvious flaws.

I looked through 0001 and made some additional cosmetic changes,
mostly to get comments closer to the associated code; I also
ran pgindent on it (see v5-0001 attached).  That seems pretty
committable to me at this point.  I also like your 0002 additions to
test_predtest.c (although why the mixture of ERROR and WARNING?
ISTM they should all be WARNING, so we can press on with the test).

One other thought is that maybe separating out
predicate_implied_not_null_by_clause should be part of 0001?

I'm less excited about the rest of v4-0002.

@@ -740,6 +747,16 @@ predicate_refuted_by_recurse(Node *clause, Node *predicate,
  !weak))
 return true;
 
+/*
+ * Because weak refutation expands the allowed outcomes for B
+ * from "false" to "false or null", we can additionally prove
+ * weak refutation in the case that strong refutation is proven.
+ */
+if (weak && not_arg &&
+predicate_implied_by_recurse(predicate, not_arg,
+ true))
+return true;
+
 switch (pclass)
 {
 case CLASS_AND:

I don't buy this bit at all.  If the prior recursive call fails to
prove weak refutation in a case where strong refutation holds, how is
that not a bug lower down?  Moreover, in order to mask such a bug,
you're doubling the time taken by failed proofs, which is an
unfortunate thing --- we don't like spending a lot of time on
something that fails to improve the plan.

@@ -1138,32 +1155,114 @@ predicate_implied_by_simple_clause(Expr *predicate, 
Node *clause,
 Assert(list_length(op->args) == 2);
 rightop = lsecond(op->args);
 
-/*
- * We might never see a null Const here, but better check
- * anyway
- */
-if (rightop && IsA(rightop, Const) &&
-!((Const *) rightop)->constisnull)
+if (rightop && IsA(rightop, Const))
 {
+Const*constexpr = (Const *) rightop;
 Node   *leftop = linitial(op->args);
 
-if (DatumGetBool(((Const *) rightop)->constvalue))
-{
-/* X = true implies X */
-if (equal(predicate, leftop))
-return true;
-}
+if (constexpr->constisnull)
+return false;
+
+if (DatumGetBool(constexpr->constvalue))
+return equal(predicate, leftop);
 else
-{
-/* X = false implies NOT X */
-if (is_notclause(predicate) &&
-equal(get_notclausearg(predicate), leftop))
-return true;
-}
+return is_notclause(predicate) &&
+equal(get_notclausearg(predicate), leftop);
 }
 }
 }
 break;

I don't understand what this bit is doing ... and the fact that
the patch removes all the existing comments and adds none isn't
helping that.  What it seems to mostly be doing is adding early
"return false"s, which I'm not sure is a good thing, because
it seems possible that operator_predicate_proof could apply here.

+case IS_UNKNOWN:
+/*
+ * When the clause is in the form "foo IS UNKNOWN" then
+ * we can prove weak implication of a predicate that
+ * is strict for "foo" and negated. This doesn't work
+ * for strong implication since if "foo" is "null" then
+ * the predicate will evaluate to "null" rather than
+ * "true".
+ */

The phrasing of this comment seems randomly inconsistent with others
making similar arguments.

+ 

Re: Teach predtest about IS [NOT] proofs

2024-01-17 Thread James Coleman
On Fri, Dec 22, 2023 at 2:48 PM Tom Lane  wrote:
>
> James Coleman  writes:
> > I've not yet applied all of your feedback, but I wanted to get an
> > initial read on your thoughts on how using switch statements ends up
> > looking. Attached is a single (pure refactor) patch that converts the
> > various if/else levels that check things like node tag and
> > boolean/null test type into switch statements. I've retained 'default'
> > keyword usages for now for simplicity (my intuition is that we
> > generally prefer to list out all options for compiler safety benefits,
> > though I'm not 100% sure that's useful in the outer node tag check
> > since it's unlikely that someone adding a new node would modify
> > this...).
>
> > My big question is: are you comfortable with the indentation explosion
> > this creates? IMO it's a lot wordier, but it is also more obvious what
> > the structural goal is. I'm not sure how we want to make the right
> > trade-off though.
>
> Yeah, I see what you mean.  Also, I'd wanted to shove most of
> the text in the function header in-line and get rid of the short
> restatements of those paras.  I carried that through just for
> predicate_implied_by_simple_clause, as attached.  The structure is
> definitely clearer, but we end up with an awful lot of indentation,
> which makes the comments less readable than I'd like.  (I did some
> minor rewording to make them flow better.)
>
> On balance I think this is probably better than what we have, but
> maybe we'd be best off to avoid doubly nested switches?  I think
> there's a good argument for the outer switch on nodeTag, but
> maybe we're getting diminishing returns from an inner switch.
>
> regards, tom lane
>

Apologies for the long delay.

Attached is a new patch series.

0001 does the initial pure refactor. 0003 makes a lot of modifications
to what we can prove about implication and refutation. Finally, 0003
isn't intended to be committed, but attempts to validate more
holistically that none of the changes creates any invalid proofs
beyond the mostly happy-path tests added in 0004.

I ended up not tackling changing how test_predtest tests run for now.
That's plausibly still useful, and I'd be happy to add that if you
generally agree with the direction of the patch and with that
abstraction being useful.

I added some additional verifications to the test_predtest module to
prevent additional obvious flaws.

Regards,
James Coleman


v4-0003-Add-temporary-all-permutations-test.patch
Description: Binary data


v4-0001-Use-switch-statements-in-predicate_-implied-refut.patch
Description: Binary data


v4-0002-Teach-predtest.c-about-BooleanTest.patch
Description: Binary data


Re: Teach predtest about IS [NOT] proofs

2023-12-22 Thread Tom Lane
James Coleman  writes:
> I've not yet applied all of your feedback, but I wanted to get an
> initial read on your thoughts on how using switch statements ends up
> looking. Attached is a single (pure refactor) patch that converts the
> various if/else levels that check things like node tag and
> boolean/null test type into switch statements. I've retained 'default'
> keyword usages for now for simplicity (my intuition is that we
> generally prefer to list out all options for compiler safety benefits,
> though I'm not 100% sure that's useful in the outer node tag check
> since it's unlikely that someone adding a new node would modify
> this...).

> My big question is: are you comfortable with the indentation explosion
> this creates? IMO it's a lot wordier, but it is also more obvious what
> the structural goal is. I'm not sure how we want to make the right
> trade-off though.

Yeah, I see what you mean.  Also, I'd wanted to shove most of
the text in the function header in-line and get rid of the short
restatements of those paras.  I carried that through just for
predicate_implied_by_simple_clause, as attached.  The structure is
definitely clearer, but we end up with an awful lot of indentation,
which makes the comments less readable than I'd like.  (I did some
minor rewording to make them flow better.)

On balance I think this is probably better than what we have, but
maybe we'd be best off to avoid doubly nested switches?  I think
there's a good argument for the outer switch on nodeTag, but
maybe we're getting diminishing returns from an inner switch.

regards, tom lane

diff --git a/src/backend/optimizer/util/predtest.c b/src/backend/optimizer/util/predtest.c
index fe83e45311..a9accbed53 100644
--- a/src/backend/optimizer/util/predtest.c
+++ b/src/backend/optimizer/util/predtest.c
@@ -1087,38 +1087,12 @@ arrayexpr_cleanup_fn(PredIterInfo info)
 }
 
 
-/*--
+/*
  * predicate_implied_by_simple_clause
  *	  Does the predicate implication test for a "simple clause" predicate
  *	  and a "simple clause" restriction.
  *
  * We return true if able to prove the implication, false if not.
- *
- * We have several strategies for determining whether one simple clause
- * implies another:
- *
- * A simple and general way is to see if they are equal(); this works for any
- * kind of expression, and for either implication definition.  (Actually,
- * there is an implied assumption that the functions in the expression are
- * immutable --- but this was checked for the predicate by the caller.)
- *
- * Another way that always works is that for boolean x, "x = TRUE" is
- * equivalent to "x", likewise "x = FALSE" is equivalent to "NOT x".
- * These can be worth checking because, while we preferentially simplify
- * boolean comparisons down to "x" and "NOT x", the other form has to be
- * dealt with anyway in the context of index conditions.
- *
- * If the predicate is of the form "foo IS NOT NULL", and we are considering
- * strong implication, we can conclude that the predicate is implied if the
- * clause is strict for "foo", i.e., it must yield false or NULL when "foo"
- * is NULL.  In that case truth of the clause ensures that "foo" isn't NULL.
- * (Again, this is a safe conclusion because "foo" must be immutable.)
- * This doesn't work for weak implication, though.
- *
- * Finally, if both clauses are binary operator expressions, we may be able
- * to prove something using the system's knowledge about operators; those
- * proof rules are encapsulated in operator_predicate_proof().
- *--
  */
 static bool
 predicate_implied_by_simple_clause(Expr *predicate, Node *clause,
@@ -1127,65 +1101,115 @@ predicate_implied_by_simple_clause(Expr *predicate, Node *clause,
 	/* Allow interrupting long proof attempts */
 	CHECK_FOR_INTERRUPTS();
 
-	/* First try the equal() test */
+	/*
+	 * A simple and general rule is that a clause implies itself, hence we
+	 * check if they are equal(); this works for any kind of expression, and
+	 * for either implication definition.  (Actually, there is an implied
+	 * assumption that the functions in the expression are immutable --- but
+	 * this was checked for the predicate by the caller.)
+	 */
 	if (equal((Node *) predicate, clause))
 		return true;
 
-	/* Next see if clause is boolean equality to a constant */
-	if (is_opclause(clause) &&
-		((OpExpr *) clause)->opno == BooleanEqualOperator)
+	/* Our remaining strategies are all clause-type-specific */
+	switch (nodeTag(clause))
 	{
-		OpExpr	   *op = (OpExpr *) clause;
-		Node	   *rightop;
-
-		Assert(list_length(op->args) == 2);
-		rightop = lsecond(op->args);
-		/* We might never see a null Const here, but better check anyway */
-		if (rightop && IsA(rightop, Const) &&
-			!((Const *) rightop)->constisnull)
-		{
-			Node	   *leftop = linitial(op->args);
-
-			if (DatumGetBool(((Const *) rightop)->constvalue))
+		case T_OpExpr:
 			{
-/* X = true implies X */
-if (equal(predicate, 

Re: Teach predtest about IS [NOT] proofs

2023-12-22 Thread James Coleman
On Thu, Dec 14, 2023 at 4:38 PM Tom Lane  wrote:
>
> James Coleman  writes:
> > On Wed, Dec 13, 2023 at 1:36 PM Tom Lane  wrote:
> >> I don't have an objection in principle to adding more smarts to
> >> predtest.c.  However, we should be wary of slowing down cases where
> >> no BooleanTests are present to be optimized.  I wonder if it could
> >> help to use a switch on nodeTag rather than a series of if(IsA())
> >> tests.  (I'd be inclined to rewrite the inner if-then-else chains
> >> as switches too, really.  You get some benefit from the compiler
> >> noticing whether you've covered all the enum values.)
>
> > I think I could take this on; would you prefer it as a patch in this
> > series? Or as a new patch thread?
>
> No, keep it in the same thread (and make a CF entry, if you didn't
> already).  It might be best to make a series of 2 patches, first
> just refactoring what's there per this discussion, and then a
> second one to add BooleanTest logic.

CF entry is already created; I'll keep it here then.

> >> I note you've actively broken the function's ability to cope with
> >> NULL input pointers.  Maybe we don't need it to, but I'm not going
> >> to accept a patch that just side-swipes that case without any
> >> justification.
>
> > [ all callers have previously used predicate_classify ]
>
> OK, fair enough.  The checks for nulls are probably from ancient
> habit, but I agree we could remove 'em here.
>
> >> Perhaps, rather than hoping people will notice comments that are
> >> potentially offscreen from what they're modifying, we should relocate
> >> those comment paras to be adjacent to the relevant parts of the
> >> function?
>
> > Splitting up that block comment makes sense to me.
>
> Done, let's make it so.
>
> >> I've not gone through the patch in detail to see whether I believe
> >> the proposed proof rules.  It would help to have more comments
> >> justifying them.
>
> > Most of them are sufficiently simple -- e.g., X IS TRUE implies X --
> > that I don't think there's a lot to say in justification. In some
> > cases I've noted the cases that force only strong or weak implication.
>
> Yeah, it's the strong-vs-weak distinction that makes me cautious here.
> One's high-school-algebra instinct for what's obviously true tends to
> not think about NULL/UNKNOWN, and you do have to consider that.
>
> >>> As noted in a TODO in the patch itself, I think it may be worth 
> >>> refactoring
> >>> the test_predtest module to run the "x, y" case as well as the "y, x" case
>
> >> I think that's actively undesirable.  It is not typically the case that
> >> a proof rule for A => B also works in the other direction, so this would
> >> encourage wasting cycles in the tests.  I fear it might also cause
> >> confusion about which direction a proof rule is supposed to work in.
>
> > That makes sense in the general case.
>
> > Boolean expressions seem like a special case in that regard: (subject
> > to what it looks like) would you be OK with a wrapping function that
> > does both directions (with output that shows which direction is being
> > tested) used only for the cases where we do want to check both
> > directions?
>
> Using a wrapper where appropriate would remove the inefficiency
> concern, but I still worry whether it will promote confusion about
> which direction we're proving things in.  You'll need to be very clear
> about the labeling.

I've not yet applied all of your feedback, but I wanted to get an
initial read on your thoughts on how using switch statements ends up
looking. Attached is a single (pure refactor) patch that converts the
various if/else levels that check things like node tag and
boolean/null test type into switch statements. I've retained 'default'
keyword usages for now for simplicity (my intuition is that we
generally prefer to list out all options for compiler safety benefits,
though I'm not 100% sure that's useful in the outer node tag check
since it's unlikely that someone adding a new node would modify
this...).

My big question is: are you comfortable with the indentation explosion
this creates? IMO it's a lot wordier, but it is also more obvious what
the structural goal is. I'm not sure how we want to make the right
trade-off though.

Once there's agreement on this part, I'll add back a second patch
applying my changes on top of the refactor as well as apply other
feedback (e.g., splitting up the block comment).

Regards,
James Coleman


v2-0001-WIP-use-switch-statements.patch
Description: Binary data


Re: Teach predtest about IS [NOT] proofs

2023-12-14 Thread Tom Lane
James Coleman  writes:
> On Wed, Dec 13, 2023 at 1:36 PM Tom Lane  wrote:
>> I don't have an objection in principle to adding more smarts to
>> predtest.c.  However, we should be wary of slowing down cases where
>> no BooleanTests are present to be optimized.  I wonder if it could
>> help to use a switch on nodeTag rather than a series of if(IsA())
>> tests.  (I'd be inclined to rewrite the inner if-then-else chains
>> as switches too, really.  You get some benefit from the compiler
>> noticing whether you've covered all the enum values.)

> I think I could take this on; would you prefer it as a patch in this
> series? Or as a new patch thread?

No, keep it in the same thread (and make a CF entry, if you didn't
already).  It might be best to make a series of 2 patches, first
just refactoring what's there per this discussion, and then a
second one to add BooleanTest logic.

>> I note you've actively broken the function's ability to cope with
>> NULL input pointers.  Maybe we don't need it to, but I'm not going
>> to accept a patch that just side-swipes that case without any
>> justification.

> [ all callers have previously used predicate_classify ]

OK, fair enough.  The checks for nulls are probably from ancient
habit, but I agree we could remove 'em here.

>> Perhaps, rather than hoping people will notice comments that are
>> potentially offscreen from what they're modifying, we should relocate
>> those comment paras to be adjacent to the relevant parts of the
>> function?

> Splitting up that block comment makes sense to me.

Done, let's make it so.

>> I've not gone through the patch in detail to see whether I believe
>> the proposed proof rules.  It would help to have more comments
>> justifying them.

> Most of them are sufficiently simple -- e.g., X IS TRUE implies X --
> that I don't think there's a lot to say in justification. In some
> cases I've noted the cases that force only strong or weak implication.

Yeah, it's the strong-vs-weak distinction that makes me cautious here.
One's high-school-algebra instinct for what's obviously true tends to
not think about NULL/UNKNOWN, and you do have to consider that.

>>> As noted in a TODO in the patch itself, I think it may be worth refactoring
>>> the test_predtest module to run the "x, y" case as well as the "y, x" case

>> I think that's actively undesirable.  It is not typically the case that
>> a proof rule for A => B also works in the other direction, so this would
>> encourage wasting cycles in the tests.  I fear it might also cause
>> confusion about which direction a proof rule is supposed to work in.

> That makes sense in the general case.

> Boolean expressions seem like a special case in that regard: (subject
> to what it looks like) would you be OK with a wrapping function that
> does both directions (with output that shows which direction is being
> tested) used only for the cases where we do want to check both
> directions?

Using a wrapper where appropriate would remove the inefficiency
concern, but I still worry whether it will promote confusion about
which direction we're proving things in.  You'll need to be very clear
about the labeling.

regards, tom lane




Re: Teach predtest about IS [NOT] proofs

2023-12-13 Thread James Coleman
Thanks for taking a look!

On Wed, Dec 13, 2023 at 1:36 PM Tom Lane  wrote:
>
> James Coleman  writes:
> > Attached is a patch that solves that issue. It also teaches predtest about
> > quite a few more cases involving BooleanTest expressions (e.g., how they
> > relate to NullTest expressions). One thing I could imagine being an
> > objection is that not all of these warrant cycles in planning. If that
> > turns out to be the case there's not a particularly clear line in my mind
> > about where to draw that line.
>
> I don't have an objection in principle to adding more smarts to
> predtest.c.  However, we should be wary of slowing down cases where
> no BooleanTests are present to be optimized.  I wonder if it could
> help to use a switch on nodeTag rather than a series of if(IsA())
> tests.  (I'd be inclined to rewrite the inner if-then-else chains
> as switches too, really.  You get some benefit from the compiler
> noticing whether you've covered all the enum values.)

I think I could take this on; would you prefer it as a patch in this
series? Or as a new patch thread?

> I note you've actively broken the function's ability to cope with
> NULL input pointers.  Maybe we don't need it to, but I'm not going
> to accept a patch that just side-swipes that case without any
> justification.

I should have explained that. I don't think I've broken it:

1. predicate_implied_by_simple_clause() is only ever called by
predicate_implied_by_recurse()
2. predicate_implied_by_recurse() starts with:
pclass = predicate_classify(predicate, _info);
3. predicate_classify(Node *clause, PredIterInfo info) starts off with:
Assert(clause != NULL);

I believe this means we are currently guaranteed by the caller to
receive a non-NULL pointer, but I could be missing something.

The same argument (just substituting the equivalent "refute" function
names) applies to predicate_refuted_by_simple_clause().

> Another way in which the patch needs more effort is that you've
> not bothered to update the large comment block atop the function.
> Perhaps, rather than hoping people will notice comments that are
> potentially offscreen from what they're modifying, we should relocate
> those comment paras to be adjacent to the relevant parts of the
> function?

Splitting up that block comment makes sense to me.

> I've not gone through the patch in detail to see whether I believe
> the proposed proof rules.  It would help to have more comments
> justifying them.

Most of them are sufficiently simple -- e.g., X IS TRUE implies X --
that I don't think there's a lot to say in justification. In some
cases I've noted the cases that force only strong or weak implication.

There are a few cases, though, (e.g., "X is unknown weakly implies X
is not true") that, reading over this again, don't immediately strike
me as obvious, so I'll expand on those.

> > As noted in a TODO in the patch itself, I think it may be worth refactoring
> > the test_predtest module to run the "x, y" case as well as the "y, x" case
> > with a single call so as to eliminate a lot of repetition in
> > clause/expression test cases. If reviewers agree that's desirable, then I
> > could do that as a precursor.
>
> I think that's actively undesirable.  It is not typically the case that
> a proof rule for A => B also works in the other direction, so this would
> encourage wasting cycles in the tests.  I fear it might also cause
> confusion about which direction a proof rule is supposed to work in.

That makes sense in the general case.

Boolean expressions seem like a special case in that regard: (subject
to what it looks like) would you be OK with a wrapping function that
does both directions (with output that shows which direction is being
tested) used only for the cases where we do want to check both
directions?

Thanks,
James Coleman




Re: Teach predtest about IS [NOT] proofs

2023-12-13 Thread Tom Lane
James Coleman  writes:
> Attached is a patch that solves that issue. It also teaches predtest about
> quite a few more cases involving BooleanTest expressions (e.g., how they
> relate to NullTest expressions). One thing I could imagine being an
> objection is that not all of these warrant cycles in planning. If that
> turns out to be the case there's not a particularly clear line in my mind
> about where to draw that line.

I don't have an objection in principle to adding more smarts to
predtest.c.  However, we should be wary of slowing down cases where
no BooleanTests are present to be optimized.  I wonder if it could
help to use a switch on nodeTag rather than a series of if(IsA())
tests.  (I'd be inclined to rewrite the inner if-then-else chains
as switches too, really.  You get some benefit from the compiler
noticing whether you've covered all the enum values.)

I note you've actively broken the function's ability to cope with
NULL input pointers.  Maybe we don't need it to, but I'm not going
to accept a patch that just side-swipes that case without any
justification.

Another way in which the patch needs more effort is that you've
not bothered to update the large comment block atop the function.
Perhaps, rather than hoping people will notice comments that are
potentially offscreen from what they're modifying, we should relocate
those comment paras to be adjacent to the relevant parts of the
function?

I've not gone through the patch in detail to see whether I believe
the proposed proof rules.  It would help to have more comments
justifying them.

> As noted in a TODO in the patch itself, I think it may be worth refactoring
> the test_predtest module to run the "x, y" case as well as the "y, x" case
> with a single call so as to eliminate a lot of repetition in
> clause/expression test cases. If reviewers agree that's desirable, then I
> could do that as a precursor.

I think that's actively undesirable.  It is not typically the case that
a proof rule for A => B also works in the other direction, so this would
encourage wasting cycles in the tests.  I fear it might also cause
confusion about which direction a proof rule is supposed to work in.

regards, tom lane




Teach predtest about IS [NOT] proofs

2023-12-11 Thread James Coleman
Hello,

I recently encountered a case where partial indexes were surprisingly not
being used. The issue is that predtest doesn't understand how boolean
values and IS  expressions relate.

For example if I have:

create table foo(i int, bar boolean);
create index on foo(i) where bar is true;

then this query:

select * from foo where i = 1 and bar;

doesn't use the partial index.

Attached is a patch that solves that issue. It also teaches predtest about
quite a few more cases involving BooleanTest expressions (e.g., how they
relate to NullTest expressions). One thing I could imagine being an
objection is that not all of these warrant cycles in planning. If that
turns out to be the case there's not a particularly clear line in my mind
about where to draw that line.

As noted in a TODO in the patch itself, I think it may be worth refactoring
the test_predtest module to run the "x, y" case as well as the "y, x" case
with a single call so as to eliminate a lot of repetition in
clause/expression test cases. If reviewers agree that's desirable, then I
could do that as a precursor.

Regards,
James Coleman


v1-0001-Teach-predtest-about-IS-NOT-bool-proofs.patch
Description: Binary data