Hello

In several queries relying on views, I noticed that the optimizer miss a quite 
simple to implement optimization. My views contain several branches, with 
different paths that are simplified by the caller of the view. This 
simplification is based on columns to be null or not.

Today, even with a single table, the following (silly) query is not optimized 
away:
        SELECT * FROM test WHERE a IS NULL AND a IS NOT NULL;

In more complex cases, it of course isn't any better:
        SELECT * FROM (
        SELECT a, NULL::integer AS b FROM foo
      UNION ALL
        SELECT a, b FROM bar WHERE b IS NOT NULL
        ) WHERE a = 1 AND b IS NULL;

The attached patch handles both situations. When flattening and simplifying 
the AND clauses, a list of the NullChecks is built, and subsequent NullChecks 
are compared to the list. If opposite NullChecks on the same variable are 
found, the whole AND is optimized away.
This lead to nice boosts, since instead of having 'never executed' branches, 
the optimizer can go even further. Right now, the algorithmic complexity of 
this optimization is not great: it is in O(n²), with n being the number of 
NullCheck in a given AND clause. But compared to the possible benefits, and 
the very low risk of n being high enough to have a real planification-time 
impact, I feel this optimization would be worth it.


Regards

 Pierre
>From 0a300b6fdd934daa6fb79e9bc67bdc2adfa3cc72 Mon Sep 17 00:00:00 2001
From: Pierre Ducroquet <p.p...@pinaraf.info>
Date: Wed, 6 Nov 2019 18:05:01 +0100
Subject: [PATCH] Simplify AND clauses that have NOT NULL clauses on similar
 variables

It's not uncommon to have views with several branches, with fields that are null depending on the branch.
If the view is queried with an IS NULL or IS NOT NULL on such a field, huge simplification can be done.

This patch thus looks, when flattening and simplifying AND clauses, for opposite NOT NULL on similar variables.
If this happens, the whole AND clause is discarded and simplified into false.
---
 src/backend/optimizer/util/clauses.c | 44 ++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index a04b62274d..2ba66a62a5 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -3781,6 +3781,7 @@ simplify_and_arguments(List *args,
 {
 	List	   *newargs = NIL;
 	List	   *unprocessed_args;
+	List	   *null_checks = NIL;
 
 	/* See comments in simplify_or_arguments */
 	unprocessed_args = list_copy(args);
@@ -3844,6 +3845,49 @@ simplify_and_arguments(List *args,
 			continue;
 		}
 
+		if (IsA(arg, NullTest))
+		{
+			NullTest	*null_check = (NullTest*) arg;
+
+			/* Only check for $VAR IS NULL, ignore other expressions */
+			if (!IsA(null_check->arg, Var))
+				goto next;
+
+			if (null_checks == NIL)
+			{
+				null_checks = lappend(null_checks, null_check);
+				goto next;
+			}
+			else
+			{
+				Var			*null_check_var = (Var *) null_check->arg;
+
+				/* Now, we can check every other null check from our list */
+				ListCell	*other_null;
+				foreach (other_null, null_checks)
+				{
+					NullTest	*other_null_check = (NullTest *) lfirst(other_null);
+					Var			*other_null_var = (Var *) other_null_check->arg;
+
+					if ((other_null_check->nulltesttype != null_check->nulltesttype) &&
+						(other_null_var->varno == null_check_var->varno) &&
+						(other_null_var->varattno == null_check_var->varattno))
+					{
+						*forceFalse = true;
+
+						/*
+						* Once we detect a FALSE result we can just exit the loop
+						* immediately.  However, if we ever add a notion of
+						* non-removable functions, we'd need to keep scanning.
+						*/
+						return NIL;
+					}
+				}
+				lappend(null_checks, null_check);
+			}
+		}
+
+next:
 		/* else emit the simplified arg into the result list */
 		newargs = lappend(newargs, arg);
 	}
-- 
2.24.0.rc2

Reply via email to