Re: POC, WIP: OR-clause support for indexes

Peter Geoghegan Tue, 25 Jul 2023 16:48:21 -0700

On Thu, Jun 29, 2023 at 2:32 AM Alena Rybakina <lena.riback...@yandex.ru> wrote:
> Hi! I'm sorry I didn't answer you right away, I was too busy with work.


Same for me, this time. I was busy working on my patch, which I
finally posted yesterday.

> To be honest, I didn't think about the fact that my optimization can help 
> eliminate duplicates before reading the data before.

I'm not surprised that you didn't specifically think of that, because
it's very subtle.

> I am still only in the process of familiarizing myself with the thread [1] 
> (reference from your letter), but I have already seen that there are problems 
> related, for example, to when "or" expressions refer to the parent element.

I didn't intend to imply that you might have the same problem here. I
just meant that OR optimizations can have problems with duplicate
elimination, in general. I suspect that your patch doesn't have that
problem, because you are performing a transformation of one kind of OR
into another kind of OR.

> I think, I would face the similar problems if I complicate the current code, 
> for example, so that not only or expressions standing on the same level are 
> written in any, but also on different ones without violating the logic of the 
> priority of executing operators.

I can't say that I am particularly experienced in this general area --
I have never tried to formally reason about how two different
statements are equivalent. It just hasn't been something that I've
needed to have a totally rigorous understanding of up until now. But
my recent patch changes that. Now I need to study this area to make
sure that I have a truly rigorous understanding.

Jeff Davis suggested that I read "Logic and Databases", by C.J. Date.
So now I have some homework to do.

> Unfortunately, when I tried to make a transformation at the stage of index 
> formation, I encountered too incorrect an assessment of the selectivity of 
> relation, which affected the incorrect calculation of the cost and 
> cardinality.

It's not surprising that a weird shift in the plan chosen by the
optimizer is seen with some random test case, as a result of this
added transformation. Even changes that are 100% strictly better (e.g.
changes in a selectivity estimation function that is somehow
guaranteed to be more accurate in all cases) might do that. Here is a
recent example of that with another patch, involving a bitmap OR:

https://postgr.es/m/cah2-wzncdk9n2tz6j_-iln563_epuc3nzp6vsvtl6jhzs6n...@mail.gmail.com

This example was *not* a regression, if you go by conventional
measures. It was merely a less robust plan than the bitmap OR plan,
because it didn't pass down both columns as index quals.

BTW, there are various restrictions on the sort order of SAOPs that
you might want to try to familiarize yourself with. I describe them
(perhaps not very clearly) here:

https://postgr.es/m/CAH2-Wz=ksvn_sjcnd1+bt-wtifra5ok48adynq3pkkhxgmq...@mail.gmail.com

Currently, the optimizer doesn't recognize multi-column indexes with
SAOPs on every column as having a valid sort order, except on the
first column. It seems possible that that has consequences for your
patch. (I'm really only guessing, though; don't trust anything that I
say about the optimizer too much.)

--
Peter Geoghegan

Re: POC, WIP: OR-clause support for indexes

Reply via email to