Re: [HACKERS] Surjective functional indexes

Konstantin Knizhnik Thu, 08 Nov 2018 00:09:02 -0800


On 07.11.2018 22:25, Tom Lane wrote:

Robert Haas <robertmh...@gmail.com> writes:

On Fri, May 11, 2018 at 11:01 AM, Simon Riggs <si...@2ndquadrant.com> wrote:

I have no problem if you want to replace this with an even better
design in a later release.

Meh. The author / committer should get a patch into the right shape

They have done, at length. Claiming otherwise is just trash talk.

Some people might call it "honest disagreement".

So where we are today is that this patch was lobotomized by commits
77366d90f/d06fe6ce2 as a result of this bug report:

https://postgr.es/m/20181106185255.776mstcyehnc63ty@alvherre.pgsql

We need to move forward, either by undertaking a more extensive
clean-out, or by finding a path to a version of the code that is
satisfactory.  I wanted to enumerate my concerns while yesterday's
events are still fresh in mind.  (Andres or Robert might have more.)

* I do not understand why this feature is on-by-default in the first
place.  It can only be a win for expression indexes that are many-to-one
mappings; for indexes that are one-to-one or few-to-one, it's a pretty
big loss.  I see no reason to assume that most expression indexes fall
into the first category.  I suggest that the design ought to be to use
this optimization only for indexes for which the user has explicitly
enabled recheck_on_update.  That would allow getting rid of the cost check
in IsProjectionFunctionalIndex, about which we'd otherwise have to have
an additional fight: I do not like its ad-hoc-ness, nor the modularity
violation (and potential circularity) involved in having the relcache call
cost_qual_eval.

* Having heapam.c calling the executor also seems like a
modularity/layering violation that will bite us in the future.

* The implementation seems incredibly inefficient.  ProjIndexIsUnchanged
is doing creation/destruction of an EState, index_open/index_close
(including acquisition of a lock we should already have), BuildIndexInfo,
and expression compilation, all of which are completely redundant with
work done elsewhere in the executor.  And it's doing that *over again for
every tuple*, which totally aside from wasted cycles probably means there
are large query-lifespan memory leaks in an UPDATE affecting many rows.
I think a minimum expectation should be that one-time work is done only
one time; but ideally none of those things would happen at all because
we could share work with the regular code path.  Perhaps it's too much
to hope that we could also avoid duplicate computation of the new index
expression values, but as long as I'm listing complaints ...

* As noted in the bug thread, the implementation of the new reloption
is broken because (a) it fails to work for some built-in index AMs
and (b) by design, it can't work for add-on AMs.  I agree with Andres
that that's not acceptable.

* This seems like bad data structure design:

+   Bitmapset  *rd_projidx;     /* Oids of projection indexes */

That comment is a lie, although IMO it'd be better if it were true;
a list of target index OIDs would be a far better design here.  The use
of rd_projidx as a set of indexes into the relation's indexlist is
inefficient and overcomplicated, plus it creates an unnecessary and not
very safe (even if it were documented) coupling between rd_indexlist and
rd_projidx.

* Having ProjIndexIsUnchanged examine relation->rd_projidx directly is
broken by design anyway, both from a modularity standpoint and because
its inner loop involves catalog accesses that could result in relcache
flushes.  I'm somewhat amazed that the regression tests passed on
CLOBBER_CACHE_ALWAYS buildfarm animals, although possibly this is
explained by the fact that they're only testing cases with a single
expression index, so that the bitmap isn't checked again after the cache
flush happens.  Again, this could be fixed if what was returned was just
a list of relevant index OIDs.

* This bit of coding is unsafe, for the reason explained in the existing
comment:

      /*
       * Now save copies of the bitmaps in the relcache entry.  We intentionally
       * set rd_indexattr last, because that's the one that signals validity of
       * the values; if we run out of memory before making that copy, we won't
       * leave the relcache entry looking like the other ones are valid but
       * empty.
       */
      oldcxt = MemoryContextSwitchTo(CacheMemoryContext);
      relation->rd_keyattr = bms_copy(uindexattrs);
      relation->rd_pkattr = bms_copy(pkindexattrs);
      relation->rd_idattr = bms_copy(idindexattrs);
      relation->rd_indexattr = bms_copy(indexattrs);
+    relation->rd_projindexattr = bms_copy(projindexattrs);
+    relation->rd_projidx = bms_copy(projindexes);
      MemoryContextSwitchTo(oldcxt);

* There's a lot of other inattention to comments.  For example, I noticed
that this patch added new responsibilities to RelationGetIndexList without
updating its header comment to mention them.

* There's a lack of attention to terminology, too.  I do not think that
"projection index" is an appropriate term at all, nor do I think that
"recheck_on_update" is a particularly helpful option name, though we
may be stuck with the latter at this point.

* I also got annoyed by minor sloppiness like not adding the new
regression test in the same place in serial_schedule and
parallel_schedule.  The whole thing needed more careful review
than it got.

In short, it seems likely to me that large parts of this patch need to
be pulled out, rewritten, and then put back in different places than
they are today.  I'm not sure if a complete revert is the best next
step, or if we can make progress without that.

                        regards, tom lane

First of all I am sorry for this bug. It's a pity that I was notinvolved in this investigation: I am not subscribed on"postgresql-general" list.I have found the same bug in yet another my code few month ago. Shame onme, but I didn't realized that values returned by FormIndexDatumdata doesn't match with descriptor of index relation, because AM mayconvert them to own types. By the way, I will be please if somebody cansuggest the best

way of building proper tuple descriptor.

If the idea of "projection" index optimization is still considered asvaluable by community, I can continue improvement of this patch.Tom, thank you for the critics.But if it's not too brazen of me, I'dlike to receive more constructive critics:

> Having heapam.c calling the executor also seems like amodularity/layering violation that will bite us in the future.Function ProjIndexIsUnchanged is implemented and used in heapam.cbecause this is the place where decision weather to perform or noperform hot update is made.Insertion of index is done later. So the only possibility is to somehowcache results of this operation to be able to reuse in future.

> The implementation seems incredibly inefficient. ProjIndexIsUnchangedis doing creation/destruction of an EState, index_open/index_close...

Well, please look for example at get_actual_variable_range function. Itis doing almost the same. Is it also extremely inefficient?Almost all functions dealing with indexes and calling FormIndexDatum areperforming the same actions. But the main differences, is that most ofthem (like IndexCheckExclusion) initialize EState and create slot onlyonce and then iterate through all tuples matching search condition. AndProjIndexIsUnchanged is doing it for each tuple. Once again, I will bepleased to receive from you (or somebody else) advice how to addressthis problem. I have no idea how it it possible to avoid to perform thischeck for each affected tuple. So the only solution I can imagine is tosomehow cache and reused created executor state.

> Having ProjIndexIsUnchanged examine relation->rd_projidx directly isbroken by design anyway, both from a modularity standpoint and becauseits inner loop involves catalog accesses that could result in relcacheflushes.

Sorry, can you clarify it? As far as there are shared lock forcorrespondent relation and index, nobody else can alter this table orindex. And there is rd_refcnt which should prevent destruction ofRelation even if relcache is invalidated. There are a lot of otherplaces where components of relation are accessed directly (most oftenrelation->rd_rel stuff). I do not understand why replacing rd_projidxwith list of Oid will solve the problem.

> There's a lack of attention to terminology, too. I do not think that"projection index" is an appropriate term at all, nor do I think that"recheck_on_update" is a particularly helpful option name, though we maybe stuck with the latter at this point.

As I am not native english speaking, I will agree with any proposedterminology.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Surjective functional indexes

Reply via email to