On Fri, Mar 31, 2017 at 4:31 PM, Peter Geoghegan <p...@bowt.ie> wrote:
> That's all I have for now. Maybe I can look again later, or tomorrow.

I took another look, this time at code used during CREATE INDEX. More feedback:

* I see no reason to expose _bt_pgaddtup() (to modify it to not be
static, so it can be called during CREATE INDEX for truncated high
key). You could call PageAddItem() directly, just as _bt_pgaddtup()
itself does, and lose nothing. This is the case because the special
steps within _bt_pgaddtup() are only when inserting the first real
item (and only on an internal page). You're only ever using
_bt_pgaddtup() for the high key offset. Would a raw PageAddItem() call
lose anything?

I think I see why you've done this -- the existing CREATE INDEX
_bt_sortaddtup() routine (which is very similar to _bt_pgaddtup(), a
routine used for *insertion*) doesn't do the correct thing were you to
use it, because it assumes that the page is always right most (i.e.,
always has no high key yet).

The reason _bt_sortaddtup() exists is explained here:

 * This is almost like nbtinsert.c's _bt_pgaddtup(), but we can't use
 * that because it assumes that P_RIGHTMOST() will return the correct
 * answer for the page.  Here, we don't know yet if the page will be
 * rightmost.  Offset P_FIRSTKEY is always the first data key.
static void
_bt_sortaddtup(Page page,
               Size itemsize,
               IndexTuple itup,
               OffsetNumber itup_off)

(...thinks some more...)

So, this difference only matters when you have a non-leaf item, which
is never subject to truncation in your patch. So, in fact, it doesn't
matter at all. I guess you should just use _bt_pgaddtup() after all,
rather than bothering with a raw PageAddItem(), even. But, don't
forget to note why this is okay above _bt_sortaddtup().

* Calling PageIndexTupleDelete() within _bt_buildadd(), which
memmove()s all other items on the leaf page, seems wasteful in the
context of CREATE INDEX. Can we do better?

* I also think that calling PageIndexTupleDelete() has a page space
accounting bug, because the following thing happens a second time for
highkey ItemId when new code does this call:

phdr->pd_lower -= sizeof(ItemIdData);

(The first time this happens is within _bt_buildadd() itself, just
before your patch calls PageIndexTupleDelete().)

* I don't think it's okay to let index_truncate_tuple() leak memory
within _bt_buildadd(). It's probably okay for nbtinsert.c callers to
index_truncate_tuple() to not be too careful, though, since those
calls occur in a per-tuple memory context. The same cannot be said for
_bt_buildadd()/CREATE INDEX calls.

* Speaking of memory management: is this really needed?

> @@ -554,7 +580,11 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, 
> IndexTuple itup)
>          * Save a copy of the minimum key for the new page.  We have to copy
>          * it off the old page, not the new one, in case we are not at leaf
>          * level.
> +        * Despite oitup is already initialized, it's important to get high
> +        * key from the page, since we could have replaced it with truncated
> +        * copy. See comment above.
>          */
> +       oitup = (IndexTuple) PageGetItem(opage,PageGetItemId(opage, P_HIKEY));
>         state->btps_minkey = CopyIndexTuple(oitup);

You didn't modify/truncate oitup in-place -- you effectively made a
(truncated) copy by calling index_truncate_tuple(). Maybe you can
manage the memory by assigning keytup to state->btps_minkey, in place
of a CopyIndexTuple(), just for the truncation case?

I haven't studied this in enough detail to be sure that that would be
correct, but it seems clear that a better strategy is needed for
managing memory within _bt_buildadd().

Peter Geoghegan

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to