Re: [HACKERS] Indirect indexes

Alexander Korotkov Tue, 18 Oct 2016 14:47:43 -0700

On Wed, Oct 19, 2016 at 12:21 AM, Alexander Korotkov <
[email protected]> wrote:

> On Tue, Oct 18, 2016 at 9:28 PM, Alvaro Herrera <[email protected]>
> wrote:
>
>> Vacuuming presents an additional challenge: in order to remove index
>> items from an indirect index, it's critical to scan the PK index first
>> and collect the PK values that are being removed.  Then scan the
>> indirect index and remove any items that match the PK items removed.
>> This is a bit problematic because of the additional memory needed to
>> store the array of PK values.  I haven't implemented this yet.
>>
>
> Imagine another situation: PK column was not updated, but indirect indexed
> column was updated.
> Thus, for single heap tuple we would have single PK tuple and two indirect
> index tuples (correct me if I'm wrong).
> How are we going to delete old indirect index tuple?
>

Let me explain it in more details.

There is a table with two columns and indirect index on it.

CREATE TABLE tbl (id integer primary key, val integer);
CREAET INDIRECT INDEX tbl_val_indirect_idx ON tbl (val);

Then do insert and update.

INSERT INTO tbl VALUES (1, 1);
UPDATE tbl SET val = 2 WHERE id = 1;

Then heap would contain two tuples.

 ctid  | id | val
-------+----+-----
 (0;1) |  1 |   1
 (0;2) |  1 |   2

tbl_pk_idx would contain another two tuples

 id | item_pointer
----+--------------
  1 |        (0;1)
  1 |        (0;2)

And tbl_val_indirect_idx would have also two tuples

 val | id
-----+----
   1 |  1
   2 |  1

Then vacuum removes (0;1) from heap, reference to (0;1) from tbl_pk_idx.
But how will it remove (1,1) tuple from tbl_val_indirect_idx?  Thus, before
vacuuming tbl_val_indirect_idx we should know not only values of id which
are being removed, but actually (id, val) pairs which are being removed.
Should we collect those paris while scanning heap?  But we should also take
into account that multiple heap tuples might have same (id, val) pair
values (assuming there could be other columns being updated).  Therefore,
we should take into account when last pair of particular (id, val) pair
value was deleted from heap.  That would be very huge change to vacuum, may
be even writing way more complex vacuum algorithm from scratch.  Probably,
you see the better solution of this problem.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Indirect indexes

Reply via email to