Re: [HACKERS][PROPOSAL] Covering + unique indexes.

Anastasia Lubennikova Tue, 15 Sep 2015 09:58:28 -0700

Proposal Clarification.

I see that discussion become too complicated. So, I'd like to clarifywhat we are talking about.


We are discussing 2 different improvements of index.

The one is "partially unique index" and the other "index with includedcolumns".

Let's look at example.

- We have a table tbl(f1, f2, f3, f4).
- We want to have an unique index on (f1,f2).

- We want to have an index on (f1, f2, f3) which allow us to use indexfor complex "where" clauses.- We would like to have a covering index on all columns to avoid readingof heap pages.


What are we doing now:
CREATE UNIQUE INDEX on tbl(f1,f2);
CREATE INDEX on tbl(f1, f2, f3, f4);

What's wrong?

- Two indexes with repeated data. Overhead to data manipulationoperations and database size.

- We don't need f4 as index key. But we have to...

- Problem related to previous. It's possible that f4 has no opclass forour index. So there's no way to include it to index.

While we don't need any opclass at all.

Suggestions.
CREATE INDEX idx ON tbl (f1, f2, f3) [UNIQUE ON (f1, f2)] [INCLUDE (f4)];
Let's review it stepby step.

1. "partially unique index"
CREATE INDEX idx ON tbl (f1, f2, f3) UNIQUE ON (f1, f2);

It means that we want to have columns (f1, f2, f3) as index keys in ourindex.

But we want enforce uniqueness only on first two.
It allows us insert (1,1,1), (1,2,1) and restricts insert (1,1,1), (1,1,2).

It doesn't affect "select" queries.
Following query can use index-only scan.
SELECT f1,f2, f3 where f1 ... and f2 ... and f3 ....;

We haven't to maintain two indexes now. Just one!

_bt_iseual()<http://doxygen.postgresql.org/nbtinsert_8c.html#abcfb7a3dcd40a5d1759652239f3a0115>works with (f1, f2)_bt_compare()<http://doxygen.postgresql.org/nbtsearch_8c.html#af689dabb25e99f551b68aa9b7a1e6ea6>works with (f1,f2,f3)

2. "index with included columns" It goes well with both unique andnon-unique indexes.

CREATE INDEX idx ON tbl (f1, f2, f3) INCLUDE (f4);
What we get here:
- f4 is not search key.
- f4 could not have opclass for our index

- f4 is only in the leaf pages and it's not bloating internal nodes ofb-tree.

- f4 can still be retrieved without going to the heap, so including it
in the leaf nodes makes it possible to do index-only scans more often

Following query can use index-only scan:
SELECT f1,f2, f3, f4 where f1 ... and f2 ... and f3 ....;

And this one wouldn't use index-only scan because recheck on f4 is required.
SELECT f1,f2, f3, f4 where f1 ... and f2 ... and f3 .... and f4;


Catalog changes:
Now:

pg_index<http://doxygen.postgresql.org/pg__index_8h.html#a5065be0408f03576083a30c97b43a09a>

int16 indnatts; /* number of columns in index */
bool indisunique; /* is this a unique index? */

New:
pg_index

int16 ind_n_unique_atts; /* number of unique columns in index. countedfrom begin of index. 0 means that index is not unique */int16 ind_n_key_atts; /* number of index key columns in index. countedfrom begin of index.*/

int16 ind_n_total_atts; /* number of columns in index.*/

In our case:
ind_n_unique_atts = 2; // f1, f2
ind_n_key_atts = 3; // f1, f2, f3
ind_n_total_atts = 4; // f1, f2, f3, f4

P.S. I use many ideas from discussion without quotations just becauseI'd like to keep this message readable. Thanks to everyone.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS][PROPOSAL] Covering + unique indexes.

Reply via email to