Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

Peter Geoghegan Thu, 22 Aug 2019 21:34:51 -0700

On Wed, Aug 21, 2019 at 10:19 AM Anastasia Lubennikova
<a.lubennik...@postgrespro.ru> wrote:
> I'm going to look through the patch once more to update nbtxlog
> comments, where needed and
> answer to your remarks that are still left in the comments.


Have you been using amcheck's rootdescend verification? I see this
problem with v8, with the TPC-H test data:

DEBUG:  finished verifying presence of 1500000 tuples from table
"customer" with bitset 51.09% set
ERROR:  could not find tuple using search from root page in index
"idx_customer_nationkey2"

I've been running my standard amcheck query with these databases, which is:

SELECT bt_index_parent_check(index => c.oid, heapallindexed => true,
rootdescend => true),
c.relname,
c.relpages
FROM pg_index i
JOIN pg_opclass op ON i.indclass[0] = op.oid
JOIN pg_am am ON op.opcmethod = am.oid
JOIN pg_class c ON i.indexrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE am.amname = 'btree'
AND c.relpersistence != 't'
AND c.relkind = 'i' AND i.indisready AND i.indisvalid
ORDER BY c.relpages DESC;

There were many large indexes that amcheck didn't detect a problem
with. I don't yet understand what the problem is, or why we only see
the problem for a small number of indexes. Note that all of these
indexes passed verification with v5, so this is some kind of
regression.

I also noticed that there were some regressions in the size of indexes
-- indexes were not nearly as small as they were in v5 in some cases.
The overall picture was a clear regression in how effective
deduplication is.

I think that it would save time if you had direct access to my test
data, even though it's a bit cumbersome. You'll have to download about
10GB of dumps, which require plenty of disk space when restored:

regression=# \l+
                                                                List
of databases
    Name    | Owner | Encoding |  Collate   |   Ctype    | Access
privileges |  Size   | Tablespace |                Description
------------+-------+----------+------------+------------+-------------------+---------+------------+--------------------------------------------
 land       | pg    | UTF8     | en_US.UTF8 | en_US.UTF8 |
      | 6425 MB | pg_default |
 mgd        | pg    | UTF8     | en_US.UTF8 | en_US.UTF8 |
      | 61 GB   | pg_default |
 postgres   | pg    | UTF8     | en_US.UTF8 | en_US.UTF8 |
      | 7753 kB | pg_default | default administrative connection
database
 regression | pg    | UTF8     | en_US.UTF8 | en_US.UTF8 |
      | 886 MB  | pg_default |
 template0  | pg    | UTF8     | en_US.UTF8 | en_US.UTF8 | =c/pg
     +| 7609 kB | pg_default | unmodifiable empty database
            |       |          |            |            | pg=CTc/pg
      |         |            |
 template1  | pg    | UTF8     | en_US.UTF8 | en_US.UTF8 | =c/pg
     +| 7609 kB | pg_default | default template for new databases
            |       |          |            |            | pg=CTc/pg
      |         |            |
 tpcc       | pg    | UTF8     | en_US.UTF8 | en_US.UTF8 |
      | 10 GB   | pg_default |
 tpce       | pg    | UTF8     | en_US.UTF8 | en_US.UTF8 |
      | 26 GB   | pg_default |
 tpch       | pg    | UTF8     | en_US.UTF8 | en_US.UTF8 |
      | 32 GB   | pg_default |
(9 rows)

I have found it very valuable to use this test data when changing
nbtsplitloc.c, or anything that could affect where page splits make
free space available. If this is too much data to handle conveniently,
then you could skip "mgd" and almost have as much test coverage. There
really does seem to be a benefit to using diverse test cases like
this, because sometimes regressions only affect a small number of
specific indexes for specific reasons. For example, only TPC-H has a
small number of indexes that have tuples that are inserted in order,
but also have many duplicates. Removing the BT_COMPRESS_THRESHOLD
stuff really helped with those indexes.

Want me to send this data and the associated tests script over to you?

-- 
Peter Geoghegan

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

Reply via email to