[HACKERS] Increasing test coverage of WAL redo functions

Heikki Linnakangas Wed, 19 Nov 2014 06:31:38 -0800

To test WAL replay, I often set up a master-standby system withstreaming replication and run "make installcheck" on the master.However, the regression suite doesn't generate all WAL record types. Ispent some time looking at the lcov report (make coverage-html), andcrafted new tests to test those redo functions that were not otherwisecovered.

All the new test cases are related to indexes, mostly vacuuming them.See attached. With this patch, all WAL record types are tested, althoughthere are still a few codepaths within the redo functions (aside from"can't happen" error checks) that are not exercised.


There are a couple of problems with these new tests:

1. Whether the vacuum tests test what they're supposed to, depends onwhat else is going on in the system. If there's another backend presentthat holds onto an snapshot, vacuum won't be able to remove any rows, sothat code will go untested. Running those tests in parallel with othertests makes it quite likely that nothing can be vacuumed.

2. These make the regression database larger. The following tables andindexes are added:


postgres=# \d+
                         List of relations
 Schema |       Name       | Type  | Owner  |  Size   | Description
--------+------------------+-------+--------+---------+-------------
 public | btree_tall_tbl   | table | heikki | 24 kB   |
 public | btree_test_tbl   | table | heikki | 392 kB  |
 public | gin_test_tbl     | table | heikki | 588 MB  |
 public | gist_point_tbl   | table | heikki | 1056 kB |
 public | spgist_point_tbl | table | heikki | 1056 kB |
 public | spgist_text_tbl  | table | heikki | 1472 kB |
(6 rows)

postgres=# \di+
                                   List of relations

ption
--------+------------------+-------+--------+------------------+---------+-------
------
 public | btree_tall_idx   | index | heikki | btree_tall_tbl   | 1176 kB |
 public | btree_test_idx   | index | heikki | btree_test_tbl   | 472 kB  |
 public | gin_test_idx     | index | heikki | gin_test_tbl     | 220 MB  |
 public | gist_pointidx    | index | heikki | gist_point_tbl   | 1744 kB |
 public | spgist_point_idx | index | heikki | spgist_point_tbl | 1120 kB |
 public | spgist_text_idx  | index | heikki | spgist_text_tbl  | 440 kB  |
(6 rows)

The GIN test needs to create a huge table, to cover the case ofsplitting an internal posting tree page. That 588MB table plus index isobviously too much to include in the regular regression suite. I'm notsure how much smaller it could be made, but it's going to be in thatballpark anyway. It also takes a long time to run.

I think the rest are tolerable, they make the regression database about9 MB larger, from 45 MB to 53 MB, and only take a few seconds to run, onmy laptop.

My plan is to leave out that large GIN test for now, and commit therest. I'll add comments to the vacuum tests explaining that it's a hitand miss whether they manage to vacuum anything. It's still better tohave the tests even if they sometimes fail to test vacuum as intended,than not have the tests at all. In either case, they won't fail unlessthere's a bug somewhere, and they will still exercise some code that'snot otherwise tested at all.


Thoughts?

PS. The brin test case is currently in a funny position inserial_schedule, compared to parallel_schedule. This patch moves it towhere I think it belongs.


- Heikki

diff --git a/src/test/regress/expected/btree_index.out b/src/test/regress/expected/btree_index.out
index 74d47be..c8e51fc 100644
--- a/src/test/regress/expected/btree_index.out
+++ b/src/test/regress/expected/btree_index.out
@@ -127,3 +127,32 @@ select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
  RI_FKey_setnull_del
 (5 rows)
 
+--
+-- Test B-tree insertion with a metapage update (XLOG_BTREE_INSERT_META
+-- WAL record type). This happens when a "fast-root" page is split.
+--
+-- First create a tree that's at least two levels deep.
+create table btree_test_tbl(id int4);
+insert into btree_test_tbl select generate_series(1, 10000);
+create index btree_test_idx on btree_test_tbl (id);
+-- Delete most entries, and vacuum. This turns the leaf page into a fast root.
+delete from btree_test_tbl where id < 9990;
+vacuum btree_test_tbl;
+-- Now do more insertions, creating a sibling for the fast root, so that
+-- it's not the fast root anymore.
+insert into btree_test_tbl select generate_series(1, 9900);
+--
+-- Test B-tree page deletion. In particular, deleting a non-leaf page.
+--
+-- First create a tree that's at least four levels deep. The text inserted
+-- is long and poorly compressible. That way only a few index tuples fit on
+-- each page, allowing us to get a tall tree with fewer pages.
+CREATE TABLE btree_tall_tbl(id int4, t text);
+CREATE INDEX btree_tall_idx on btree_tall_tbl (id, t) WITH (fillfactor = 10);
+INSERT INTO btree_tall_tbl
+SELECT g, g::text || '_' ||
+          (select string_agg(md5(i::text), '_') from generate_series(1, 50) i)
+FROM generate_series(1, 100) g;
+-- Delete most entries, and vacuum. This causes page deletions.
+delete from btree_tall_tbl where id < 950;
+vacuum btree_tall_tbl;
diff --git a/src/test/regress/expected/gin.out b/src/test/regress/expected/gin.out
new file mode 100644
index 0000000..3b87dc3
--- /dev/null
+++ b/src/test/regress/expected/gin.out
@@ -0,0 +1,28 @@
+--
+-- Test GIN indexes.
+--
+-- There are other tests to test different GIN opclassed. This is for testing
+-- GIN itself.
+-- Create and populate a test table with a GIN index.
+create table gin_test_tbl(i int4[]);
+create index gin_test_idx on gin_test_tbl using gin (i) with (fastupdate=on);
+insert into gin_test_tbl select array[1, 2, g] from generate_series(1, 100000) g;
+insert into gin_test_tbl select array[1, 3, g] from generate_series(1, 100000) g;
+-- Insert enough entries to create a three-level posting tree. That exercises
+-- splitting an internal page. With the default 8k block size, each posting
+-- tree leaf page can hold at most around 8000 TIDs, and each internal page
+-- can hold about 800 child pages. So we need > 8000*800 items.
+insert into gin_test_tbl select array[1] from generate_series(1, 10000) g1, generate_series(1, 1000) g2;
+vacuum gin_test_tbl; -- flush the fast update pages
+-- Test vacuuming 
+delete from gin_test_tbl where i @> array[2];
+vacuum gin_test_tbl;
+-- Disable fastupdate, and do more insertions. With fastupdate enabled, most
+-- insertions (by flushing the list pages) cause just page splits. This
+-- causes more churn in the GIN data leaf pages, and exercises the recompression
+-- codepaths.
+alter index gin_test_idx set (fastupdate = off);
+insert into gin_test_tbl select array[1, 2, g] from generate_series(1, 10000) g;
+insert into gin_test_tbl select array[1, 3, g] from generate_series(1, 10000) g;
+delete from gin_test_tbl where i @> array[2];
+vacuum gin_test_tbl;
diff --git a/src/test/regress/expected/gist.out b/src/test/regress/expected/gist.out
new file mode 100644
index 0000000..fb9a54d
--- /dev/null
+++ b/src/test/regress/expected/gist.out
@@ -0,0 +1,19 @@
+--
+-- Test GiST indexes.
+--
+-- There are other tests to test different gist opclasses. This is for
+-- testing GiST code itself. Vacuuming in particular.
+create table gist_point_tbl(id int4, p point);
+create index gist_pointidx on gist_point_tbl using gist(p);
+-- Insert enough data to create a tree that's a couple of levels deep.
+insert into gist_point_tbl (id, p)
+select g,        point(g*10, g*10) from generate_series(1, 10000) g;
+insert into gist_point_tbl (id, p)
+select g+100000, point(g*10+1, g*10+1) from generate_series(1, 10000) g;
+-- To test vacuum, delete some entries from all over the index.
+delete from gist_point_tbl where id % 2 = 1;
+-- And also delete some concentration of values. (GiST doesn't currently
+-- attempt to delete pages even when they become empty, but if it did, this
+-- would exercise it)
+delete from gist_point_tbl where id < 10000;
+vacuum gist_point_tbl;
diff --git a/src/test/regress/expected/spgist.out b/src/test/regress/expected/spgist.out
new file mode 100644
index 0000000..1d1af56
--- /dev/null
+++ b/src/test/regress/expected/spgist.out
@@ -0,0 +1,38 @@
+--
+-- Test SP-GiST indexes.
+--
+-- There are other tests to test different SP-GiST opclasses. This is for
+-- testing SP-GiST code itself.
+create table spgist_point_tbl(id int4, p point);
+create index spgist_point_idx on spgist_point_tbl using spgist(p);
+-- Test vacuum-root action. It gets invoked when the root is also a leaf,
+-- i.e. the index is very small.
+insert into spgist_point_tbl (id, p)
+select g, point(g*10, g*10) from generate_series(1, 10) g;
+delete from spgist_point_tbl where id < 5;
+vacuum spgist_point_tbl;
+-- Insert more data, to make the index a few levels deep.
+insert into spgist_point_tbl (id, p)
+select g,      point(g*10, g*10) from generate_series(1, 10000) g;
+insert into spgist_point_tbl (id, p)
+select g+100000, point(g*10+1, g*10+1) from generate_series(1, 10000) g;
+-- To test vacuum, delete some entries from all over the index.
+delete from spgist_point_tbl where id % 2 = 1;
+-- And also delete some concentration of values. (SP-GiST doesn't currently
+-- attempt to delete pages even when they become empty, but if it did, this
+-- would exercise it)
+delete from spgist_point_tbl where id < 10000;
+vacuum spgist_point_tbl;
+-- The point opclass's choose method only uses the spgMatchNode action.
+-- Create an index using text opclass, which uses the others actions.
+create table spgist_text_tbl(id int4, t text);
+create index spgist_text_idx on spgist_text_tbl using spgist(t);
+insert into spgist_text_tbl (id, t)
+select g, 'f' || repeat('o', 100) || g from generate_series(1, 10000) g
+union all
+select g, 'baaaaaaaaaaaaaar' || g from generate_series(1, 1000) g;
+-- Do a lot of insertions that have to split an existing node. Hopefully
+-- one of these will cause the page to run out of space, causing the inner
+-- tuple to be moved to another page.
+insert into spgist_text_tbl (id, t)
+select -g, 'f' || repeat('o', 100-g) || 'surprise' from generate_series(1, 100) g;
diff --git a/src/test/regress/output/misc.source b/src/test/regress/output/misc.source
index 0660608..2cba2b6 100644
--- a/src/test/regress/output/misc.source
+++ b/src/test/regress/output/misc.source
@@ -597,6 +597,8 @@ SELECT user_relns() AS user_relns
  bt_i4_heap
  bt_name_heap
  bt_txt_heap
+ btree_tall_tbl
+ btree_test_tbl
  c
  c_star
  char_tbl
@@ -622,6 +624,8 @@ SELECT user_relns() AS user_relns
  float4_tbl
  float8_tbl
  func_index_heap
+ gin_test_tbl
+ gist_point_tbl
  hash_f8_heap
  hash_i4_heap
  hash_name_heap
@@ -671,6 +675,8 @@ SELECT user_relns() AS user_relns
  road
  shighway
  slow_emp4000
+ spgist_point_tbl
+ spgist_text_tbl
  street
  stud_emp
  student
@@ -700,7 +706,7 @@ SELECT user_relns() AS user_relns
  tvvmv
  varchar_tbl
  xacttest
-(122 rows)
+(128 rows)
 
 SELECT name(equipment(hobby_construct(text 'skywalking', text 'mer')));
  name 
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index d4f02e5..e1afd4b 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -83,7 +83,7 @@ test: select_into select_distinct select_distinct_on select_implicit select_havi
 # ----------
 # Another group of parallel tests
 # ----------
-test: brin privileges security_label collate matview lock replica_identity rowsecurity
+test: brin gin gist spgist privileges security_label collate matview lock replica_identity rowsecurity
 
 # ----------
 # Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 611b0a8..e609ab0 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -95,6 +95,10 @@ test: update
 test: delete
 test: namespace
 test: prepared_xacts
+test: brin
+test: gin
+test: gist
+test: spgist
 test: privileges
 test: security_label
 test: collate
@@ -103,7 +107,6 @@ test: lock
 test: replica_identity
 test: rowsecurity
 test: alter_generic
-test: brin
 test: misc
 test: psql
 test: async
diff --git a/src/test/regress/sql/btree_index.sql b/src/test/regress/sql/btree_index.sql
index 3f26468..19bb68a 100644
--- a/src/test/regress/sql/btree_index.sql
+++ b/src/test/regress/sql/btree_index.sql
@@ -64,3 +64,39 @@ select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
 set enable_indexscan to false;
 set enable_bitmapscan to true;
 select proname from pg_proc where proname like E'RI\\_FKey%del' order by 1;
+
+--
+-- Test B-tree insertion with a metapage update (XLOG_BTREE_INSERT_META
+-- WAL record type). This happens when a "fast-root" page is split.
+--
+
+-- First create a tree that's at least two levels deep.
+create table btree_test_tbl(id int4);
+insert into btree_test_tbl select generate_series(1, 10000);
+create index btree_test_idx on btree_test_tbl (id);
+
+-- Delete most entries, and vacuum. This turns the leaf page into a fast root.
+delete from btree_test_tbl where id < 9990;
+vacuum btree_test_tbl;
+
+-- Now do more insertions, creating a sibling for the fast root, so that
+-- it's not the fast root anymore.
+insert into btree_test_tbl select generate_series(1, 9900);
+
+--
+-- Test B-tree page deletion. In particular, deleting a non-leaf page.
+--
+
+-- First create a tree that's at least four levels deep. The text inserted
+-- is long and poorly compressible. That way only a few index tuples fit on
+-- each page, allowing us to get a tall tree with fewer pages.
+CREATE TABLE btree_tall_tbl(id int4, t text);
+CREATE INDEX btree_tall_idx on btree_tall_tbl (id, t) WITH (fillfactor = 10);
+INSERT INTO btree_tall_tbl
+SELECT g, g::text || '_' ||
+          (select string_agg(md5(i::text), '_') from generate_series(1, 50) i)
+FROM generate_series(1, 100) g;
+
+-- Delete most entries, and vacuum. This causes page deletions.
+delete from btree_tall_tbl where id < 950;
+vacuum btree_tall_tbl;
diff --git a/src/test/regress/sql/gin.sql b/src/test/regress/sql/gin.sql
new file mode 100644
index 0000000..a779fc1
--- /dev/null
+++ b/src/test/regress/sql/gin.sql
@@ -0,0 +1,35 @@
+--
+-- Test GIN indexes.
+--
+-- There are other tests to test different GIN opclassed. This is for testing
+-- GIN itself.
+
+-- Create and populate a test table with a GIN index.
+create table gin_test_tbl(i int4[]);
+create index gin_test_idx on gin_test_tbl using gin (i) with (fastupdate=on);
+insert into gin_test_tbl select array[1, 2, g] from generate_series(1, 100000) g;
+insert into gin_test_tbl select array[1, 3, g] from generate_series(1, 100000) g;
+
+-- Insert enough entries to create a three-level posting tree. That exercises
+-- splitting an internal page. With the default 8k block size, each posting
+-- tree leaf page can hold at most around 8000 TIDs, and each internal page
+-- can hold about 800 child pages. So we need > 8000*800 items.
+insert into gin_test_tbl select array[1] from generate_series(1, 10000) g1, generate_series(1, 1000) g2;
+
+vacuum gin_test_tbl; -- flush the fast update pages
+
+-- Test vacuuming 
+delete from gin_test_tbl where i @> array[2];
+vacuum gin_test_tbl;
+
+-- Disable fastupdate, and do more insertions. With fastupdate enabled, most
+-- insertions (by flushing the list pages) cause just page splits. This
+-- causes more churn in the GIN data leaf pages, and exercises the recompression
+-- codepaths.
+alter index gin_test_idx set (fastupdate = off);
+
+insert into gin_test_tbl select array[1, 2, g] from generate_series(1, 10000) g;
+insert into gin_test_tbl select array[1, 3, g] from generate_series(1, 10000) g;
+
+delete from gin_test_tbl where i @> array[2];
+vacuum gin_test_tbl;
diff --git a/src/test/regress/sql/gist.sql b/src/test/regress/sql/gist.sql
new file mode 100644
index 0000000..42bb027
--- /dev/null
+++ b/src/test/regress/sql/gist.sql
@@ -0,0 +1,25 @@
+--
+-- Test GiST indexes.
+--
+-- There are other tests to test different gist opclasses. This is for
+-- testing GiST code itself. Vacuuming in particular.
+
+create table gist_point_tbl(id int4, p point);
+create index gist_pointidx on gist_point_tbl using gist(p);
+
+-- Insert enough data to create a tree that's a couple of levels deep.
+insert into gist_point_tbl (id, p)
+select g,        point(g*10, g*10) from generate_series(1, 10000) g;
+
+insert into gist_point_tbl (id, p)
+select g+100000, point(g*10+1, g*10+1) from generate_series(1, 10000) g;
+
+-- To test vacuum, delete some entries from all over the index.
+delete from gist_point_tbl where id % 2 = 1;
+
+-- And also delete some concentration of values. (GiST doesn't currently
+-- attempt to delete pages even when they become empty, but if it did, this
+-- would exercise it)
+delete from gist_point_tbl where id < 10000;
+
+vacuum gist_point_tbl;
diff --git a/src/test/regress/sql/spgist.sql b/src/test/regress/sql/spgist.sql
new file mode 100644
index 0000000..cfd4cc8
--- /dev/null
+++ b/src/test/regress/sql/spgist.sql
@@ -0,0 +1,51 @@
+--
+-- Test SP-GiST indexes.
+--
+-- There are other tests to test different SP-GiST opclasses. This is for
+-- testing SP-GiST code itself.
+
+create table spgist_point_tbl(id int4, p point);
+create index spgist_point_idx on spgist_point_tbl using spgist(p);
+
+-- Test vacuum-root action. It gets invoked when the root is also a leaf,
+-- i.e. the index is very small.
+insert into spgist_point_tbl (id, p)
+select g, point(g*10, g*10) from generate_series(1, 10) g;
+delete from spgist_point_tbl where id < 5;
+vacuum spgist_point_tbl;
+
+-- Insert more data, to make the index a few levels deep.
+
+insert into spgist_point_tbl (id, p)
+select g,      point(g*10, g*10) from generate_series(1, 10000) g;
+
+insert into spgist_point_tbl (id, p)
+select g+100000, point(g*10+1, g*10+1) from generate_series(1, 10000) g;
+
+-- To test vacuum, delete some entries from all over the index.
+delete from spgist_point_tbl where id % 2 = 1;
+
+-- And also delete some concentration of values. (SP-GiST doesn't currently
+-- attempt to delete pages even when they become empty, but if it did, this
+-- would exercise it)
+delete from spgist_point_tbl where id < 10000;
+
+vacuum spgist_point_tbl;
+
+
+-- The point opclass's choose method only uses the spgMatchNode action.
+-- Create an index using text opclass, which uses the others actions.
+
+create table spgist_text_tbl(id int4, t text);
+create index spgist_text_idx on spgist_text_tbl using spgist(t);
+
+insert into spgist_text_tbl (id, t)
+select g, 'f' || repeat('o', 100) || g from generate_series(1, 10000) g
+union all
+select g, 'baaaaaaaaaaaaaar' || g from generate_series(1, 1000) g;
+
+-- Do a lot of insertions that have to split an existing node. Hopefully
+-- one of these will cause the page to run out of space, causing the inner
+-- tuple to be moved to another page.
+insert into spgist_text_tbl (id, t)
+select -g, 'f' || repeat('o', 100-g) || 'surprise' from generate_series(1, 100) g;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Increasing test coverage of WAL redo functions

Reply via email to