On Wed, Jan 20, 2016 at 6:17 AM, Fujii Masao <masao.fu...@gmail.com> wrote: > On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud > <julien.rouh...@dalibo.com> wrote: >> On 15/01/2016 22:59, Jeff Janes wrote: >>> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud >>> <julien.rouh...@dalibo.com> wrote: >> >> All looks fine to me, I flag it as ready for committer. > > When I compiled the PostgreSQL with the patch, I got the following error. > ISTM that the inclusion of pg_am.h header file is missing in ginfast.c.
Thanks. Fixed. > gin_clean_pending_list() must check whether the server is in recovery or not. > If it's in recovery, the function must exit with an error. That is, IMO, > something like the following check must be added. > > if (RecoveryInProgress()) > ereport(ERROR, > > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > errmsg("recovery is in progress"), > errhint("GIN pending list cannot be > cleaned up during recovery."))); > > It's better to make gin_clean_pending_list() check whether the target index > is temporary index of other sessions or not, like pgstatginindex() does. I've added both of these checks. Sorry I overlooked your early email in this thread about those. > > + Relation indexRel = index_open(indexoid, AccessShareLock); > > ISTM that AccessShareLock is not safe when updating the pending list and > GIN index main structure. Probaby we should use RowExclusiveLock? Other callers of the ginInsertCleanup function also do so while holding only the AccessShareLock on the index. It turns out that there is a bug around this, as discussed in "Potential GIN vacuum bug" (http://www.postgresql.org/message-id/flat/CAMkU=1xalflhuuohfp5v33rzedlvb5aknnujceum9knbkrb...@mail.gmail.com) But, that bug has to be solved at a deeper level than this patch. I've also cleaned up some other conflicts, and chose a more suitable OID for the new catalog function. The number of new header includes needed to implement this makes me wonder if I put this code in the correct file, but I don't see a better location for it. New version attached. Thanks, Jeff
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml new file mode 100644 index 9c143b2..ada228f *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *************** postgres=# SELECT * FROM pg_xlogfile_nam *** 18036,18041 **** --- 18036,18045 ---- <primary>brin_summarize_new_values</primary> </indexterm> + <indexterm> + <primary>gin_clean_pending_list</primary> + </indexterm> + <para> <xref linkend="functions-admin-index-table"> shows the functions available for index maintenance tasks. *************** postgres=# SELECT * FROM pg_xlogfile_nam *** 18056,18061 **** --- 18060,18072 ---- <entry><type>integer</type></entry> <entry>summarize page ranges not already summarized</entry> </row> + <row> + <entry> + <literal><function>gin_clean_pending_list(<parameter>index_oid</> <type>regclass</>)</function></literal> + </entry> + <entry><type>bigint</type></entry> + <entry>move fast-update pending list entries into main index structure</entry> + </row> </tbody> </tgroup> </table> *************** postgres=# SELECT * FROM pg_xlogfile_nam *** 18069,18074 **** --- 18080,18092 ---- into the index. </para> + <para> + <function>gin_clean_pending_list</> accepts a GIN index OID argument + and moves the fast-update entries from the pending list into the + main index structure for that index. It returns the number of pages + removed from the pending list. + </para> + </sect2> <sect2 id="functions-admin-genfile"> diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml new file mode 100644 index 9eb0b5a..17a5087 *** a/doc/src/sgml/gin.sgml --- b/doc/src/sgml/gin.sgml *************** *** 734,740 **** from the indexed item). As of <productname>PostgreSQL</productname> 8.4, <acronym>GIN</> is capable of postponing much of this work by inserting new tuples into a temporary, unsorted list of pending entries. ! When the table is vacuumed, or if the pending list becomes larger than <xref linkend="guc-gin-pending-list-limit">, the entries are moved to the main <acronym>GIN</acronym> data structure using the same bulk insert techniques used during initial index creation. This greatly improves --- 734,742 ---- from the indexed item). As of <productname>PostgreSQL</productname> 8.4, <acronym>GIN</> is capable of postponing much of this work by inserting new tuples into a temporary, unsorted list of pending entries. ! When the table is vacuumed or autoanalyzed, or when ! <function>gin_clean_pending_list(regclass)</function> is called, or if the ! pending list becomes larger than <xref linkend="guc-gin-pending-list-limit">, the entries are moved to the main <acronym>GIN</acronym> data structure using the same bulk insert techniques used during initial index creation. This greatly improves diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c new file mode 100644 index 681ce09..dfdc45f *** a/src/backend/access/gin/ginfast.c --- b/src/backend/access/gin/ginfast.c *************** *** 20,29 **** --- 20,32 ---- #include "access/gin_private.h" #include "access/xloginsert.h" + #include "access/xlog.h" #include "commands/vacuum.h" + #include "catalog/pg_am.h" #include "miscadmin.h" #include "utils/memutils.h" #include "utils/rel.h" + #include "utils/acl.h" #include "storage/indexfsm.h" /* GUC parameter */ *************** ginInsertCleanup(GinState *ginstate, *** 958,960 **** --- 961,1011 ---- MemoryContextSwitchTo(oldCtx); MemoryContextDelete(opCtx); } + + /* + * SQL-callable function to clean the insert pending list + */ + Datum + gin_clean_pending_list(PG_FUNCTION_ARGS) + { + Oid indexoid = PG_GETARG_OID(0); + Relation indexRel = index_open(indexoid, AccessShareLock); + IndexBulkDeleteResult stats; + GinState ginstate; + + if (RecoveryInProgress()) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("recovery is in progress"), + errhint("GIN pending list cannot be cleaned up during recovery."))); + + /* Must be a GIN index */ + if (indexRel->rd_rel->relkind != RELKIND_INDEX || + indexRel->rd_rel->relam != GIN_AM_OID) + ereport(ERROR, + (errcode(ERRCODE_WRONG_OBJECT_TYPE), + errmsg("\"%s\" is not a GIN index", + RelationGetRelationName(indexRel)))); + /* + * Reject attempts to read non-local temporary relations; we would be + * likely to get wrong data since we have no visibility into the owning + * session's local buffers. + */ + if (RELATION_IS_OTHER_TEMP(indexRel)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot access temporary indexes of other sessions"))); + + /* User must own the index (comparable to privileges needed for VACUUM) */ + if (!pg_class_ownercheck(indexoid, GetUserId())) + aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS, + RelationGetRelationName(indexRel)); + + memset(&stats, 0, sizeof(stats)); + initGinState(&ginstate, indexRel); + ginInsertCleanup(&ginstate, true, &stats); + + index_close(indexRel, AccessShareLock); + + PG_RETURN_INT64((int64) stats.pages_deleted); + } diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h new file mode 100644 index 695959c..d2ea588 *** a/src/include/access/gin_private.h --- b/src/include/access/gin_private.h *************** extern void ginFreeScanKeys(GinScanOpaqu *** 881,886 **** --- 881,889 ---- /* ginget.c */ extern int64 gingetbitmap(IndexScanDesc scan, TIDBitmap *tbm); + /* ginfast.c */ + extern Datum gin_clean_pending_list(PG_FUNCTION_ARGS); + /* ginlogic.c */ extern void ginInitConsistentFunction(GinState *ginstate, GinScanKey key); diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h new file mode 100644 index 79e92ff..96b148a *** a/src/include/catalog/pg_proc.h --- b/src/include/catalog/pg_proc.h *************** DATA(insert OID = 3087 ( gin_extract_ts *** 4517,4522 **** --- 4517,4524 ---- DESCR("GIN tsvector support (obsolete)"); DATA(insert OID = 3088 ( gin_tsquery_consistent PGNSP PGUID 12 1 0 0 0 f f f f t f i s 6 0 16 "2281 21 3615 23 2281 2281" _null_ _null_ _null_ _null_ _null_ gin_tsquery_consistent_6args _null_ _null_ _null_ )); DESCR("GIN tsvector support (obsolete)"); + DATA(insert OID = 3789 ( gin_clean_pending_list PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 20 "2205" _null_ _null_ _null_ _null_ _null_ gin_clean_pending_list _null_ _null_ _null_ )); + DESCR("GIN: clean the fastupdate pending list"); DATA(insert OID = 3662 ( tsquery_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 16 "3615 3615" _null_ _null_ _null_ _null_ _null_ tsquery_lt _null_ _null_ _null_ )); DATA(insert OID = 3663 ( tsquery_le PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 16 "3615 3615" _null_ _null_ _null_ _null_ _null_ tsquery_le _null_ _null_ _null_ )); diff --git a/src/test/regress/expected/gin.out b/src/test/regress/expected/gin.out new file mode 100644 index c015fe7..cc7601c *** a/src/test/regress/expected/gin.out --- b/src/test/regress/expected/gin.out *************** create table gin_test_tbl(i int4[]); *** 8,14 **** --- 8,27 ---- create index gin_test_idx on gin_test_tbl using gin (i) with (fastupdate = on); insert into gin_test_tbl select array[1, 2, g] from generate_series(1, 20000) g; insert into gin_test_tbl select array[1, 3, g] from generate_series(1, 1000) g; + select gin_clean_pending_list('gin_test_idx')>10 as many; -- flush the fastupdate buffers + many + ------ + t + (1 row) + + insert into gin_test_tbl select array[3, 1, g] from generate_series(1, 1000) g; vacuum gin_test_tbl; -- flush the fastupdate buffers + select gin_clean_pending_list('gin_test_idx'); -- nothing to flush + gin_clean_pending_list + ------------------------ + 0 + (1 row) + -- Test vacuuming delete from gin_test_tbl where i @> array[2]; vacuum gin_test_tbl; diff --git a/src/test/regress/sql/gin.sql b/src/test/regress/sql/gin.sql new file mode 100644 index 4b35560..31890b4 *** a/src/test/regress/sql/gin.sql --- b/src/test/regress/sql/gin.sql *************** create index gin_test_idx on gin_test_tb *** 10,17 **** --- 10,23 ---- insert into gin_test_tbl select array[1, 2, g] from generate_series(1, 20000) g; insert into gin_test_tbl select array[1, 3, g] from generate_series(1, 1000) g; + select gin_clean_pending_list('gin_test_idx')>10 as many; -- flush the fastupdate buffers + + insert into gin_test_tbl select array[3, 1, g] from generate_series(1, 1000) g; + vacuum gin_test_tbl; -- flush the fastupdate buffers + select gin_clean_pending_list('gin_test_idx'); -- nothing to flush + -- Test vacuuming delete from gin_test_tbl where i @> array[2]; vacuum gin_test_tbl;
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers