On Mon, Jan 25, 2016 at 3:54 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:
> On Wed, Jan 20, 2016 at 6:17 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>> On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud
>> <julien.rouh...@dalibo.com> wrote:
>>> On 15/01/2016 22:59, Jeff Janes wrote:
>>>> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
>>>> <julien.rouh...@dalibo.com> wrote:
>>>
>>> All looks fine to me, I flag it as ready for committer.
>>
>> When I compiled the PostgreSQL with the patch, I got the following error.
>> ISTM that the inclusion of pg_am.h header file is missing in ginfast.c.
>
> Thanks.  Fixed.
>
>> gin_clean_pending_list() must check whether the server is in recovery or not.
>> If it's in recovery, the function must exit with an error. That is, IMO,
>> something like the following check must be added.
>>
>>          if (RecoveryInProgress())
>>                  ereport(ERROR,
>>
>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>>                                   errmsg("recovery is in progress"),
>>                                   errhint("GIN pending list cannot be
>> cleaned up during recovery.")));
>>
>> It's better to make gin_clean_pending_list() check whether the target index
>> is temporary index of other sessions or not, like pgstatginindex() does.
>
> I've added both of these checks.  Sorry I overlooked your early email
> in this thread about those.
>
>>
>> +    Relation    indexRel = index_open(indexoid, AccessShareLock);
>>
>> ISTM that AccessShareLock is not safe when updating the pending list and
>> GIN index main structure. Probaby we should use RowExclusiveLock?
>
> Other callers of the ginInsertCleanup function also do so while
> holding only the AccessShareLock on the index.  It turns out that
> there is a bug around this, as discussed in "Potential GIN vacuum bug"
> (http://www.postgresql.org/message-id/flat/CAMkU=1xalflhuuohfp5v33rzedlvb5aknnujceum9knbkrb...@mail.gmail.com)
>
> But, that bug has to be solved at a deeper level than this patch.
>
> I've also cleaned up some other conflicts, and chose a more suitable
> OID for the new catalog function.
>
> The number of new header includes needed to implement this makes me
> wonder if I put this code in the correct file, but I don't see a
> better location for it.
>
> New version attached.

Thanks for updating the patch! It looks good to me.

Based on your patch, I just improved the doc. For example,
I added the following note into the doc.

+    These functions cannot be executed during recovery.
+    Use of these functions is restricted to superusers and the owner
+    of the given index.

If there is no problem, I'm thinking to commit this version.

Regards,

-- 
Fujii Masao
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***************
*** 18036,18044 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 18036,18051 ----
      <primary>brin_summarize_new_values</primary>
     </indexterm>
  
+    <indexterm>
+     <primary>gin_clean_pending_list</primary>
+    </indexterm>
+ 
     <para>
      <xref linkend="functions-admin-index-table"> shows the functions
      available for index maintenance tasks.
+     These functions cannot be executed during recovery.
+     Use of these functions is restricted to superusers and the owner
+     of the given index.
     </para>
  
     <table id="functions-admin-index-table">
***************
*** 18056,18061 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 18063,18075 ----
         <entry><type>integer</type></entry>
         <entry>summarize page ranges not already summarized</entry>
        </row>
+       <row>
+        <entry>
+         <literal><function>gin_clean_pending_list(<parameter>index</> <type>regclass</>)</function></literal>
+        </entry>
+        <entry><type>bigint</type></entry>
+        <entry>move GIN pending list entries into main index structure</entry>
+       </row>
       </tbody>
      </tgroup>
     </table>
***************
*** 18069,18074 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 18083,18100 ----
      into the index.
     </para>
  
+    <para>
+     <function>gin_clean_pending_list</> accepts the OID or name of
+     a GIN index and cleans up the pending list of the specified GIN index
+     by moving entries in it to the main GIN data structure in bulk.
+     It returns the number of pages cleaned up from the pending list.
+     Note that the cleanup does not happen and the return value is 0
+     if the argument is the GIN index built with <literal>fastupdate</>
+     option disabled because it doesn't have a pending list.
+     Please see <xref linkend="gin-fast-update"> and <xref linkend="gin-tips">
+     for details of the pending list and <literal>fastupdate</> option.
+    </para>
+ 
    </sect2>
  
    <sect2 id="functions-admin-genfile">
*** a/doc/src/sgml/gin.sgml
--- b/doc/src/sgml/gin.sgml
***************
*** 734,740 ****
     from the indexed item). As of <productname>PostgreSQL</productname> 8.4,
     <acronym>GIN</> is capable of postponing much of this work by inserting
     new tuples into a temporary, unsorted list of pending entries.
!    When the table is vacuumed, or if the pending list becomes larger than
     <xref linkend="guc-gin-pending-list-limit">, the entries are moved to the
     main <acronym>GIN</acronym> data structure using the same bulk insert
     techniques used during initial index creation.  This greatly improves
--- 734,742 ----
     from the indexed item). As of <productname>PostgreSQL</productname> 8.4,
     <acronym>GIN</> is capable of postponing much of this work by inserting
     new tuples into a temporary, unsorted list of pending entries.
!    When the table is vacuumed or autoanalyzed, or when 
!    <function>gin_clean_pending_list</function> function is called, or if the
!    pending list becomes larger than
     <xref linkend="guc-gin-pending-list-limit">, the entries are moved to the
     main <acronym>GIN</acronym> data structure using the same bulk insert
     techniques used during initial index creation.  This greatly improves
*** a/doc/src/sgml/ref/create_index.sgml
--- b/doc/src/sgml/ref/create_index.sgml
***************
*** 362,369 **** CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
        Turning <literal>fastupdate</> off via <command>ALTER INDEX</> prevents
        future insertions from going into the list of pending index entries,
        but does not in itself flush previous entries.  You might want to
!       <command>VACUUM</> the table afterward to ensure the pending list is
!       emptied.
       </para>
      </note>
      </listitem>
--- 362,369 ----
        Turning <literal>fastupdate</> off via <command>ALTER INDEX</> prevents
        future insertions from going into the list of pending index entries,
        but does not in itself flush previous entries.  You might want to
!       <command>VACUUM</> the table or call <function>gin_clean_pending_list</>
!       function afterward to ensure the pending list is emptied.
       </para>
      </note>
      </listitem>
*** a/src/backend/access/gin/ginfast.c
--- b/src/backend/access/gin/ginfast.c
***************
*** 20,29 ****
--- 20,32 ----
  
  #include "access/gin_private.h"
  #include "access/xloginsert.h"
+ #include "access/xlog.h"
  #include "commands/vacuum.h"
+ #include "catalog/pg_am.h"
  #include "miscadmin.h"
  #include "utils/memutils.h"
  #include "utils/rel.h"
+ #include "utils/acl.h"
  #include "storage/indexfsm.h"
  
  /* GUC parameter */
***************
*** 958,960 **** ginInsertCleanup(GinState *ginstate,
--- 961,1012 ----
  	MemoryContextSwitchTo(oldCtx);
  	MemoryContextDelete(opCtx);
  }
+ 
+ /*
+  * SQL-callable function to clean the insert pending list
+  */
+ Datum
+ gin_clean_pending_list(PG_FUNCTION_ARGS)
+ {
+ 	Oid			indexoid = PG_GETARG_OID(0);
+ 	Relation	indexRel = index_open(indexoid, AccessShareLock);
+ 	IndexBulkDeleteResult stats;
+ 	GinState	ginstate;
+ 
+ 	if (RecoveryInProgress())
+  		ereport(ERROR,
+ 				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ 				 errmsg("recovery is in progress"),
+ 				 errhint("GIN pending list cannot be cleaned up during recovery.")));
+ 
+ 	/* Must be a GIN index */
+ 	if (indexRel->rd_rel->relkind != RELKIND_INDEX ||
+ 		indexRel->rd_rel->relam != GIN_AM_OID)
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ 				 errmsg("\"%s\" is not a GIN index",
+ 						RelationGetRelationName(indexRel))));
+ 
+ 	/*
+ 	 * Reject attempts to read non-local temporary relations; we would be
+ 	 * likely to get wrong data since we have no visibility into the owning
+ 	 * session's local buffers.
+ 	 */
+ 	if (RELATION_IS_OTHER_TEMP(indexRel))
+ 		ereport(ERROR,
+ 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 			   errmsg("cannot access temporary indexes of other sessions")));
+ 
+ 	/* User must own the index (comparable to privileges needed for VACUUM) */
+ 	if (!pg_class_ownercheck(indexoid, GetUserId()))
+ 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
+ 					   RelationGetRelationName(indexRel));
+ 
+ 	memset(&stats, 0, sizeof(stats));
+ 	initGinState(&ginstate, indexRel);
+ 	ginInsertCleanup(&ginstate, true, &stats);
+ 
+ 	index_close(indexRel, AccessShareLock);
+ 
+ 	PG_RETURN_INT64((int64) stats.pages_deleted);
+ }
*** a/src/include/access/gin_private.h
--- b/src/include/access/gin_private.h
***************
*** 881,886 **** extern void ginFreeScanKeys(GinScanOpaque so);
--- 881,889 ----
  /* ginget.c */
  extern int64 gingetbitmap(IndexScanDesc scan, TIDBitmap *tbm);
  
+ /* ginfast.c */
+ extern Datum gin_clean_pending_list(PG_FUNCTION_ARGS);
+ 
  /* ginlogic.c */
  extern void ginInitConsistentFunction(GinState *ginstate, GinScanKey key);
  
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***************
*** 4517,4522 **** DATA(insert OID = 3087 (  gin_extract_tsquery	PGNSP PGUID 12 1 0 0 0 f f f f t f
--- 4517,4524 ----
  DESCR("GIN tsvector support (obsolete)");
  DATA(insert OID = 3088 (  gin_tsquery_consistent PGNSP PGUID 12 1 0 0 0 f f f f t f i s 6 0 16 "2281 21 3615 23 2281 2281" _null_ _null_ _null_ _null_ _null_ gin_tsquery_consistent_6args _null_ _null_ _null_ ));
  DESCR("GIN tsvector support (obsolete)");
+ DATA(insert OID = 3789 (  gin_clean_pending_list PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 20 "2205" _null_ _null_ _null_ _null_ _null_ gin_clean_pending_list _null_ _null_ _null_ ));
+ DESCR("clean up GIN pending list");
  
  DATA(insert OID = 3662 (  tsquery_lt			PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 16 "3615 3615" _null_ _null_ _null_ _null_ _null_ tsquery_lt _null_ _null_ _null_ ));
  DATA(insert OID = 3663 (  tsquery_le			PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 16 "3615 3615" _null_ _null_ _null_ _null_ _null_ tsquery_le _null_ _null_ _null_ ));
*** a/src/test/regress/expected/gin.out
--- b/src/test/regress/expected/gin.out
***************
*** 8,14 **** create table gin_test_tbl(i int4[]);
--- 8,27 ----
  create index gin_test_idx on gin_test_tbl using gin (i) with (fastupdate = on);
  insert into gin_test_tbl select array[1, 2, g] from generate_series(1, 20000) g;
  insert into gin_test_tbl select array[1, 3, g] from generate_series(1, 1000) g;
+ select gin_clean_pending_list('gin_test_idx')>10 as many; -- flush the fastupdate buffers
+  many 
+ ------
+  t
+ (1 row)
+ 
+ insert into gin_test_tbl select array[3, 1, g] from generate_series(1, 1000) g;
  vacuum gin_test_tbl; -- flush the fastupdate buffers
+ select gin_clean_pending_list('gin_test_idx'); -- nothing to flush
+  gin_clean_pending_list 
+ ------------------------
+                       0
+ (1 row)
+ 
  -- Test vacuuming
  delete from gin_test_tbl where i @> array[2];
  vacuum gin_test_tbl;
*** a/src/test/regress/sql/gin.sql
--- b/src/test/regress/sql/gin.sql
***************
*** 10,17 **** create index gin_test_idx on gin_test_tbl using gin (i) with (fastupdate = on);
--- 10,23 ----
  insert into gin_test_tbl select array[1, 2, g] from generate_series(1, 20000) g;
  insert into gin_test_tbl select array[1, 3, g] from generate_series(1, 1000) g;
  
+ select gin_clean_pending_list('gin_test_idx')>10 as many; -- flush the fastupdate buffers
+ 
+ insert into gin_test_tbl select array[3, 1, g] from generate_series(1, 1000) g;
+ 
  vacuum gin_test_tbl; -- flush the fastupdate buffers
  
+ select gin_clean_pending_list('gin_test_idx'); -- nothing to flush
+ 
  -- Test vacuuming
  delete from gin_test_tbl where i @> array[2];
  vacuum gin_test_tbl;
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to