On Tue, Sep 9, 2014 at 1:28 AM, Jeff Janes <[email protected]> wrote:
> On Sun, Aug 17, 2014 at 7:46 PM, Fujii Masao <[email protected]> wrote:
>>
>>
>> Thanks for reviewing the patch! ISTM that I failed to make the patch from
>> my git repository... Attached is the rebased version.
>
>
>
> I get some compiler warnings on v2 of this patch:
>
> reloptions.c:219: warning: excess elements in struct initializer
> reloptions.c:219: warning: (near initialization for 'intRelOpts[15]')
Thanks for testing the patch!
Attached is the updated version of the patch.
Previously the patch depended on another infrastructure patch
(which allows a user to specify the unit in reloption (*1)). But that
infrastructure patch has serious problem and it's not easy to fix
the problem. So I changed the patch so that it doesn't depend on
that infrastructure patch at all. Even without the infrastructure
patch, the feature that this patch introduces is useful.
Also I added the regression test into the patch.
(*1)
http://www.postgresql.org/message-id/CAHGQGwEanQ_e8WLHL25=bm_8z5zkyzw0k0yir+kdmv2hgne...@mail.gmail.com
Regards,
--
Fujii Masao
*** a/doc/src/sgml/gin.sgml
--- b/doc/src/sgml/gin.sgml
***************
*** 728,735 ****
from the indexed item). As of <productname>PostgreSQL</productname> 8.4,
<acronym>GIN</> is capable of postponing much of this work by inserting
new tuples into a temporary, unsorted list of pending entries.
! When the table is vacuumed, or if the pending list becomes too large
! (larger than <xref linkend="guc-work-mem">), the entries are moved to the
main <acronym>GIN</acronym> data structure using the same bulk insert
techniques used during initial index creation. This greatly improves
<acronym>GIN</acronym> index update speed, even counting the additional
--- 728,736 ----
from the indexed item). As of <productname>PostgreSQL</productname> 8.4,
<acronym>GIN</> is capable of postponing much of this work by inserting
new tuples into a temporary, unsorted list of pending entries.
! When the table is vacuumed, or if the pending list becomes larger than
! <literal>PENDING_LIST_CLEANUP_SIZE</literal> (or
! <xref linkend="guc-work-mem"> if not set), the entries are moved to the
main <acronym>GIN</acronym> data structure using the same bulk insert
techniques used during initial index creation. This greatly improves
<acronym>GIN</acronym> index update speed, even counting the additional
***************
*** 812,829 ****
</varlistentry>
<varlistentry>
! <term><xref linkend="guc-work-mem"></term>
<listitem>
<para>
During a series of insertions into an existing <acronym>GIN</acronym>
index that has <literal>FASTUPDATE</> enabled, the system will clean up
the pending-entry list whenever the list grows larger than
! <varname>work_mem</>. To avoid fluctuations in observed response time,
! it's desirable to have pending-list cleanup occur in the background
! (i.e., via autovacuum). Foreground cleanup operations can be avoided by
! increasing <varname>work_mem</> or making autovacuum more aggressive.
! However, enlarging <varname>work_mem</> means that if a foreground
! cleanup does occur, it will take even longer.
</para>
</listitem>
</varlistentry>
--- 813,839 ----
</varlistentry>
<varlistentry>
! <term><literal>PENDING_LIST_CLEANUP_SIZE</> and
! <xref linkend="guc-work-mem"></term>
<listitem>
<para>
During a series of insertions into an existing <acronym>GIN</acronym>
index that has <literal>FASTUPDATE</> enabled, the system will clean up
the pending-entry list whenever the list grows larger than
! <literal>PENDING_LIST_CLEANUP_SIZE</> (if not set, <varname>work_mem</>
! is used as that threshold, instead). To avoid fluctuations in observed
! response time, it's desirable to have pending-list cleanup occur in the
! background (i.e., via autovacuum). Foreground cleanup operations
! can be avoided by increasing <literal>PENDING_LIST_CLEANUP_SIZE</>
! (or <varname>work_mem</>) or making autovacuum more aggressive.
! However, enlarging the threshold of the cleanup operation means that
! if a foreground cleanup does occur, it will take even longer.
! </para>
! <para>
! <literal>PENDING_LIST_CLEANUP_SIZE</> is an index storage parameter,
! and allows each GIN index to have its own cleanup threshold.
! For example, it's possible to increase the threshold only for the GIN
! index which can be updated heavily, and decrease it otherwise.
</para>
</listitem>
</varlistentry>
*** a/doc/src/sgml/ref/create_index.sgml
--- b/doc/src/sgml/ref/create_index.sgml
***************
*** 356,361 **** CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ <replaceable class="parameter">name</
--- 356,377 ----
</listitem>
</varlistentry>
</variablelist>
+ <variablelist>
+ <varlistentry>
+ <term><literal>PENDING_LIST_CLEANUP_SIZE</></term>
+ <listitem>
+ <para>
+ This setting specifies the maximum size of the GIN pending list which is
+ used when <literal>FASTUPDATE</> is enabled. If the list grows larger than
+ this maximum size, it is cleaned up by moving the entries in it to the
+ main GIN data structure in bulk. The value is specified in kilobytes.
+ If this is not set, <literal>work_mem</> is used as the maximum size
+ of the pending list, instead. See <xref linkend="gin-fast-update"> and
+ <xref linkend="gin-tips"> for more information.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</refsect2>
<refsect2 id="SQL-CREATEINDEX-CONCURRENTLY">
*** a/src/backend/access/common/reloptions.c
--- b/src/backend/access/common/reloptions.c
***************
*** 209,214 **** static relopt_int intRelOpts[] =
--- 209,222 ----
RELOPT_KIND_HEAP | RELOPT_KIND_TOAST
}, -1, 0, 2000000000
},
+ {
+ {
+ "pending_list_cleanup_size",
+ "Maximum size of the pending list for this GIN index, in kilobytes.",
+ RELOPT_KIND_GIN
+ },
+ -1, 0, MAX_KILOBYTES
+ },
/* list terminator */
{{NULL}}
*** a/src/backend/access/gin/ginfast.c
--- b/src/backend/access/gin/ginfast.c
***************
*** 227,232 **** ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
--- 227,233 ----
ginxlogUpdateMeta data;
bool separateList = false;
bool needCleanup = false;
+ int cleanupSize;
if (collector->ntuples == 0)
return;
***************
*** 421,431 **** ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
* ginInsertCleanup could take significant amount of time, so we prefer to
* call it when it can do all the work in a single collection cycle. In
* non-vacuum mode, it shouldn't require maintenance_work_mem, so fire it
! * while pending list is still small enough to fit into work_mem.
*
* ginInsertCleanup() should not be called inside our CRIT_SECTION.
*/
! if (metadata->nPendingPages * GIN_PAGE_FREESIZE > work_mem * 1024L)
needCleanup = true;
UnlockReleaseBuffer(metabuffer);
--- 422,436 ----
* ginInsertCleanup could take significant amount of time, so we prefer to
* call it when it can do all the work in a single collection cycle. In
* non-vacuum mode, it shouldn't require maintenance_work_mem, so fire it
! * while pending list is still small enough to fit into
! * pending_list_cleanup_size (or work_mem if not set).
*
* ginInsertCleanup() should not be called inside our CRIT_SECTION.
*/
! cleanupSize = GinGetPendingListCleanupSize(index);
! if (cleanupSize == GIN_DEFAULT_PENDING_LIST_CLEANUP_SIZE)
! cleanupSize = work_mem;
! if (metadata->nPendingPages * GIN_PAGE_FREESIZE > cleanupSize * 1024L)
needCleanup = true;
UnlockReleaseBuffer(metabuffer);
*** a/src/backend/access/gin/ginutil.c
--- b/src/backend/access/gin/ginutil.c
***************
*** 524,530 **** ginoptions(PG_FUNCTION_ARGS)
GinOptions *rdopts;
int numoptions;
static const relopt_parse_elt tab[] = {
! {"fastupdate", RELOPT_TYPE_BOOL, offsetof(GinOptions, useFastUpdate)}
};
options = parseRelOptions(reloptions, validate, RELOPT_KIND_GIN,
--- 524,532 ----
GinOptions *rdopts;
int numoptions;
static const relopt_parse_elt tab[] = {
! {"fastupdate", RELOPT_TYPE_BOOL, offsetof(GinOptions, useFastUpdate)},
! {"pending_list_cleanup_size", RELOPT_TYPE_INT, offsetof(GinOptions,
! pendingListCleanupSize)}
};
options = parseRelOptions(reloptions, validate, RELOPT_KIND_GIN,
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***************
*** 95,108 ****
#define CONFIG_EXEC_PARAMS_NEW "global/config_exec_params.new"
#endif
- /* upper limit for GUC variables measured in kilobytes of memory */
- /* note that various places assume the byte size fits in a "long" variable */
- #if SIZEOF_SIZE_T > 4 && SIZEOF_LONG > 4
- #define MAX_KILOBYTES INT_MAX
- #else
- #define MAX_KILOBYTES (INT_MAX / 1024)
- #endif
-
#define KB_PER_MB (1024)
#define KB_PER_GB (1024*1024)
#define KB_PER_TB (1024*1024*1024)
--- 95,100 ----
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
***************
*** 1171,1177 **** psql_completion(const char *text, int start, int end)
pg_strcasecmp(prev_wd, "(") == 0)
{
static const char *const list_INDEXOPTIONS[] =
! {"fillfactor", "fastupdate", NULL};
COMPLETE_WITH_LIST(list_INDEXOPTIONS);
}
--- 1171,1177 ----
pg_strcasecmp(prev_wd, "(") == 0)
{
static const char *const list_INDEXOPTIONS[] =
! {"fillfactor", "fastupdate", "pending_list_cleanup_size", NULL};
COMPLETE_WITH_LIST(list_INDEXOPTIONS);
}
*** a/src/include/access/gin_private.h
--- b/src/include/access/gin_private.h
***************
*** 314,325 **** typedef struct GinOptions
--- 314,331 ----
{
int32 vl_len_; /* varlena header (do not touch directly!) */
bool useFastUpdate; /* use fast updates? */
+ int pendingListCleanupSize; /* maximum size of pending list */
} GinOptions;
#define GIN_DEFAULT_USE_FASTUPDATE true
#define GinGetUseFastUpdate(relation) \
((relation)->rd_options ? \
((GinOptions *) (relation)->rd_options)->useFastUpdate : GIN_DEFAULT_USE_FASTUPDATE)
+ #define GIN_DEFAULT_PENDING_LIST_CLEANUP_SIZE -1
+ #define GinGetPendingListCleanupSize(relation) \
+ ((relation)->rd_options ? \
+ ((GinOptions *) (relation)->rd_options)->pendingListCleanupSize : \
+ GIN_DEFAULT_PENDING_LIST_CLEANUP_SIZE)
/* Macros for buffer lock/unlock operations */
*** a/src/include/utils/guc.h
--- b/src/include/utils/guc.h
***************
*** 18,23 ****
--- 18,31 ----
#include "utils/array.h"
+ /* upper limit for GUC variables measured in kilobytes of memory */
+ /* note that various places assume the byte size fits in a "long" variable */
+ #if SIZEOF_SIZE_T > 4 && SIZEOF_LONG > 4
+ #define MAX_KILOBYTES INT_MAX
+ #else
+ #define MAX_KILOBYTES (INT_MAX / 1024)
+ #endif
+
/*
* Certain options can only be set at certain times. The rules are
* like this:
*** a/src/test/regress/expected/create_index.out
--- b/src/test/regress/expected/create_index.out
***************
*** 2235,2240 **** SELECT COUNT(*) FROM array_gin_test WHERE a @> '{2}';
--- 2235,2253 ----
DROP TABLE array_gin_test;
--
+ -- Test GIN index's reloptions
+ --
+ CREATE INDEX gin_relopts_test ON array_index_op_test USING gin (i)
+ WITH (FASTUPDATE=on, PENDING_LIST_CLEANUP_SIZE=32);
+ \d+ gin_relopts_test
+ Index "public.gin_relopts_test"
+ Column | Type | Definition | Storage
+ --------+---------+------------+---------
+ i | integer | i | plain
+ gin, for table "public.array_index_op_test"
+ Options: fastupdate=on, pending_list_cleanup_size=32
+
+ --
-- HASH
--
CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops);
*** a/src/test/regress/sql/create_index.sql
--- b/src/test/regress/sql/create_index.sql
***************
*** 651,656 **** SELECT COUNT(*) FROM array_gin_test WHERE a @> '{2}';
--- 651,663 ----
DROP TABLE array_gin_test;
--
+ -- Test GIN index's reloptions
+ --
+ CREATE INDEX gin_relopts_test ON array_index_op_test USING gin (i)
+ WITH (FASTUPDATE=on, PENDING_LIST_CLEANUP_SIZE=32);
+ \d+ gin_relopts_test
+
+ --
-- HASH
--
CREATE INDEX hash_i4_index ON hash_i4_heap USING hash (random int4_ops);
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers