Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Tue, Aug 11, 2015 at 2:58 AM, Jeff Janeswrote: > On Thu, Oct 30, 2014 at 5:30 AM, Fujii Masao wrote: >> >> On Thu, Oct 30, 2014 at 7:30 PM, Etsuro Fujita >> wrote: >> >> > + { >> > + {"pending_list_cleanup_size", PGC_USERSET, >> > CLIENT_CONN_STATEMENT, >> > + gettext_noop("Sets the maximum size of the >> > pending >> > list for GIN index."), >> > +NULL, >> > + GUC_UNIT_KB >> > + }, >> > + _list_cleanup_size, >> > + 4096, 0, MAX_KILOBYTES, >> > + NULL, NULL, NULL >> > + }, >> > >> > ISTM it'd be better to use RESOURCES_MEM, not CLIENT_CONN_STATEMENT. No? >> >> Yes if the pending list always exists in the memory. But not, IIUC. >> Thought? >> >> > Also why not set min to 64, not to 0, in accoradance with that of >> > work_mem? >> >> I'm OK to use 64. But I just chose 0 because I could not think of any >> reasonable >> reason why 64k is suitable as the minimum size of the pending list. >> IOW, I have no idea about whether it's reasonable to use the min value of >> work_mem as the min size of the pending list. > > > > I know I am late to the party here, but would like to have the minimum be 0, > not 64. As long as by zero, it means it doesn't insert anything into the > pending list, rather than inserting and promptly cleaning it up. > > The reason for this is that if I am trying to decide what > pending_list_cleanup_size I want to set for the index in the indexes storage > parameters, the way to find that out is try a bunch of different ones > through the guc while the index is still at the default of no overriding > storage parameter. It would be nice to try the fastupdate=off alternative > (i.e. the same as pending_list_cleanup_size=0) without having to change the > index itself and change the syntax used in the testing. Sounds OK to me. So we should change the minimum values of both gin_pending_list_limit GUC and storage parameters to 0? Or only GUC? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Thu, Oct 30, 2014 at 5:30 AM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Oct 30, 2014 at 7:30 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: + { + {pending_list_cleanup_size, PGC_USERSET, CLIENT_CONN_STATEMENT, + gettext_noop(Sets the maximum size of the pending list for GIN index.), +NULL, + GUC_UNIT_KB + }, + pending_list_cleanup_size, + 4096, 0, MAX_KILOBYTES, + NULL, NULL, NULL + }, ISTM it'd be better to use RESOURCES_MEM, not CLIENT_CONN_STATEMENT. No? Yes if the pending list always exists in the memory. But not, IIUC. Thought? Also why not set min to 64, not to 0, in accoradance with that of work_mem? I'm OK to use 64. But I just chose 0 because I could not think of any reasonable reason why 64k is suitable as the minimum size of the pending list. IOW, I have no idea about whether it's reasonable to use the min value of work_mem as the min size of the pending list. I know I am late to the party here, but would like to have the minimum be 0, not 64. As long as by zero, it means it doesn't insert anything into the pending list, rather than inserting and promptly cleaning it up. The reason for this is that if I am trying to decide what pending_list_cleanup_size I want to set for the index in the indexes storage parameters, the way to find that out is try a bunch of different ones through the guc while the index is still at the default of no overriding storage parameter. It would be nice to try the fastupdate=off alternative (i.e. the same as pending_list_cleanup_size=0) without having to change the index itself and change the syntax used in the testing. Cheers, Jeff
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Wed, Nov 12, 2014 at 12:40 AM, Tom Lane t...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: On Tue, Nov 11, 2014 at 10:14 PM, Robert Haas robertmh...@gmail.com wrote: Not to kibitz too much after-the-fact, but wouldn't it be better to give this a name that has GIN in it somewhere? Maybe. gin_pending_list_cleanup_size? gin_pending_list_limit? Better name? gin_pending_list_limit sounds good to me. OK, barring any objection, I will rename the reloption and GUC to gin_pending_list_limit. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Wed, Nov 12, 2014 at 9:30 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Nov 12, 2014 at 12:40 AM, Tom Lane t...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: On Tue, Nov 11, 2014 at 10:14 PM, Robert Haas robertmh...@gmail.com wrote: Not to kibitz too much after-the-fact, but wouldn't it be better to give this a name that has GIN in it somewhere? Maybe. gin_pending_list_cleanup_size? gin_pending_list_limit? Better name? gin_pending_list_limit sounds good to me. OK, barring any objection, I will rename the reloption and GUC to gin_pending_list_limit. Done. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
(2014/11/11 2:31), Fujii Masao wrote: On Mon, Nov 10, 2014 at 4:15 PM, Etsuro Fujita The patch looks good to me except for the following point: *** a/src/backend/access/gin/ginfast.c --- b/src/backend/access/gin/ginfast.c *** *** 25,30 --- 25,32 #include utils/memutils.h #include utils/rel.h + /* GUC parameter */ + int pending_list_cleanup_size = 0; I think we need to initialize the GUC to boot_val, 4096 in this case. No, IIUC basically the variable for GUC doesn't need to be initialized to its default value. OTOH, it's also harmless to initialize it to the default. I like the current code a bit because we don't need to change the initial value again when we decide to change the default value of GUC. I have no strong opinion about this, though. OK, so if there are no objections of others, I'll mark this as Ready for Committer. Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Tue, Nov 11, 2014 at 7:01 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: (2014/11/11 2:31), Fujii Masao wrote: On Mon, Nov 10, 2014 at 4:15 PM, Etsuro Fujita The patch looks good to me except for the following point: *** a/src/backend/access/gin/ginfast.c --- b/src/backend/access/gin/ginfast.c *** *** 25,30 --- 25,32 #include utils/memutils.h #include utils/rel.h + /* GUC parameter */ + int pending_list_cleanup_size = 0; I think we need to initialize the GUC to boot_val, 4096 in this case. No, IIUC basically the variable for GUC doesn't need to be initialized to its default value. OTOH, it's also harmless to initialize it to the default. I like the current code a bit because we don't need to change the initial value again when we decide to change the default value of GUC. I have no strong opinion about this, though. OK, so if there are no objections of others, I'll mark this as Ready for Committer. I just pushed this. Thanks! Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Tue, Nov 11, 2014 at 7:11 AM, Fujii Masao masao.fu...@gmail.com wrote: OK, so if there are no objections of others, I'll mark this as Ready for Committer. I just pushed this. Thanks! Not to kibitz too much after-the-fact, but wouldn't it be better to give this a name that has GIN in it somewhere? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Tue, Nov 11, 2014 at 10:14 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Nov 11, 2014 at 7:11 AM, Fujii Masao masao.fu...@gmail.com wrote: OK, so if there are no objections of others, I'll mark this as Ready for Committer. I just pushed this. Thanks! Not to kibitz too much after-the-fact, but wouldn't it be better to give this a name that has GIN in it somewhere? Maybe. gin_pending_list_cleanup_size? gin_pending_list_limit? Better name? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
Fujii Masao masao.fu...@gmail.com writes: On Tue, Nov 11, 2014 at 10:14 PM, Robert Haas robertmh...@gmail.com wrote: Not to kibitz too much after-the-fact, but wouldn't it be better to give this a name that has GIN in it somewhere? Maybe. gin_pending_list_cleanup_size? gin_pending_list_limit? Better name? gin_pending_list_limit sounds good to me. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Mon, Nov 10, 2014 at 4:15 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: (2014/11/06 23:38), Fujii Masao wrote: On Tue, Nov 4, 2014 at 12:04 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: IIUC, I think that min = 0 disables fast update, so ISTM that it'd be appropriate to set min to some positive value. And ISTM that the idea of using the min value of work_mem is not so bad. OK. I changed the min value to 64kB. *** 356,361 CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ replaceable class=parametername/ --- 356,372 /listitem /varlistentry /variablelist +variablelist +varlistentry + termliteralPENDING_LIST_CLEANUP_SIZE//term The above is still in uppercse. Fixed. Attached is the updated version of the patch. Thanks for the review! Thanks for the updating the patch! The patch looks good to me except for the following point: Thanks for the review again! *** a/src/backend/access/gin/ginfast.c --- b/src/backend/access/gin/ginfast.c *** *** 25,30 --- 25,32 #include utils/memutils.h #include utils/rel.h + /* GUC parameter */ + int pending_list_cleanup_size = 0; I think we need to initialize the GUC to boot_val, 4096 in this case. No, IIUC basically the variable for GUC doesn't need to be initialized to its default value. OTOH, it's also harmless to initialize it to the default. I like the current code a bit because we don't need to change the initial value again when we decide to change the default value of GUC. I have no strong opinion about this, though. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
(2014/11/06 23:38), Fujii Masao wrote: On Tue, Nov 4, 2014 at 12:04 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: IIUC, I think that min = 0 disables fast update, so ISTM that it'd be appropriate to set min to some positive value. And ISTM that the idea of using the min value of work_mem is not so bad. OK. I changed the min value to 64kB. *** 356,361 CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ replaceable class=parametername/ --- 356,372 /listitem /varlistentry /variablelist +variablelist +varlistentry + termliteralPENDING_LIST_CLEANUP_SIZE//term The above is still in uppercse. Fixed. Attached is the updated version of the patch. Thanks for the review! Thanks for the updating the patch! The patch looks good to me except for the following point: *** a/src/backend/access/gin/ginfast.c --- b/src/backend/access/gin/ginfast.c *** *** 25,30 --- 25,32 #include utils/memutils.h #include utils/rel.h + /* GUC parameter */ + int pending_list_cleanup_size = 0; I think we need to initialize the GUC to boot_val, 4096 in this case. I asked the opinions of others about the config_group of the GUC. But there seems no opinions, so I agree with Fujii-san's idea of assigning the GUC CLIENT_CONN_STATEMENT. Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Tue, Nov 4, 2014 at 12:04 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: IIUC, I think that min = 0 disables fast update, so ISTM that it'd be appropriate to set min to some positive value. And ISTM that the idea of using the min value of work_mem is not so bad. OK. I changed the min value to 64kB. *** 356,361 CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ replaceable class=parametername/ --- 356,372 /listitem /varlistentry /variablelist +variablelist +varlistentry + termliteralPENDING_LIST_CLEANUP_SIZE//term The above is still in uppercse. Fixed. Attached is the updated version of the patch. Thanks for the review! Regards, -- Fujii Masao *** a/doc/src/sgml/config.sgml --- b/doc/src/sgml/config.sgml *** *** 5911,5916 SET XML OPTION { DOCUMENT | CONTENT }; --- 5911,5937 /listitem /varlistentry + varlistentry id=guc-pending-list-cleanup-size xreflabel=pending_list_cleanup_size + termvarnamepending_list_cleanup_size/varname (typeinteger/type) + indexterm +primaryvarnamepending_list_cleanup_size/ configuration parameter/primary + /indexterm + /term + listitem +para + Sets the maximum size of the GIN pending list which is used + when literalfastupdate/ is enabled. If the list grows + larger than this maximum size, it is cleaned up by moving + the entries in it to the main GIN data structure in bulk. + The default is four megabytes (literal4MB/). This setting + can be overridden for individual GIN indexes by changing + storage parameters. + See xref linkend=gin-fast-update and xref linkend=gin-tips + for more information. +/para + /listitem + /varlistentry + /variablelist /sect2 sect2 id=runtime-config-client-format *** a/doc/src/sgml/gin.sgml --- b/doc/src/sgml/gin.sgml *** *** 728,735 from the indexed item). As of productnamePostgreSQL/productname 8.4, acronymGIN/ is capable of postponing much of this work by inserting new tuples into a temporary, unsorted list of pending entries. !When the table is vacuumed, or if the pending list becomes too large !(larger than xref linkend=guc-work-mem), the entries are moved to the main acronymGIN/acronym data structure using the same bulk insert techniques used during initial index creation. This greatly improves acronymGIN/acronym index update speed, even counting the additional --- 728,735 from the indexed item). As of productnamePostgreSQL/productname 8.4, acronymGIN/ is capable of postponing much of this work by inserting new tuples into a temporary, unsorted list of pending entries. !When the table is vacuumed, or if the pending list becomes larger than !xref linkend=guc-pending-list-cleanup-size, the entries are moved to the main acronymGIN/acronym data structure using the same bulk insert techniques used during initial index creation. This greatly improves acronymGIN/acronym index update speed, even counting the additional *** *** 750,756 para If consistent response time is more important than update speed, use of pending entries can be disabled by turning off the !literalFASTUPDATE/literal storage parameter for a acronymGIN/acronym index. See xref linkend=sql-createindex for details. /para --- 750,756 para If consistent response time is more important than update speed, use of pending entries can be disabled by turning off the !literalfastupdate/literal storage parameter for a acronymGIN/acronym index. See xref linkend=sql-createindex for details. /para *** *** 812,829 /varlistentry varlistentry !termxref linkend=guc-work-mem/term listitem para During a series of insertions into an existing acronymGIN/acronym ! index that has literalFASTUPDATE/ enabled, the system will clean up the pending-entry list whenever the list grows larger than ! varnamework_mem/. To avoid fluctuations in observed response time, ! it's desirable to have pending-list cleanup occur in the background ! (i.e., via autovacuum). Foreground cleanup operations can be avoided by ! increasing varnamework_mem/ or making autovacuum more aggressive. ! However, enlarging varnamework_mem/ means that if a foreground ! cleanup does occur, it will take even longer. /para /listitem /varlistentry --- 812,837 /varlistentry varlistentry !termxref linkend=guc-pending-list-cleanup-size/term listitem para During a series of insertions into an existing acronymGIN/acronym ! index that has literalfastupdate/ enabled, the system will clean up the pending-entry list whenever the list grows larger than !
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
(2014/10/30 21:30), Fujii Masao wrote: On Thu, Oct 30, 2014 at 7:30 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: Here are my review comments. * The patch applies cleanly and make and make check run successfully. I think that the patch is mostly good. Thanks! Attached is the updated version of the patch. Thank you for updating the patch! * In src/backend/utils/misc/guc.c: + { + {pending_list_cleanup_size, PGC_USERSET, CLIENT_CONN_STATEMENT, + gettext_noop(Sets the maximum size of the pending list for GIN index.), +NULL, + GUC_UNIT_KB + }, + pending_list_cleanup_size, + 4096, 0, MAX_KILOBYTES, + NULL, NULL, NULL + }, ISTM it'd be better to use RESOURCES_MEM, not CLIENT_CONN_STATEMENT. No? Yes if the pending list always exists in the memory. But not, IIUC. Thought? Exactly. But I think we can expect that in many cases, since I think that the users would often set the GUC to a small value to the extent that most of the pending list pages would be cached by shared buffer, to maintain *search* performance. I'd like to hear the opinions of others about the category for the GUC. Also why not set min to 64, not to 0, in accoradance with that of work_mem? I'm OK to use 64. But I just chose 0 because I could not think of any reasonable reason why 64k is suitable as the minimum size of the pending list. IOW, I have no idea about whether it's reasonable to use the min value of work_mem as the min size of the pending list. IIUC, I think that min = 0 disables fast update, so ISTM that it'd be appropriate to set min to some positive value. And ISTM that the idea of using the min value of work_mem is not so bad. * In doc/src/sgml/ref/create_index.sgml: + termliteralPENDING_LIST_CLEANUP_SIZE//term IMHO, it seems to me better for this variable to be in lowercase in accordance with the GUC version. Using lowercase only for pending_list_cleanup_size and uppercase for other options looks strange to me. What about using lowercase for all the storage options? +1 I changed the document in that way. *** 356,361 CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ replaceable class=parametername/ --- 356,372 /listitem /varlistentry /variablelist +variablelist +varlistentry + termliteralPENDING_LIST_CLEANUP_SIZE//term The above is still in uppercse. Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
(2014/10/09 11:49), Etsuro Fujita wrote: (2014/10/08 22:51), Fujii Masao wrote: On Wed, Sep 24, 2014 at 11:10 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: On Wed, Sep 10, 2014 at 10:37 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fujii Masao wrote: On Wed, Sep 10, 2014 at 12:15 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: PENDING_LIST_CLEANUP_SIZE and work_mem, for this setting. Wouldn't it be easy-to-use to have only one parameter, PENDING_LIST_CLEANUP_SIZE? How about setting PENDING_LIST_CLEANUP_SIZE to work_mem as the default value when running the CREATE INDEX command? So what about introduing pending_list_cleanup_size also as GUC? That is, users can set the threshold by using either the reloption or GUC. Yes, I think having both a GUC and a reloption makes sense -- the GUC applies to all indexes, and can be tweaked for individual indexes using the reloption. OK, I'd vote for your idea of having both the GUC and the reloption. So, I think the patch needs to be updated. Fujii-san, what plan do you have about the patch? Please see the attached patch. In this patch, I introduced the GUC parameter, pending_list_cleanup_size. I chose 4MB as the default value of the parameter. But do you have any better idea about that default value? It seems reasonable to me that the GUC has the same default value as work_mem. So, +1 for the default value of 4MB. BTW, I moved the CommitFest entry of this patch to next CF 2014-10. OK, I'll review the patch in the CF. Thank you for updating the patch! Here are my review comments. * The patch applies cleanly and make and make check run successfully. I think that the patch is mostly good. * In src/backend/utils/misc/guc.c: + { + {pending_list_cleanup_size, PGC_USERSET, CLIENT_CONN_STATEMENT, + gettext_noop(Sets the maximum size of the pending list for GIN index.), +NULL, + GUC_UNIT_KB + }, + pending_list_cleanup_size, + 4096, 0, MAX_KILOBYTES, + NULL, NULL, NULL + }, ISTM it'd be better to use RESOURCES_MEM, not CLIENT_CONN_STATEMENT. No? Also why not set min to 64, not to 0, in accoradance with that of work_mem? * In src/backend/utils/misc/postgresql.conf.sample: Likewise, why not put this variable in the section of RESOURCE USAGE, not in that of CLIENT CONNECTION DEFAULTS. * In src/backend/access/common/reloptions.c: + { + { + pending_list_cleanup_size, + Maximum size of the pending list for this GIN index, in kilobytes., + RELOPT_KIND_GIN + }, + -1, 0, MAX_KILOBYTES + }, As in guc.c, why not set min to 64, not to 0? * In src/include/access/gin.h: /* GUC parameter */ extern PGDLLIMPORT int GinFuzzySearchLimit; + extern int pending_list_cleanup_size; The comment should be GUC parameters. * In src/backend/access/gin/ginfast.c: + /* GUC parameter */ + int pending_list_cleanup_size = 0; Do we need to substitute 0 for pending_list_cleanup_size? * In doc/src/sgml/config.sgml: + varlistentry id=guc-pending-list-cleanup-size xreflabel=pending_list_cleanup_size + termvarnamepending_list_cleanup_size/varname (typeinteger/type) As in postgresql.conf.sample, ISTM it'd be better to explain this variable in the section of Resource Consumption (maybe in Memory), not in that of Client Connection Defaults. * In doc/src/sgml/gin.sgml: ! literalpending_list_cleanup_size/. To avoid fluctuations in observed ISTM it'd be better to use varname for pending_list_cleanup_size, not literal, here. * In doc/src/sgml/ref/create_index.sgml: + termliteralPENDING_LIST_CLEANUP_SIZE//term IMHO, it seems to me better for this variable to be in lowercase in accordance with the GUC version. Also, I think that the words GIN indexes accept a different parameter: in the section of Index Storage Parameters in this reference page would be GIN indexes accept different parameters:. Sorry for the delay in reviewing the patch. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Thu, Oct 30, 2014 at 7:30 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: (2014/10/09 11:49), Etsuro Fujita wrote: (2014/10/08 22:51), Fujii Masao wrote: On Wed, Sep 24, 2014 at 11:10 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: On Wed, Sep 10, 2014 at 10:37 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fujii Masao wrote: On Wed, Sep 10, 2014 at 12:15 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: PENDING_LIST_CLEANUP_SIZE and work_mem, for this setting. Wouldn't it be easy-to-use to have only one parameter, PENDING_LIST_CLEANUP_SIZE? How about setting PENDING_LIST_CLEANUP_SIZE to work_mem as the default value when running the CREATE INDEX command? So what about introduing pending_list_cleanup_size also as GUC? That is, users can set the threshold by using either the reloption or GUC. Yes, I think having both a GUC and a reloption makes sense -- the GUC applies to all indexes, and can be tweaked for individual indexes using the reloption. OK, I'd vote for your idea of having both the GUC and the reloption. So, I think the patch needs to be updated. Fujii-san, what plan do you have about the patch? Please see the attached patch. In this patch, I introduced the GUC parameter, pending_list_cleanup_size. I chose 4MB as the default value of the parameter. But do you have any better idea about that default value? It seems reasonable to me that the GUC has the same default value as work_mem. So, +1 for the default value of 4MB. BTW, I moved the CommitFest entry of this patch to next CF 2014-10. OK, I'll review the patch in the CF. Thank you for updating the patch! Here are my review comments. * The patch applies cleanly and make and make check run successfully. I think that the patch is mostly good. Thanks! Attached is the updated version of the patch. * In src/backend/utils/misc/guc.c: + { + {pending_list_cleanup_size, PGC_USERSET, CLIENT_CONN_STATEMENT, + gettext_noop(Sets the maximum size of the pending list for GIN index.), +NULL, + GUC_UNIT_KB + }, + pending_list_cleanup_size, + 4096, 0, MAX_KILOBYTES, + NULL, NULL, NULL + }, ISTM it'd be better to use RESOURCES_MEM, not CLIENT_CONN_STATEMENT. No? Yes if the pending list always exists in the memory. But not, IIUC. Thought? Also why not set min to 64, not to 0, in accoradance with that of work_mem? I'm OK to use 64. But I just chose 0 because I could not think of any reasonable reason why 64k is suitable as the minimum size of the pending list. IOW, I have no idea about whether it's reasonable to use the min value of work_mem as the min size of the pending list. * In src/backend/utils/misc/postgresql.conf.sample: Likewise, why not put this variable in the section of RESOURCE USAGE, not in that of CLIENT CONNECTION DEFAULTS. Same as above. * In src/backend/access/common/reloptions.c: + { + { + pending_list_cleanup_size, + Maximum size of the pending list for this GIN index, in kilobytes., + RELOPT_KIND_GIN + }, + -1, 0, MAX_KILOBYTES + }, As in guc.c, why not set min to 64, not to 0? Same as above. * In src/include/access/gin.h: /* GUC parameter */ extern PGDLLIMPORT int GinFuzzySearchLimit; + extern int pending_list_cleanup_size; The comment should be GUC parameters. Yes, fixed. * In src/backend/access/gin/ginfast.c: + /* GUC parameter */ + int pending_list_cleanup_size = 0; Do we need to substitute 0 for pending_list_cleanup_size? Same as above. * In doc/src/sgml/config.sgml: + varlistentry id=guc-pending-list-cleanup-size xreflabel=pending_list_cleanup_size + termvarnamepending_list_cleanup_size/varname (typeinteger/type) As in postgresql.conf.sample, ISTM it'd be better to explain this variable in the section of Resource Consumption (maybe in Memory), not in that of Client Connection Defaults. Same as above. * In doc/src/sgml/gin.sgml: ! literalpending_list_cleanup_size/. To avoid fluctuations in observed ISTM it'd be better to use varname for pending_list_cleanup_size, not literal, here. Yes, fixed. * In doc/src/sgml/ref/create_index.sgml: + termliteralPENDING_LIST_CLEANUP_SIZE//term IMHO, it seems to me better for this variable to be in lowercase in accordance with the GUC version. Using lowercase only for pending_list_cleanup_size and uppercase for other options looks strange to me. What about using lowercase for all the storage options? I changed the document in that way. Also, I think that the words GIN indexes accept a different parameter: in the section of Index Storage Parameters in this reference page would be GIN indexes accept different parameters:. Yes,
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Wed, Sep 24, 2014 at 11:10 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: (2014/09/13 2:42), Fujii Masao wrote: On Wed, Sep 10, 2014 at 10:37 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fujii Masao wrote: On Wed, Sep 10, 2014 at 12:15 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: PENDING_LIST_CLEANUP_SIZE and work_mem, for this setting. Wouldn't it be easy-to-use to have only one parameter, PENDING_LIST_CLEANUP_SIZE? How about setting PENDING_LIST_CLEANUP_SIZE to work_mem as the default value when running the CREATE INDEX command? That's an idea. But there might be some users who want to change the cleanup size per session like they can do by setting work_mem, and your idea would prevent them from doing that... So what about introduing pending_list_cleanup_size also as GUC? That is, users can set the threshold by using either the reloption or GUC. Yes, I think having both a GUC and a reloption makes sense -- the GUC applies to all indexes, and can be tweaked for individual indexes using the reloption. Agreed. I'm not sure about the idea of being able to change it per session, though. Do you mean that you would like insert processes use a very large value so that they can just append new values to the pending list, and have vacuum use a small value so that it cleans up as soon as it runs? Two things: 1. we could have an autovacuum_ reloption which only changes what autovacuum does; 2. we could have autovacuum run index cleanup actions separately from actual vacuuming. Yes, I was thinking something like that. But if autovacuum has already been able to handle that, it's nice. Anyway, as you pointed out, it's better to have both GUC and reloption for the cleanup size of pending list. OK, I'd vote for your idea of having both the GUC and the reloption. So, I think the patch needs to be updated. Fujii-san, what plan do you have about the patch? Please see the attached patch. In this patch, I introduced the GUC parameter, pending_list_cleanup_size. I chose 4MB as the default value of the parameter. But do you have any better idea about that default value? BTW, I moved the CommitFest entry of this patch to next CF 2014-10. Regards, -- Fujii Masao *** a/doc/src/sgml/config.sgml --- b/doc/src/sgml/config.sgml *** *** 5907,5912 SET XML OPTION { DOCUMENT | CONTENT }; --- 5907,5933 /listitem /varlistentry + varlistentry id=guc-pending-list-cleanup-size xreflabel=pending_list_cleanup_size + termvarnamepending_list_cleanup_size/varname (typeinteger/type) + indexterm +primaryvarnamepending_list_cleanup_size/ configuration parameter/primary + /indexterm + /term + listitem +para + Sets the maximum size of the GIN pending list which is used + when literalFASTUPDATE/ is enabled. If the list grows + larger than this maximum size, it is cleaned up by moving + the entries in it to the main GIN data structure in bulk. + The default is four megabytes (literal4MB/). This setting + can be overridden for individual GIN indexes by changing + storage parameters. + See xref linkend=gin-fast-update and xref linkend=gin-tips + for more information. +/para + /listitem + /varlistentry + /variablelist /sect2 sect2 id=runtime-config-client-format *** a/doc/src/sgml/gin.sgml --- b/doc/src/sgml/gin.sgml *** *** 728,735 from the indexed item). As of productnamePostgreSQL/productname 8.4, acronymGIN/ is capable of postponing much of this work by inserting new tuples into a temporary, unsorted list of pending entries. !When the table is vacuumed, or if the pending list becomes too large !(larger than xref linkend=guc-work-mem), the entries are moved to the main acronymGIN/acronym data structure using the same bulk insert techniques used during initial index creation. This greatly improves acronymGIN/acronym index update speed, even counting the additional --- 728,735 from the indexed item). As of productnamePostgreSQL/productname 8.4, acronymGIN/ is capable of postponing much of this work by inserting new tuples into a temporary, unsorted list of pending entries. !When the table is vacuumed, or if the pending list becomes larger than !xref linkend=guc-pending-list-cleanup-size, the entries are moved to the main acronymGIN/acronym data structure using the same bulk insert techniques used during initial index creation. This greatly improves acronymGIN/acronym index update speed, even counting the additional *** *** 812,829 /varlistentry varlistentry !termxref linkend=guc-work-mem/term listitem para During a series of insertions into an existing acronymGIN/acronym index that has literalFASTUPDATE/ enabled, the
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
(2014/10/08 22:51), Fujii Masao wrote: On Wed, Sep 24, 2014 at 11:10 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: On Wed, Sep 10, 2014 at 10:37 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fujii Masao wrote: On Wed, Sep 10, 2014 at 12:15 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: PENDING_LIST_CLEANUP_SIZE and work_mem, for this setting. Wouldn't it be easy-to-use to have only one parameter, PENDING_LIST_CLEANUP_SIZE? How about setting PENDING_LIST_CLEANUP_SIZE to work_mem as the default value when running the CREATE INDEX command? So what about introduing pending_list_cleanup_size also as GUC? That is, users can set the threshold by using either the reloption or GUC. Yes, I think having both a GUC and a reloption makes sense -- the GUC applies to all indexes, and can be tweaked for individual indexes using the reloption. OK, I'd vote for your idea of having both the GUC and the reloption. So, I think the patch needs to be updated. Fujii-san, what plan do you have about the patch? Please see the attached patch. In this patch, I introduced the GUC parameter, pending_list_cleanup_size. I chose 4MB as the default value of the parameter. But do you have any better idea about that default value? It seems reasonable to me that the GUC has the same default value as work_mem. So, +1 for the default value of 4MB. BTW, I moved the CommitFest entry of this patch to next CF 2014-10. OK, I'll review the patch in the CF. Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
(2014/09/13 2:42), Fujii Masao wrote: On Wed, Sep 10, 2014 at 10:37 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fujii Masao wrote: On Wed, Sep 10, 2014 at 12:15 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: PENDING_LIST_CLEANUP_SIZE and work_mem, for this setting. Wouldn't it be easy-to-use to have only one parameter, PENDING_LIST_CLEANUP_SIZE? How about setting PENDING_LIST_CLEANUP_SIZE to work_mem as the default value when running the CREATE INDEX command? That's an idea. But there might be some users who want to change the cleanup size per session like they can do by setting work_mem, and your idea would prevent them from doing that... So what about introduing pending_list_cleanup_size also as GUC? That is, users can set the threshold by using either the reloption or GUC. Yes, I think having both a GUC and a reloption makes sense -- the GUC applies to all indexes, and can be tweaked for individual indexes using the reloption. Agreed. I'm not sure about the idea of being able to change it per session, though. Do you mean that you would like insert processes use a very large value so that they can just append new values to the pending list, and have vacuum use a small value so that it cleans up as soon as it runs? Two things: 1. we could have an autovacuum_ reloption which only changes what autovacuum does; 2. we could have autovacuum run index cleanup actions separately from actual vacuuming. Yes, I was thinking something like that. But if autovacuum has already been able to handle that, it's nice. Anyway, as you pointed out, it's better to have both GUC and reloption for the cleanup size of pending list. OK, I'd vote for your idea of having both the GUC and the reloption. So, I think the patch needs to be updated. Fujii-san, what plan do you have about the patch? Sorry for the delay. Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Wed, Sep 10, 2014 at 10:37 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fujii Masao wrote: On Wed, Sep 10, 2014 at 12:15 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: PENDING_LIST_CLEANUP_SIZE and work_mem, for this setting. Wouldn't it be easy-to-use to have only one parameter, PENDING_LIST_CLEANUP_SIZE? How about setting PENDING_LIST_CLEANUP_SIZE to work_mem as the default value when running the CREATE INDEX command? That's an idea. But there might be some users who want to change the cleanup size per session like they can do by setting work_mem, and your idea would prevent them from doing that... So what about introduing pending_list_cleanup_size also as GUC? That is, users can set the threshold by using either the reloption or GUC. Yes, I think having both a GUC and a reloption makes sense -- the GUC applies to all indexes, and can be tweaked for individual indexes using the reloption. Agreed. I'm not sure about the idea of being able to change it per session, though. Do you mean that you would like insert processes use a very large value so that they can just append new values to the pending list, and have vacuum use a small value so that it cleans up as soon as it runs? Two things: 1. we could have an autovacuum_ reloption which only changes what autovacuum does; 2. we could have autovacuum run index cleanup actions separately from actual vacuuming. Yes, I was thinking something like that. But if autovacuum has already been able to handle that, it's nice. Anyway, as you pointed out, it's better to have both GUC and reloption for the cleanup size of pending list. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
(2014/09/10 12:31), Fujii Masao wrote: On Wed, Sep 10, 2014 at 12:15 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: (2014/09/09 22:17), Fujii Masao wrote: Attached is the updated version of the patch. I took a quick review on the patch. It looks good to me, but one thing I'm concerned about is You wrote: The attached patch introduces the GIN index storage parameter PENDING_LIST_CLEANUP_SIZE which specifies the maximum size of GIN pending list. If it's not set, work_mem is used as that maximum size, instead. So this patch doesn't break the existing application which currently uses work_mem as the threshold of cleanup operation of the pending list. It has only not to set PENDING_LIST_CLEANUP_SIZE. As you mentioned, I think it's important to consider for the existing applications, but I'm wondering if it would be a bit confusing users to have two parameters, Yep. PENDING_LIST_CLEANUP_SIZE and work_mem, for this setting. Wouldn't it be easy-to-use to have only one parameter, PENDING_LIST_CLEANUP_SIZE? How about setting PENDING_LIST_CLEANUP_SIZE to work_mem as the default value when running the CREATE INDEX command? That's an idea. But there might be some users who want to change the cleanup size per session like they can do by setting work_mem, and your idea would prevent them from doing that... Why not use ALTER INDEX ... SET (PENDING_LIST_CLEANUP_SIZE= ...)? Maybe I'm missing something, though. So what about introduing pending_list_cleanup_size also as GUC? That is, users can set the threshold by using either the reloption or GUC. Yeah, that's an idea. So, I'd like to hear the opinions of others. Thanks, Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Wed, Sep 10, 2014 at 5:35 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: (2014/09/10 12:31), Fujii Masao wrote: On Wed, Sep 10, 2014 at 12:15 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: (2014/09/09 22:17), Fujii Masao wrote: Attached is the updated version of the patch. I took a quick review on the patch. It looks good to me, but one thing I'm concerned about is You wrote: The attached patch introduces the GIN index storage parameter PENDING_LIST_CLEANUP_SIZE which specifies the maximum size of GIN pending list. If it's not set, work_mem is used as that maximum size, instead. So this patch doesn't break the existing application which currently uses work_mem as the threshold of cleanup operation of the pending list. It has only not to set PENDING_LIST_CLEANUP_SIZE. As you mentioned, I think it's important to consider for the existing applications, but I'm wondering if it would be a bit confusing users to have two parameters, Yep. PENDING_LIST_CLEANUP_SIZE and work_mem, for this setting. Wouldn't it be easy-to-use to have only one parameter, PENDING_LIST_CLEANUP_SIZE? How about setting PENDING_LIST_CLEANUP_SIZE to work_mem as the default value when running the CREATE INDEX command? That's an idea. But there might be some users who want to change the cleanup size per session like they can do by setting work_mem, and your idea would prevent them from doing that... Why not use ALTER INDEX ... SET (PENDING_LIST_CLEANUP_SIZE= ...)? Maybe I'm missing something, though. It takes AccessExclusive lock and has an effect on every sessions (not only specified session). Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
Fujii Masao wrote: On Wed, Sep 10, 2014 at 12:15 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: PENDING_LIST_CLEANUP_SIZE and work_mem, for this setting. Wouldn't it be easy-to-use to have only one parameter, PENDING_LIST_CLEANUP_SIZE? How about setting PENDING_LIST_CLEANUP_SIZE to work_mem as the default value when running the CREATE INDEX command? That's an idea. But there might be some users who want to change the cleanup size per session like they can do by setting work_mem, and your idea would prevent them from doing that... So what about introduing pending_list_cleanup_size also as GUC? That is, users can set the threshold by using either the reloption or GUC. Yes, I think having both a GUC and a reloption makes sense -- the GUC applies to all indexes, and can be tweaked for individual indexes using the reloption. I'm not sure about the idea of being able to change it per session, though. Do you mean that you would like insert processes use a very large value so that they can just append new values to the pending list, and have vacuum use a small value so that it cleans up as soon as it runs? Two things: 1. we could have an autovacuum_ reloption which only changes what autovacuum does; 2. we could have autovacuum run index cleanup actions separately from actual vacuuming. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Tue, Sep 9, 2014 at 1:28 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Sun, Aug 17, 2014 at 7:46 PM, Fujii Masao masao.fu...@gmail.com wrote: Thanks for reviewing the patch! ISTM that I failed to make the patch from my git repository... Attached is the rebased version. I get some compiler warnings on v2 of this patch: reloptions.c:219: warning: excess elements in struct initializer reloptions.c:219: warning: (near initialization for 'intRelOpts[15]') Thanks for testing the patch! Attached is the updated version of the patch. Previously the patch depended on another infrastructure patch (which allows a user to specify the unit in reloption (*1)). But that infrastructure patch has serious problem and it's not easy to fix the problem. So I changed the patch so that it doesn't depend on that infrastructure patch at all. Even without the infrastructure patch, the feature that this patch introduces is useful. Also I added the regression test into the patch. (*1) http://www.postgresql.org/message-id/CAHGQGwEanQ_e8WLHL25=bm_8z5zkyzw0k0yir+kdmv2hgne...@mail.gmail.com Regards, -- Fujii Masao *** a/doc/src/sgml/gin.sgml --- b/doc/src/sgml/gin.sgml *** *** 728,735 from the indexed item). As of productnamePostgreSQL/productname 8.4, acronymGIN/ is capable of postponing much of this work by inserting new tuples into a temporary, unsorted list of pending entries. !When the table is vacuumed, or if the pending list becomes too large !(larger than xref linkend=guc-work-mem), the entries are moved to the main acronymGIN/acronym data structure using the same bulk insert techniques used during initial index creation. This greatly improves acronymGIN/acronym index update speed, even counting the additional --- 728,736 from the indexed item). As of productnamePostgreSQL/productname 8.4, acronymGIN/ is capable of postponing much of this work by inserting new tuples into a temporary, unsorted list of pending entries. !When the table is vacuumed, or if the pending list becomes larger than !literalPENDING_LIST_CLEANUP_SIZE/literal (or !xref linkend=guc-work-mem if not set), the entries are moved to the main acronymGIN/acronym data structure using the same bulk insert techniques used during initial index creation. This greatly improves acronymGIN/acronym index update speed, even counting the additional *** *** 812,829 /varlistentry varlistentry !termxref linkend=guc-work-mem/term listitem para During a series of insertions into an existing acronymGIN/acronym index that has literalFASTUPDATE/ enabled, the system will clean up the pending-entry list whenever the list grows larger than ! varnamework_mem/. To avoid fluctuations in observed response time, ! it's desirable to have pending-list cleanup occur in the background ! (i.e., via autovacuum). Foreground cleanup operations can be avoided by ! increasing varnamework_mem/ or making autovacuum more aggressive. ! However, enlarging varnamework_mem/ means that if a foreground ! cleanup does occur, it will take even longer. /para /listitem /varlistentry --- 813,839 /varlistentry varlistentry !termliteralPENDING_LIST_CLEANUP_SIZE/ and !xref linkend=guc-work-mem/term listitem para During a series of insertions into an existing acronymGIN/acronym index that has literalFASTUPDATE/ enabled, the system will clean up the pending-entry list whenever the list grows larger than ! literalPENDING_LIST_CLEANUP_SIZE/ (if not set, varnamework_mem/ ! is used as that threshold, instead). To avoid fluctuations in observed ! response time, it's desirable to have pending-list cleanup occur in the ! background (i.e., via autovacuum). Foreground cleanup operations ! can be avoided by increasing literalPENDING_LIST_CLEANUP_SIZE/ ! (or varnamework_mem/) or making autovacuum more aggressive. ! However, enlarging the threshold of the cleanup operation means that ! if a foreground cleanup does occur, it will take even longer. ! /para ! para ! literalPENDING_LIST_CLEANUP_SIZE/ is an index storage parameter, ! and allows each GIN index to have its own cleanup threshold. ! For example, it's possible to increase the threshold only for the GIN ! index which can be updated heavily, and decrease it otherwise. /para /listitem /varlistentry *** a/doc/src/sgml/ref/create_index.sgml --- b/doc/src/sgml/ref/create_index.sgml *** *** 356,361 CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ replaceable class=parametername/ --- 356,377 /listitem /varlistentry /variablelist +variablelist +varlistentry + termliteralPENDING_LIST_CLEANUP_SIZE//term + listitem + para + This setting specifies the maximum size
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
(2014/09/09 22:17), Fujii Masao wrote: On Tue, Sep 9, 2014 at 1:28 AM, Jeff Janes jeff.ja...@gmail.com wrote: I get some compiler warnings on v2 of this patch: reloptions.c:219: warning: excess elements in struct initializer reloptions.c:219: warning: (near initialization for 'intRelOpts[15]') Attached is the updated version of the patch. Thank you for updating the patch! I took a quick review on the patch. It looks good to me, but one thing I'm concerned about is You wrote: The attached patch introduces the GIN index storage parameter PENDING_LIST_CLEANUP_SIZE which specifies the maximum size of GIN pending list. If it's not set, work_mem is used as that maximum size, instead. So this patch doesn't break the existing application which currently uses work_mem as the threshold of cleanup operation of the pending list. It has only not to set PENDING_LIST_CLEANUP_SIZE. As you mentioned, I think it's important to consider for the existing applications, but I'm wondering if it would be a bit confusing users to have two parameters, PENDING_LIST_CLEANUP_SIZE and work_mem, for this setting. Wouldn't it be easy-to-use to have only one parameter, PENDING_LIST_CLEANUP_SIZE? How about setting PENDING_LIST_CLEANUP_SIZE to work_mem as the default value when running the CREATE INDEX command? Sorry for the delay. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Wed, Sep 10, 2014 at 12:15 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: (2014/09/09 22:17), Fujii Masao wrote: On Tue, Sep 9, 2014 at 1:28 AM, Jeff Janes jeff.ja...@gmail.com wrote: I get some compiler warnings on v2 of this patch: reloptions.c:219: warning: excess elements in struct initializer reloptions.c:219: warning: (near initialization for 'intRelOpts[15]') Attached is the updated version of the patch. Thank you for updating the patch! I took a quick review on the patch. It looks good to me, Thanks for reviewing the patch! but one thing I'm concerned about is You wrote: The attached patch introduces the GIN index storage parameter PENDING_LIST_CLEANUP_SIZE which specifies the maximum size of GIN pending list. If it's not set, work_mem is used as that maximum size, instead. So this patch doesn't break the existing application which currently uses work_mem as the threshold of cleanup operation of the pending list. It has only not to set PENDING_LIST_CLEANUP_SIZE. As you mentioned, I think it's important to consider for the existing applications, but I'm wondering if it would be a bit confusing users to have two parameters, Yep. PENDING_LIST_CLEANUP_SIZE and work_mem, for this setting. Wouldn't it be easy-to-use to have only one parameter, PENDING_LIST_CLEANUP_SIZE? How about setting PENDING_LIST_CLEANUP_SIZE to work_mem as the default value when running the CREATE INDEX command? That's an idea. But there might be some users who want to change the cleanup size per session like they can do by setting work_mem, and your idea would prevent them from doing that... So what about introduing pending_list_cleanup_size also as GUC? That is, users can set the threshold by using either the reloption or GUC. Sorry for the delay. No problem. Thanks! Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Sun, Aug 17, 2014 at 7:46 PM, Fujii Masao masao.fu...@gmail.com wrote: Thanks for reviewing the patch! ISTM that I failed to make the patch from my git repository... Attached is the rebased version. I get some compiler warnings on v2 of this patch: reloptions.c:219: warning: excess elements in struct initializer reloptions.c:219: warning: (near initialization for 'intRelOpts[15]') Cheers, Jeff
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Sat, Aug 16, 2014 at 4:23 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Fri, Aug 8, 2014 at 11:45 PM, Fujii Masao masao.fu...@gmail.com wrote: On Fri, Aug 8, 2014 at 11:43 PM, Fujii Masao masao.fu...@gmail.com wrote: On Mon, Apr 14, 2014 at 11:40 PM, Tom Lane t...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: On Tue, Apr 1, 2014 at 1:41 AM, Robert Haas robertmh...@gmail.com wrote: Should we try to install some hack around fastupdate for 9.4? I fear the divergence between reasonable values of work_mem and reasonable sizes for that list is only going to continue to get bigger. I'm sure there's somebody out there who has work_mem = 16GB, and stuff like 263865a48973767ce8ed7b7788059a38a24a9f37 is only going to increase the appeal of large values. Controlling the threshold of the size of pending list only by GUC doesn't seem reasonable. Users may want to increase the threshold only for the GIN index which can be updated heavily, and decrease it otherwise. So I think that it's better to add new storage parameter for GIN index to control the threshold, or both storage parameter and GUC. Yeah, -1 for a GUC. A GIN-index-specific storage parameter seems more appropriate. The attached patch introduces the GIN index storage parameter PENDING_LIST_CLEANUP_SIZE which specifies the maximum size of GIN pending list. If it's not set, work_mem is used as that maximum size, instead. So this patch doesn't break the existing application which currently uses work_mem as the threshold of cleanup operation of the pending list. It has only not to set PENDING_LIST_CLEANUP_SIZE. This is an index storage parameter, and allows us to specify each threshold per GIN index. So, for example, we can increase the threshold only for the GIN index which can be updated heavily, and decrease it otherwise. This patch uses another patch that I proposed (*1) as an infrastructure. Please apply that infrastructure patch first if you apply this patch. (*1) http://www.postgresql.org/message-id/CAHGQGwEanQ_e8WLHL25=bm_8z5zkyzw0k0yir+kdmv2hgne...@mail.gmail.com Regards, -- Fujii Masao Sorry, I forgot to attached the patch This time, attached. I think that this patch should be rebased. It contains garbage code. Thanks for reviewing the patch! ISTM that I failed to make the patch from my git repository... Attached is the rebased version. Regards, -- Fujii Masao diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml index 80a578d..24c8dc1 100644 --- a/doc/src/sgml/gin.sgml +++ b/doc/src/sgml/gin.sgml @@ -728,8 +728,9 @@ from the indexed item). As of productnamePostgreSQL/productname 8.4, acronymGIN/ is capable of postponing much of this work by inserting new tuples into a temporary, unsorted list of pending entries. - When the table is vacuumed, or if the pending list becomes too large - (larger than xref linkend=guc-work-mem), the entries are moved to the + When the table is vacuumed, or if the pending list becomes larger than + literalPENDING_LIST_CLEANUP_SIZE/literal (or + xref linkend=guc-work-mem if not set), the entries are moved to the main acronymGIN/acronym data structure using the same bulk insert techniques used during initial index creation. This greatly improves acronymGIN/acronym index update speed, even counting the additional @@ -812,18 +813,27 @@ /varlistentry varlistentry - termxref linkend=guc-work-mem/term + termliteralPENDING_LIST_CLEANUP_SIZE/ and + xref linkend=guc-work-mem/term listitem para During a series of insertions into an existing acronymGIN/acronym index that has literalFASTUPDATE/ enabled, the system will clean up the pending-entry list whenever the list grows larger than - varnamework_mem/. To avoid fluctuations in observed response time, - it's desirable to have pending-list cleanup occur in the background - (i.e., via autovacuum). Foreground cleanup operations can be avoided by - increasing varnamework_mem/ or making autovacuum more aggressive. - However, enlarging varnamework_mem/ means that if a foreground - cleanup does occur, it will take even longer. + literalPENDING_LIST_CLEANUP_SIZE/ (if not set, varnamework_mem/ + is used as that threshold, instead). To avoid fluctuations in observed + response time, it's desirable to have pending-list cleanup occur in the + background (i.e., via autovacuum). Foreground cleanup operations + can be avoided by increasing literalPENDING_LIST_CLEANUP_SIZE/ + (or varnamework_mem/) or making autovacuum more aggressive. + However, enlarging the threshold of the cleanup operation means that + if a foreground cleanup does occur, it will take even longer. +/para +para + literalPENDING_LIST_CLEANUP_SIZE/ is an index storage parameter, + and allows each GIN index to have its own cleanup threshold. + For example, it's possible to increase the
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Fri, Aug 8, 2014 at 11:45 PM, Fujii Masao masao.fu...@gmail.com wrote: On Fri, Aug 8, 2014 at 11:43 PM, Fujii Masao masao.fu...@gmail.com wrote: On Mon, Apr 14, 2014 at 11:40 PM, Tom Lane t...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: On Tue, Apr 1, 2014 at 1:41 AM, Robert Haas robertmh...@gmail.com wrote: Should we try to install some hack around fastupdate for 9.4? I fear the divergence between reasonable values of work_mem and reasonable sizes for that list is only going to continue to get bigger. I'm sure there's somebody out there who has work_mem = 16GB, and stuff like 263865a48973767ce8ed7b7788059a38a24a9f37 is only going to increase the appeal of large values. Controlling the threshold of the size of pending list only by GUC doesn't seem reasonable. Users may want to increase the threshold only for the GIN index which can be updated heavily, and decrease it otherwise. So I think that it's better to add new storage parameter for GIN index to control the threshold, or both storage parameter and GUC. Yeah, -1 for a GUC. A GIN-index-specific storage parameter seems more appropriate. The attached patch introduces the GIN index storage parameter PENDING_LIST_CLEANUP_SIZE which specifies the maximum size of GIN pending list. If it's not set, work_mem is used as that maximum size, instead. So this patch doesn't break the existing application which currently uses work_mem as the threshold of cleanup operation of the pending list. It has only not to set PENDING_LIST_CLEANUP_SIZE. This is an index storage parameter, and allows us to specify each threshold per GIN index. So, for example, we can increase the threshold only for the GIN index which can be updated heavily, and decrease it otherwise. This patch uses another patch that I proposed (*1) as an infrastructure. Please apply that infrastructure patch first if you apply this patch. (*1) http://www.postgresql.org/message-id/CAHGQGwEanQ_e8WLHL25=bm_8z5zkyzw0k0yir+kdmv2hgne...@mail.gmail.com Regards, -- Fujii Masao Sorry, I forgot to attached the patch This time, attached. I think that this patch should be rebased. It contains garbage code. Regards, --- Sawada Masahiko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Mon, Apr 14, 2014 at 11:40 PM, Tom Lane t...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: On Tue, Apr 1, 2014 at 1:41 AM, Robert Haas robertmh...@gmail.com wrote: Should we try to install some hack around fastupdate for 9.4? I fear the divergence between reasonable values of work_mem and reasonable sizes for that list is only going to continue to get bigger. I'm sure there's somebody out there who has work_mem = 16GB, and stuff like 263865a48973767ce8ed7b7788059a38a24a9f37 is only going to increase the appeal of large values. Controlling the threshold of the size of pending list only by GUC doesn't seem reasonable. Users may want to increase the threshold only for the GIN index which can be updated heavily, and decrease it otherwise. So I think that it's better to add new storage parameter for GIN index to control the threshold, or both storage parameter and GUC. Yeah, -1 for a GUC. A GIN-index-specific storage parameter seems more appropriate. The attached patch introduces the GIN index storage parameter PENDING_LIST_CLEANUP_SIZE which specifies the maximum size of GIN pending list. If it's not set, work_mem is used as that maximum size, instead. So this patch doesn't break the existing application which currently uses work_mem as the threshold of cleanup operation of the pending list. It has only not to set PENDING_LIST_CLEANUP_SIZE. This is an index storage parameter, and allows us to specify each threshold per GIN index. So, for example, we can increase the threshold only for the GIN index which can be updated heavily, and decrease it otherwise. This patch uses another patch that I proposed (*1) as an infrastructure. Please apply that infrastructure patch first if you apply this patch. (*1) http://www.postgresql.org/message-id/CAHGQGwEanQ_e8WLHL25=bm_8z5zkyzw0k0yir+kdmv2hgne...@mail.gmail.com Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Fri, Aug 8, 2014 at 11:43 PM, Fujii Masao masao.fu...@gmail.com wrote: On Mon, Apr 14, 2014 at 11:40 PM, Tom Lane t...@sss.pgh.pa.us wrote: Fujii Masao masao.fu...@gmail.com writes: On Tue, Apr 1, 2014 at 1:41 AM, Robert Haas robertmh...@gmail.com wrote: Should we try to install some hack around fastupdate for 9.4? I fear the divergence between reasonable values of work_mem and reasonable sizes for that list is only going to continue to get bigger. I'm sure there's somebody out there who has work_mem = 16GB, and stuff like 263865a48973767ce8ed7b7788059a38a24a9f37 is only going to increase the appeal of large values. Controlling the threshold of the size of pending list only by GUC doesn't seem reasonable. Users may want to increase the threshold only for the GIN index which can be updated heavily, and decrease it otherwise. So I think that it's better to add new storage parameter for GIN index to control the threshold, or both storage parameter and GUC. Yeah, -1 for a GUC. A GIN-index-specific storage parameter seems more appropriate. The attached patch introduces the GIN index storage parameter PENDING_LIST_CLEANUP_SIZE which specifies the maximum size of GIN pending list. If it's not set, work_mem is used as that maximum size, instead. So this patch doesn't break the existing application which currently uses work_mem as the threshold of cleanup operation of the pending list. It has only not to set PENDING_LIST_CLEANUP_SIZE. This is an index storage parameter, and allows us to specify each threshold per GIN index. So, for example, we can increase the threshold only for the GIN index which can be updated heavily, and decrease it otherwise. This patch uses another patch that I proposed (*1) as an infrastructure. Please apply that infrastructure patch first if you apply this patch. (*1) http://www.postgresql.org/message-id/CAHGQGwEanQ_e8WLHL25=bm_8z5zkyzw0k0yir+kdmv2hgne...@mail.gmail.com Regards, -- Fujii Masao Sorry, I forgot to attached the patch This time, attached. Regards, -- Fujii Masao *** a/doc/src/sgml/gin.sgml --- b/doc/src/sgml/gin.sgml *** *** 728,735 from the indexed item). As of productnamePostgreSQL/productname 8.4, acronymGIN/ is capable of postponing much of this work by inserting new tuples into a temporary, unsorted list of pending entries. !When the table is vacuumed, or if the pending list becomes too large !(larger than xref linkend=guc-work-mem), the entries are moved to the main acronymGIN/acronym data structure using the same bulk insert techniques used during initial index creation. This greatly improves acronymGIN/acronym index update speed, even counting the additional --- 728,736 from the indexed item). As of productnamePostgreSQL/productname 8.4, acronymGIN/ is capable of postponing much of this work by inserting new tuples into a temporary, unsorted list of pending entries. !When the table is vacuumed, or if the pending list becomes larger than !literalPENDING_LIST_CLEANUP_SIZE/literal (or !xref linkend=guc-work-mem if not set), the entries are moved to the main acronymGIN/acronym data structure using the same bulk insert techniques used during initial index creation. This greatly improves acronymGIN/acronym index update speed, even counting the additional *** *** 812,829 /varlistentry varlistentry !termxref linkend=guc-work-mem/term listitem para During a series of insertions into an existing acronymGIN/acronym index that has literalFASTUPDATE/ enabled, the system will clean up the pending-entry list whenever the list grows larger than ! varnamework_mem/. To avoid fluctuations in observed response time, ! it's desirable to have pending-list cleanup occur in the background ! (i.e., via autovacuum). Foreground cleanup operations can be avoided by ! increasing varnamework_mem/ or making autovacuum more aggressive. ! However, enlarging varnamework_mem/ means that if a foreground ! cleanup does occur, it will take even longer. /para /listitem /varlistentry --- 813,839 /varlistentry varlistentry !termliteralPENDING_LIST_CLEANUP_SIZE/ and !xref linkend=guc-work-mem/term listitem para During a series of insertions into an existing acronymGIN/acronym index that has literalFASTUPDATE/ enabled, the system will clean up the pending-entry list whenever the list grows larger than ! literalPENDING_LIST_CLEANUP_SIZE/ (if not set, varnamework_mem/ ! is used as that threshold, instead). To avoid fluctuations in observed ! response time, it's desirable to have pending-list cleanup occur in the ! background (i.e., via autovacuum). Foreground cleanup operations ! can be avoided by increasing literalPENDING_LIST_CLEANUP_SIZE/ ! (or
Re: PENDING_LIST_CLEANUP_SIZE - maximum size of GIN pending list Re: [HACKERS] HEAD seems to generate larger WAL regarding GIN index
On Fri, Aug 8, 2014 at 11:43 PM, Fujii Masao masao.fu...@gmail.com wrote: The attached patch introduces... A patch perhaps? :) -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers