Re: doc review for parallel vacuum
On Tue, Apr 14, 2020 at 2:54 AM Justin Pryzby wrote: > > Looks good. One more change: > > [-Even if-]{+If+} this option is specified with the ANALYZE > [-option-]{+option,+} > > Remove "even" and add comma. > Pushed after making this change. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: doc review for parallel vacuum
On Mon, Apr 13, 2020 at 03:22:06PM +0530, Amit Kapila wrote: > On Mon, Apr 13, 2020 at 2:00 PM Justin Pryzby wrote: > > > > |Copy the index > > |bulk-deletion result returned from ambulkdelete and amvacuumcleanup to > > |the DSM segment if it's the first time [???] because they allocate locally > > |and it's possible that an index will be vacuumed by a different > > |vacuum process the next time." > > > > Is it correct to say: "..if it's the first iteration" and "different > > process on > > the next iteration" ? Or "cycle" ? > > > > "cycle" sounds better. I have changed the patch as per your latest > comments. Let me know what you think? Looks good. One more change: [-Even if-]{+If+} this option is specified with the ANALYZE [-option-]{+option,+} Remove "even" and add comma. Thanks, -- Justin
Re: doc review for parallel vacuum
On Mon, Apr 13, 2020 at 2:00 PM Justin Pryzby wrote: > > |Copy the index > |bulk-deletion result returned from ambulkdelete and amvacuumcleanup to > |the DSM segment if it's the first time [???] because they allocate locally > |and it's possible that an index will be vacuumed by a different > |vacuum process the next time." > > Is it correct to say: "..if it's the first iteration" and "different process > on > the next iteration" ? Or "cycle" ? > "cycle" sounds better. I have changed the patch as per your latest comments. Let me know what you think? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com v5-0001-Comments-and-doc-fixes-for-commit-40d964ec99.patch Description: Binary data
Re: doc review for parallel vacuum
On Mon, Apr 13, 2020 at 10:44:42AM +0530, Amit Kapila wrote: > On Fri, Apr 10, 2020 at 7:16 PM Justin Pryzby wrote: > > > > Also, this part still doesn't read well: > > > > -* amvacuumcleanup to the DSM segment if it's the first time to get > > it? > > -* from them? because they? allocate it locally and it's possible > > that an > > -* index will be vacuumed by the different vacuum process at the > > next > > > > If you change "it" and "them" and "it" and say "*a* different", then it'll > > be > > ok. > > > > I am not sure if I follow how exactly you want to change it but still > let me know what you think about if we change it like: "Copy the index > bulk-deletion result returned from ambulkdelete and amvacuumcleanup to > the DSM segment if it's the first time because they allocate locally > and it's possible that an index will be vacuumed by the different > vacuum process at the next time." I changed "the" to "a" and removed "at": |Copy the index |bulk-deletion result returned from ambulkdelete and amvacuumcleanup to |the DSM segment if it's the first time [???] because they allocate locally |and it's possible that an index will be vacuumed by a different |vacuum process the next time." Is it correct to say: "..if it's the first iteration" and "different process on the next iteration" ? Or "cycle" ? -- Justin
Re: doc review for parallel vacuum
On Fri, Apr 10, 2020 at 7:16 PM Justin Pryzby wrote: > > Also, this part still doesn't read well: > > -* amvacuumcleanup to the DSM segment if it's the first time to get > it? > -* from them? because they? allocate it locally and it's possible > that an > -* index will be vacuumed by the different vacuum process at the next > > If you change "it" and "them" and "it" and say "*a* different", then it'll be > ok. > I am not sure if I follow how exactly you want to change it but still let me know what you think about if we change it like: "Copy the index bulk-deletion result returned from ambulkdelete and amvacuumcleanup to the DSM segment if it's the first time because they allocate locally and it's possible that an index will be vacuumed by the different vacuum process at the next time." -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: doc review for parallel vacuum
On Fri, Apr 10, 2020 at 12:56:08PM +0530, Amit Kapila wrote: > On Wed, Apr 8, 2020 at 12:49 PM Masahiko Sawada > wrote: > > > > On Tue, 7 Apr 2020 at 13:55, Justin Pryzby wrote: > > > > > > > I don't have comments on your change other than the comments Amit > > already sent. Thank you for reviewing this part! > > > > I have made the modifications as per my comments. What do you think > about the attached? Couple more changes (in bold): - The PARALLEL option is used only for vacuum PURPOSES. - Even if this option is specified with THE ANALYZE option Also, this part still doesn't read well: -* amvacuumcleanup to the DSM segment if it's the first time to get it? -* from them? because they? allocate it locally and it's possible that an -* index will be vacuumed by the different vacuum process at the next If you change "it" and "them" and "it" and say "*a* different", then it'll be ok. -- Justin
Re: doc review for parallel vacuum
On Fri, 10 Apr 2020 at 16:26, Amit Kapila wrote: > > On Wed, Apr 8, 2020 at 12:49 PM Masahiko Sawada > wrote: > > > > On Tue, 7 Apr 2020 at 13:55, Justin Pryzby wrote: > > > > > > > I don't have comments on your change other than the comments Amit > > already sent. Thank you for reviewing this part! > > > > I have made the modifications as per my comments. What do you think > about the attached? Thank you for updating the patch! Looks good to me. Regards, -- Masahiko Sawadahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: doc review for parallel vacuum
On Wed, Apr 8, 2020 at 12:49 PM Masahiko Sawada wrote: > > On Tue, 7 Apr 2020 at 13:55, Justin Pryzby wrote: > > > > I don't have comments on your change other than the comments Amit > already sent. Thank you for reviewing this part! > I have made the modifications as per my comments. What do you think about the attached? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com v4-0001-Comments-and-doc-fixes-for-commit-40d964ec99.patch Description: Binary data
Re: doc review for parallel vacuum
On Tue, 7 Apr 2020 at 13:55, Justin Pryzby wrote: > > On Tue, Apr 07, 2020 at 09:57:46AM +0530, Amit Kapila wrote: > > On Mon, Mar 23, 2020 at 10:34 AM Amit Kapila > > wrote: > > > On Sun, Mar 22, 2020 at 7:48 AM Justin Pryzby > > > wrote: > > > > > > > > Original, long thread > > > > https://www.postgresql.org/message-id/flat/CAA4eK1%2Bnw1FBK3_sDnW%2B7kB%2Bx4qbDJqetgqwYW8k2xv82RZ%2BKw%40mail.gmail.com#b1745ee853b137043e584b500b41300f > > > > > > > > > > I have comments/questions on the patches: > > > doc changes > > > - > > > 1. > > > > > > - Perform vacuum index and cleanup index phases of > > > VACUUM > > > + Perform vacuum index and index cleanup phases of > > > VACUUM > > > > > > Why is the proposed text improvement over what is already there? > > > > I have kept the existing text as it is for this case. > > Probably it should say "index vacuum and index cleanup", which is also what > the > comment in vacuumlazy.c says. > > > > 2. > > > If the > > > - PARALLEL option is omitted, then > > > - VACUUM decides the number of workers based on > > > number > > > - of indexes that support parallel vacuum operation on the relation > > > which > > > - is further limited by > > linkend="guc-max-parallel-workers-maintenance"/>. > > > - The index can participate in a parallel vacuum if and only if the > > > size > > > + PARALLEL option is omitted, then the number of > > > workers > > > + is determined based on the number of indexes that support parallel > > > vacuum > > > + operation on the relation, and is further limited by > > + linkend="guc-max-parallel-workers-maintenance"/>. > > > + An index can participate in parallel vacuum if and only if the size > > >of the index is more than > > linkend="guc-min-parallel-index-scan-size"/>. > > > > > > Here, it is not clear to me why the proposed text is better than what > > > we already have? > > Changed as per your suggestion. > > To answer your question: > > "VACUUM decides" doesn't sound consistent with docs. > > "based on {+the+} number" > => here, you're veritably missing an article... > > "relation which" technically means that the *relation* is "is further limited > by"... > > > > Patch changes > > > - > > > 1. > > > - * and index cleanup with parallel workers unless the parallel vacuum is > > > + * and index cleanup with parallel workers unless parallel vacuum is > > > > > > As per my understanding 'parallel vacuum' is a noun phrase, so we > > > should have a determiner before it. > > > > Changed as per your suggestion. > > Thanks (I think you meant an "article"). > > > > - * We allow each worker to update it as and when it has incurred any > > > cost and > > > + * We allow each worker to update it AS AND WHEN it has incurred any > > > cost and > > > > > > I don't see why it is necessary to make this part bold? We are using > > > it at one other place in the code in the way it is used here. > > > > > > > Kept the existing text as it is. > > I meant this as a question. I'm not sure what "as and when means" ? "If and > when" ? > > > I have combined both of your patches. Let me know if you have any > > more suggestions, otherwise, I am planning to push this tomorrow. > > In the meantime, I found some more issues, so I rebased on top of your patch > so > you can review it. > I don't have comments on your change other than the comments Amit already sent. Thank you for reviewing this part! Regards, -- Masahiko Sawadahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: doc review for parallel vacuum
On Tue, Apr 7, 2020 at 10:25 AM Justin Pryzby wrote: > > On Tue, Apr 07, 2020 at 09:57:46AM +0530, Amit Kapila wrote: > > On Mon, Mar 23, 2020 at 10:34 AM Amit Kapila > > wrote: > > > On Sun, Mar 22, 2020 at 7:48 AM Justin Pryzby > > > wrote: > > > > > > > > Original, long thread > > > > https://www.postgresql.org/message-id/flat/CAA4eK1%2Bnw1FBK3_sDnW%2B7kB%2Bx4qbDJqetgqwYW8k2xv82RZ%2BKw%40mail.gmail.com#b1745ee853b137043e584b500b41300f > > > > > > > > > > I have comments/questions on the patches: > > > doc changes > > > - > > > 1. > > > > > > - Perform vacuum index and cleanup index phases of > > > VACUUM > > > + Perform vacuum index and index cleanup phases of > > > VACUUM > > > > > > Why is the proposed text improvement over what is already there? > > > > I have kept the existing text as it is for this case. > > Probably it should say "index vacuum and index cleanup", which is also what > the > comment in vacuumlazy.c says. > Okay, that makes sense. > > > > - * We allow each worker to update it as and when it has incurred any > > > cost and > > > + * We allow each worker to update it AS AND WHEN it has incurred any > > > cost and > > > > > > I don't see why it is necessary to make this part bold? We are using > > > it at one other place in the code in the way it is used here. > > > > > > > Kept the existing text as it is. > > I meant this as a question. I'm not sure what "as and when means" ? "If and > when" ? > It means the "at the time when" worker performed any I/O. This has been used at some other place in code as well. > > I have combined both of your patches. Let me know if you have any > > more suggestions, otherwise, I am planning to push this tomorrow. > > In the meantime, I found some more issues, so I rebased on top of your patch > so > you can review it. > - The PARALLEL option is used only for vacuum purpose. - Even if this option is specified with ANALYZE option - it does not affect ANALYZE. + The PARALLEL option is used only for vacuum operations. + If specified along with ANALYZE, the behavior during + ANALYZE is unchanged. I think the proposed text makes the above text unclear especially "the behavior during ANALYZE is unchanged.". Basically this option has nothing to do with the behavior of vacuum or analyze. I think we should be more specific as the current text. * Copy the index bulk-deletion result returned from ambulkdelete and - * amvacuumcleanup to the DSM segment if it's the first time to get it - * from them, because they allocate it locally and it's possible that an - * index will be vacuumed by the different vacuum process at the next - * time. The copying of the result normally happens only after the first - * time of index vacuuming. From the second time, we pass the result on - * the DSM segment so that they then update it directly. + * amvacuumcleanup to the DSM segment if it's the first time to get a result + * from a worker, because workers allocate BulkDeleteResults locally, and it's possible that an + * index will be vacuumed by a different vacuum process the next + * time. This can be done by the leader backend as well, so we can't use workers terminology here. Also, I don't see the need to mention BulkDeleteResults. I will include some changes from this text. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: doc review for parallel vacuum
On Tue, Apr 07, 2020 at 09:57:46AM +0530, Amit Kapila wrote: > On Mon, Mar 23, 2020 at 10:34 AM Amit Kapila wrote: > > On Sun, Mar 22, 2020 at 7:48 AM Justin Pryzby wrote: > > > > > > Original, long thread > > > https://www.postgresql.org/message-id/flat/CAA4eK1%2Bnw1FBK3_sDnW%2B7kB%2Bx4qbDJqetgqwYW8k2xv82RZ%2BKw%40mail.gmail.com#b1745ee853b137043e584b500b41300f > > > > > > > I have comments/questions on the patches: > > doc changes > > - > > 1. > > > > - Perform vacuum index and cleanup index phases of > > VACUUM > > + Perform vacuum index and index cleanup phases of > > VACUUM > > > > Why is the proposed text improvement over what is already there? > > I have kept the existing text as it is for this case. Probably it should say "index vacuum and index cleanup", which is also what the comment in vacuumlazy.c says. > > 2. > > If the > > - PARALLEL option is omitted, then > > - VACUUM decides the number of workers based on > > number > > - of indexes that support parallel vacuum operation on the relation > > which > > - is further limited by > linkend="guc-max-parallel-workers-maintenance"/>. > > - The index can participate in a parallel vacuum if and only if the > > size > > + PARALLEL option is omitted, then the number of > > workers > > + is determined based on the number of indexes that support parallel > > vacuum > > + operation on the relation, and is further limited by > + linkend="guc-max-parallel-workers-maintenance"/>. > > + An index can participate in parallel vacuum if and only if the size > >of the index is more than > linkend="guc-min-parallel-index-scan-size"/>. > > > > Here, it is not clear to me why the proposed text is better than what > > we already have? > Changed as per your suggestion. To answer your question: "VACUUM decides" doesn't sound consistent with docs. "based on {+the+} number" => here, you're veritably missing an article... "relation which" technically means that the *relation* is "is further limited by"... > > Patch changes > > - > > 1. > > - * and index cleanup with parallel workers unless the parallel vacuum is > > + * and index cleanup with parallel workers unless parallel vacuum is > > > > As per my understanding 'parallel vacuum' is a noun phrase, so we > > should have a determiner before it. > > Changed as per your suggestion. Thanks (I think you meant an "article"). > > - * We allow each worker to update it as and when it has incurred any cost > > and > > + * We allow each worker to update it AS AND WHEN it has incurred any cost > > and > > > > I don't see why it is necessary to make this part bold? We are using > > it at one other place in the code in the way it is used here. > > > > Kept the existing text as it is. I meant this as a question. I'm not sure what "as and when means" ? "If and when" ? > I have combined both of your patches. Let me know if you have any > more suggestions, otherwise, I am planning to push this tomorrow. In the meantime, I found some more issues, so I rebased on top of your patch so you can review it. -- Justin >From 3551aba7445b46d60d2199d068c0e06a5828c5c3 Mon Sep 17 00:00:00 2001 From: Amit Kapila Date: Tue, 7 Apr 2020 09:48:28 +0530 Subject: [PATCH v3 1/2] Comments and doc fixes for commit 40d964ec99. Reported-by: Justin Pryzby Author: Justin Pryzby, with few changes by me Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/20200322021801.gb2...@telsasoft.com --- doc/src/sgml/ref/vacuum.sgml | 16 +++--- src/backend/access/heap/vacuumlazy.c | 30 +-- src/backend/access/transam/parallel.c | 2 +- src/backend/commands/vacuum.c | 20 +- src/include/commands/vacuum.h | 2 +- 5 files changed, 35 insertions(+), 35 deletions(-) diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml index 846056a353..b7e0a8af6b 100644 --- a/doc/src/sgml/ref/vacuum.sgml +++ b/doc/src/sgml/ref/vacuum.sgml @@ -234,13 +234,13 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ integer - background workers (for the detail of each vacuum phases, please + background workers (for the detail of each vacuum phase, please refer to ). If the - PARALLEL option is omitted, then - VACUUM decides the number of workers based on number - of indexes that support parallel vacuum operation on the relation which - is further limited by . - The index can participate in a parallel vacuum if and only if the size + PARALLEL option is omitted, then the number of workers + is determined based on the number of indexes that support parallel vacuum + operation on the relation, and is further limited by . + An index can participate in parallel vacuum if and only if the size of the index is more than . Please note that it is not guaranteed that the number
Re: doc review for parallel vacuum
On Mon, Mar 23, 2020 at 10:34 AM Amit Kapila wrote: > > On Sun, Mar 22, 2020 at 7:48 AM Justin Pryzby wrote: > > > > Original, long thread > > https://www.postgresql.org/message-id/flat/CAA4eK1%2Bnw1FBK3_sDnW%2B7kB%2Bx4qbDJqetgqwYW8k2xv82RZ%2BKw%40mail.gmail.com#b1745ee853b137043e584b500b41300f > > > > I have comments/questions on the patches: > doc changes > - > 1. > > - Perform vacuum index and cleanup index phases of > VACUUM > + Perform vacuum index and index cleanup phases of > VACUUM > > Why is the proposed text improvement over what is already there? > I have kept the existing text as it is for this case. > 2. > If the > - PARALLEL option is omitted, then > - VACUUM decides the number of workers based on number > - of indexes that support parallel vacuum operation on the relation which > - is further limited by linkend="guc-max-parallel-workers-maintenance"/>. > - The index can participate in a parallel vacuum if and only if the size > + PARALLEL option is omitted, then the number of > workers > + is determined based on the number of indexes that support parallel > vacuum > + operation on the relation, and is further limited by + linkend="guc-max-parallel-workers-maintenance"/>. > + An index can participate in parallel vacuum if and only if the size >of the index is more than linkend="guc-min-parallel-index-scan-size"/>. > > Here, it is not clear to me why the proposed text is better than what > we already have? > Changed as per your suggestion. > 3. > .. > - vacuum launches before starting each phase and exit at the end of > + vacuum are launched before the start of each phase and > terminate at the end of > > It is better to use 'exit' instead of 'ternimate' as we are not > forcing the workers to end rather we wait for them to exit. > I have used 'exit' instead of 'terminate' here. > > > Patch changes > - > 1. > - * and index cleanup with parallel workers unless the parallel vacuum is > + * and index cleanup with parallel workers unless parallel vacuum is > > As per my understanding 'parallel vacuum' is a noun phrase, so we > should have a determiner before it. > > 2. > - * Since writes are not allowed during the parallel mode, so we copy the > + * Since writes are not allowed during parallel mode, copy the > > Similar to above, I think here also we should have a determiner before > 'parallel mode'. > Changed as per your suggestion. > 3. > - * The basic idea of a cost-based vacuum delay for parallel vacuum is to > allow > - * each worker to sleep proportional to the work done by it. We achieve this > + * The basic idea of a cost-based delay for parallel vacuum is to force > + * each worker to sleep in proportion to the share of work it's done. > We achieve this > > I am not sure if it is an improvement to use 'to force' instead of 'to > allow' because there is another criteria as well which decides whether > the worker will sleep or not. I am also not sure if the second change > (share of work it's) in this sentence is a clear improvement. > I have used 'to allow' in above text, otherwise, acceptted your suggestions. > 4. > - * We allow each worker to update it as and when it has incurred any cost and > + * We allow each worker to update it AS AND WHEN it has incurred any cost and > > I don't see why it is necessary to make this part bold? We are using > it at one other place in the code in the way it is used here. > Kept the existing text as it is. > 5. > - * We allow any worker to sleep only if it has performed the I/O above a > + * We force a worker to sleep only if it has performed I/O above a > * certain threshold > > Hmm, again I am not sure if we should use 'force' here instead of 'allow'. > Kept the usage of 'allow'. I have combined both of your patches. Let me know if you have any more suggestions, otherwise, I am planning to push this tomorrow. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com 0001-Comments-and-doc-fixes-for-commit-40d964ec99.patch Description: Binary data
Re: doc review for parallel vacuum
On Sun, Mar 22, 2020 at 7:48 AM Justin Pryzby wrote: > > Original, long thread > https://www.postgresql.org/message-id/flat/CAA4eK1%2Bnw1FBK3_sDnW%2B7kB%2Bx4qbDJqetgqwYW8k2xv82RZ%2BKw%40mail.gmail.com#b1745ee853b137043e584b500b41300f > I have comments/questions on the patches: doc changes - 1. - Perform vacuum index and cleanup index phases of VACUUM + Perform vacuum index and index cleanup phases of VACUUM Why is the proposed text improvement over what is already there? 2. If the - PARALLEL option is omitted, then - VACUUM decides the number of workers based on number - of indexes that support parallel vacuum operation on the relation which - is further limited by . - The index can participate in a parallel vacuum if and only if the size + PARALLEL option is omitted, then the number of workers + is determined based on the number of indexes that support parallel vacuum + operation on the relation, and is further limited by . + An index can participate in parallel vacuum if and only if the size of the index is more than . Here, it is not clear to me why the proposed text is better than what we already have? 3. .. - vacuum launches before starting each phase and exit at the end of + vacuum are launched before the start of each phase and terminate at the end of It is better to use 'exit' instead of 'ternimate' as we are not forcing the workers to end rather we wait for them to exit. Patch changes - 1. - * and index cleanup with parallel workers unless the parallel vacuum is + * and index cleanup with parallel workers unless parallel vacuum is As per my understanding 'parallel vacuum' is a noun phrase, so we should have a determiner before it. 2. - * Since writes are not allowed during the parallel mode, so we copy the + * Since writes are not allowed during parallel mode, copy the Similar to above, I think here also we should have a determiner before 'parallel mode'. 3. - * The basic idea of a cost-based vacuum delay for parallel vacuum is to allow - * each worker to sleep proportional to the work done by it. We achieve this + * The basic idea of a cost-based delay for parallel vacuum is to force + * each worker to sleep in proportion to the share of work it's done. We achieve this I am not sure if it is an improvement to use 'to force' instead of 'to allow' because there is another criteria as well which decides whether the worker will sleep or not. I am also not sure if the second change (share of work it's) in this sentence is a clear improvement. 4. - * We allow each worker to update it as and when it has incurred any cost and + * We allow each worker to update it AS AND WHEN it has incurred any cost and I don't see why it is necessary to make this part bold? We are using it at one other place in the code in the way it is used here. 5. - * We allow any worker to sleep only if it has performed the I/O above a + * We force a worker to sleep only if it has performed I/O above a * certain threshold Hmm, again I am not sure if we should use 'force' here instead of 'allow'. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com