Yamada-san, On Fri, Dec 6, 2019 at 3:24 PM Tatsuro Yamada <tatsuro.yamada...@nttcom.co.jp> wrote: >>> 1) > >>> For now, I'm not sure it should be set current_child_table_relid to zero > >>> when the current phase is changed from "acquiring inherited sample rows" > >>> to > >>> "computing stats". See <Test result> bellow. > >> > >> In the upthread discussion [1], Robert asked to *not* do such things, > >> that is, resetting some values due to phase change. I'm not sure his > >> point applies to this case too though. > > //Below "computing stats" is for the partitioned table hoge, > //therefore the second column from the left side would be > //better to set Zero to easy to understand.
On second thought, maybe we should not reset, because that might be considered loss of information. To avoid confusion, we should simply document that current_child_table_relid is only valid during the "acquiring inherited sample rows" phase. > >>> 2) > >>> There are many "finalizing analyze" phases based on relids in the case > >>> of partitioning tables. Would it better to fix the document? or it > >>> would be better to reduce it to one? > >>> > It would be better to modify the document of "finalizing analyze" phase. > > # Before modify > The command is updating pg_class. When this phase is completed, > <command>ANALYZE</command> will end. > > # Modified > The command is updating pg_class. When this phase is completed, > <command>ANALYZE</command> will end. In the case of partitioned table, > it might be shown on each partitions. > > What do you think that? I'm going to fix it, if you agreed. :) *All* phases are repeated in this case, not not just "finalizing analyze", because ANALYZE repeatedly runs for each partition after the parent partitioned table's ANALYZE finishes. ANALYZE's documentation mentions that analyzing a partitioned table also analyzes all of its partitions, so users should expect to see the progress information for each partition. So, I don't think we need to clarify that if only in one phase's description. Maybe we can add a note after the phase description table which mentions this implementation detail about partitioned tables. Like this: <note> <para> Note that when <command>ANALYZE</command> is run on a partitioned table, all of its partitions are also recursively analyzed as also mentioned on <xref linkend="sql-analyze"/>. In that case, <command>ANALYZE</command> progress is reported first for the parent table, whereby its inheritance statistics are collected, followed by that for each partition. </para> </note> Some more comments on the documentation: + Number of computed extended stats. This counter only advances when the phase + is <literal>computing extended stats</literal>. Number of computer extended stats -> Number of extended stats computed + Number of analyzed child tables. This counter only advances when the phase + is <literal>computing extended stats</literal>. Regarding, "Number of analyzed child table", note that we don't "analyze" child tables in this phase, only scan its blocks to collect samples for parent's ANALYZE. Also, the 2nd sentence is wrong -- you meant "when the phase is <literal>acquiring inherited sample rows</literal>. I suggest to write this as follows: Number of child tables scanned. This counter only advances when the phase is <literal>acquiring inherited sample rows</literal>. + <entry>OID of the child table currently being scanned. + It might be different from relid when analyzing tables that have child tables. I suggest: OID of the child table currently being scanned. This field is only valid when the phase is <literal>computing extended stats</literal>. + The command is currently scanning the <structfield>current_relid</structfield> + to obtain samples. I suggest: The command is currently scanning the the table given by <structfield>current_relid</structfield> to obtain sample rows. + The command is currently scanning the <structfield>current_child_table_relid</structfield> + to obtain samples. I suggest (based on phase description pg_stat_progress_create_index phase descriptions): The command is currently scanning child tables to obtain sample rows. Columns <structfield>child_tables_total</structfield>, <structfield>child_tables_done</structfield>, and <structfield>current_child_table_relid</structfield> contain the progress information for this phase. + <row> + <entry><literal>computing stats</literal></entry> I think the phase name should really be "computing statistics", that is, use the full word. + <entry> + The command is computing stats from the samples obtained during the table scan. + </entry> + </row> So I suggest: The command is computing statistics from the sample rows obtained during the table scan + <entry><literal>computing extended stats</literal></entry> + <entry> + The command is computing extended stats from the samples obtained in the previous phase. + </entry> I suggest: The command is computing extended statistics from the sample rows obtained during the table scan. Thanks, Amit