Hello.
At Tue, 9 Jul 2019 17:38:44 +0900, Tatsuro Yamada
<[email protected]> wrote in
<[email protected]>
> Hi Alvaro, Anthony, Julien and Robert,
>
> On 2019/07/09 3:47, Julien Rouhaud wrote:
> > On Mon, Jul 8, 2019 at 8:44 PM Robert Haas <[email protected]>
> > wrote:
> >>
> >> On Mon, Jul 8, 2019 at 2:18 PM Alvaro Herrera
> >> <[email protected]> wrote:
> >>> Yeah, I got the impression that that was determined to be the
> >>> desirable
> >>> behavior, so I made it do that, but I'm not really happy about it
> >>> either. We're not too late to change the CREATE INDEX behavior, but
> >>> let's discuss what is it that we want.
> >>
> >> I don't think I intended to make any such determination -- which
> >> commit do you think established this as the canonical behavior?
> >>
> >> I propose that once a field is set, we should leave it set until the
> >> end.
> > +1
> > Note that this patch is already behaving like that if the table only
> > contains dead rows.
+1 from me.
> I fixed the patch including:
>
> - Replace "if" to "else if". (Suggested by Julien)
> - Fix typo s/ech/each/. (Suggested by Anthony)
> - Add Phase "analyzing complete" in the pgstat view. (Suggested by
> - Julien, Robert and me)
> It was overlooked to add it in system_views.sql.
>
> I share my re-test result, see below:
>
> ---------------------------------------------------------
> [Session #1]
> create table hoge as select * from generate_series(1, 1000000) a;
> analyze verbose hoge;
>
> [Session #2]
> \a \t
> select * from pg_stat_progress_analyze; \watch 0.001
>
> 3785|13599|postgres|16384|f|16384|scanning table|4425|6
> 3785|13599|postgres|16384|f|16384|scanning table|4425|31
> 3785|13599|postgres|16384|f|16384|scanning table|4425|70
> 3785|13599|postgres|16384|f|16384|scanning table|4425|109
> ...
> 3785|13599|postgres|16384|f|16384|scanning table|4425|4425
> 3785|13599|postgres|16384|f|16384|scanning table|4425|4425
> 3785|13599|postgres|16384|f|16384|scanning table|4425|4425
> 3785|13599|postgres|16384|f|16384|analyzing sample|0|0
> 3785|13599|postgres|16384|f|16384|analyzing complete|0|0 <-- Added and
> fixed. :)
> ---------------------------------------------------------
I have some comments.
+ const int index[] = {
+ PROGRESS_ANALYZE_PHASE,
+ PROGRESS_ANALYZE_TOTAL_BLOCKS,
+ PROGRESS_ANALYZE_BLOCKS_DONE
+ };
+ const int64 val[] = {
+ PROGRESS_ANALYZE_PHASE_ANALYSIS,
+ 0, 0
Do you oppose to leaving the total/done blocks alone here:p?
+ BlockNumber nblocks;
+ double blksdone = 0;
Why is it a double even though blksdone is of the same domain
with nblocks? And finally:
+ pgstat_progress_update_param(PROGRESS_ANALYZE_BLOCKS_DONE,
+ ++blksdone);
It is converted into int64.
+ WHEN 2 THEN 'analyzing sample'
+ WHEN 3 THEN 'analyzing sample (extended stats)'
I think we should avoid parenthesized phrases as far as
not-necessary. That makes the column unnecessarily long. The
phase is internally called as "compute stats" so *I* would prefer
something like the followings:
+ WHEN 2 THEN 'computing stats'
+ WHEN 3 THEN 'computing extended stats'
+ WHEN 4 THEN 'analyzing complete'
And if you are intending by this that (as mentioned in the
documentation) "working to complete this analyze", this would be:
+ WHEN 4 THEN 'completing analyze'
+ WHEN 4 THEN 'finalizing analyze'
+ <entry>Process ID of backend.</entry>
of "the" backend. ? And period is not attached on all
descriptions consisting of a single sentence.
+ <entry>OID of the database to which this backend is connected.</entry>
+ <entry>Name of the database to which this backend is connected.</entry>
"database to which .. is connected" is phrased as "database this
backend is connected to" in pg_stat_acttivity. (Just for consistency)
+ <entry>Whether the current scan includes legacy inheritance
children.</entry>
This apparently excludes partition tables but actually it is
included.
"Whether scanning through child tables" ?
I'm not sure "child tables" is established as the word to mean
both "partition tables and inheritance children"..
+ The table being scanned (differs from <literal>relid</literal>
+ only when processing partitions or inheritance children).
Is <literal> needed? And I think the parentheses are not needed.
OID of the table currently being scanned. Can differ from relid
when analyzing tables that have child tables.
+ Total number of heap blocks to scan in the current table.
Number of heap blocks on scanning_table to scan?
It might be better that this description describes that this
and the next column is meaningful only while the phase
"scanning table".
+ The command is currently scanning the table.
"sample(s)" comes somewhat abruptly in the next item. Something
like "The command is currently scanning the table
<structfield>scanning_table</structfield> to obtain samples"
might be better.
+ <command>ANALYZE</command> is currently extracting statistical data
+ from the sample obtained.
Something like "The command is computing stats from the samples
obtained in the previous phase" might be better.
regards.
-
Kyotaro Horiguchi
NTT Open Source Software Center