On Tue, Mar 28, 2017 at 1:40 PM, Haribabu Kommi
<kommi.harib...@gmail.com> wrote:
>
>
> On Mon, Mar 27, 2017 at 1:27 PM, Haribabu Kommi <kommi.harib...@gmail.com>
> wrote:
>>
>>
>>
>> On Sat, Mar 25, 2017 at 6:40 AM, Fujii Masao <masao.fu...@gmail.com>
>> wrote:
>>>
>>> On Wed, Feb 15, 2017 at 12:53 PM, Haribabu Kommi
>>> <kommi.harib...@gmail.com> wrote:
>>> >
>>> >
>>> > On Wed, Feb 8, 2017 at 9:36 PM, Amit Kapila <amit.kapil...@gmail.com>
>>> > wrote:
>>> >>
>>> >> On Tue, Feb 7, 2017 at 11:47 AM, Haribabu Kommi
>>> >> <kommi.harib...@gmail.com> wrote:
>>> >> > Hi Hackers,
>>> >> >
>>> >> > I just want to discuss adding of a new statistics view that provides
>>> >> > the information of wal writing details as follows
>>> >> >
>>> >>
>>> >> +1.  I think it will be useful to observe WAL activity.
>>> >
>>> >
>>> > Thanks for your opinion.
>>> >
>>> >> > postgres=# \d pg_stat_wal_writer
>>> >> >                         View "pg_catalog.pg_stat_wal_writer"
>>> >> >         Column         |           Type           | Collation |
>>> >> > Nullable
>>> >> > |
>>> >> > Default
>>> >> >
>>> >> >
>>> >> > -----------------------+--------------------------+-----------+----------+---------
>>> >> >  num_backend_writes       | bigint                   |           |
>>> >> > |
>>> >> >  num_total_writes  | bigint                   |           |
>>> >> > |
>>> >> >  num_blocks  | bigint                   |           |          |
>>> >> >  total_write_time   | bigint|           |          |
>>> >> >  stats_reset           | timestamp with time zone |           |
>>> >> > |
>>> >> >
>>> >> > The columns of the view are
>>> >> > 1. Total number of xlog writes that are called from the backend.
>>> >> > 2. Total number of xlog writes that are called from both backend
>>> >> >  and background workers. (This column can be changed to just
>>> >> >  display on the background writes).
>>> >> > 3. The number of the blocks that are written.
>>> >> > 4. Total write_time of the IO operation it took, this variable data
>>> >> > is
>>> >> > filled only when the track_io_timing GUC is enabled.
>>> >>
>>> >> So, here is *write_time* the total time system has spent in WAL
>>> >> writing before the last reset?
>>> >
>>> >
>>> > total write_time spent in WAL writing "after" the last reset in
>>> > milliseconds.
>>> >
>>> >> I think there should be a separate column for write and sync time.
>>> >>
>>> >
>>> > Added.
>>> >
>>> >>
>>> >> > Or it is possible to integrate the new columns into the existing
>>> >> > pg_stat_bgwriter view also.
>>> >> >
>>> >>
>>> >> I feel separate view is better.
>>> >
>>> >
>>> > Ok.
>>> >
>>> > Following the sample out of the view after regress run.
>>> >
>>> > postgres=# select * from pg_stat_walwrites;
>>> > -[ RECORD 1 ]--+------------------------------
>>> > backend_writes | 19092
>>> > writes         | 663
>>> > write_blocks   | 56116
>>> > write_time     | 0
>>> > sync_time      | 3064
>>> > stats_reset    | 2017-02-15 13:37:09.454314+11
>>> >
>>> > Currently, writer, walwriter and checkpointer processes
>>> > are considered as background processes that can do
>>> > the wal write mainly.
>>
>>
>> Thanks for the review.
>>
>>> I'm not sure if this categorization is good. You told that this view is
>>> useful
>>> to tune walwriter parameters at the top of this thread. If so, ISTM that
>>> the information about walwriter's activity should be separated from
>>> others.
>>
>>
>> Yes, that's correct. First I thought of providing the statistics of
>> walwriter, but
>> later in development, it turned into showing statistics of all wal write
>> activity
>> of background processes also to differentiate the actual write by the
>> backends.
>>
>>>
>>> What about other processes which *can* write WAL, for example walsender
>>> (e.g., BASE_BACKUP can cause WAL record), startup process (e.g., end-of-
>>> recovery checkpoint) and logical replication worker (Not sure if it
>>> always
>>> works with synchronous_commit=off, though). There might be other
>>> processes
>>> which can write WAL
>>
>>
>> It is possible to add the walsender, stratup and other processes easily,
>> but not
>> background workers that does some wal write operations until unless they
>> report the stats with pgstat_report_stat(). Is it fine to ignore the
>> workers that
>> does not report the stats?
>
>
> Added stats collection for walsender, statrup and autovacuum processes.
> The background workers that call pgstat_report_stat() function will
> automatically
> included.

So you added pgstat_send_walwrites() into several places. But WAL activity
by the background workers which don't call pgstat_report_stat() is still not
considered in pg_stat_walwrites view. This design looks fragile to me.
I think that we should capture all the WAL activities (even by such background
workers) in XLogWrite(), instead.

For example, we can add the "shared" counters (protected by WALWriteLock)
into XLogCtl and make XLogWrite() update those counters if the process is
NOT walwriter. OTOH, if it's walwriter, XLogWrite() updates its local counters.
Then walwriter periodically packs those counters in the message and sends
it to the stats collector.

>
>>
>>
>>>
>>> Why didn't you separate "write_blocks", "write_time" and "sync_time" per
>>> the process category, like "backend_writes" and "writes"?
>>
>>
>> Ok. I will add those columns.
>
>
> Added the new columns to differentiate backend and background processes.
>
>>> This view doesn't seem to take into consideration the WAL writing and
>>> flushing
>>> during creating new WAL segment file.
>>>
>>> I think that it's better to make this view report also the number of WAL
>>> pages
>>> which are written when wal_buffer is full. This information is useful to
>>> tune the size of wal_buffers. This was proposed by Nagayasu before.
>>> https://www.postgresql.org/message-id/4ff824f3.5090...@uptime.jp
>>
>>
>> Ok. But this new column just shows how many times the WAL buffers are
>> flushed
>> because of wal buffers are full. Not the WAL pages that are actually
>> flushed because
>> of wal buffers full as a separate column.
>
>
> Added a new column to display the dirty writes from both backend and
> background processes.
>
> update patch attached.

Thanks for updating the patch!

+      <entry><structfield>writes</></entry>
+      <entry><type>bigint</type></entry>
+      <entry>
+      Number of WAL writes that are carried out by background
processes and workers.
+      </entry>
<snip>
+      <entry><structfield>write_blocks</></entry>
+      <entry><type>bigint</type></entry>
+      <entry>Number of WAL pages written to the disk by the
background processes/workers</entry>
+     </row>

Could you tell me why both numbers of WAL writes and pages written need to
be reported? How can we tune the system by using those information?

Instead of (or in addition to) the number of WAL pages, we should report
the number of *bytes* of WAL records actually written?

Anyway, we need more and more opinions about what information of WAL activity
should be exposed in the view and how. But we have only a few days left for
this CommitFest...

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to