Hi,

On 2023-04-27 11:36:49 -0400, Melanie Plageman wrote:
> > >               /* and finally tell the kernel to write the data to storage 
> > > */
> > >               reln = smgropen(currlocator, InvalidBackendId);
> > >               smgrwriteback(reln, BufTagGetForkNum(&tag), tag.blockNum, 
> > > nblocks);
> 
> Yes, as it is currently, IssuePendingWritebacks() is only used for shared
> buffers. My rationale for including IOObject is that localbuf.c calls
> smgr* functions and there isn't anything stopping it from calling
> smgrwriteback() or using WritebackContexts (AFAICT).

I think it's extremely unlikely that we'll ever do that, because it's very
common to have temp tables that are bigger than temp_buffers. We basically
hope that the kernel can do good caching for us there.


> > Or I actually think we might not even need to pass around the io_*
> > parameters and could just pass immediate values to the
> > pgstat_count_io_op_time call. If we ever start using shared buffers
> > for thing other than relation files (for example SLRU?), we'll have to
> > consider the target individually for each buffer block. That being
> > said, I'm fine with how it is either.
> 
> In IssuePendingWritebacks() we don't actually know which IOContext we
> are issuing writebacks for when we call pgstat_count_io_op_time() (we do
> issue pending writebacks for other IOContexts than IOCONTEXT_NORMAL). I
> agree IOObject is not strictly necessary right now. I've kept IOObject a
> member of WritebackContext for the reasons I mention above, however, I
> am open to removing it if it adds confusion.

I don't think it's really worth adding struct members for potential future
safety. We can just add them later if we end up needing them.


> From 7cdd6fc78ed82180a705ab9667714f80d08c5f7d Mon Sep 17 00:00:00 2001
> From: Melanie Plageman <melanieplage...@gmail.com>
> Date: Mon, 24 Apr 2023 18:21:54 -0400
> Subject: [PATCH v4] Add writeback to pg_stat_io
> 
> 28e626bde00 added the notion of IOOps but neglected to include
> writeback. With the addition of IO timing to pg_stat_io in ac8d53dae5,
> the omission of writeback caused some confusion. Checkpointer write
> timing in pg_stat_io often differed greatly from the write timing
> written to the log. To fix this, add IOOp IOOP_WRITEBACK and track
> writebacks and writeback timing in pg_stat_io.

For the future: It'd be good to note that catversion needs to be increased.

Off-topic: I've been wondering about computing a "catversion hash" based on
all the files going into initdb. At least to have some tooling to detect
missing catversion increases...


> index 99f7f95c39..27b6f1a0a0 100644
> --- a/doc/src/sgml/monitoring.sgml
> +++ b/doc/src/sgml/monitoring.sgml
> @@ -3867,6 +3867,32 @@ SELECT pid, wait_event_type, wait_event FROM 
> pg_stat_activity WHERE wait_event i
>        </entry>
>       </row>
>  
> +     <row>
> +      <entry role="catalog_table_entry">
> +       <para role="column_definition">
> +        <structfield>writebacks</structfield> <type>bigint</type>
> +       </para>
> +       <para>
> +        Number of units of size <varname>op_bytes</varname> which the backend
> +        requested the kernel write out to permanent storage.
> +       </para>
> +      </entry>
> +     </row>

I think the reference to "backend" here is somewhat misplaced - it could be
checkpointer or bgwriter as well. We don't reference the backend in other
comparable columns of pgsio either...


> diff --git a/src/backend/storage/buffer/buf_init.c 
> b/src/backend/storage/buffer/buf_init.c
> index 0057443f0c..a7182fe95a 100644
> --- a/src/backend/storage/buffer/buf_init.c
> +++ b/src/backend/storage/buffer/buf_init.c
> @@ -145,9 +145,15 @@ InitBufferPool(void)
>       /* Init other shared buffer-management stuff */
>       StrategyInitialize(!foundDescs);
>  
> -     /* Initialize per-backend file flush context */
> -     WritebackContextInit(&BackendWritebackContext,
> -                                              &backend_flush_after);
> +     /*
> +      * Initialize per-backend file flush context. IOContext is initialized 
> to
> +      * IOCONTEXT_NORMAL because this is the most common context. IOObject is
> +      * initialized to IOOBJECT_RELATION because writeback is currently only
> +      * requested for permanent relations in shared buffers. The backend can
> +      * overwrite these as appropriate.
> +      */
> +     WritebackContextInit(&BackendWritebackContext, IOOBJECT_RELATION,
> +                                              IOCONTEXT_NORMAL, 
> &backend_flush_after);
>  }
>

This seems somewhat icky.


>  /*
> diff --git a/src/backend/storage/buffer/bufmgr.c 
> b/src/backend/storage/buffer/bufmgr.c
> index 1fa689052e..116910cdfe 100644
> --- a/src/backend/storage/buffer/bufmgr.c
> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -1685,6 +1685,8 @@ again:
>               FlushBuffer(buf_hdr, NULL, IOOBJECT_RELATION, io_context);
>               LWLockRelease(content_lock);
>  
> +             BackendWritebackContext.io_object = IOOBJECT_RELATION;
> +             BackendWritebackContext.io_context = io_context;
>               ScheduleBufferTagForWriteback(&BackendWritebackContext,
>                                                                         
> &buf_hdr->tag);
>       }

What about passing the io_context to ScheduleBufferTagForWriteback instead?


> --- a/src/test/regress/sql/stats.sql
> +++ b/src/test/regress/sql/stats.sql

Hm. Could we add a test for this? While it's not implemented everywhere, we
still issue the smgrwriteback() afaics. The default for the _flush_after GUCs
changes, but backend_flush_after is USERSET, so we could just change it for a
single command.

Greetings,

Andres Freund


Reply via email to