Joseph Koshakow 2025-03-21 01:25:
Hi I am working with Aidar to give a review and I am also a beginner
reviewer.

From 813e5ec0da4c65970b4b1ce2ec2918e4652da9ab Mon Sep 17 00:00:00
2001
From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Date: Fri, 20 Dec 2024 14:06:47 +0300
Subject: [PATCH v1 1/2] Add pg_buffercache_evict_all() function for
testing
  */
 bool
-EvictUnpinnedBuffer(Buffer buf)
+EvictUnpinnedBuffer(Buffer buf, bool *flushed)

I think you should update the comment above this function to include
details on the new `flushed` argument.

diff --git a/doc/src/sgml/pgbuffercache.sgml
b/doc/src/sgml/pgbuffercache.sgml
index 802a5112d77..83950ca5cce 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -31,8 +31,10 @@
   This module provides the
<function>pg_buffercache_pages()</function>
   function (wrapped in the <structname>pg_buffercache</structname>
view),
   the <function>pg_buffercache_summary()</function> function, the
-  <function>pg_buffercache_usage_counts()</function> function and
-  the <function>pg_buffercache_evict()</function> function.
+  <function>pg_buffercache_usage_counts()</function> function, the
+  <function>pg_buffercache_evict()</function>, the
+  <function>pg_buffercache_evict_relation()</function>, and the
+  <function>pg_buffercache_evict_all()</function> function.

All the other functions have indexterm tags above this, should we add
indexterms for the new functions?

+ <sect2 id="pgbuffercache-pg-buffercache-evict-relation">
+  <title>The <structname>pg_buffercache_evict_relation</structname>
Function</title>
+  <para>
+   The <function>pg_buffercache_evict_relation()</function>
function is very similar
+   to <function>pg_buffercache_evict()</function> function.  The
difference is that
+   <function>pg_buffercache_evict_relation()</function> takes a
relation
+   identifier instead of buffer identifier.  Then, it tries to
evict all
+   buffers in that relation.  The function is intended for
developer testing only.
+  </para>
+ </sect2>
+
+ <sect2 id="pgbuffercache-pg-buffercache-evict-all">
+  <title>The <structname>pg_buffercache_evict_all</structname>
Function</title>
+  <para>
+   The <function>pg_buffercache_evict_all()</function> function is
very similar
+   to <function>pg_buffercache_evict()</function> function.  The
difference is,
+   the <function>pg_buffercache_evict_all()</function> does not
take argument;
+   instead it tries to evict all buffers in the buffer pool.  The
function is
+   intended for developer testing only.
+  </para>
+ </sect2>

The other difference is that these new functions have an additional
output argument to indicate if the buffer was flushed which we
probably
want to mention here.

Also I think it's more gramatically correct to say "the <fn-name>
function" or just "<fn-name>". A couple of times you say "<fn-name>
function" or "the <fn-name>".

Also I think the third to last sentence should end with "... does not
take **an** argument" or "... does not take argument**s**".

+CREATE FUNCTION pg_buffercache_evict_relation(
+    IN regclass,
+    IN fork text default 'main',
+    OUT buffers_evicted int4,
+    OUT buffers_flushed int4)
+AS 'MODULE_PATHNAME', 'pg_buffercache_evict_relation'
+LANGUAGE C PARALLEL SAFE VOLATILE;
+
+CREATE FUNCTION pg_buffercache_evict_all(
+    OUT buffers_evicted int4,
+    OUT buffers_flushed int4)
+AS 'MODULE_PATHNAME', 'pg_buffercache_evict_all'
+LANGUAGE C PARALLEL SAFE VOLATILE;

Does it make sense to also update pg_buffercache_evict to also return
a bool if the buffer was flushed? Or is that too much of a breaking
change?

+ /* Lock the header and check if it's valid. */
+ buf_state = LockBufHdr(desc);
+ if ((buf_state & BM_VALID) == 0)
+ {
+ UnlockBufHdr(desc, buf_state);
+ return false;
+ }

EvictUnpinnedBuffer first checks if the buffer is locked before
attempting to lock it. Do we need a similar check here?

   /* Lock the header if it is not already locked. */
   if (!buf_state)
    buf_state = LockBufHdr(desc);

I don't think *flushed is necessarily accurate this way, as it
might detect
that it doesn't need to flush the data (via StartBufferIO()
returning false).

You are right. It seems there is no way to get that information
without editing the FlushBuffer(), right? And I think editing
FlushBuffer() is not worth it. So, I can either remove it or explain
in the docs that #buffers_flushed may not be completely accurate.

There's already a lot of caveats on EvictUnpinnedBuffer that it might
have unexpected results when run concurrently, so I think it's fine to
add one more.

Thanks,
Joe Koshakow

I agree with most of what Joseph said. However, I would like to add some
comments.

At the moment, the "flushed" flag essentially indicates whether the buffer was dirty at the time of eviction and it does not guarantee that it has been written to disk. Therefore, it would be better to rename the buffers_flushed
column in the output of pg_buffer_cache_evict_all() and
pg_buffercache_evict_relation() functions to dirty_buffers mb? This would allow us to avoid the confusion that arises from the fact that not all dirty buffers could have actually been written to disk. In addition, this would
remove the "flushed" parameter from the EvictUnpinnedBuffer() function.
Because if we explicitly call LockBufHdr() outside of EvictUnpinnedBuffer(),
we can already know in advance whether the buffer is dirty or not.

The same applies to the suggestion to retrieve "flushed" count from the
pg_buffercache_evict() call. We cannot say this for certain, but we can
determine whether the buffer was dirty.

regards,
Aidar Imamov


Reply via email to