On Tuesday, September 04, 2012 12:42 AM Jeff Janes wrote:
On Mon, Sep 3, 2012 at 7:15 AM, Amit kapila <amit.kap...@huawei.com> wrote:
>> This patch is based on below Todo Item:
>
>> Consider adding buffers the background writer finds reusable to the free
>> list
>
>
>
>> I have tried implementing it and taken the readings for Select when all the
>> data is in either OS buffers
>
>> or Shared Buffers.
>
>
>
>> The Patch has simple implementation for "bgwriter or checkpoint process
>> moving the unused buffers (unpinned with "ZERO" usage_count buffers) into
>> "freelist".
> I don't think InvalidateBuffer can be safely used in this way. It
> says "We assume
> that no other backend could possibly be interested in using the page",
> which is not true here.
As I understood and anlyzed based on above, that there is problem in attached
patch such that in function
InvalidateBuffer(), after UnlockBufHdr() and before PartitionLock if some
backend uses that buffer and increase the usage count to 1, still
InvalidateBuffer() will remove the buffer from hash table and put it in
Freelist.
I have modified the code to address above by checking refcount & usage_count
inside Partition Lock
, LockBufHdr and only after that move it to freelist which is similar to
InvalidateBuffer.
In actual code we can optimize the current code by using extra parameter in
InvalidateBuffer.
Please let me know if I understood you correctly or you want to say something
else by above comment?
> Also, do we want to actually invalidate the buffers? If someone does
> happen to want one after it is put on the freelist, making it read it
> in again into a different buffer doesn't seem like a nice thing to do,
> rather than just letting it reclaim it.
But even if bgwriter/checkpoint don't do, Backend needing new buffer will do
similar things (remove from hash table) for this buffer as this is nextvictim
buffer.
The main intention of doing the MoveBufferToFreeList is to avoid contention of
Partition Locks and BufFreeListLock among backends, which
has given Performance improvement in high contention scenarios.
One problem I could see with proposed change is that in some cases the usage
count will get decrement for a buffer allocated
from free list immediately as it can be nextvictimbuffer.
However there can be solution to this problem.
Can you suggest some scenario's where I should do more performance test?
With Regards,
Amit Kapila.
diff --git a/src/backend/storage/buffer/bufmgr.c
b/src/backend/storage/buffer/bufmgr.c
index dba19eb..87446cb 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -955,6 +955,88 @@ retry:
StrategyFreeBuffer(buf);
}
+
+/*
+ * MoveBufferToFreeList -- mark a shared buffer invalid and return it to the
+ * freelist. which is similar to InvalidateBuffer function.
+ */
+static void
+MoveBufferToFreeList(volatile BufferDesc *buf)
+{
+ BufferTag oldTag;
+ uint32 oldHash; /* hash value
for oldTag */
+ LWLockId oldPartitionLock; /* buffer partition
lock for it */
+ BufFlags oldFlags;
+
+ /* Save the original buffer tag before dropping the spinlock */
+ oldTag = buf->tag;
+
+ UnlockBufHdr(buf);
+
+ /*
+ * Need to compute the old tag's hashcode and partition lock ID. XXX is
it
+ * worth storing the hashcode in BufferDesc so we need not recompute it
+ * here? Probably not.
+ */
+ oldHash = BufTableHashCode(&oldTag);
+ oldPartitionLock = BufMappingPartitionLock(oldHash);
+
+
+ /*
+ * Acquire exclusive mapping lock in preparation for changing the
buffer's
+ * association.
+ */
+ LWLockAcquire(oldPartitionLock, LW_EXCLUSIVE);
+
+ /* Re-lock the buffer header */
+ LockBufHdr(buf);
+
+ /* If it's changed while we were waiting for lock, do nothing */
+ if (!BUFFERTAGS_EQUAL(buf->tag, oldTag))
+ {
+ UnlockBufHdr(buf);
+ LWLockRelease(oldPartitionLock);
+ return;
+ }
+
+ /*
+ * Validate wheather we can add the buffer into freelist or not
+ */
+ if ((buf->refcount != 0) || (buf->usage_count != 0))
+ {
+ UnlockBufHdr(buf);
+ LWLockRelease(oldPartitionLock);
+ return;
+ }
+
+ /*
+ * Clear out the buffer's tag and flags. We must do this to ensure that
+ * linear scans of the buffer array don't think the buffer is valid.
+ */
+ oldFlags = buf->flags;
+ CLEAR_BUFFERTAG(buf->tag);
+ buf->flags = 0;
+ buf->usage_count = 0;
+
+ UnlockBufHdr(buf);
+
+ /*
+ * Remove the buffer from the lookup hashtable, if it was in there.
+ */
+ if (oldFlags & BM_TAG_VALID)
+ BufTableDelete(&oldTag, oldHash);
+
+ /*
+ * Done with mapping lock.
+ */
+ LWLockRelease(oldPartitionLock);
+
+ /*
+ * Insert the buffer at the head of the list of free buffers.
+ */
+ StrategyFreeBuffer(buf);
+}
+
/*
* MarkBufferDirty
*
@@ -1658,12 +1740,29 @@ SyncOneBuffer(int buf_id, bool skip_recently_used)
return result;
}
- if (!(bufHdr->flags & BM_VALID) || !(bufHdr->flags & BM_DIRTY))
+ if (!(bufHdr->flags & BM_VALID))
{
/* It's clean, so nothing to do */
UnlockBufHdr(bufHdr);
return result;
}
+ else if (!(bufHdr->flags & BM_DIRTY))
+ {
+ /*
+ * It's clean, so nothing to flush;
+ * if buffer is unused then move it to freelist
+ */
+ if ((bufHdr->refcount == 0 && bufHdr->usage_count == 0)
+ && (bufHdr->freeNext == FREENEXT_NOT_IN_LIST))
+ {
+ MoveBufferToFreeList (bufHdr);
+ }
+ else
+ {
+ UnlockBufHdr(bufHdr);
+ }
+ return result;
+ }
/*
* Pin it, share-lock it, write it. (FlushBuffer will do nothing if the
@@ -1677,6 +1776,19 @@ SyncOneBuffer(int buf_id, bool skip_recently_used)
LWLockRelease(bufHdr->content_lock);
UnpinBuffer(bufHdr, true);
+ /*
+ * Buffer is unused then move it to freelist
+ */
+ LockBufHdr(bufHdr);
+ if (bufHdr->refcount == 0 && bufHdr->usage_count == 0)
+ {
+ MoveBufferToFreeList (bufHdr);
+ }
+ else
+ {
+ UnlockBufHdr(bufHdr);
+ }
+
return result | BUF_WRITTEN;
}
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers