On Sat, 2005-01-01 at 17:47, Simon Riggs wrote:
> On Sat, 2005-01-01 at 17:01, Bruce Momjian wrote:
> > Simon Riggs wrote:
> >
> > > Well, I think we're saying: its not in 8.0 now, and we take our time to
> > > consider patches for 8.1 and accept the situation that the parameter
> > > names/meaning will change in next release.
> >
> > I have no problem doing something for 8.0 if we can find something that
> > meets all the items I mentioned.
> >
> > One idea would be to just remove bgwriter_percent. Beta/RC users would
> > still have it in their postgresql.conf, but it is commented out so it
> > should be OK. If they uncomment it their server would not start but we
> > could just tell testers to remove it. I see that as better than having
> > conflicting parameters.
>
> Can't say I like that at first thought. I'll think some more though...
>
> > Another idea is to have bgwriter_percent be the percent of the buffer it
> > will scan.
>
> Hmmm....well that was my original suggestion (bg2.patch on 12 Dec)
> (...though with a bug, as Neil pointed out)
>
> > We could default that to 50% or 100%, but we then need to
> > make sure all beta/RC users update their postgresql.conf with the new
> > default because the commented-out default will not be correct.
>
> ...we just differ/ed on what the default should be...
>
> > At this point I see these as our only two viable options, aside from
> > doing nothing.
>
> > I realize our current behavior requires a full scan of the buffer cache,
> > but how often is the bgwriter_maxpages limit met? If it is not a full
> > scan is done anyway, right?
>
> Well, if you heavy a very heavy read workload then that would be a
> problem. I was more worried about concurrency in a heavy write
> situation, but I can see your point, and agree.
>
> (Idea #1 still suffers from this, so we should rule it out...)
>
> > It seems the only way to really add
> > functionality is to change bgwriter_precent to control how much of the
> > buffer is scanned.
>
> OK. I think you've persuaded me on idea #2, if I understand you right:
>
> bgwriter_percent = 50 (default)
> bgwriter_maxpages = 100 (default)
>
> percent is the number of shared_buffers we scan, limited by maxpages.
>
> (I'll code it up in a couple of hours when the kids are in bed)
Here's the basic patch - no changes to current default values or docs.
Not sure if this is still interesting or not...
--
Best Regards, Simon Riggs
Index: src/backend/storage/buffer/bufmgr.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/storage/buffer/bufmgr.c,v
retrieving revision 1.182
diff -d -c -r1.182 bufmgr.c
*** src/backend/storage/buffer/bufmgr.c 24 Nov 2004 02:56:17 -0000 1.182
--- src/backend/storage/buffer/bufmgr.c 1 Jan 2005 21:03:16 -0000
***************
*** 682,717 ****
BufferDesc **dirty_buffers;
BufferTag *buftags;
int num_buffer_dirty;
int i;
/* If either limit is zero then we are disabled from doing anything... */
if (percent == 0 || maxpages == 0)
return 0;
/*
! * Get a list of all currently dirty buffers and how many there are.
* We do not flush buffers that get dirtied after we started. They
! * have to wait until the next checkpoint.
*/
! dirty_buffers = (BufferDesc **) palloc(NBuffers * sizeof(BufferDesc *));
! buftags = (BufferTag *) palloc(NBuffers * sizeof(BufferTag));
LWLockAcquire(BufMgrLock, LW_EXCLUSIVE);
- num_buffer_dirty = StrategyDirtyBufferList(dirty_buffers, buftags,
- NBuffers);
! /*
! * If called by the background writer, we are usually asked to only
! * write out some portion of dirty buffers now, to prevent the IO
! * storm at checkpoint time.
! */
! if (percent > 0)
! {
! Assert(percent <= 100);
! num_buffer_dirty = (num_buffer_dirty * percent + 99) / 100;
! }
! if (maxpages > 0 && num_buffer_dirty > maxpages)
! num_buffer_dirty = maxpages;
/* Make sure we can handle the pin inside the loop */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
--- 682,728 ----
BufferDesc **dirty_buffers;
BufferTag *buftags;
int num_buffer_dirty;
+ int max_buffer_dirty = 1;
+ int max_buffer_scan = 1;
int i;
/* If either limit is zero then we are disabled from doing anything... */
if (percent == 0 || maxpages == 0)
return 0;
+ /* Set number of buffers we will scan from LRUs of buffer lists */
+ if (percent > 0 ) {
+ Assert(percent <= 100);
+ max_buffer_scan = (NBuffers * percent + 99) / 100;
+ }
+
+ /* at checkpoint time we scan the whole buffer list */
+ if (percent < 0)
+ max_buffer_scan = NBuffers;
+
+ if (maxpages < 0 || maxpages > NBuffers)
+ max_buffer_dirty = NBuffers;
+ else
+ max_buffer_dirty = maxpages;
+
+ /* we cannot find more dirty buffers than we scan */
+ if (max_buffer_dirty > max_buffer_scan)
+ max_buffer_dirty = max_buffer_scan;
+
/*
! * Get a list of dirty buffers to clean and how many there are.
* We do not flush buffers that get dirtied after we started. They
! * have to wait until the next call of this function
*/
! dirty_buffers =
! (BufferDesc **) palloc(max_buffer_dirty * sizeof(BufferDesc *));
! buftags = (BufferTag *) palloc(max_buffer_dirty * sizeof(BufferTag));
LWLockAcquire(BufMgrLock, LW_EXCLUSIVE);
! num_buffer_dirty = StrategyDirtyBufferList(dirty_buffers, buftags,
! max_buffer_dirty,
! max_buffer_scan);
/* Make sure we can handle the pin inside the loop */
ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
Index: src/backend/storage/buffer/freelist.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/storage/buffer/freelist.c,v
retrieving revision 1.48
diff -d -c -r1.48 freelist.c
*** src/backend/storage/buffer/freelist.c 16 Sep 2004 16:58:31 -0000 1.48
--- src/backend/storage/buffer/freelist.c 1 Jan 2005 21:03:17 -0000
***************
*** 735,756 ****
* StrategyDirtyBufferList
*
* Returns a list of dirty buffers, in priority order for writing.
- * Note that the caller may choose not to write them all.
*
* The caller must beware of the possibility that a buffer is no longer dirty,
* or even contains a different page, by the time he reaches it. If it no
* longer contains the same page it need not be written, even if it is (again)
* dirty.
*
! * Buffer pointers are stored into buffers[], and corresponding tags into
! * buftags[], both of size max_buffers. The function returns the number of
! * buffer IDs stored.
*/
int
StrategyDirtyBufferList(BufferDesc **buffers, BufferTag *buftags,
! int max_buffers)
{
int num_buffer_dirty = 0;
int cdb_id_t1;
int cdb_id_t2;
int buf_id;
--- 735,757 ----
* StrategyDirtyBufferList
*
* Returns a list of dirty buffers, in priority order for writing.
*
* The caller must beware of the possibility that a buffer is no longer dirty,
* or even contains a different page, by the time he reaches it. If it no
* longer contains the same page it need not be written, even if it is (again)
* dirty.
*
! * We scan the buffer lists T1 and T2 for at most max_buffer_scan buffers,
! * recording any dirty buffer pointers in buffers[], and corresponding tags into
! * buftags[], both of size max_buffer_dirty. The function returns the number of
! * dirty buffer IDs stored.
*/
int
StrategyDirtyBufferList(BufferDesc **buffers, BufferTag *buftags,
! int max_buffer_dirty, int max_buffer_scan)
{
int num_buffer_dirty = 0;
+ int num_buffer_scan = 0;
int cdb_id_t1;
int cdb_id_t2;
int buf_id;
***************
*** 779,790 ****
buffers[num_buffer_dirty] = buf;
buftags[num_buffer_dirty] = buf->tag;
num_buffer_dirty++;
! if (num_buffer_dirty >= max_buffers)
break;
}
}
cdb_id_t1 = StrategyCDB[cdb_id_t1].next;
}
if (cdb_id_t2 >= 0)
--- 780,794 ----
buffers[num_buffer_dirty] = buf;
buftags[num_buffer_dirty] = buf->tag;
num_buffer_dirty++;
! if (num_buffer_dirty >= max_buffer_dirty)
break;
}
}
cdb_id_t1 = StrategyCDB[cdb_id_t1].next;
+ num_buffer_scan++;
+ if (num_buffer_scan >= max_buffer_scan)
+ break;
}
if (cdb_id_t2 >= 0)
***************
*** 799,810 ****
buffers[num_buffer_dirty] = buf;
buftags[num_buffer_dirty] = buf->tag;
num_buffer_dirty++;
! if (num_buffer_dirty >= max_buffers)
break;
}
}
cdb_id_t2 = StrategyCDB[cdb_id_t2].next;
}
}
--- 803,817 ----
buffers[num_buffer_dirty] = buf;
buftags[num_buffer_dirty] = buf->tag;
num_buffer_dirty++;
! if (num_buffer_dirty >= max_buffer_dirty)
break;
}
}
cdb_id_t2 = StrategyCDB[cdb_id_t2].next;
+ num_buffer_scan++;
+ if (num_buffer_scan >= max_buffer_scan)
+ break;
}
}
Index: src/include/storage/buf_internals.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/storage/buf_internals.h,v
retrieving revision 1.74
diff -d -c -r1.74 buf_internals.h
*** src/include/storage/buf_internals.h 16 Oct 2004 18:05:07 -0000 1.74
--- src/include/storage/buf_internals.h 1 Jan 2005 21:03:18 -0000
***************
*** 184,190 ****
extern void StrategyInvalidateBuffer(BufferDesc *buf);
extern void StrategyHintVacuum(bool vacuum_active);
extern int StrategyDirtyBufferList(BufferDesc **buffers, BufferTag *buftags,
! int max_buffers);
extern void StrategyInitialize(bool init);
/* buf_table.c */
--- 184,190 ----
extern void StrategyInvalidateBuffer(BufferDesc *buf);
extern void StrategyHintVacuum(bool vacuum_active);
extern int StrategyDirtyBufferList(BufferDesc **buffers, BufferTag *buftags,
! int max_buffer_dirty, int max_buffer_scan);
extern void StrategyInitialize(bool init);
/* buf_table.c */
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend