On Mon, 2007-03-12 at 13:21 +0000, Simon Riggs wrote:
> So based on those thoughts, sync_scan_offset should be fixed at 16,
> rather than being variable. In addition, ss_report_loc() should only
> report its position every 16 blocks, rather than do this every time,
> which will reduce overhead of this call.
If we fix sync_scan_offset at 16, we might as well just get rid of it.
Sync scans are only useful on large tables, and getting a free 16 pages
over a scan isn't worth the trouble. However, even without
sync_scan_offset, sync scans are still a valuable feature.
I agree that ss_report_loc() doesn't need to report on every call. If
there's any significant overhead I agree that it should report less
often. Do you think that the overhead is significant on such a simple
> To match that, scan_recycle_buffers should be fixed at 32. So GUCs for
> sync_scan_offset and scan_recycle_buffers would not be required at all.
> IMHO we can also remove sync_scan_threshold and just use NBuffers
> instead. That way we get the benefit of both patches or neither, making
> it easier to understand what's going on.
I like the idea of reducing tuning parameters, but we should, at a
minimum, still allow an on/off button for sync scans. My tests revealed
that the wrong combination of OS/FS/IO-Scheduler/Controller could result
in bad I/O behavior.
> If need be, the value of scan_recycle_buffers can be varied upwards
> should the scans drift apart, as a way of bringing them back together.
If the scans aren't being brought together, that means that one of the
scans is CPU bound or outside the combined cache trail (shared_buffers
+ OS buffer cache).
> We aren't tracking whether they are together or apart, so I would like
> to see some debug output from synch scans to allow us to assess how far
> behind the second scan is as it progresses. e.g.
> LOG: synch scan currently on block N, trailing pathfinder by M blocks
> issued every 128 blocks as we go through the scans.
It's hard to track where all the scans are currently. One of the
advantages of my patch is its simplicity: the scans don't need to know
about other specific scans, and there is no concept in the code of a
"head" scan or a "pack".
There is no easy way to tell which scan is ahead and which is behind.
There was a discussion when I submitted this proposal at the beginning
of 8.3, but I didn't see enough benefit to justify all of the costs and
risks associated with scans communicating between eachother. I
certainly can't implement that kind of thing before feature freeze, and
I think there's a risk of lock contention for the communication
required. I'm also concerned that -- if the scans are too
interdependent -- it would make postgres less robust against the
disappearance of a single backend (i.e. what if the backend that is
leading a scan dies?).
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly