On Sun, 2007-03-04 at 11:54 +0000, Simon Riggs wrote: > > (2) sync_scan_offset: Start a new scan this many pages before a > > currently running scan to take advantage of the pages > > that are likely already in cache. > > I'm somewhat dubious about this parameter, I have to say, even though I > am eager for this feature. It seems like a "magic" parameter that works > only when we have the right knowledge to set it correctly. >
That was my concern about this parameter also. > How will we know what to default it to and how will we know whether to > set it higher or lower for better performance? Does that value vary > according to the workload on the system? How? > Perhaps people would only set this parameter when they know it will help, and for more complex (or varied) usage patterns they'd set sync_scan_offset to 0 to be safe. My thinking on the subject (and this is only backed up by very basic tests) is that there are basically two situations where setting this parameter too high can hurt: (1) It's too close to the limits of your physical memory, and you end up diverging the scans when they could be kept together. (2) You're using a lot of CPU and the backends aren't processing the buffers as fast as your I/O system is delivering them. This will prevent the scans from converging. If your CPUs are well below capacity and you choose a size significantly less than your effective cache size, I don't think it will hurt. > I'm worried that we get a feature that works well on simple tests and > not at all in real world circumstances. I don't want to cast doubt on > what could be a great patch or be negative: I just see that the feature > relies on the dynamic behaviour of the system. I'd like to see some > further studies on how this works to make sure that we can realistically > set know how to set this knob, that its the correct knob and it is the > only one we need. I will do some better tests on some better hardware this week and next week. I hope that sheds some light. > Further thoughts: It sounds like sync_scan_offset is related to > effective_cache_size. Can you comment on whether that might be a > something we can use as well/instead? (i.e. set the scan offset to say K > * effective_cache_size, 0.1 <= K <= 0.5)??? > > Might we do roughly the same thing with sync_scan_threshold as well, and > just have enable_sync_scan instead? i.e. sync_scan_threshold = > effective_cache_size? When would those two parameters not be connected > directly to each other? > Originally, these parameters were in terms of the effective_cache_size. Somebody else convinced me that it was too confusing to have the variables dependent on each other, so I made them independent. I don't have a strong opinion either way. Regards, Jeff Davis ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match