On Mon, 2007-03-05 at 21:03 +0000, Heikki Linnakangas wrote: > Another approach I proposed back in December is to not have a variable > like that at all, but scan the buffer cache for pages belonging to the > table you're scanning to initialize the scan. Scanning all the > BufferDescs is a fairly CPU and lock heavy operation, but it might be ok > given that we're talking about large I/O bound sequential scans. It > would require no DBA tuning and would work more robustly in varying > conditions. I'm not sure where you would continue after scanning the > in-cache pages. At the highest in-cache block number, perhaps. >
I assume you're referring to this: "each backend keeps a bitmap of pages it has processed during the scan, and read the pages in the order they're available in cache." which I think is a great idea. However, I was unable to devise a good answer to all these questions at once: * How do we attempt to maintain sequential reads on the underlying I/O layer? * My current implementation takes advantage of the OS buffer cache, how could we maintain that advantage from PostgreSQL-specific cache logic? * How do I test to see whether it actually helps in a realistic scenario? It seems like it would help the most when scans are progressing at different rates, but how often do people have CPU-bound queries on tables that don't fit into physical memory (and how long would it take for me to benchmark such a query)? It seems like your idea is more analytical, and my current implementation is more guesswork. I like the analytical approach, but I don't know that we have enough information to pull it off because we're missing what's in the OS buffer cache. The OS buffer cache is crucial to Synchronized Scanning, because shared buffers are evicted based on a more complex set of circumstances, whereas the OS buffer cache is usually LRU and forms a nicer "cache trail" (upon which Synchronized Scanning is largely based). If you have some tests you'd like me to run, I'm planning to do some benchmarks this week and next. I can see if my current patch holds up under the scenarios you're worried about. Regards, Jeff Davis ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly