Greg Stark wrote: > On Tue, Dec 1, 2009 at 9:57 PM, Richard Huxton <d...@archonet.com> wrote: >> Why are we writing out the hint bits to disk anyway? Is it really so >> slow to calculate them on read + cache them that it's worth all this >> trouble? Are they not also to blame for the "write my import data twice" >> feature? > > It would be interesting to experiment with different strategies. But > the results would depend a lot on workloads and I doubt one strategy > is best for everyone. > > It has often been suggested that we could set the hint bits but not > dirty the page, so they would never be written out unless some other > update hit the page. In most use cases that would probably result in > the right thing happening where we avoid half the writes but still > stop doing transaction status lookups relatively promptly. The scary > thing is that there might be use cases such as static data loaded > where the hint bits never get set and every scan of the page has to > recheck those statuses until the tuples are frozen.
And how scary is that? Assuming we cache the hints... 1. With the page itself, so same lifespan 2. Separately, perhaps with a different (longer) lifespan. Separately would then let you trade complexity for compactness - "all of block B is deleted", "all of table T is visible". So what is the cost of calculating the hint-bits for a whole block of tuples in one go vs reading that block from actual spinning disk? > There does need to be something like the hint bits which does > eventually have to be set because we can't keep transaction > information around forever. Even if you keep the transaction > information all the way back to the last freeze date (up to about 1GB > and change I think) then the data has to be written twice, the second > time is to freeze the transactions. In the worst case then reading a > page requires a random page access (or two) from anywhere in that 1GB+ > file for each tuple on the page (whether visible to us or not). While on that topic - I'm assuming freezing requires substantially more effort than updating hint bits? -- Richard Huxton Archonet Ltd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers