On 7 June 2013 20:16, Andres Freund <and...@2ndquadrant.com> wrote: > On 2013-06-07 20:10:55 +0100, Simon Riggs wrote: >> On 7 June 2013 19:56, Heikki Linnakangas <hlinnakan...@vmware.com> wrote: >> > On 07.06.2013 21:33, Simon Riggs wrote: >> >> >> >> Now that I consider Greg's line of thought, the idea we focused on >> >> here was about avoiding freezing. But Greg makes me think that we may >> >> also wish to look at allowing queries to run longer than one epoch as >> >> well, if the epoch wrap time is likely to come down substantially. >> >> >> >> To do that I think we'll need to hold epoch for relfrozenxid as well, >> >> amongst other things. >> >> > The biggest problem I see with that is that if a snapshot can be older than >> > 2 billion XIDs, it must be possible to store XIDs on the same page that are >> > more than 2 billion XIDs apart. All the discussed schemes where we store >> > the >> > epoch at the page level, either explicitly or derived from the LSN, rely on >> > the fact that it's not currently necessary to do that. Currently, when one >> > XID on a page is older than 2 billion XIDs, that old XID can always be >> > replaced with FrozenXid, because there cannot be a snapshot old enough to >> > not see it. >> >> It does seem that there are two problems: avoiding freezing AND long >> running queries >> >> The long running query problem hasn't ever been looked at, it seems, >> until here and now. > > I'd say that's because it's prohibitive to run so long transactions > anyway since it causes too much unremovable bloat. 2bio transactions > really is a quite a bit, I don't think it's a relevant restriction. Yet. > > Let's discuss it if we have solved the other problems ;)
Let me say that I think that problem is solvable also. At the moment we allow all visible tuple versions to be linked together, so that the last visible and latest update are linked by a chain. If we break that assumption and say that we will never follow an update chain from a snapshot in the distant past, then we can remove intermediate dead rows. We currently regard those as recently dead, but that just requires some extra thought. We still keep all *visible* tuple versions, we just don't bother to keep all the intermediate ones as well. Perhaps another day, but one day. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers