Hi, sorry for asking this really old commit.
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=7ab9b2f3b79177e501a1ef90ed004cc68788abaf I could not understand why "releases the lock on the buffer" is a problem since vacuum will lock and check the page bit again before set the vm. Did I missed something? Thanks for your help. On Thu, Apr 19, 2012 at 4:09 AM, Robert Haas <robertmh...@gmail.com> wrote: > I discovered when researching the issue of index-only scans and Hot > Standby that there's a bug (for which I'm responsible) in > lazy_scan_heap[1]. Actually, the code has been like this for a long > time, but I needed to change it when I did the crash-safe visibility > map work, and I didn't. The problem is that lazy_scan_heap() releases > the lock on the buffer it's whacking around before it sets the > visibility map bit. Thus, it's possible for the page-level bit to be > cleared by some other backend before the visibility map bit gets set. > In previous releases that was arguably OK, since the worst thing that > could happen is a postponement of vacuuming on that page until the > next anti-wraparound cycle; but now that we have index-only scans, it > can cause queries to return wrong answers. > > Attached is a patch to fix the problem, which rearranges things so > that we set the bit in the visibility map before releasing the buffer > lock. Similar work has already been done for inserts, updates, and > deletes, which in previous releases would *clear* the visibility map > bit after releasing the page lock, and no longer done. But the vacuum > case, which *sets* the bit, was overlooked. Our convention in those > related cases is that it's acceptable to hold the buffer lock while > setting the visibility map bit and issuing the WAL log record, but > there must be no possibility of an I/O to read in the visibility map > page while the buffer lock is held. This turned out to be pretty > simple because, as it turns out, lazy_scan_heap() is almost always > holding a pin on the correct page anyway, so I only had to tweak > things slightly to guarantee it. As a pleasant side effect, the new > code is actually quite a bit simpler and more readable than the old > code, at least IMHO. > > While I was at it, I made a couple of minor, related changes which I > believe to be improvements. First, I arranged things so that, when we > pause the first vacuum pass to do an index vac cycle, we release any > buffer pin we're holding on the visibility map page. The fact that we > haven't done that in the past is probably harmless, but I think > there's something to be said for not holding onto buffer pins for long > periods of time when it isn't really necessary. Second, I adjusted > things so that we print a warning if the visibility map bit is set and > the page-level bit is clear, since that should never happen in 9.2+. > It is similar to the existing warning which catches the case where a > page is marked all-visible despite containing dead tuples. > > Comments? > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > > http://archives.postgresql.org/pgsql-hackers/2012-04/msg00682.php > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > > -- GaoZengqi pgf...@gmail.com zengqi...@gmail.com