Re: [HACKERS] crash-safe visibility map, take three

2011-01-07 Thread Robert Haas
On Fri, Jan 7, 2011 at 1:28 PM, Jim Nasby wrote: > On Jan 5, 2011, at 8:10 PM, Robert Haas wrote: >> On Wed, Jan 5, 2011 at 3:22 PM, Jesper Krogh wrote: >>> Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE bit >>> serve? >> >> If we modify a page on which PD_ALL_VISIBLE isn

Re: [HACKERS] crash-safe visibility map, take three

2011-01-07 Thread Jim Nasby
On Jan 5, 2011, at 8:10 PM, Robert Haas wrote: > On Wed, Jan 5, 2011 at 3:22 PM, Jesper Krogh wrote: >> Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE bit >> serve? > > If we modify a page on which PD_ALL_VISIBLE isn't set, we don't > attempt to update the visibility map.

Re: [HACKERS] crash-safe visibility map, take three

2011-01-05 Thread Jesper Krogh
On 2011-01-06 03:10, Robert Haas wrote: On Wed, Jan 5, 2011 at 3:22 PM, Jesper Krogh wrote: Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE bit serve? If we modify a page on which PD_ALL_VISIBLE isn't set, we don't attempt to update the visibility map. In theory, this

Re: [HACKERS] crash-safe visibility map, take three

2011-01-05 Thread Robert Haas
On Wed, Jan 5, 2011 at 3:22 PM, Jesper Krogh wrote: > Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE bit > serve? If we modify a page on which PD_ALL_VISIBLE isn't set, we don't attempt to update the visibility map. In theory, this is an important optimization to reduce

Re: [HACKERS] crash-safe visibility map, take three

2011-01-05 Thread Jesper Krogh
On 2010-11-30 05:57, Robert Haas wrote: Last week, I posted a couple of possible designs for making the visibility map crash-safe, which did not elicit much comment. Since this is an important prerequisite to index-only scans, I'm trying again. The logic seems to be: * If the visibillity map

Re: [HACKERS] crash-safe visibility map, take three

2010-12-03 Thread Florian Weimer
* Robert Haas: > Those hint bit tests are a single machine instruction. It's tough > to beat that. It's tough to get within two orders of magnitude. > I'd like to, but I don't see how. For some scans, it might be possible to hoist the checks out of inner loops. (At least in principle, I'm not

Re: [HACKERS] crash-safe visibility map, take three

2010-12-02 Thread Jeff Davis
On Thu, 2010-12-02 at 19:06 -0500, Robert Haas wrote: > I don't think that you can seriously suggest that emitting that volume > of FPIs isn't going to be a problem immediately. We have to have some > solution to that problem out of the gate. Fair enough. I think you understand my point, and it's

Re: [HACKERS] crash-safe visibility map, take three

2010-12-02 Thread Robert Haas
On Thu, Dec 2, 2010 at 6:37 PM, Jeff Davis wrote: >> It seems to me that a COPY command executed in a transaction with no >> other open snapshots writing to a table created or truncated within >> the same transaction should be able to write frozen tuples from the >> get-go, regardless of anything

Re: [HACKERS] crash-safe visibility map, take three

2010-12-02 Thread Jeff Davis
On Thu, 2010-12-02 at 17:00 -0500, Robert Haas wrote: > I'm not really convinced that this problem is confined to bulk > loading. Every INSERT or UPDATE results in a new tuple that may need > hit bits set and eventually to be frozen. A bulk load is just a time > when you do lots of inserts all at

Re: [HACKERS] crash-safe visibility map, take three

2010-12-02 Thread Robert Haas
On Thu, Dec 2, 2010 at 2:01 PM, Jeff Davis wrote: > * We don't get an exclusive lock when dirtying a page with hint bits > - Why: we write while reading, and we want good concurrency. > - Why': because after a bulk load, we don't have any hint bits, and the > only way to get them set without VACUU

Re: [HACKERS] crash-safe visibility map, take three

2010-12-02 Thread Kevin Grittner
Jeff Davis wrote: > And, if we had a bulk loading path, we could probably get away > with writing the data only twice (today, we write it 3 times > including the hint bits) or maybe once if WAL archiving is off. If you're counting WAL writes, you're low. If you don't go out of your way to avo

Re: [HACKERS] crash-safe visibility map, take three

2010-12-02 Thread Jeff Davis
On Wed, 2010-12-01 at 23:22 -0500, Robert Haas wrote: > Well, let's think about what we'd need to do to make CRCs work > reliably. There are two problems. > > 1. [...] If we CRC the entire page, the torn pages are never > acceptable, so every action that modifies the page must be WAL-logged. >

Re: [HACKERS] crash-safe visibility map, take three

2010-12-02 Thread Robert Haas
On Thu, Dec 2, 2010 at 6:37 AM, Dimitri Fontaine wrote: > Robert Haas writes: >> Or maybe I do.  One other thing I've been thinking about with regard >> to hint bit updates is that we might choose to mark that are >> hint-bit-updated as "untidy" rather than "dirty".  The background > > Please rev

Re: [HACKERS] crash-safe visibility map, take three

2010-12-02 Thread Dimitri Fontaine
Robert Haas writes: > Or maybe I do. One other thing I've been thinking about with regard > to hint bit updates is that we might choose to mark that are > hint-bit-updated as "untidy" rather than "dirty". The background Please review archives, you'll find the idea discussed and some patches to

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Robert Haas
On Wed, Dec 1, 2010 at 6:41 PM, Jim Nasby wrote: > On Dec 1, 2010, at 2:59 PM, Robert Haas wrote: >> 2. Hint bits are necessary because an old XID can't be viewed as >> guaranteed committed. > > Hmm... I thought hint bits were necessary because it's too expensive to query > CLOG for every tuple.

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Robert Haas
On Wed, Dec 1, 2010 at 5:24 PM, Jeff Davis wrote: > On Wed, 2010-12-01 at 15:59 -0500, Robert Haas wrote: >> As for CRCs, there's a pretty direct chain of inference here: >> >> 1. CRCs are hard (really impossible) because we have hint bits. > > I would disagree with "impossible". If we don't set h

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Jim Nasby
On Dec 1, 2010, at 2:59 PM, Robert Haas wrote: > 2. Hint bits are necessary because an old XID can't be viewed as > guaranteed committed. Hmm... I thought hint bits were necessary because it's too expensive to query CLOG for every tuple. If my understanding is correct then if we fix the CLOG per

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Jeff Davis
On Wed, 2010-12-01 at 15:59 -0500, Robert Haas wrote: > As for CRCs, there's a pretty direct chain of inference here: > > 1. CRCs are hard (really impossible) because we have hint bits. I would disagree with "impossible". If we don't set hint bits during reading; and when we do set them, we log t

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Tom Lane
Robert Haas writes: > If we switched from per-tuple MVCC based on XIDs to per-page MVCC > based on LSNs and a rollback segment, all of this stuff would go out > the window. Hint bits, gone. Anti-wraparound VACUUM, gone. CRCs, > feasible. Visibility map... we might still need that, but the > p

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Robert Haas
On Wed, Dec 1, 2010 at 3:31 PM, Jeff Davis wrote: > On Wed, 2010-12-01 at 11:25 -0500, Robert Haas wrote: >> 1. Every time we observe a page as all-visible, (a) set the >> PD_ALL_VISIBLE bit on the page, without bumping the LSN; > > ... > >> 2. Every time we observe a page as all-visible, (a) set

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Jeff Davis
On Wed, 2010-12-01 at 11:25 -0500, Robert Haas wrote: > 1. Every time we observe a page as all-visible, (a) set the > PD_ALL_VISIBLE bit on the page, without bumping the LSN; ... > 2. Every time we observe a page as all-visible, (a) set the > PD_ALL_VISIBLE bit on the page, without bumping the LS

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Robert Haas
On Wed, Dec 1, 2010 at 12:22 PM, Tom Lane wrote: > Robert Haas writes: >> I think we can improve this a bit further by also introducing a >> HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with >> FrozenXID.  This allows us to freeze tuples aggressively - if we want >> - without losi

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Robert Haas
On Wed, Dec 1, 2010 at 11:40 AM, Heikki Linnakangas wrote: > On 01.12.2010 18:25, Robert Haas wrote: >> >> I think we can improve this a bit further by also introducing a >> HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with >> FrozenXID.  This allows us to freeze tuples aggressivel

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Tom Lane
Robert Haas writes: > I think we can improve this a bit further by also introducing a > HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with > FrozenXID. This allows us to freeze tuples aggressively - if we want > - without losing any forensic information. So far so good ... > We c

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Tom Lane
Heikki Linnakangas writes: > On 01.12.2010 18:40, Tom Lane wrote: >> Um, no it isn't. Suppose the heap page gets to disk but we crash before >> the WAL record does. Now we have a persistent state where the heap page >> is marked PD_ALL_VISIBLE but the corresponding VM bit is not set. The >> VM

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Heikki Linnakangas
On 01.12.2010 18:40, Tom Lane wrote: Robert Haas writes: As far as I can tell, there are basically two viable solutions on the table here. 1. Every time we observe a page as all-visible, (a) set the PD_ALL_VISIBLE bit on the page, without bumping the LSN; (b) set the bit in the visibility ma

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Tom Lane
Heikki Linnakangas writes: > Hmm, actually, if we're willing to believe PD_ALL_VISIBLE in the page > header over the xmin/xmax on the tuples, we could simply not bother > doing anti-wraparound vacuums for pages that have the flag set. I'm not > sure what changes that would require outside heapa

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Tom Lane
Robert Haas writes: > As far as I can tell, there are basically two viable solutions on the > table here. > 1. Every time we observe a page as all-visible, (a) set the > PD_ALL_VISIBLE bit on the page, without bumping the LSN; (b) set the > bit in the visibility map page, bumping the LSN as usual

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Heikki Linnakangas
On 01.12.2010 18:25, Robert Haas wrote: I think we can improve this a bit further by also introducing a HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with FrozenXID. This allows us to freeze tuples aggressively - if we want - without losing any forensic information. We can then m

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Robert Haas
On Wed, Dec 1, 2010 at 10:36 AM, Bruce Momjian wrote: > Oh, we don't update the LSN when we set the PD_ALL_VISIBLE flag?  OK, > please let me think some more.  Thanks. As far as I can tell, there are basically two viable solutions on the table here. 1. Every time we observe a page as all-visible

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Robert Haas
On Wed, Dec 1, 2010 at 9:57 AM, Kevin Grittner wrote: > Heikki Linnakangas wrote: > >> it would be annoying to have to checkpoint after a data load > > Heck, in my world it's currently pretty much a necessity to run > VACUUM FREEZE ANALYZE on a table after a data load before it's > reasonable to

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Bruce Momjian
Heikki Linnakangas wrote: > On 01.12.2010 15:39, Bruce Momjian wrote: > > Heikki Linnakangas wrote: > >> On 01.12.2010 03:35, Bruce Momjian wrote: > >>> Heikki Linnakangas wrote: > Let's recap what happens when a VM bit is set: You set the > PD_ALL_VISIBLE flag on the heap page (assuming

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Kevin Grittner
Heikki Linnakangas wrote: > it would be annoying to have to checkpoint after a data load Heck, in my world it's currently pretty much a necessity to run VACUUM FREEZE ANALYZE on a table after a data load before it's reasonable to expose the table to production use. It would hardly be an incon

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Heikki Linnakangas
On 01.12.2010 15:39, Bruce Momjian wrote: Heikki Linnakangas wrote: On 01.12.2010 03:35, Bruce Momjian wrote: Heikki Linnakangas wrote: Let's recap what happens when a VM bit is set: You set the PD_ALL_VISIBLE flag on the heap page (assuming it's not set already, it usually isn't), and then se

Re: [HACKERS] crash-safe visibility map, take three

2010-12-01 Thread Bruce Momjian
Heikki Linnakangas wrote: > On 01.12.2010 03:35, Bruce Momjian wrote: > > Heikki Linnakangas wrote: > >> Let's recap what happens when a VM bit is set: You set the > >> PD_ALL_VISIBLE flag on the heap page (assuming it's not set already, it > >> usually isn't), and then set the bit in the VM while

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Heikki Linnakangas
On 01.12.2010 03:35, Bruce Momjian wrote: Heikki Linnakangas wrote: Let's recap what happens when a VM bit is set: You set the PD_ALL_VISIBLE flag on the heap page (assuming it's not set already, it usually isn't), and then set the bit in the VM while keeping the heap page locked. What if we s

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Bruce Momjian
Heikki Linnakangas wrote: > On 30.11.2010 18:33, Tom Lane wrote: > > Robert Haas writes: > >> Oh, but it's worse than that. When you XLOG a WAL record for each of > >> those pages, you're going to trigger full-page writes for all of them. > >> So now you've turned 1GB of data to write into 2+ G

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Tom Lane
Heikki Linnakangas writes: > On 30.11.2010 19:22, Tom Lane wrote: >> But having said that, I wonder whether we need a full-page image for >> a WAL-logged action that is known to involve only setting a single bit >> and updating LSN. > You have to write a full-page image if you update the LSN, bec

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 12:25 PM, Robert Haas wrote: > On Tue, Nov 30, 2010 at 12:22 PM, Tom Lane wrote: >> But having said that, I wonder whether we need a full-page image for >> a WAL-logged action that is known to involve only setting a single bit >> and updating LSN.  Would omitting the FPI b

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 12:22 PM, Tom Lane wrote: > But having said that, I wonder whether we need a full-page image for > a WAL-logged action that is known to involve only setting a single bit > and updating LSN.  Would omitting the FPI be any more risky than what > happens now (ie, the page does

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Heikki Linnakangas
On 30.11.2010 19:22, Tom Lane wrote: But having said that, I wonder whether we need a full-page image for a WAL-logged action that is known to involve only setting a single bit and updating LSN. Would omitting the FPI be any more risky than what happens now (ie, the page does get written back to

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Tom Lane
Robert Haas writes: > On Tue, Nov 30, 2010 at 12:10 PM, Tom Lane wrote: >> It's ridiculous to claim that that "doubles the cost of VACUUM".  In the >> worst case, it will add 25% to the cost of setting an all-visible bit on >> a page where there is no other work to do.  (You already are writing o

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 12:10 PM, Tom Lane wrote: > Robert Haas writes: >> We're not going to double the cost of VACUUM to get index-only scans. >> And that's exactly what will happen if you do full-page writes of >> every heap page to set a single bit. > > It's ridiculous to claim that that "dou

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Tom Lane
Robert Haas writes: > We're not going to double the cost of VACUUM to get index-only scans. > And that's exactly what will happen if you do full-page writes of > every heap page to set a single bit. It's ridiculous to claim that that "doubles the cost of VACUUM". In the worst case, it will add 2

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 11:59 AM, Tom Lane wrote: > Robert Haas writes: >> On Tue, Nov 30, 2010 at 11:40 AM, Tom Lane wrote: >>> Ouch.  That seems like it could shoot down all these proposals.  There >>> definitely isn't any way to make VM crash-safe if there is no WAL-driven >>> mechanism for s

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 11:55 AM, Tom Lane wrote: > Heikki Linnakangas writes: >> Can we get away with not setting the LSN on the heap page, even though >> we set the PD_ALL_VISIBLE flag? If we don't set the LSN, the heap page >> can be flushed to disk before the WAL record, but I think that's fi

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 11:49 AM, Heikki Linnakangas wrote: > On 30.11.2010 18:33, Tom Lane wrote: >> >> Robert Haas  writes: >>> >>> Oh, but it's worse than that.  When you XLOG a WAL record for each of >>> those pages, you're going to trigger full-page writes for all of them. >>>  So now you've

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Tom Lane
Robert Haas writes: > On Tue, Nov 30, 2010 at 11:40 AM, Tom Lane wrote: >> Ouch.  That seems like it could shoot down all these proposals.  There >> definitely isn't any way to make VM crash-safe if there is no WAL-driven >> mechanism for setting the bits. > Heikki's intent method works fine, be

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Tom Lane
Heikki Linnakangas writes: > Can we get away with not setting the LSN on the heap page, even though > we set the PD_ALL_VISIBLE flag? If we don't set the LSN, the heap page > can be flushed to disk before the WAL record, but I think that's fine > because it's OK to have the flag set in the heap

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 11:40 AM, Tom Lane wrote: > Robert Haas writes: >> That's definitely sucky, but in some ways it would be more complicated >> if they did, because I don't think all-visible on the master implies >> all-visible on the standby. > > Ouch.  That seems like it could shoot down a

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Heikki Linnakangas
On 30.11.2010 18:33, Tom Lane wrote: Robert Haas writes: Oh, but it's worse than that. When you XLOG a WAL record for each of those pages, you're going to trigger full-page writes for all of them. So now you've turned 1GB of data to write into 2+ GB of data to write. No, because only the f

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Heikki Linnakangas
On 30.11.2010 18:40, Tom Lane wrote: Robert Haas writes: That's definitely sucky, but in some ways it would be more complicated if they did, because I don't think all-visible on the master implies all-visible on the standby. Ouch. That seems like it could shoot down all these proposals. The

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 11:33 AM, Tom Lane wrote: > Robert Haas writes: >> Oh, but it's worse than that.  When you XLOG a WAL record for each of >> those pages, you're going to trigger full-page writes for all of them. >>  So now you've turned 1GB of data to write into 2+ GB of data to >> write.

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Tom Lane
Robert Haas writes: > That's definitely sucky, but in some ways it would be more complicated > if they did, because I don't think all-visible on the master implies > all-visible on the standby. Ouch. That seems like it could shoot down all these proposals. There definitely isn't any way to make

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Tom Lane
Heikki Linnakangas writes: > On 30.11.2010 18:10, Tom Lane wrote: >> I'm not convinced it works at all. Consider write intent record, >> checkpoint, set bit, crash before completing vacuum. There will be >> no second intent record at which you could clean up if things are >> inconsistent. > Tha

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Tom Lane
Robert Haas writes: > Oh, but it's worse than that. When you XLOG a WAL record for each of > those pages, you're going to trigger full-page writes for all of them. > So now you've turned 1GB of data to write into 2+ GB of data to > write. No, because only the first mod of each VM page would tri

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Heikki Linnakangas
On 30.11.2010 18:10, Tom Lane wrote: Heikki Linnakangas writes: Yeah, I'm not terribly excited about any of these schemes. The "intent" record seems like the simplest one, but even that is quite different from the traditional WAL-logging we do that it makes me slightly nervous. I'm not convin

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Heikki Linnakangas
On 30.11.2010 18:22, Robert Haas wrote: On Tue, Nov 30, 2010 at 11:16 AM, Tom Lane wrote: How much is "quite a lot"? Do we have any real reason to think that this solution is unacceptable performance-wise? Well, let's imagine a 1GB insert-only table. It has 128K pages. If you XLOG setting

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 11:22 AM, Robert Haas wrote: > On Tue, Nov 30, 2010 at 11:16 AM, Tom Lane wrote: >> How much is "quite a lot"?  Do we have any real reason to think that >> this solution is unacceptable performance-wise? > > Well, let's imagine a 1GB insert-only table.  It has 128K pages.

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 11:16 AM, Tom Lane wrote: > How much is "quite a lot"?  Do we have any real reason to think that > this solution is unacceptable performance-wise? Well, let's imagine a 1GB insert-only table. It has 128K pages. If you XLOG setting the bit on each page, you'll need to wri

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Tom Lane
Heikki Linnakangas writes: > The trivial solution to this is to WAL-log setting the visibility map > bit, like we WAL-log any other operation. Lock the heap page, lock the > visibility map page, write WAL-record, and release locks. That works, > but the problem is that it creates quite a lot of

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Tom Lane
Heikki Linnakangas writes: > On 30.11.2010 17:38, Tom Lane wrote: >> Wouldn't it be easier and more robust to just consider VM bit changes to >> be part of the WAL-logged actions? That would include updating LSNs on >> VM pages and flushing VM pages to disk during checkpoint based on their >> LSN

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Heikki Linnakangas
Here's one more idea: The trivial solution to this is to WAL-log setting the visibility map bit, like we WAL-log any other operation. Lock the heap page, lock the visibility map page, write WAL-record, and release locks. That works, but the problem is that it creates quite a lot of new WAL tra

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 10:43 AM, Heikki Linnakangas wrote: >> It seems like you'll need to hold some kind of lock between the time >> you examine RedoRecPtr and the time you actually examine the bit. >> WALInsertLock in shared mode, maybe? > > It's enough to hold an exclusive lock on the visibili

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Heikki Linnakangas
On 30.11.2010 17:38, Tom Lane wrote: Heikki Linnakangas writes: On 30.11.2010 06:57, Robert Haas wrote: I can't say I'm totally in love with any of these designs. Anyone else have any ideas, or any opinions about which one is best? Well, the design I've been pondering goes like this: Wou

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 10:38 AM, Tom Lane wrote: > Heikki Linnakangas writes: >> On 30.11.2010 06:57, Robert Haas wrote: >>> I can't say I'm totally in love with any of these designs.  Anyone >>> else have any ideas, or any opinions about which one is best? > >> Well, the design I've been ponder

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Heikki Linnakangas
On 30.11.2010 17:32, Robert Haas wrote: On Tue, Nov 30, 2010 at 2:34 AM, Heikki Linnakangas wrote: Some care is needed with checkpoints. Setting visibility map bits in step 2 is safe because crash recovery will replay the intent XLOG record and clear any incorrectly set bits. But if a checkpoi

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Tom Lane
Heikki Linnakangas writes: > On 30.11.2010 06:57, Robert Haas wrote: >> I can't say I'm totally in love with any of these designs. Anyone >> else have any ideas, or any opinions about which one is best? > Well, the design I've been pondering goes like this: Wouldn't it be easier and more robust

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Robert Haas
On Tue, Nov 30, 2010 at 2:34 AM, Heikki Linnakangas wrote: > Some care is needed with checkpoints. Setting visibility map bits in step 2 > is safe because crash recovery will replay the intent XLOG record and clear > any incorrectly set bits. But if a checkpoint has happened after the intent > XLO

Re: [HACKERS] crash-safe visibility map, take three

2010-11-30 Thread Rob Wultsch
On Mon, Nov 29, 2010 at 9:57 PM, Robert Haas wrote: > 1. Pin each visibility map page.  If any VM_BECOMING_ALL_VISIBLE bits > are set, take the exclusive content lock for long enough to clear > them. I wonder what the performance hit will be to workloads with contention and if this feature should

Re: [HACKERS] crash-safe visibility map, take three

2010-11-29 Thread Heikki Linnakangas
On 30.11.2010 06:57, Robert Haas wrote: I can't say I'm totally in love with any of these designs. Anyone else have any ideas, or any opinions about which one is best? Well, the design I've been pondering goes like this: At vacuum: 1. Write an "intent" XLOG record listing a chunk of visibili

[HACKERS] crash-safe visibility map, take three

2010-11-29 Thread Robert Haas
Last week, I posted a couple of possible designs for making the visibility map crash-safe, which did not elicit much comment. Since this is an important prerequisite to index-only scans, I'm trying again. http://archives.postgresql.org/pgsql-hackers/2010-11/msg01474.php http://archives.postgresql