On Jan 5, 2011, at 8:10 PM, Robert Haas wrote:
On Wed, Jan 5, 2011 at 3:22 PM, Jesper Krogh jes...@krogh.cc wrote:
Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE bit
serve?
If we modify a page on which PD_ALL_VISIBLE isn't set, we don't
attempt to update the
On Fri, Jan 7, 2011 at 1:28 PM, Jim Nasby j...@nasby.net wrote:
On Jan 5, 2011, at 8:10 PM, Robert Haas wrote:
On Wed, Jan 5, 2011 at 3:22 PM, Jesper Krogh jes...@krogh.cc wrote:
Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE bit
serve?
If we modify a page on which
On 2010-11-30 05:57, Robert Haas wrote:
Last week, I posted a couple of possible designs for making the
visibility map crash-safe, which did not elicit much comment. Since
this is an important prerequisite to index-only scans, I'm trying
again.
The logic seems to be:
* If the visibillity map
On Wed, Jan 5, 2011 at 3:22 PM, Jesper Krogh jes...@krogh.cc wrote:
Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE bit
serve?
If we modify a page on which PD_ALL_VISIBLE isn't set, we don't
attempt to update the visibility map. In theory, this is an important
On 2011-01-06 03:10, Robert Haas wrote:
On Wed, Jan 5, 2011 at 3:22 PM, Jesper Kroghjes...@krogh.cc wrote:
Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE bit
serve?
If we modify a page on which PD_ALL_VISIBLE isn't set, we don't
attempt to update the visibility map.
* Robert Haas:
Those hint bit tests are a single machine instruction. It's tough
to beat that. It's tough to get within two orders of magnitude.
I'd like to, but I don't see how.
For some scans, it might be possible to hoist the checks out of inner
loops. (At least in principle, I'm not
Robert Haas robertmh...@gmail.com writes:
Or maybe I do. One other thing I've been thinking about with regard
to hint bit updates is that we might choose to mark that are
hint-bit-updated as untidy rather than dirty. The background
Please review archives, you'll find the idea discussed and
On Thu, Dec 2, 2010 at 6:37 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
Robert Haas robertmh...@gmail.com writes:
Or maybe I do. One other thing I've been thinking about with regard
to hint bit updates is that we might choose to mark that are
hint-bit-updated as untidy rather than
On Wed, 2010-12-01 at 23:22 -0500, Robert Haas wrote:
Well, let's think about what we'd need to do to make CRCs work
reliably. There are two problems.
1. [...] If we CRC the entire page, the torn pages are never
acceptable, so every action that modifies the page must be WAL-logged.
2.
Jeff Davis pg...@j-davis.com wrote:
And, if we had a bulk loading path, we could probably get away
with writing the data only twice (today, we write it 3 times
including the hint bits) or maybe once if WAL archiving is off.
If you're counting WAL writes, you're low. If you don't go out of
On Thu, Dec 2, 2010 at 2:01 PM, Jeff Davis pg...@j-davis.com wrote:
* We don't get an exclusive lock when dirtying a page with hint bits
- Why: we write while reading, and we want good concurrency.
- Why': because after a bulk load, we don't have any hint bits, and the
only way to get them set
On Thu, 2010-12-02 at 17:00 -0500, Robert Haas wrote:
I'm not really convinced that this problem is confined to bulk
loading. Every INSERT or UPDATE results in a new tuple that may need
hit bits set and eventually to be frozen. A bulk load is just a time
when you do lots of inserts all at
On Thu, Dec 2, 2010 at 6:37 PM, Jeff Davis pg...@j-davis.com wrote:
It seems to me that a COPY command executed in a transaction with no
other open snapshots writing to a table created or truncated within
the same transaction should be able to write frozen tuples from the
get-go, regardless of
On Thu, 2010-12-02 at 19:06 -0500, Robert Haas wrote:
I don't think that you can seriously suggest that emitting that volume
of FPIs isn't going to be a problem immediately. We have to have some
solution to that problem out of the gate.
Fair enough. I think you understand my point, and it's
Heikki Linnakangas wrote:
On 01.12.2010 03:35, Bruce Momjian wrote:
Heikki Linnakangas wrote:
Let's recap what happens when a VM bit is set: You set the
PD_ALL_VISIBLE flag on the heap page (assuming it's not set already, it
usually isn't), and then set the bit in the VM while keeping the
On 01.12.2010 15:39, Bruce Momjian wrote:
Heikki Linnakangas wrote:
On 01.12.2010 03:35, Bruce Momjian wrote:
Heikki Linnakangas wrote:
Let's recap what happens when a VM bit is set: You set the
PD_ALL_VISIBLE flag on the heap page (assuming it's not set already, it
usually isn't), and then
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
it would be annoying to have to checkpoint after a data load
Heck, in my world it's currently pretty much a necessity to run
VACUUM FREEZE ANALYZE on a table after a data load before it's
reasonable to expose the table to
Heikki Linnakangas wrote:
On 01.12.2010 15:39, Bruce Momjian wrote:
Heikki Linnakangas wrote:
On 01.12.2010 03:35, Bruce Momjian wrote:
Heikki Linnakangas wrote:
Let's recap what happens when a VM bit is set: You set the
PD_ALL_VISIBLE flag on the heap page (assuming it's not set
On Wed, Dec 1, 2010 at 9:57 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
it would be annoying to have to checkpoint after a data load
Heck, in my world it's currently pretty much a necessity to run
VACUUM FREEZE ANALYZE
On Wed, Dec 1, 2010 at 10:36 AM, Bruce Momjian br...@momjian.us wrote:
Oh, we don't update the LSN when we set the PD_ALL_VISIBLE flag? OK,
please let me think some more. Thanks.
As far as I can tell, there are basically two viable solutions on the
table here.
1. Every time we observe a page
On 01.12.2010 18:25, Robert Haas wrote:
I think we can improve this a bit further by also introducing a
HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with
FrozenXID. This allows us to freeze tuples aggressively - if we want
- without losing any forensic information. We can then
Robert Haas robertmh...@gmail.com writes:
As far as I can tell, there are basically two viable solutions on the
table here.
1. Every time we observe a page as all-visible, (a) set the
PD_ALL_VISIBLE bit on the page, without bumping the LSN; (b) set the
bit in the visibility map page, bumping
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
Hmm, actually, if we're willing to believe PD_ALL_VISIBLE in the page
header over the xmin/xmax on the tuples, we could simply not bother
doing anti-wraparound vacuums for pages that have the flag set. I'm not
sure what changes
On 01.12.2010 18:40, Tom Lane wrote:
Robert Haasrobertmh...@gmail.com writes:
As far as I can tell, there are basically two viable solutions on the
table here.
1. Every time we observe a page as all-visible, (a) set the
PD_ALL_VISIBLE bit on the page, without bumping the LSN; (b) set the
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 01.12.2010 18:40, Tom Lane wrote:
Um, no it isn't. Suppose the heap page gets to disk but we crash before
the WAL record does. Now we have a persistent state where the heap page
is marked PD_ALL_VISIBLE but the corresponding
Robert Haas robertmh...@gmail.com writes:
I think we can improve this a bit further by also introducing a
HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with
FrozenXID. This allows us to freeze tuples aggressively - if we want
- without losing any forensic information.
So far
On Wed, Dec 1, 2010 at 11:40 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
On 01.12.2010 18:25, Robert Haas wrote:
I think we can improve this a bit further by also introducing a
HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with
FrozenXID. This allows us
On Wed, Dec 1, 2010 at 12:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
Robert Haas robertmh...@gmail.com writes:
I think we can improve this a bit further by also introducing a
HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with
FrozenXID. This allows us to freeze tuples
On Wed, 2010-12-01 at 11:25 -0500, Robert Haas wrote:
1. Every time we observe a page as all-visible, (a) set the
PD_ALL_VISIBLE bit on the page, without bumping the LSN;
...
2. Every time we observe a page as all-visible, (a) set the
PD_ALL_VISIBLE bit on the page, without bumping the LSN,
On Wed, Dec 1, 2010 at 3:31 PM, Jeff Davis pg...@j-davis.com wrote:
On Wed, 2010-12-01 at 11:25 -0500, Robert Haas wrote:
1. Every time we observe a page as all-visible, (a) set the
PD_ALL_VISIBLE bit on the page, without bumping the LSN;
...
2. Every time we observe a page as all-visible,
Robert Haas robertmh...@gmail.com writes:
If we switched from per-tuple MVCC based on XIDs to per-page MVCC
based on LSNs and a rollback segment, all of this stuff would go out
the window. Hint bits, gone. Anti-wraparound VACUUM, gone. CRCs,
feasible. Visibility map... we might still need
On Wed, 2010-12-01 at 15:59 -0500, Robert Haas wrote:
As for CRCs, there's a pretty direct chain of inference here:
1. CRCs are hard (really impossible) because we have hint bits.
I would disagree with impossible. If we don't set hint bits during
reading; and when we do set them, we log them
On Dec 1, 2010, at 2:59 PM, Robert Haas wrote:
2. Hint bits are necessary because an old XID can't be viewed as
guaranteed committed.
Hmm... I thought hint bits were necessary because it's too expensive to query
CLOG for every tuple. If my understanding is correct then if we fix the CLOG
On Wed, Dec 1, 2010 at 5:24 PM, Jeff Davis pg...@j-davis.com wrote:
On Wed, 2010-12-01 at 15:59 -0500, Robert Haas wrote:
As for CRCs, there's a pretty direct chain of inference here:
1. CRCs are hard (really impossible) because we have hint bits.
I would disagree with impossible. If we
On Wed, Dec 1, 2010 at 6:41 PM, Jim Nasby j...@nasby.net wrote:
On Dec 1, 2010, at 2:59 PM, Robert Haas wrote:
2. Hint bits are necessary because an old XID can't be viewed as
guaranteed committed.
Hmm... I thought hint bits were necessary because it's too expensive to query
CLOG for every
On Mon, Nov 29, 2010 at 9:57 PM, Robert Haas robertmh...@gmail.com wrote:
1. Pin each visibility map page. If any VM_BECOMING_ALL_VISIBLE bits
are set, take the exclusive content lock for long enough to clear
them.
I wonder what the performance hit will be to workloads with contention
and if
On Tue, Nov 30, 2010 at 2:34 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
Some care is needed with checkpoints. Setting visibility map bits in step 2
is safe because crash recovery will replay the intent XLOG record and clear
any incorrectly set bits. But if a checkpoint
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 30.11.2010 06:57, Robert Haas wrote:
I can't say I'm totally in love with any of these designs. Anyone
else have any ideas, or any opinions about which one is best?
Well, the design I've been pondering goes like this:
On 30.11.2010 17:32, Robert Haas wrote:
On Tue, Nov 30, 2010 at 2:34 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
Some care is needed with checkpoints. Setting visibility map bits in step 2
is safe because crash recovery will replay the intent XLOG record and clear
any
On Tue, Nov 30, 2010 at 10:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 30.11.2010 06:57, Robert Haas wrote:
I can't say I'm totally in love with any of these designs. Anyone
else have any ideas, or any opinions about which one is
On 30.11.2010 17:38, Tom Lane wrote:
Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes:
On 30.11.2010 06:57, Robert Haas wrote:
I can't say I'm totally in love with any of these designs. Anyone
else have any ideas, or any opinions about which one is best?
Well, the design I've
On Tue, Nov 30, 2010 at 10:43 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
It seems like you'll need to hold some kind of lock between the time
you examine RedoRecPtr and the time you actually examine the bit.
WALInsertLock in shared mode, maybe?
It's enough to hold an
Here's one more idea:
The trivial solution to this is to WAL-log setting the visibility map
bit, like we WAL-log any other operation. Lock the heap page, lock the
visibility map page, write WAL-record, and release locks. That works,
but the problem is that it creates quite a lot of new WAL
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 30.11.2010 17:38, Tom Lane wrote:
Wouldn't it be easier and more robust to just consider VM bit changes to
be part of the WAL-logged actions? That would include updating LSNs on
VM pages and flushing VM pages to disk during
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
The trivial solution to this is to WAL-log setting the visibility map
bit, like we WAL-log any other operation. Lock the heap page, lock the
visibility map page, write WAL-record, and release locks. That works,
but the problem
On Tue, Nov 30, 2010 at 11:16 AM, Tom Lane t...@sss.pgh.pa.us wrote:
How much is quite a lot? Do we have any real reason to think that
this solution is unacceptable performance-wise?
Well, let's imagine a 1GB insert-only table. It has 128K pages. If
you XLOG setting the bit on each page,
On Tue, Nov 30, 2010 at 11:22 AM, Robert Haas robertmh...@gmail.com wrote:
On Tue, Nov 30, 2010 at 11:16 AM, Tom Lane t...@sss.pgh.pa.us wrote:
How much is quite a lot? Do we have any real reason to think that
this solution is unacceptable performance-wise?
Well, let's imagine a 1GB
On 30.11.2010 18:22, Robert Haas wrote:
On Tue, Nov 30, 2010 at 11:16 AM, Tom Lanet...@sss.pgh.pa.us wrote:
How much is quite a lot? Do we have any real reason to think that
this solution is unacceptable performance-wise?
Well, let's imagine a 1GB insert-only table. It has 128K pages. If
On 30.11.2010 18:10, Tom Lane wrote:
Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes:
Yeah, I'm not terribly excited about any of these schemes. The intent
record seems like the simplest one, but even that is quite different
from the traditional WAL-logging we do that it makes me
Robert Haas robertmh...@gmail.com writes:
Oh, but it's worse than that. When you XLOG a WAL record for each of
those pages, you're going to trigger full-page writes for all of them.
So now you've turned 1GB of data to write into 2+ GB of data to
write.
No, because only the first mod of each
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 30.11.2010 18:10, Tom Lane wrote:
I'm not convinced it works at all. Consider write intent record,
checkpoint, set bit, crash before completing vacuum. There will be
no second intent record at which you could clean up if
Robert Haas robertmh...@gmail.com writes:
That's definitely sucky, but in some ways it would be more complicated
if they did, because I don't think all-visible on the master implies
all-visible on the standby.
Ouch. That seems like it could shoot down all these proposals. There
definitely
On Tue, Nov 30, 2010 at 11:33 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Robert Haas robertmh...@gmail.com writes:
Oh, but it's worse than that. When you XLOG a WAL record for each of
those pages, you're going to trigger full-page writes for all of them.
So now you've turned 1GB of data to write
On 30.11.2010 18:40, Tom Lane wrote:
Robert Haasrobertmh...@gmail.com writes:
That's definitely sucky, but in some ways it would be more complicated
if they did, because I don't think all-visible on the master implies
all-visible on the standby.
Ouch. That seems like it could shoot down all
On 30.11.2010 18:33, Tom Lane wrote:
Robert Haasrobertmh...@gmail.com writes:
Oh, but it's worse than that. When you XLOG a WAL record for each of
those pages, you're going to trigger full-page writes for all of them.
So now you've turned 1GB of data to write into 2+ GB of data to
write.
On Tue, Nov 30, 2010 at 11:40 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Robert Haas robertmh...@gmail.com writes:
That's definitely sucky, but in some ways it would be more complicated
if they did, because I don't think all-visible on the master implies
all-visible on the standby.
Ouch. That
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
Can we get away with not setting the LSN on the heap page, even though
we set the PD_ALL_VISIBLE flag? If we don't set the LSN, the heap page
can be flushed to disk before the WAL record, but I think that's fine
because it's OK
Robert Haas robertmh...@gmail.com writes:
On Tue, Nov 30, 2010 at 11:40 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Ouch. That seems like it could shoot down all these proposals. There
definitely isn't any way to make VM crash-safe if there is no WAL-driven
mechanism for setting the bits.
On Tue, Nov 30, 2010 at 11:49 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
On 30.11.2010 18:33, Tom Lane wrote:
Robert Haasrobertmh...@gmail.com writes:
Oh, but it's worse than that. When you XLOG a WAL record for each of
those pages, you're going to trigger full-page
On Tue, Nov 30, 2010 at 11:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
Can we get away with not setting the LSN on the heap page, even though
we set the PD_ALL_VISIBLE flag? If we don't set the LSN, the heap page
can be flushed to
On Tue, Nov 30, 2010 at 11:59 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Robert Haas robertmh...@gmail.com writes:
On Tue, Nov 30, 2010 at 11:40 AM, Tom Lane t...@sss.pgh.pa.us wrote:
Ouch. That seems like it could shoot down all these proposals. There
definitely isn't any way to make VM
Robert Haas robertmh...@gmail.com writes:
We're not going to double the cost of VACUUM to get index-only scans.
And that's exactly what will happen if you do full-page writes of
every heap page to set a single bit.
It's ridiculous to claim that that doubles the cost of VACUUM. In the
worst
On Tue, Nov 30, 2010 at 12:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
Robert Haas robertmh...@gmail.com writes:
We're not going to double the cost of VACUUM to get index-only scans.
And that's exactly what will happen if you do full-page writes of
every heap page to set a single bit.
It's
Robert Haas robertmh...@gmail.com writes:
On Tue, Nov 30, 2010 at 12:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
It's ridiculous to claim that that doubles the cost of VACUUM. In the
worst case, it will add 25% to the cost of setting an all-visible bit on
a page where there is no other work to
On 30.11.2010 19:22, Tom Lane wrote:
But having said that, I wonder whether we need a full-page image for
a WAL-logged action that is known to involve only setting a single bit
and updating LSN. Would omitting the FPI be any more risky than what
happens now (ie, the page does get written back
On Tue, Nov 30, 2010 at 12:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
But having said that, I wonder whether we need a full-page image for
a WAL-logged action that is known to involve only setting a single bit
and updating LSN. Would omitting the FPI be any more risky than what
happens now
On Tue, Nov 30, 2010 at 12:25 PM, Robert Haas robertmh...@gmail.com wrote:
On Tue, Nov 30, 2010 at 12:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
But having said that, I wonder whether we need a full-page image for
a WAL-logged action that is known to involve only setting a single bit
and
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
On 30.11.2010 19:22, Tom Lane wrote:
But having said that, I wonder whether we need a full-page image for
a WAL-logged action that is known to involve only setting a single bit
and updating LSN.
You have to write a full-page
Heikki Linnakangas wrote:
On 30.11.2010 18:33, Tom Lane wrote:
Robert Haasrobertmh...@gmail.com writes:
Oh, but it's worse than that. When you XLOG a WAL record for each of
those pages, you're going to trigger full-page writes for all of them.
So now you've turned 1GB of data to write
On 01.12.2010 03:35, Bruce Momjian wrote:
Heikki Linnakangas wrote:
Let's recap what happens when a VM bit is set: You set the
PD_ALL_VISIBLE flag on the heap page (assuming it's not set already, it
usually isn't), and then set the bit in the VM while keeping the heap
page locked.
What if we
Last week, I posted a couple of possible designs for making the
visibility map crash-safe, which did not elicit much comment. Since
this is an important prerequisite to index-only scans, I'm trying
again.
http://archives.postgresql.org/pgsql-hackers/2010-11/msg01474.php
On 30.11.2010 06:57, Robert Haas wrote:
I can't say I'm totally in love with any of these designs. Anyone
else have any ideas, or any opinions about which one is best?
Well, the design I've been pondering goes like this:
At vacuum:
1. Write an intent XLOG record listing a chunk of
72 matches
Mail list logo