All,
So, I'll ask again (because I didn't see a reply): is there any way
users can *check* if they've been corrupted? Short of waiting for PK/FK
violations?
Given that all of the fixes we recommend involve extensive downtimes,
nobody is going to want to do them just in case. How can they test
Josh Berkus j...@agliodbs.com writes:
So, I'll ask again (because I didn't see a reply): is there any way
users can *check* if they've been corrupted? Short of waiting for PK/FK
violations?
I think it would work to do a REINDEX on each table (doesn't matter which
index you select) and see if
Josh Berkus wrote:
All,
So, I'll ask again (because I didn't see a reply): is there any way
users can *check* if they've been corrupted? Short of waiting for PK/FK
violations?
Obviously there are queries you can run to check each FK -- the same
queries that ri_triggers.c would run when you
On 03/19/2014 02:01 PM, Alvaro Herrera wrote:
Josh Berkus wrote:
All,
So, I'll ask again (because I didn't see a reply): is there any way
users can *check* if they've been corrupted? Short of waiting for PK/FK
violations?
Obviously there are queries you can run to check each FK -- the
Josh Berkus wrote:
On 03/19/2014 02:01 PM, Alvaro Herrera wrote:
Josh Berkus wrote:
All,
So, I'll ask again (because I didn't see a reply): is there any way
users can *check* if they've been corrupted? Short of waiting for PK/FK
violations?
Some notes:
1. if there's been no crash
Folks:
So another question, which I've already received from the field, is how
can you detect this kind of corruption in the first place, if it's not
causing a user-visible error?
Got that question from someone who failed over between master and
replica on 9.3.2 last weekend. They're not seeing
Greg Stark st...@mit.edu writes:
This is not really accurate:
This error allowed multiple versions of the same row to become
visible to queries, resulting in apparent duplicates. Since the error
is in WAL replay, it would only manifest during crash recovery or on
standby servers.
I think
On 03/17/2014 08:28 AM, Tom Lane wrote:
Greg Stark st...@mit.edu writes:
The error causes some rows to disappear from indexes resulting in
inconsistent query results on a hot standby depending on whether
indexes are used. If the standby is subsequently activated or if it
occurs during
On 2014-03-17 10:03:52 -0700, Josh Berkus wrote:
On 03/17/2014 08:28 AM, Tom Lane wrote:
Greg Stark st...@mit.edu writes:
The error causes some rows to disappear from indexes resulting in
inconsistent query results on a hot standby depending on whether
indexes are used. If the standby is
On 2014-03-15 16:02:19 -0400, Tom Lane wrote:
First-draft release notes are committed, and should be visible at
http://www.postgresql.org/docs/devel/static/release-9-3-4.html
once guaibasaurus does its next buildfarm run a few minutes from
now. Any suggestions?
So, the current text is:
This
Andres Freund and...@2ndquadrant.com writes:
On 2014-03-17 10:03:52 -0700, Josh Berkus wrote:
First, see suggested text in my first-draft release announcement.
I don't think that text is any better, it's imo even wrong:
The bug causes rows to vanish from indexes during recovery due to
On 2014-03-17 13:42:59 -0400, Tom Lane wrote:
Andres Freund and...@2ndquadrant.com writes:
On 2014-03-17 10:03:52 -0700, Josh Berkus wrote:
First, see suggested text in my first-draft release announcement.
I don't think that text is any better, it's imo even wrong:
The bug causes rows
On 2014-03-17 11:28:45 -0400, Tom Lane wrote:
Hm ... rows disappearing from indexes might make people think that
they could fix or mitigate the damage via REINDEX.
Good point. I guess in some cases it will end up working because
VACUUM/hot pruning have cleaned up the mess, but that's certainly
Andres Freund and...@2ndquadrant.com writes:
That's much better, yes. Two things:
* I'd change the warning about unique key violations into a more general
one about constraints. Foreign key and exclusion constraint are also
affected...
I'll see what I can do.
* I wonder if we should
On 2014-03-17 14:01:03 -0400, Tom Lane wrote:
Andres Freund and...@2ndquadrant.com writes:
* I wonder if we should make the possible origins a bit more
general as it's perfectly possible to trigger the problem without
foreign keys. Maybe: can arise when a table row that has been updated
Andres Freund and...@2ndquadrant.com writes:
On 2014-03-17 14:01:03 -0400, Tom Lane wrote:
IIUC, this case only occurs when using the new-in-9.3 types of
nonexclusive row locks. I'm willing to bet that the number of
applications using those is negligible; so I think it's all right to not
On 2014-03-17 14:16:41 -0400, Tom Lane wrote:
Andres Freund and...@2ndquadrant.com writes:
On 2014-03-17 14:01:03 -0400, Tom Lane wrote:
IIUC, this case only occurs when using the new-in-9.3 types of
nonexclusive row locks. I'm willing to bet that the number of
applications using those
Andres Freund and...@2ndquadrant.com writes:
To me that looks sufficient to trigger the bug, because we're issuing a
wal record about the row that was passed to heap_lock_update(), not the
latest one in the ctid chain. When replaying that record, it will reset
the t_ctid field, thus breaking
On 2014-03-17 14:29:56 -0400, Tom Lane wrote:
Andres Freund and...@2ndquadrant.com writes:
To me that looks sufficient to trigger the bug, because we're issuing a
wal record about the row that was passed to heap_lock_update(), not the
latest one in the ctid chain. When replaying that
On 2014-03-17 14:52:25 -0400, Tom Lane wrote:
Andres Freund and...@2ndquadrant.com writes:
On 2014-03-17 14:29:56 -0400, Tom Lane wrote:
[ scratches head ... ] If that's what's happening, isn't it a bug in
itself? Surely the WAL record ought to point at the tuple that was
locked.
Andres Freund and...@2ndquadrant.com writes:
On 2014-03-17 14:29:56 -0400, Tom Lane wrote:
[ scratches head ... ] If that's what's happening, isn't it a bug in
itself? Surely the WAL record ought to point at the tuple that was
locked.
There's a separate XLOG_HEAP2_LOCK_UPDATED record, for
Andres Freund wrote:
On 2014-03-17 14:01:03 -0400, Tom Lane wrote:
Andres Freund and...@2ndquadrant.com writes:
* I wonder if we should make the possible origins a bit more
general as it's perfectly possible to trigger the problem without
foreign keys. Maybe: can arise when a table
Alvaro Herrera alvhe...@2ndquadrant.com writes:
Uhm. But at the bottom of that block, right above the failed: label
(heapam.c line 4527 in current master), we recheck the tuple for
locked-only-ness; and fail the whole operation by returning
HeapTupleUpdated, if it's not locked-only, no?
On 2014-03-17 16:17:35 -0300, Alvaro Herrera wrote:
Andres Freund wrote:
On 2014-03-17 14:01:03 -0400, Tom Lane wrote:
Andres Freund and...@2ndquadrant.com writes:
* I wonder if we should make the possible origins a bit more
general as it's perfectly possible to trigger the problem
On Mon, Mar 17, 2014 at 3:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
I'm thinking we'd better promote that Assert to a normal runtime elog.
I wasn't sure about this but on further thought I think it's a really
good idea and should be mentioned in the release notes. One of the
things that's been
On 2014-03-17 21:09:10 +, Greg Stark wrote:
That said, it would be nice to actually fix the problem, not just
detect it. Eventually vacuum would fix the problem. I think. I'm not
really sure what will happen actually.
Indexes will quite possibly stay corrupted. I think. If there was a
Andres Freund wrote:
On 2014-03-17 21:09:10 +, Greg Stark wrote:
That said, it would be nice to actually fix the problem, not just
detect it. Eventually vacuum would fix the problem. I think. I'm not
really sure what will happen actually.
Indexes will quite possibly stay corrupted. I
Andres Freund and...@2ndquadrant.com writes:
On 2014-03-17 21:09:10 +, Greg Stark wrote:
That said, it would be nice to actually fix the problem, not just
detect it. Eventually vacuum would fix the problem. I think. I'm not
really sure what will happen actually.
Indexes will quite
On 2014-03-17 20:51:31 -0300, Alvaro Herrera wrote:
I think the best way to really cleanup a table is to use something like:
ALTER TABLE rew ALTER COLUMN data TYPE text USING (data);
where text is the previous type of the column. That should trigger a
full table rewrite, without any
On 2014-03-17 19:55:01 -0400, Tom Lane wrote:
I think the best way to really cleanup a table is to use something like:
ALTER TABLE rew ALTER COLUMN data TYPE text USING (data);
where text is the previous type of the column. That should trigger a
full table rewrite, without any finesse
Alvaro Herrera alvhe...@2ndquadrant.com writes:
Andres Freund wrote:
I think the best way to really cleanup a table is to use something like:
ALTER TABLE rew ALTER COLUMN data TYPE text USING (data);
where text is the previous type of the column. That should trigger a
full table rewrite,
On 2014-03-17 19:58:18 -0400, Tom Lane wrote:
Alvaro Herrera alvhe...@2ndquadrant.com writes:
Andres Freund wrote:
I think the best way to really cleanup a table is to use something like:
ALTER TABLE rew ALTER COLUMN data TYPE text USING (data);
where text is the previous type of the
Andres Freund and...@2ndquadrant.com writes:
On 2014-03-17 19:55:01 -0400, Tom Lane wrote:
I think the best way to really cleanup a table is to use something like:
ALTER TABLE rew ALTER COLUMN data TYPE text USING (data);
Um... don't we have logic in there that's smart enough to short-circuit
On 03/15/2014 01:02 PM, Tom Lane wrote:
First-draft release notes are committed, and should be visible at
http://www.postgresql.org/docs/devel/static/release-9-3-4.html
once guaibasaurus does its next buildfarm run a few minutes from
now. Any suggestions?
Hmmm, not sure I like this. It's
This is not really accurate:
This error allowed multiple versions of the same row to become
visible to queries, resulting in apparent duplicates. Since the error
is in WAL replay, it would only manifest during crash recovery or on
standby servers.
I think the idea is coming from what the second
On 03/16/2014 12:32 PM, Greg Stark wrote:
I would consider adding something like For the problem to occur a
foreign key from another table must exist and a new row must be added
to that other table around the same time (possibly in the same
transaction) as an update to the referenced row That
First-draft release notes are committed, and should be visible at
http://www.postgresql.org/docs/devel/static/release-9-3-4.html
once guaibasaurus does its next buildfarm run a few minutes from
now. Any suggestions?
regards, tom lane
--
Sent via pgsql-hackers mailing
37 matches
Mail list logo