Re: [HACKERS] First-draft release notes for next week's releases

2014-03-19 Thread Josh Berkus
All, So, I'll ask again (because I didn't see a reply): is there any way users can *check* if they've been corrupted? Short of waiting for PK/FK violations? Given that all of the fixes we recommend involve extensive downtimes, nobody is going to want to do them just in case. How can they test

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-19 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes: So, I'll ask again (because I didn't see a reply): is there any way users can *check* if they've been corrupted? Short of waiting for PK/FK violations? I think it would work to do a REINDEX on each table (doesn't matter which index you select) and see if

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-19 Thread Alvaro Herrera
Josh Berkus wrote: All, So, I'll ask again (because I didn't see a reply): is there any way users can *check* if they've been corrupted? Short of waiting for PK/FK violations? Obviously there are queries you can run to check each FK -- the same queries that ri_triggers.c would run when you

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-19 Thread Josh Berkus
On 03/19/2014 02:01 PM, Alvaro Herrera wrote: Josh Berkus wrote: All, So, I'll ask again (because I didn't see a reply): is there any way users can *check* if they've been corrupted? Short of waiting for PK/FK violations? Obviously there are queries you can run to check each FK -- the

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-19 Thread Alvaro Herrera
Josh Berkus wrote: On 03/19/2014 02:01 PM, Alvaro Herrera wrote: Josh Berkus wrote: All, So, I'll ask again (because I didn't see a reply): is there any way users can *check* if they've been corrupted? Short of waiting for PK/FK violations? Some notes: 1. if there's been no crash

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-18 Thread Josh Berkus
Folks: So another question, which I've already received from the field, is how can you detect this kind of corruption in the first place, if it's not causing a user-visible error? Got that question from someone who failed over between master and replica on 9.3.2 last weekend. They're not seeing

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Tom Lane
Greg Stark st...@mit.edu writes: This is not really accurate: This error allowed multiple versions of the same row to become visible to queries, resulting in apparent duplicates. Since the error is in WAL replay, it would only manifest during crash recovery or on standby servers. I think

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Josh Berkus
On 03/17/2014 08:28 AM, Tom Lane wrote: Greg Stark st...@mit.edu writes: The error causes some rows to disappear from indexes resulting in inconsistent query results on a hot standby depending on whether indexes are used. If the standby is subsequently activated or if it occurs during

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Andres Freund
On 2014-03-17 10:03:52 -0700, Josh Berkus wrote: On 03/17/2014 08:28 AM, Tom Lane wrote: Greg Stark st...@mit.edu writes: The error causes some rows to disappear from indexes resulting in inconsistent query results on a hot standby depending on whether indexes are used. If the standby is

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Andres Freund
On 2014-03-15 16:02:19 -0400, Tom Lane wrote: First-draft release notes are committed, and should be visible at http://www.postgresql.org/docs/devel/static/release-9-3-4.html once guaibasaurus does its next buildfarm run a few minutes from now. Any suggestions? So, the current text is: This

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: On 2014-03-17 10:03:52 -0700, Josh Berkus wrote: First, see suggested text in my first-draft release announcement. I don't think that text is any better, it's imo even wrong: The bug causes rows to vanish from indexes during recovery due to

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Andres Freund
On 2014-03-17 13:42:59 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-03-17 10:03:52 -0700, Josh Berkus wrote: First, see suggested text in my first-draft release announcement. I don't think that text is any better, it's imo even wrong: The bug causes rows

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Andres Freund
On 2014-03-17 11:28:45 -0400, Tom Lane wrote: Hm ... rows disappearing from indexes might make people think that they could fix or mitigate the damage via REINDEX. Good point. I guess in some cases it will end up working because VACUUM/hot pruning have cleaned up the mess, but that's certainly

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: That's much better, yes. Two things: * I'd change the warning about unique key violations into a more general one about constraints. Foreign key and exclusion constraint are also affected... I'll see what I can do. * I wonder if we should

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Andres Freund
On 2014-03-17 14:01:03 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: * I wonder if we should make the possible origins a bit more general as it's perfectly possible to trigger the problem without foreign keys. Maybe: can arise when a table row that has been updated

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: On 2014-03-17 14:01:03 -0400, Tom Lane wrote: IIUC, this case only occurs when using the new-in-9.3 types of nonexclusive row locks. I'm willing to bet that the number of applications using those is negligible; so I think it's all right to not

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Andres Freund
On 2014-03-17 14:16:41 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-03-17 14:01:03 -0400, Tom Lane wrote: IIUC, this case only occurs when using the new-in-9.3 types of nonexclusive row locks. I'm willing to bet that the number of applications using those

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: To me that looks sufficient to trigger the bug, because we're issuing a wal record about the row that was passed to heap_lock_update(), not the latest one in the ctid chain. When replaying that record, it will reset the t_ctid field, thus breaking

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Andres Freund
On 2014-03-17 14:29:56 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: To me that looks sufficient to trigger the bug, because we're issuing a wal record about the row that was passed to heap_lock_update(), not the latest one in the ctid chain. When replaying that

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Andres Freund
On 2014-03-17 14:52:25 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2014-03-17 14:29:56 -0400, Tom Lane wrote: [ scratches head ... ] If that's what's happening, isn't it a bug in itself? Surely the WAL record ought to point at the tuple that was locked.

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: On 2014-03-17 14:29:56 -0400, Tom Lane wrote: [ scratches head ... ] If that's what's happening, isn't it a bug in itself? Surely the WAL record ought to point at the tuple that was locked. There's a separate XLOG_HEAP2_LOCK_UPDATED record, for

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Alvaro Herrera
Andres Freund wrote: On 2014-03-17 14:01:03 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: * I wonder if we should make the possible origins a bit more general as it's perfectly possible to trigger the problem without foreign keys. Maybe: can arise when a table

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes: Uhm. But at the bottom of that block, right above the failed: label (heapam.c line 4527 in current master), we recheck the tuple for locked-only-ness; and fail the whole operation by returning HeapTupleUpdated, if it's not locked-only, no?

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Andres Freund
On 2014-03-17 16:17:35 -0300, Alvaro Herrera wrote: Andres Freund wrote: On 2014-03-17 14:01:03 -0400, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: * I wonder if we should make the possible origins a bit more general as it's perfectly possible to trigger the problem

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Greg Stark
On Mon, Mar 17, 2014 at 3:28 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'm thinking we'd better promote that Assert to a normal runtime elog. I wasn't sure about this but on further thought I think it's a really good idea and should be mentioned in the release notes. One of the things that's been

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Andres Freund
On 2014-03-17 21:09:10 +, Greg Stark wrote: That said, it would be nice to actually fix the problem, not just detect it. Eventually vacuum would fix the problem. I think. I'm not really sure what will happen actually. Indexes will quite possibly stay corrupted. I think. If there was a

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Alvaro Herrera
Andres Freund wrote: On 2014-03-17 21:09:10 +, Greg Stark wrote: That said, it would be nice to actually fix the problem, not just detect it. Eventually vacuum would fix the problem. I think. I'm not really sure what will happen actually. Indexes will quite possibly stay corrupted. I

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: On 2014-03-17 21:09:10 +, Greg Stark wrote: That said, it would be nice to actually fix the problem, not just detect it. Eventually vacuum would fix the problem. I think. I'm not really sure what will happen actually. Indexes will quite

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Andres Freund
On 2014-03-17 20:51:31 -0300, Alvaro Herrera wrote: I think the best way to really cleanup a table is to use something like: ALTER TABLE rew ALTER COLUMN data TYPE text USING (data); where text is the previous type of the column. That should trigger a full table rewrite, without any

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Andres Freund
On 2014-03-17 19:55:01 -0400, Tom Lane wrote: I think the best way to really cleanup a table is to use something like: ALTER TABLE rew ALTER COLUMN data TYPE text USING (data); where text is the previous type of the column. That should trigger a full table rewrite, without any finesse

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes: Andres Freund wrote: I think the best way to really cleanup a table is to use something like: ALTER TABLE rew ALTER COLUMN data TYPE text USING (data); where text is the previous type of the column. That should trigger a full table rewrite,

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Andres Freund
On 2014-03-17 19:58:18 -0400, Tom Lane wrote: Alvaro Herrera alvhe...@2ndquadrant.com writes: Andres Freund wrote: I think the best way to really cleanup a table is to use something like: ALTER TABLE rew ALTER COLUMN data TYPE text USING (data); where text is the previous type of the

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-17 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes: On 2014-03-17 19:55:01 -0400, Tom Lane wrote: I think the best way to really cleanup a table is to use something like: ALTER TABLE rew ALTER COLUMN data TYPE text USING (data); Um... don't we have logic in there that's smart enough to short-circuit

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-16 Thread Josh Berkus
On 03/15/2014 01:02 PM, Tom Lane wrote: First-draft release notes are committed, and should be visible at http://www.postgresql.org/docs/devel/static/release-9-3-4.html once guaibasaurus does its next buildfarm run a few minutes from now. Any suggestions? Hmmm, not sure I like this. It's

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-16 Thread Greg Stark
This is not really accurate: This error allowed multiple versions of the same row to become visible to queries, resulting in apparent duplicates. Since the error is in WAL replay, it would only manifest during crash recovery or on standby servers. I think the idea is coming from what the second

Re: [HACKERS] First-draft release notes for next week's releases

2014-03-16 Thread Josh Berkus
On 03/16/2014 12:32 PM, Greg Stark wrote: I would consider adding something like For the problem to occur a foreign key from another table must exist and a new row must be added to that other table around the same time (possibly in the same transaction) as an update to the referenced row That