Josh Berkus wrote:
> On 03/19/2014 02:01 PM, Alvaro Herrera wrote:
> > Josh Berkus wrote:
> >> All,
> >>
> >> So, I'll ask again (because I didn't see a reply): is there any way
> >> users can *check* if they've been corrupted? Short of waiting for PK/FK
> >> violations?
Some notes:
1. if there'
On 03/19/2014 02:01 PM, Alvaro Herrera wrote:
> Josh Berkus wrote:
>> All,
>>
>> So, I'll ask again (because I didn't see a reply): is there any way
>> users can *check* if they've been corrupted? Short of waiting for PK/FK
>> violations?
>
> Obviously there are queries you can run to check each
Josh Berkus wrote:
> All,
>
> So, I'll ask again (because I didn't see a reply): is there any way
> users can *check* if they've been corrupted? Short of waiting for PK/FK
> violations?
Obviously there are queries you can run to check each FK -- the same
queries that ri_triggers.c would run when
Josh Berkus writes:
> So, I'll ask again (because I didn't see a reply): is there any way
> users can *check* if they've been corrupted? Short of waiting for PK/FK
> violations?
I think it would work to do a REINDEX on each table (doesn't matter which
index you select) and see if it complains ab
All,
So, I'll ask again (because I didn't see a reply): is there any way
users can *check* if they've been corrupted? Short of waiting for PK/FK
violations?
Given that all of the fixes we recommend involve extensive downtimes,
nobody is going to want to do them "just in case". How can they test
Folks:
So another question, which I've already received from the field, is how
can you detect this kind of corruption in the first place, if it's not
causing a user-visible error?
Got that question from someone who failed over between master and
replica on 9.3.2 last weekend. They're not seeing
Andres Freund writes:
> On 2014-03-17 19:55:01 -0400, Tom Lane wrote:
>>> I think the best way to really cleanup a table is to use something like:
>>> ALTER TABLE rew ALTER COLUMN data TYPE text USING (data);
>> Um... don't we have logic in there that's smart enough to short-circuit
>> that?
> N
On 2014-03-17 19:58:18 -0400, Tom Lane wrote:
> Alvaro Herrera writes:
> > Andres Freund wrote:
> >> I think the best way to really cleanup a table is to use something like:
> >> ALTER TABLE rew ALTER COLUMN data TYPE text USING (data);
> >> where text is the previous type of the column. That shou
Alvaro Herrera writes:
> Andres Freund wrote:
>> I think the best way to really cleanup a table is to use something like:
>> ALTER TABLE rew ALTER COLUMN data TYPE text USING (data);
>> where text is the previous type of the column. That should trigger a
>> full table rewrite, without any finesse
On 2014-03-17 19:55:01 -0400, Tom Lane wrote:
> > I think the best way to really cleanup a table is to use something like:
> > ALTER TABLE rew ALTER COLUMN data TYPE text USING (data);
> > where text is the previous type of the column. That should trigger a
> > full table rewrite, without any fines
On 2014-03-17 20:51:31 -0300, Alvaro Herrera wrote:
> > I think the best way to really cleanup a table is to use something like:
> > ALTER TABLE rew ALTER COLUMN data TYPE text USING (data);
> > where text is the previous type of the column. That should trigger a
> > full table rewrite, without any
Andres Freund writes:
> On 2014-03-17 21:09:10 +, Greg Stark wrote:
>> That said, it would be nice to actually fix the problem, not just
>> detect it. Eventually vacuum would fix the problem. I think. I'm not
>> really sure what will happen actually.
> Indexes will quite possibly stay corrupt
Andres Freund wrote:
> On 2014-03-17 21:09:10 +, Greg Stark wrote:
> > That said, it would be nice to actually fix the problem, not just
> > detect it. Eventually vacuum would fix the problem. I think. I'm not
> > really sure what will happen actually.
>
> Indexes will quite possibly stay corr
On 2014-03-17 21:09:10 +, Greg Stark wrote:
> That said, it would be nice to actually fix the problem, not just
> detect it. Eventually vacuum would fix the problem. I think. I'm not
> really sure what will happen actually.
Indexes will quite possibly stay corrupted. I think. If there was a
in
On Mon, Mar 17, 2014 at 3:28 PM, Tom Lane wrote:
> I'm thinking we'd better promote that Assert to a normal runtime elog.
I wasn't sure about this but on further thought I think it's a really
good idea and should be mentioned in the release notes. One of the
things that's been bothering me about
On 2014-03-17 16:17:35 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
> > On 2014-03-17 14:01:03 -0400, Tom Lane wrote:
> > > Andres Freund writes:
> > > > * I wonder if we should make the possible origins a bit more
> > > > general as it's perfectly possible to trigger the problem without
Alvaro Herrera writes:
> Uhm. But at the bottom of that block, right above the "failed:" label
> (heapam.c line 4527 in current master), we recheck the tuple for
> "locked-only-ness"; and fail the whole operation by returning
> HeapTupleUpdated, if it's not locked-only, no? Which would cause
> E
Andres Freund wrote:
> On 2014-03-17 14:01:03 -0400, Tom Lane wrote:
> > Andres Freund writes:
> > > * I wonder if we should make the possible origins a bit more
> > > general as it's perfectly possible to trigger the problem without
> > > foreign keys. Maybe: "can arise when a table row that
Andres Freund writes:
> On 2014-03-17 14:29:56 -0400, Tom Lane wrote:
>> [ scratches head ... ] If that's what's happening, isn't it a bug in
>> itself? Surely the WAL record ought to point at the tuple that was
>> locked.
> There's a separate XLOG_HEAP2_LOCK_UPDATED record, for every later tup
On 2014-03-17 14:52:25 -0400, Tom Lane wrote:
> Andres Freund writes:
> > On 2014-03-17 14:29:56 -0400, Tom Lane wrote:
> >> [ scratches head ... ] If that's what's happening, isn't it a bug in
> >> itself? Surely the WAL record ought to point at the tuple that was
> >> locked.
>
> > There's a
On 2014-03-17 14:29:56 -0400, Tom Lane wrote:
> Andres Freund writes:
> > To me that looks sufficient to trigger the bug, because we're issuing a
> > wal record about the row that was passed to heap_lock_update(), not the
> > latest one in the ctid chain. When replaying that record, it will reset
Andres Freund writes:
> To me that looks sufficient to trigger the bug, because we're issuing a
> wal record about the row that was passed to heap_lock_update(), not the
> latest one in the ctid chain. When replaying that record, it will reset
> the t_ctid field, thus breaking the chain.
[ scratc
On 2014-03-17 14:16:41 -0400, Tom Lane wrote:
> Andres Freund writes:
> > On 2014-03-17 14:01:03 -0400, Tom Lane wrote:
> >> IIUC, this case only occurs when using the new-in-9.3 types of
> >> nonexclusive row locks. I'm willing to bet that the number of
> >> applications using those is negligibl
Andres Freund writes:
> On 2014-03-17 14:01:03 -0400, Tom Lane wrote:
>> IIUC, this case only occurs when using the new-in-9.3 types of
>> nonexclusive row locks. I'm willing to bet that the number of
>> applications using those is negligible; so I think it's all right to not
>> mention that case
On 2014-03-17 14:01:03 -0400, Tom Lane wrote:
> Andres Freund writes:
> > * I wonder if we should make the possible origins a bit more
> > general as it's perfectly possible to trigger the problem without
> > foreign keys. Maybe: "can arise when a table row that has been updated
> > is row l
Andres Freund writes:
> That's much better, yes. Two things:
> * I'd change the warning about unique key violations into a more general
> one about constraints. Foreign key and exclusion constraint are also
> affected...
I'll see what I can do.
> * I wonder if we should make the possible or
On 2014-03-17 11:28:45 -0400, Tom Lane wrote:
> Hm ... "rows disappearing from indexes" might make people think that
> they could fix or mitigate the damage via REINDEX.
Good point. I guess in some cases it will end up working because
VACUUM/hot pruning have cleaned up the mess, but that's certain
On 2014-03-17 13:42:59 -0400, Tom Lane wrote:
> Andres Freund writes:
> > On 2014-03-17 10:03:52 -0700, Josh Berkus wrote:
> >> First, see suggested text in my first-draft release announcement.
>
> > I don't think that text is any better, it's imo even wrong:
> > "The bug causes rows to vanish fr
Andres Freund writes:
> On 2014-03-17 10:03:52 -0700, Josh Berkus wrote:
>> First, see suggested text in my first-draft release announcement.
> I don't think that text is any better, it's imo even wrong:
> "The bug causes rows to vanish from indexes during recovery due to
> simultaneous updates o
On 2014-03-15 16:02:19 -0400, Tom Lane wrote:
> First-draft release notes are committed, and should be visible at
> http://www.postgresql.org/docs/devel/static/release-9-3-4.html
> once guaibasaurus does its next buildfarm run a few minutes from
> now. Any suggestions?
So, the current text is:
"T
On 2014-03-17 10:03:52 -0700, Josh Berkus wrote:
> On 03/17/2014 08:28 AM, Tom Lane wrote:
> > Greg Stark writes:
> >> The error causes some rows to disappear from indexes resulting in
> >> inconsistent query results on a hot standby depending on whether
> >> indexes are used. If the standby is su
On 03/17/2014 08:28 AM, Tom Lane wrote:
> Greg Stark writes:
>> The error causes some rows to disappear from indexes resulting in
>> inconsistent query results on a hot standby depending on whether
>> indexes are used. If the standby is subsequently activated or if it
>> occurs during recovery aft
Greg Stark writes:
> This is not really accurate:
> "This error allowed multiple versions of the same row to become
> visible to queries, resulting in apparent duplicates. Since the error
> is in WAL replay, it would only manifest during crash recovery or on
> standby servers."
> I think the idea
On 03/16/2014 12:32 PM, Greg Stark wrote:
> I would consider adding something like "For the problem to occur a
> foreign key from another table must exist and a new row must be added
> to that other table around the same time (possibly in the same
> transaction) as an update to the referenced row"
This is not really accurate:
"This error allowed multiple versions of the same row to become
visible to queries, resulting in apparent duplicates. Since the error
is in WAL replay, it would only manifest during crash recovery or on
standby servers."
I think the idea is coming from what the second
On 03/15/2014 01:02 PM, Tom Lane wrote:
> First-draft release notes are committed, and should be visible at
> http://www.postgresql.org/docs/devel/static/release-9-3-4.html
> once guaibasaurus does its next buildfarm run a few minutes from
> now. Any suggestions?
Hmmm, not sure I like this. It's
36 matches
Mail list logo