Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-09-15 Thread Robert Haas
On Tue, Sep 15, 2020 at 2:04 PM Andres Freund wrote: > > How is it possible? Because tuple which has a committed xmax and the > > xmax is older than the oldestXmin, should not come for freezing unless > > it is lock_only xid (because those tuples are already gone). > > There've been several

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-09-15 Thread Andres Freund
Hi, On 2020-09-15 12:52:25 +0530, Dilip Kumar wrote: > On Tue, Sep 15, 2020 at 11:14 AM Andres Freund wrote: > > > > On 2020-09-15 10:54:29 +0530, Dilip Kumar wrote: > > > What problem do you see if we set xmax to the InvalidTransactionId and > > > HEAP_XMAX_INVALID flag in the infomask ? > > >

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-09-15 Thread Dilip Kumar
On Tue, Sep 15, 2020 at 11:14 AM Andres Freund wrote: > > On 2020-09-15 10:54:29 +0530, Dilip Kumar wrote: > > What problem do you see if we set xmax to the InvalidTransactionId and > > HEAP_XMAX_INVALID flag in the infomask ? > > 1) It'll make a dead tuple appear live. You cannot do this for

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-09-14 Thread Andres Freund
On 2020-09-15 10:54:29 +0530, Dilip Kumar wrote: > What problem do you see if we set xmax to the InvalidTransactionId and > HEAP_XMAX_INVALID flag in the infomask ? 1) It'll make a dead tuple appear live. You cannot do this for tuples with an xid below the horizon. 2) it'll break HOT chain

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-09-14 Thread Dilip Kumar
On Tue, Sep 15, 2020 at 2:35 AM Andres Freund wrote: > > Hi, > > On 2020-09-14 17:00:48 -0400, Robert Haas wrote: > > On Mon, Sep 14, 2020 at 4:13 PM Andres Freund wrote: > > > My understanding of the case we're discussing is that it's corruption > > > (e.g. relfrozenxid being different than

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-09-14 Thread Andres Freund
Hi, On 2020-09-14 17:00:48 -0400, Robert Haas wrote: > On Mon, Sep 14, 2020 at 4:13 PM Andres Freund wrote: > > My understanding of the case we're discussing is that it's corruption > > (e.g. relfrozenxid being different than table contents) affecting a HOT > > chain. I.e. by definition all

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-09-14 Thread Robert Haas
On Mon, Sep 14, 2020 at 4:13 PM Andres Freund wrote: > My understanding of the case we're discussing is that it's corruption > (e.g. relfrozenxid being different than table contents) affecting a HOT > chain. I.e. by definition all within a single page. We won't have > modified part of it

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-09-14 Thread Andres Freund
Hi, On 2020-09-14 15:50:49 -0400, Robert Haas wrote: > On Mon, Sep 14, 2020 at 3:00 PM Alvaro Herrera > wrote: > > FWIW I agree with Andres' stance on this. The current system is *very* > > complicated and bugs are obscure already. If we hide them, what we'll > > be getting is a system where

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-09-14 Thread Robert Haas
On Mon, Sep 14, 2020 at 3:00 PM Alvaro Herrera wrote: > FWIW I agree with Andres' stance on this. The current system is *very* > complicated and bugs are obscure already. If we hide them, what we'll > be getting is a system where data can become corrupted for no apparent > reason. I think I

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-09-14 Thread Alvaro Herrera
On 2020-Sep-14, Andres Freund wrote: > It seems pretty dangerous to me. What exactly are you going to put into > xmin/xmax here? And how would anything you put into the first tuple not > break index lookups? There's no such thing as a frozen xmax (so far), so > what are you going to put in there?

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-09-14 Thread Andres Freund
Hi, On 2020-09-14 13:26:27 -0400, Robert Haas wrote: > On Sat, Aug 29, 2020 at 4:36 AM Dilip Kumar wrote: > > One example is, suppose during vacuum, there are 2 tuples in the hot > > chain, and the xmin of the first tuple is corrupted (i.e. smaller > > than relfrozenxid). And the xmax of this

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-09-14 Thread Robert Haas
On Sat, Aug 29, 2020 at 4:36 AM Dilip Kumar wrote: > One example is, suppose during vacuum, there are 2 tuples in the hot > chain, and the xmin of the first tuple is corrupted (i.e. smaller > than relfrozenxid). And the xmax of this tuple (which is same as the > xmin of the second tuple) is

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-08-29 Thread Dilip Kumar
On Sat, Aug 29, 2020 at 1:46 AM Robert Haas wrote: > > On Fri, Aug 28, 2020 at 1:29 PM Andres Freund wrote: > > It can break HOT chains, plain ctid chains etc, for example. Which, if > > earlier / follower tuples are removed can't be detected anymore at a > > later time. > > I think I need a

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-08-29 Thread Dilip Kumar
On Fri, Aug 28, 2020 at 9:49 PM Robert Haas wrote: > > On Tue, Jul 21, 2020 at 9:21 AM Dilip Kumar wrote: > > In the previous version, the feature was enabled for cluster/vacuum > > full command as well. in the attached patch I have enabled it only > > if we are running vacuum command. It

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-08-28 Thread Robert Haas
On Fri, Aug 28, 2020 at 1:29 PM Andres Freund wrote: > It can break HOT chains, plain ctid chains etc, for example. Which, if > earlier / follower tuples are removed can't be detected anymore at a > later time. I think I need a more specific example here to understand the problem. If the xmax of

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-08-28 Thread Andres Freund
Hi, On 2020-08-28 12:37:17 -0400, Robert Haas wrote: > On Mon, Jul 20, 2020 at 4:30 PM Andres Freund wrote: > > If we really were to do something like this the option would need to be > > called vacuum_allow_making_corruption_worse or such. Its need to be > > *exceedingly* clear that it will

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-08-28 Thread Robert Haas
On Mon, Jul 20, 2020 at 4:30 PM Andres Freund wrote: > If we really were to do something like this the option would need to be > called vacuum_allow_making_corruption_worse or such. Its need to be > *exceedingly* clear that it will likely lead to making everything much > worse. I don't really

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-08-28 Thread Robert Haas
On Tue, Jul 21, 2020 at 9:21 AM Dilip Kumar wrote: > In the previous version, the feature was enabled for cluster/vacuum > full command as well. in the attached patch I have enabled it only > if we are running vacuum command. It will not be enabled during a > table rewrite. If we think that

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-08-03 Thread Robert Haas
On Mon, Jul 20, 2020 at 4:30 PM Andres Freund wrote: > I'm extremely doubtful this is a good idea. In all likelihood this will > just exascerbate corruption. > > You cannot just stop freezing tuples, that'll lead to relfrozenxid > getting *further* out of sync with the actual table contents. And

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-21 Thread Dilip Kumar
On Tue, Jul 21, 2020 at 4:08 PM Dilip Kumar wrote: > > On Tue, Jul 21, 2020 at 11:00 AM Dilip Kumar wrote: > > > > On Tue, Jul 21, 2020 at 2:00 AM Andres Freund wrote: > > > > > > Hi, > > > > > > On 2020-07-17 16:16:23 +0530, Dilip Kumar wrote: > > > > The attached patch allows the vacuum to

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-21 Thread Dilip Kumar
On Tue, Jul 21, 2020 at 11:00 AM Dilip Kumar wrote: > > On Tue, Jul 21, 2020 at 2:00 AM Andres Freund wrote: > > > > Hi, > > > > On 2020-07-17 16:16:23 +0530, Dilip Kumar wrote: > > > The attached patch allows the vacuum to continue by emitting WARNING > > > for the corrupted tuple instead of

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-20 Thread Dilip Kumar
On Tue, Jul 21, 2020 at 2:00 AM Andres Freund wrote: > > Hi, > > On 2020-07-17 16:16:23 +0530, Dilip Kumar wrote: > > The attached patch allows the vacuum to continue by emitting WARNING > > for the corrupted tuple instead of immediately error out as discussed > > at [1]. > > > > Basically, it

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-20 Thread Dilip Kumar
On Mon, Jul 20, 2020 at 10:14 PM Alvaro Herrera wrote: > > On 2020-Jul-20, Dilip Kumar wrote: > > > On Fri, Jul 17, 2020 at 4:16 PM Dilip Kumar wrote: > > > > So if the vacuum_tolerate_damage is set then in > > > all the cases in heap_prepare_freeze_tuple where the corrupted xid is > > >

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-20 Thread Andrey M. Borodin
> 21 июля 2020 г., в 00:36, Alvaro Herrera > написал(а): > > >> FWIW we coped with this by actively monitoring this kind of corruption >> with this amcheck patch [0]. One can observe this lost page updates >> cheaply in indexes and act on first sight of corruption: identify >> source of the

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-20 Thread Andres Freund
Hi, On 2020-07-17 16:16:23 +0530, Dilip Kumar wrote: > The attached patch allows the vacuum to continue by emitting WARNING > for the corrupted tuple instead of immediately error out as discussed > at [1]. > > Basically, it provides a new GUC called vacuum_tolerate_damage, to > control whether

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-20 Thread Alvaro Herrera
On 2020-Jul-20, Andrey M. Borodin wrote: > I think the point here is to actually move relfrozenxid back. But the > mince can't be turned back. If CLOG is rotated - the table is > corrupted beyond easy repair. Oh, I see. Hmm. Well, if you discover relfrozenxid that's newer and the pg_clog files

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-20 Thread Andrey M. Borodin
> 20 июля 2020 г., в 21:44, Alvaro Herrera > написал(а): > >> I think we shall do that in some cases >> but IMHO it's not a very good idea in all the cases. Basically, if >> the xmin precedes the relfrozenxid then probably we should allow to >> update the relfrozenxid whereas if the xmin

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-20 Thread Alvaro Herrera
On 2020-Jul-20, Dilip Kumar wrote: > On Fri, Jul 17, 2020 at 4:16 PM Dilip Kumar wrote: > > So if the vacuum_tolerate_damage is set then in > > all the cases in heap_prepare_freeze_tuple where the corrupted xid is > > detected, it will emit a warning and return that nothing is changed in > >

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-20 Thread Dilip Kumar
On Sun, Jul 19, 2020 at 4:56 PM Andrey M. Borodin wrote: > > Hi Dilip! > > > > 17 июля 2020 г., в 15:46, Dilip Kumar написал(а): > > > > The attached patch allows the vacuum to continue by emitting WARNING > > for the corrupted tuple instead of immediately error out as discussed > > at [1]. > >

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-20 Thread Dilip Kumar
On Fri, Jul 17, 2020 at 4:16 PM Dilip Kumar wrote: > > The attached patch allows the vacuum to continue by emitting WARNING > for the corrupted tuple instead of immediately error out as discussed > at [1]. > > Basically, it provides a new GUC called vacuum_tolerate_damage, to > control whether to

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-19 Thread Andrey M. Borodin
Hi Dilip! > 17 июля 2020 г., в 15:46, Dilip Kumar написал(а): > > The attached patch allows the vacuum to continue by emitting WARNING > for the corrupted tuple instead of immediately error out as discussed > at [1]. > > Basically, it provides a new GUC called vacuum_tolerate_damage, to >

Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

2020-07-17 Thread Dilip Kumar
The attached patch allows the vacuum to continue by emitting WARNING for the corrupted tuple instead of immediately error out as discussed at [1]. Basically, it provides a new GUC called vacuum_tolerate_damage, to control whether to continue the vacuum or to stop on the occurrence of a corrupted