Re: [HACKERS] Use of ActiveSnapshot
On 5/14/2007 4:26 PM, Tom Lane wrote: Jan Wieck <[EMAIL PROTECTED]> writes: Which means that the 8.3 fix for the reproducible backend crash, I posted earlier, is to have SPI_cursor_open() save and restore ActiveSnapshot while calling RevalidateCachedPlan(). I'll cross check that this fixes this symptom and commit later today. No, the correct fix is to do that inside RevalidateCachedPlan ... and I already did it. Works for me. It fixed the Slony test that actually tripped over the bug. Thanks. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Use of ActiveSnapshot
Jan Wieck <[EMAIL PROTECTED]> writes: > Which means that the 8.3 fix for the reproducible backend crash, I > posted earlier, is to have SPI_cursor_open() save and restore > ActiveSnapshot while calling RevalidateCachedPlan(). I'll cross check > that this fixes this symptom and commit later today. No, the correct fix is to do that inside RevalidateCachedPlan ... and I already did it. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Use of ActiveSnapshot
On 5/14/2007 3:35 PM, Tom Lane wrote: Jan Wieck <[EMAIL PROTECTED]> writes: The only problem with that is that there are code paths that set ActiveSnapshot to palloc()'d memory that is released due to a MemoryContextDelete() without resetting ActiveSnapshot to NULL. Only at the very end of a transaction (where ActiveSnapshot *is* reset to null, in FreeXactSnapshot); otherwise we'd have bugs unrelated to RevalidateCachedPlan. Eventually I would like to have reference-counted snapshots managed by a centralized module, as was discussed a month or two back; but right at the moment I don't think it's broken and I don't want to spend time on intermediate solutions. Which means that the 8.3 fix for the reproducible backend crash, I posted earlier, is to have SPI_cursor_open() save and restore ActiveSnapshot while calling RevalidateCachedPlan(). I'll cross check that this fixes this symptom and commit later today. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Use of ActiveSnapshot
On 5/14/2007 1:29 PM, Tom Lane wrote: Jan Wieck <[EMAIL PROTECTED]> writes: The comment for the call of pg_plan_queries in util/cache/plancache.c line 469 for example is fatally wrong. Not only should the snapshot be set by all callers at this point, but if the call actually does replan the queries, the existing ActiveSnapshot is replaced with one allocated on the current memory context. If this happens to be inside of a nested SPI call sequence, the innermost SPI stack frame will free the snapshot data without restoring ActiveSnapshot to the one from the caller. Yeah, I'd been meaning to go back and recheck that point after the code settled down, but forgot :-(. It is possible for RevalidateCachedPlan to be called with no snapshot yet set --- at least the protocol Describe messages can do that. I don't want Describe to force a snapshot because that would be bad for cases like LOCK TABLE at the start of a serializable transaction, so RevalidateCachedPlan had better be able to cope with this case. Since the "typical" case in which no replan is necessary won't touch the snapshot, I think we'd better adopt the rule that RevalidateCachedPlan never causes any caller-visible change in ActiveSnapshot, else we'll be risking very-hard-to-reproduce bugs. So my proposal is that RevalidateCachedPlan should set a snapshot for itself if it needs to replan and ActiveSnapshot is NULL (else it might as well just use the existing snap); and that it should save and restore ActiveSnapshot when it does this. The only problem with that is that there are code paths that set ActiveSnapshot to palloc()'d memory that is released due to a MemoryContextDelete() without resetting ActiveSnapshot to NULL. So it might be possible for RevalidateCachedPlan to go ahead with an ActiveSnapshot pointing to garbage. I think it would be cleaner if RevalidateCachedPlan()'s API would have a Snapshot argument. If it needs a snapshot and the argument is NULL, it can create (and free) one itself, otherwise it'd use the one given. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Use of ActiveSnapshot
Jan Wieck <[EMAIL PROTECTED]> writes: > The only problem with that is that there are code paths that set > ActiveSnapshot to palloc()'d memory that is released due to a > MemoryContextDelete() without resetting ActiveSnapshot to NULL. Only at the very end of a transaction (where ActiveSnapshot *is* reset to null, in FreeXactSnapshot); otherwise we'd have bugs unrelated to RevalidateCachedPlan. Eventually I would like to have reference-counted snapshots managed by a centralized module, as was discussed a month or two back; but right at the moment I don't think it's broken and I don't want to spend time on intermediate solutions. > I think it would be cleaner if RevalidateCachedPlan()'s API would have a > Snapshot argument. How does that improve anything? AFAICS the only thing that would ever get passed is ActiveSnapshot, so this is just more notation to do exactly the same thing. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Use of ActiveSnapshot
Jan Wieck <[EMAIL PROTECTED]> writes: > The comment for the call of pg_plan_queries in util/cache/plancache.c > line 469 for example is fatally wrong. Not only should the snapshot be > set by all callers at this point, but if the call actually does replan > the queries, the existing ActiveSnapshot is replaced with one allocated > on the current memory context. If this happens to be inside of a nested > SPI call sequence, the innermost SPI stack frame will free the snapshot > data without restoring ActiveSnapshot to the one from the caller. Yeah, I'd been meaning to go back and recheck that point after the code settled down, but forgot :-(. It is possible for RevalidateCachedPlan to be called with no snapshot yet set --- at least the protocol Describe messages can do that. I don't want Describe to force a snapshot because that would be bad for cases like LOCK TABLE at the start of a serializable transaction, so RevalidateCachedPlan had better be able to cope with this case. Since the "typical" case in which no replan is necessary won't touch the snapshot, I think we'd better adopt the rule that RevalidateCachedPlan never causes any caller-visible change in ActiveSnapshot, else we'll be risking very-hard-to-reproduce bugs. So my proposal is that RevalidateCachedPlan should set a snapshot for itself if it needs to replan and ActiveSnapshot is NULL (else it might as well just use the existing snap); and that it should save and restore ActiveSnapshot when it does this. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Use of ActiveSnapshot
On 5/12/2007 4:53 PM, Jan Wieck wrote: Either calling pg_plan_queries() with needSnapshot=false or saving and restoring ActiveSnapshot will prevent the backend from dumping core in the mentioned example, but I am not entirely sure as to which one is the right solution. Attached is a self contained example that crashes the current backend. It took me a moment to figure out exactly how to reproduce it. The problem occurs when the query that needs replanning is actually a FOR record IN SELECT ... that is inside of a nested function call. In that case, the revalidation of the saved plan actually happens inside of SPI_cursor_open(), which does not save and restore the ActiveSnapshot. Shouldn't it? Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # create table t1 (a integer, b text, primary key (a)); create function f1 (integer) returns text as ' declare key alias for $1; row record; begin for row in select a, b from t1 loop if row.a = key then return row.b; end if; end loop; return null; end; ' language plpgsql; create function f2 (integer) returns text as ' declare key alias for $1; resultrecord; tmp record; begin select 5 as a, f1 as b into result from f1(key); return result.b; end; ' language plpgsql; insert into t1 values (1, 'one'); insert into t1 values (2, 'two'); select f2(1); alter table t1 add column c text; select f2(2); ---(end of broadcast)--- TIP 6: explain analyze is your friend