Re: [HACKERS] Use of ActiveSnapshot

2007-05-14 Thread Jan Wieck

On 5/12/2007 4:53 PM, Jan Wieck wrote:
Either calling pg_plan_queries() with needSnapshot=false or saving and 
restoring ActiveSnapshot will prevent the backend from dumping core in 
the mentioned example, but I am not entirely sure as to which one is the 
right solution.


Attached is a self contained example that crashes the current backend. 
It took me a moment to figure out exactly how to reproduce it. The 
problem occurs when the query that needs replanning is actually a


  FOR record IN SELECT ...

that is inside of a nested function call. In that case, the revalidation 
of the saved plan actually happens inside of SPI_cursor_open(), which 
does not save and restore the ActiveSnapshot. Shouldn't it?



Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
create table t1 (a integer, b text, primary key (a));

create function f1 (integer) returns text as '
declare
  key   alias for $1;
  row   record;
begin
  for row in select a, b from t1 loop
if row.a = key then
  return row.b;
end if;
  end loop;
  return null;
end;
' language plpgsql;

create function f2 (integer) returns text as '
declare
  key   alias for $1;
  resultrecord;
  tmp   record;
begin
  select 5 as a, f1 as b into result from f1(key);
  return result.b;
end;
' language plpgsql;

insert into t1 values (1, 'one');
insert into t1 values (2, 'two');

select f2(1);
alter table t1 add column c text;
select f2(2);

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Use of ActiveSnapshot

2007-05-14 Thread Tom Lane
Jan Wieck [EMAIL PROTECTED] writes:
 The comment for the call of pg_plan_queries in util/cache/plancache.c 
 line 469 for example is fatally wrong. Not only should the snapshot be 
 set by all callers at this point, but if the call actually does replan 
 the queries, the existing ActiveSnapshot is replaced with one allocated 
 on the current memory context. If this happens to be inside of a nested 
 SPI call sequence, the innermost SPI stack frame will free the snapshot 
 data without restoring ActiveSnapshot to the one from the caller.

Yeah, I'd been meaning to go back and recheck that point after the code
settled down, but forgot :-(.

It is possible for RevalidateCachedPlan to be called with no snapshot
yet set --- at least the protocol Describe messages can do that.  I
don't want Describe to force a snapshot because that would be bad for
cases like LOCK TABLE at the start of a serializable transaction, so
RevalidateCachedPlan had better be able to cope with this case.

Since the typical case in which no replan is necessary won't touch
the snapshot, I think we'd better adopt the rule that
RevalidateCachedPlan never causes any caller-visible change in
ActiveSnapshot, else we'll be risking very-hard-to-reproduce bugs.
So my proposal is that RevalidateCachedPlan should set a snapshot for
itself if it needs to replan and ActiveSnapshot is NULL (else it might
as well just use the existing snap); and that it should save and restore
ActiveSnapshot when it does this.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Use of ActiveSnapshot

2007-05-14 Thread Tom Lane
Jan Wieck [EMAIL PROTECTED] writes:
 The only problem with that is that there are code paths that set 
 ActiveSnapshot to palloc()'d memory that is released due to a 
 MemoryContextDelete() without resetting ActiveSnapshot to NULL.

Only at the very end of a transaction (where ActiveSnapshot *is* reset
to null, in FreeXactSnapshot); otherwise we'd have bugs unrelated to
RevalidateCachedPlan.  Eventually I would like to have reference-counted
snapshots managed by a centralized module, as was discussed a month or
two back; but right at the moment I don't think it's broken and I don't
want to spend time on intermediate solutions.

 I think it would be cleaner if RevalidateCachedPlan()'s API would have a 
 Snapshot argument.

How does that improve anything?  AFAICS the only thing that would ever
get passed is ActiveSnapshot, so this is just more notation to do
exactly the same thing.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Use of ActiveSnapshot

2007-05-14 Thread Jan Wieck

On 5/14/2007 1:29 PM, Tom Lane wrote:

Jan Wieck [EMAIL PROTECTED] writes:
The comment for the call of pg_plan_queries in util/cache/plancache.c 
line 469 for example is fatally wrong. Not only should the snapshot be 
set by all callers at this point, but if the call actually does replan 
the queries, the existing ActiveSnapshot is replaced with one allocated 
on the current memory context. If this happens to be inside of a nested 
SPI call sequence, the innermost SPI stack frame will free the snapshot 
data without restoring ActiveSnapshot to the one from the caller.


Yeah, I'd been meaning to go back and recheck that point after the code
settled down, but forgot :-(.

It is possible for RevalidateCachedPlan to be called with no snapshot
yet set --- at least the protocol Describe messages can do that.  I
don't want Describe to force a snapshot because that would be bad for
cases like LOCK TABLE at the start of a serializable transaction, so
RevalidateCachedPlan had better be able to cope with this case.

Since the typical case in which no replan is necessary won't touch
the snapshot, I think we'd better adopt the rule that
RevalidateCachedPlan never causes any caller-visible change in
ActiveSnapshot, else we'll be risking very-hard-to-reproduce bugs.
So my proposal is that RevalidateCachedPlan should set a snapshot for
itself if it needs to replan and ActiveSnapshot is NULL (else it might
as well just use the existing snap); and that it should save and restore
ActiveSnapshot when it does this.


The only problem with that is that there are code paths that set 
ActiveSnapshot to palloc()'d memory that is released due to a 
MemoryContextDelete() without resetting ActiveSnapshot to NULL. So it 
might be possible for RevalidateCachedPlan to go ahead with an 
ActiveSnapshot pointing to garbage.


I think it would be cleaner if RevalidateCachedPlan()'s API would have a 
Snapshot argument. If it needs a snapshot and the argument is NULL, it 
can create (and free) one itself, otherwise it'd use the one given.



Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Use of ActiveSnapshot

2007-05-14 Thread Jan Wieck

On 5/14/2007 3:35 PM, Tom Lane wrote:

Jan Wieck [EMAIL PROTECTED] writes:
The only problem with that is that there are code paths that set 
ActiveSnapshot to palloc()'d memory that is released due to a 
MemoryContextDelete() without resetting ActiveSnapshot to NULL.


Only at the very end of a transaction (where ActiveSnapshot *is* reset
to null, in FreeXactSnapshot); otherwise we'd have bugs unrelated to
RevalidateCachedPlan.  Eventually I would like to have reference-counted
snapshots managed by a centralized module, as was discussed a month or
two back; but right at the moment I don't think it's broken and I don't
want to spend time on intermediate solutions.


Which means that the 8.3 fix for the reproducible backend crash, I 
posted earlier, is to have SPI_cursor_open() save and restore 
ActiveSnapshot while calling RevalidateCachedPlan(). I'll cross check 
that this fixes this symptom and commit later today.



Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Use of ActiveSnapshot

2007-05-14 Thread Tom Lane
Jan Wieck [EMAIL PROTECTED] writes:
 Which means that the 8.3 fix for the reproducible backend crash, I 
 posted earlier, is to have SPI_cursor_open() save and restore 
 ActiveSnapshot while calling RevalidateCachedPlan(). I'll cross check 
 that this fixes this symptom and commit later today.

No, the correct fix is to do that inside RevalidateCachedPlan ... and I
already did it.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Use of ActiveSnapshot

2007-05-14 Thread Jan Wieck

On 5/14/2007 4:26 PM, Tom Lane wrote:

Jan Wieck [EMAIL PROTECTED] writes:
Which means that the 8.3 fix for the reproducible backend crash, I 
posted earlier, is to have SPI_cursor_open() save and restore 
ActiveSnapshot while calling RevalidateCachedPlan(). I'll cross check 
that this fixes this symptom and commit later today.


No, the correct fix is to do that inside RevalidateCachedPlan ... and I
already did it.


Works for me. It fixed the Slony test that actually tripped over the 
bug. Thanks.



Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[HACKERS] Use of ActiveSnapshot

2007-05-12 Thread Jan Wieck
The use of ActiveSnapshot throughout the code appears rather dangerous 
to me. It is a global pointer, assumed not to be set yet in some places, 
assumed to be saved and restored by the caller in others. The actual 
(context) memory it points to is sometimes explicitly freed, sometimes 
just left in the context and thrown away by MemoryContextDelete() 
without resetting ActiveSnapshot to NULL.


The comment for the call of pg_plan_queries in util/cache/plancache.c 
line 469 for example is fatally wrong. Not only should the snapshot be 
set by all callers at this point, but if the call actually does replan 
the queries, the existing ActiveSnapshot is replaced with one allocated 
on the current memory context. If this happens to be inside of a nested 
SPI call sequence, the innermost SPI stack frame will free the snapshot 
data without restoring ActiveSnapshot to the one from the caller.


Either calling pg_plan_queries() with needSnapshot=false or saving and 
restoring ActiveSnapshot will prevent the backend from dumping core in 
the mentioned example, but I am not entirely sure as to which one is the 
right solution.



Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate