Re: lazy detoasting

2018-08-12 Thread Chapman Flack
On 04/11/18 11:27, Chapman Flack wrote:
> In most cases I can easily imagine, a function that gets an SQLXML
> object is going to read it "pretty soon" ... 
> However, the spec does explicitly provide that you could, for whatever
> reason, sit on the thing for a while, then read it later in the same
> transaction. You should get the same stuff you would have if you had
> read it right away, ...
>Eager detoasting at the time of creating the object, into a
> transaction-scoped memory context, would trivially have that behavior,
> but on the chance that XML values might be large, and a function might
> conceivably never read the thing at all on certain code paths, I'd
> rather defer detoasting until the code holding the SQLXML object
> actually wants to read it.

In the interest of closure, how this idea looks implemented in PL/Java
is here:

  https://github.com/tada/pljava/commit/1a5caf1

It uses GetOldestSnapshot() to choose the snapshot to retain, and it
passes the following test, inspired by Andrew Gierth's in this thread.
(It also fails the test if the snapshot retention isn't done, confirming
that it's needed.)


CREATE TABLE t(x xml);
BEGIN READ WRITE, ISOLATION LEVEL READ COMMITTED;
/*
 * In other session: INSERT INTO t(x)
 * SELECT table_to_xml('pg_operator', true, false, '');
 */
SELECT javatest.echoxmlparameter(x, 0, 5) FROM t; -- 0 => stash x
/*
 * In other session: DELETE FROM t;
 * VACUUM t;
 */
SELECT javatest.echoxmlparameter(null, 5, 5); -- null => unstash
COMMIT;


GetOldestSnapshot() appeared in 9.6. For older PG releases, instead
of a snapshot and toast pointer being retained to detoast lazily,
the ondisk content is eagerly fetched, then decompressed lazily.
XML compresses well, so that can still use 95% less memory while one
of these objects is being held but not read from. (The 9.5-and-earlier
support is added in a separate commit.)

-Chap



Re: lazy detoasting

2018-05-17 Thread Tom Lane
Peter Eisentraut  writes:
> In reviewing the committed patch, I noticed that in ER_get_flat_size()
> you have removed the PG_DETOAST_DATUM() call and let
> expanded_record_set_field_internal() do the de-toasting work.  I had
> considered that too, but my impression is that the purpose of the
> PG_DETOAST_DATUM() is to de-compress for the purpose of size
> computation, whereas expanded_record_set_field_internal() only does the
> inlining of externally stored values and doesn't do any explicit
> decompression.  Is this correct?

Hmm ... yeah, there is a semantics change there, but I think it's fine.
I don't recall that I'd thought specifically about the behavior for
inline-compressed/short-header Datums when I wrote that code to begin
with.  But for its purpose, which is that we're trying to flatten the
expanded record into a tuple, leaving an inline-compressed field in that
form seems OK, perhaps actually preferable.  And short-header is a no-op:
it'd end up with a short header in the emitted tuple regardless of our
choice here.

regards, tom lane



Re: lazy detoasting

2018-05-17 Thread Peter Eisentraut
On 5/12/18 17:25, Tom Lane wrote:
> Anyway, attached is a revised patch.  I found a test case for
> expanded_record_set_fields(), too.

Thank you for fixing this up.

In reviewing the committed patch, I noticed that in ER_get_flat_size()
you have removed the PG_DETOAST_DATUM() call and let
expanded_record_set_field_internal() do the de-toasting work.  I had
considered that too, but my impression is that the purpose of the
PG_DETOAST_DATUM() is to de-compress for the purpose of size
computation, whereas expanded_record_set_field_internal() only does the
inlining of externally stored values and doesn't do any explicit
decompression.  Is this correct?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: lazy detoasting

2018-05-12 Thread Tom Lane
Peter Eisentraut  writes:
> Here is a more complete patch.  I made a call graph to get to the
> bottom, literally, of how variable assignments happen in PL/pgSQL.  (See
> attached.)  There are four leaf functions to patch up.
> Also, I wrote some isolation tests to hit each of these cases.  I wasn't
> able to construct one for expanded_record_set_fields(), but the
> principle there should be the same.

I don't like this patch too much: it leaks memory in some places where
that's not a good idea, and in expanded_record_set_field_internal,
it's actually allocating the new field value in the wrong context.
And it pays no attention to updating comments, even ones that it's
flat out invalidated.

I also realized while poking at it that it'd broken much of the work
we did to optimize array and record variables in plpgsql by storing
them in "expanded" form, because in a non-atomic context it causes
assign_simple_var to forcibly flatten anything that heap_tuple_fetch_attr
knows how to flatten --- including "expanded" datums.

To make that last case work properly, we want assign_simple_var to only
flatten things that are actually ONDISK or INDIRECT external datums.
It's okay to leave "expanded" datums as-is, because they're just in memory
and as long as we haven't screwed up memory management they'll be fine
across a transaction boundary.

I made a new macro in postgres.h to recognize such cases, but I wonder
if anyone thinks we should do that differently.  It seems a bit odd
that we've not needed to make that distinction before.  OTOH, flattening
all external datums does seem like the right semantics for the
expandedrecord.c functions to have, so maybe it's fine.

Anyway, attached is a revised patch.  I found a test case for
expanded_record_set_fields(), too.

regards, tom lane

diff --git a/src/backend/utils/adt/expandedrecord.c b/src/backend/utils/adt/expandedrecord.c
index 0bf5fe8..b1b6883 100644
--- a/src/backend/utils/adt/expandedrecord.c
+++ b/src/backend/utils/adt/expandedrecord.c
@@ -19,6 +19,7 @@
 #include "postgres.h"
 
 #include "access/htup_details.h"
+#include "access/tuptoaster.h"
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "utils/builtins.h"
@@ -41,7 +42,7 @@ static const ExpandedObjectMethods ER_methods =
 
 /* Other local functions */
 static void ER_mc_callback(void *arg);
-static MemoryContext get_domain_check_cxt(ExpandedRecordHeader *erh);
+static MemoryContext get_short_term_cxt(ExpandedRecordHeader *erh);
 static void build_dummy_expanded_header(ExpandedRecordHeader *main_erh);
 static pg_noinline void check_domain_for_new_field(ExpandedRecordHeader *erh,
 		   int fnumber,
@@ -57,8 +58,9 @@ static pg_noinline void check_domain_for_new_tuple(ExpandedRecordHeader *erh,
  *
  * The expanded record is initially "empty", having a state logically
  * equivalent to a NULL composite value (not ROW(NULL, NULL, ...)).
- * Note that this might not be a valid state for a domain type; if the
- * caller needs to check that, call expanded_record_set_tuple(erh, NULL).
+ * Note that this might not be a valid state for a domain type;
+ * if the caller needs to check that, call
+ * expanded_record_set_tuple(erh, NULL, false, false).
  *
  * The expanded object will be a child of parentcontext.
  */
@@ -424,8 +426,11 @@ make_expanded_record_from_exprecord(ExpandedRecordHeader *olderh,
  *
  * The tuple is physically copied into the expanded record's local storage
  * if "copy" is true, otherwise it's caller's responsibility that the tuple
- * will live as long as the expanded record does.  In any case, out-of-line
- * fields in the tuple are not automatically inlined.
+ * will live as long as the expanded record does.
+ *
+ * Out-of-line field values in the tuple are automatically inlined if
+ * "expand_external" is true, otherwise not.  (The combination copy = false,
+ * expand_external = true is not sensible and not supported.)
  *
  * Alternatively, tuple can be NULL, in which case we just set the expanded
  * record to be empty.
@@ -433,7 +438,8 @@ make_expanded_record_from_exprecord(ExpandedRecordHeader *olderh,
 void
 expanded_record_set_tuple(ExpandedRecordHeader *erh,
 		  HeapTuple tuple,
-		  bool copy)
+		  bool copy,
+		  bool expand_external)
 {
 	int			oldflags;
 	HeapTuple	oldtuple;
@@ -453,6 +459,25 @@ expanded_record_set_tuple(ExpandedRecordHeader *erh,
 		check_domain_for_new_tuple(erh, tuple);
 
 	/*
+	 * If we need to get rid of out-of-line field values, do so, using the
+	 * short-term context to avoid leaking whatever cruft the toast fetch
+	 * might generate.
+	 */
+	if (expand_external && tuple)
+	{
+		/* Assert caller didn't ask for unsupported case */
+		Assert(copy);
+		if (HeapTupleHasExternal(tuple))
+		{
+			oldcxt = MemoryContextSwitchTo(get_short_term_cxt(erh));
+			tuple = toast_flatten_tuple(tuple, erh->er_tupdesc);
+			MemoryContextSwitchTo(oldcxt);
+		}
+		else
+			

Re: lazy detoasting

2018-05-03 Thread Andrew Gierth
> "Peter" == Peter Eisentraut  writes:

 Peter> The attached test fixes this issue by flattening the toast
 Peter> values before storing them into PL/pgSQL variables. It could use
 Peter> another check to see if there are other code paths that need
 Peter> similar adjustments, but I think it's the right idea in general.

Uhh.

What about:

  somevar := (select blah);

or

  somevar := function_returning_toasted_val(blah);

or

  call someproc(function_returning_toasted_val(blah));

or or or ...

-- 
Andrew (irc:RhodiumToad)



Re: lazy detoasting

2018-05-03 Thread Peter Eisentraut
On 5/1/18 19:56, Andrew Gierth wrote:
>  Peter> insert into test1 values (1, repeat('foo', 2000));
> 
> That value is no good because it's too compressible; it'll be left
> inline in the main table rather than being externalized, so the value of
> 'x' in the DO-block is still self-contained (though it's still toasted
> in the sense of being VARATT_IS_EXTENDED).

Right.  I added

alter table test1 alter column b set storage external;

then I can see the error.

The attached test fixes this issue by flattening the toast values before
storing them into PL/pgSQL variables.  It could use another check to see
if there are other code paths that need similar adjustments, but I think
it's the right idea in general.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 294b68508cade1dd0df5b78c0f0ec12d7ce761b8 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 3 May 2018 10:54:13 -0400
Subject: [PATCH v1] PL/pgSQL: Flatten TOAST data in nonatomic context

When in a nonatomic execution context, when storing a potentially
toasted datum into a PL/pgSQL variable, we need to flatten the
datum (i.e., remove any references to TOAST data).  Otherwise, a
transaction end combined with, say, a concurrent VACUUM, between storing
the datum and reading it, could remove the TOAST data, and then the data
in the variable would no longer be readable.
---
 src/pl/plpgsql/src/pl_exec.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 228d1c0d00..28a6957286 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -20,6 +20,7 @@
 #include "access/htup_details.h"
 #include "access/transam.h"
 #include "access/tupconvert.h"
+#include "access/tuptoaster.h"
 #include "catalog/pg_proc.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -6621,6 +6622,16 @@ exec_move_row(PLpgSQL_execstate *estate,
 {
ExpandedRecordHeader *newerh = NULL;
 
+   /*
+* If not in atomic context, flatten TOAST references.  Otherwise the
+* TOAST data might disappear if a transaction is committed.
+*/
+   if (!estate->atomic)
+   {
+   if (tup)
+   tup = toast_flatten_tuple(tup, tupdesc);
+   }
+
/*
 * If target is RECORD, we may be able to avoid field-by-field 
processing.
 */

base-commit: 30c66e77be1d890c3cca766259c0bec80bcac1b5
-- 
2.17.0



Re: lazy detoasting

2018-05-01 Thread Andrew Gierth
> "Andrew" == Andrew Gierth  writes:

 Andrew> ERROR:  no known snapshots
 Andrew> CONTEXT:  PL/pgSQL function inline_code_block line 1 at RAISE

 Andrew> This is another issue that was mentioned before in relation to
 Andrew> procedures.

See  https://www.postgresql.org/message-id/29608.1518533...@sss.pgh.pa.us

-- 
Andrew (irc:RhodiumToad)



Re: lazy detoasting

2018-05-01 Thread Andrew Gierth
> "Peter" == Peter Eisentraut  writes:

 Peter> Is there a more self-contained way to test this? I have been
 Peter> trying with something like

 Peter> create table test1 (a int, b text);

 Peter> insert into test1 values (1, repeat('foo', 2000));

That value is no good because it's too compressible; it'll be left
inline in the main table rather than being externalized, so the value of
'x' in the DO-block is still self-contained (though it's still toasted
in the sense of being VARATT_IS_EXTENDED).

I tend to use something like this:

insert into test1
  values (1, (select string_agg(chr(32+floor(95*random())::integer),'')
from generate_series(1,1)));

If I do that, I get a different error from your test (whether or not the
vacuum is done):

ERROR:  no known snapshots
CONTEXT:  PL/pgSQL function inline_code_block line 1 at RAISE

This is another issue that was mentioned before in relation to
procedures.

-- 
Andrew (irc:RhodiumToad)



Re: lazy detoasting

2018-05-01 Thread Peter Eisentraut
On 4/25/18 07:50, Andrew Gierth wrote:
> do $$
>   declare a text;
>   begin
> select f1.a into a from f1;
> delete from f1;
> commit;
> perform pg_sleep(10);  -- vacuum f1 in another session while it sleeps
> call p1(a);
>   end; $$;
> INFO:  a: (t,t,f,"missing chunk number 0",,)
> 
> (p1 in this case is using toast_item_detail() from the module I just put
> up at https://github.com/RhodiumToad/pg-toastutils to examine the value)

Is there a more self-contained way to test this?  I have been trying
with something like

create table test1 (a int, b text);

insert into test1 values (1, repeat('foo', 2000));

do $$
  declare
x text;
  begin
select test1.b into x from test1;
delete from test1;
commit;
perform pg_sleep(10);  -- vacuum test1 in another session
raise notice 'x = %', x;  -- should fail
  end;
$$;

But it doesn't fail.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: lazy detoasting

2018-04-25 Thread Andrew Gierth
> "Andrew" == Andrew Gierth  writes:

> "Peter" == Peter Eisentraut  writes:
 >> On 4/11/18 11:33, Tom Lane wrote:
 >>> (Wanders away wondering what Peter has done about toasted parameter
 >>> values for procedures in general ...)

 Peter> I'm not sure.  How can a procedure have a toasted parameter value?

 Andrew> do $$ declare a text; begin select f1.a into a from f1; call p1(a); 
end; $$;

And behold:

do $$
  declare a text;
  begin
select f1.a into a from f1;
delete from f1;
commit;
perform pg_sleep(10);  -- vacuum f1 in another session while it sleeps
call p1(a);
  end; $$;
INFO:  a: (t,t,f,"missing chunk number 0",,)

(p1 in this case is using toast_item_detail() from the module I just put
up at https://github.com/RhodiumToad/pg-toastutils to examine the value)

-- 
Andrew (irc:RhodiumToad)



Re: lazy detoasting

2018-04-24 Thread Peter Eisentraut
On 4/11/18 11:33, Tom Lane wrote:
> (Wanders away wondering what Peter has done about toasted parameter
> values for procedures in general ...)

I'm not sure.  How can a procedure have a toasted parameter value?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: lazy detoasting

2018-04-11 Thread Andrew Gierth
> "Chapman" == Chapman Flack  writes:

 Chapman> There's precedent for that kind of thing in PL/Java already
 Chapman> ... objects that Java considers alive as long as some code
 Chapman> holds a reference to them, but proxy for things in PG that may
 Chapman> only have function-call lifetime or cursor-row lifetime, etc.
 Chapman> If they are closed by Java code (or the Java GC finds them
 Chapman> unreachable) first, they have to remember to release their PG
 Chapman> stuff; if the PG stuff goes first, they have to update
 Chapman> themselves to throw a suitable "you've kept me past my sell-by
 Chapman> date" exception if the Java code tries to use them again.

How's the code doing this? I couldn't find it in a cursory scan.

-- 
Andrew (irc:RhodiumToad)



Re: lazy detoasting

2018-04-11 Thread Chapman Flack
On 04/11/2018 03:04 PM, Tom Lane wrote:
> Chapman Flack  writes:
>> that it might *not* be sufficient to find an applicable snapshot at
>> the time of constructing the object, and register that snapshot
>> on TopTransactionResourceOwner?
> 
> The problem is to know which snapshot is applicable; if the transaction
> has more than one, you don't know which was used to read the row of
> interest.  I suppose you could be conservative and use the oldest one,
> if snapmgr lets you find that.

There does seem to be GetOldestSnapshot(), returning
older( oldest on active stack, oldest on registered heap ).

And it seems to be the very thing called by tuptoaster itself,
right after the comment "since we don't know which one to use,
just use the oldest one".

-Chap



Re: lazy detoasting

2018-04-11 Thread Tom Lane
Chapman Flack  writes:
> But let me return to the earlier idea for a moment: are you saying
> that it might *not* be sufficient to find an applicable snapshot at
> the time of constructing the object, and register that snapshot
> on TopTransactionResourceOwner?

The problem is to know which snapshot is applicable; if the transaction
has more than one, you don't know which was used to read the row of
interest.  I suppose you could be conservative and use the oldest one,
if snapmgr lets you find that.

regards, tom lane



Re: lazy detoasting

2018-04-11 Thread Chapman Flack
On 04/11/2018 01:55 PM, Tom Lane wrote:
> Chapman Flack  writes:
>> Well, the devilsAdvocate() function would stash the object
>> in a static, then try to look at it some time in a later call
>> in the same transaction.
> 
> If you're worried about that, you should also worry about what happens
> if the function uses the static variable in some later transaction.
> The spec grants you license to throw an error, but it still needs to
> be a clean error (not something about "can't find toast value", IMO).

There's precedent for that kind of thing in PL/Java already ... objects
that Java considers alive as long as some code holds a reference
to them, but proxy for things in PG that may only have function-call
lifetime or cursor-row lifetime, etc. If they are closed by Java code
(or the Java GC finds them unreachable) first, they have to remember
to release their PG stuff; if the PG stuff goes first, they have to
update themselves to throw a suitable "you've kept me past my sell-by
date" exception if the Java code tries to use them again.

Thomas implemented most of those things ages ago; this is the first
I've added myself, with a little adjustment of technique because his
were for lifetimes shorter than transaction. I'm using the
TopTransactionResourceOwner to learn when the transaction is finished.
Resource owners have been around as long as any PG version PL/Java
supports, so that seems ok.

> Can you detect that the value is being stored in a long-lived variable
> and detoast at that point?

Not easily, I don't think. The question resembles "is this object
still reachable, or unreachable, now as I exit this function call?"
or at some other specific time. The Java garbage collector eventually
learns what's become unreachable, but it doesn't promise *when*.

But let me return to the earlier idea for a moment: are you saying
that it might *not* be sufficient to find an applicable snapshot at
the time of constructing the object, and register that snapshot
on TopTransactionResourceOwner?

It would obviously not be kept around any longer than the transaction,
and would be released earlier whenever the Java code reads/closes it,
or lets go of the last reference and GC finds it. In all the typical
cases I can imagine, it would be registered very briefly, with the
object being constructed/read/freed in quick succession.

-Chap



Re: lazy detoasting

2018-04-11 Thread Tom Lane
Chapman Flack  writes:
> On 04/11/2018 11:33 AM, Tom Lane wrote:
>> OK, but if this object only lives within a single function call,
>> what's the problem?  The underlying row must be visible to the
>> calling query, and that condition won't change for the duration
>> of the call.

> Well, the devilsAdvocate() function would stash the object
> in a static, then try to look at it some time in a later call
> in the same transaction.

If you're worried about that, you should also worry about what happens
if the function uses the static variable in some later transaction.
The spec grants you license to throw an error, but it still needs to
be a clean error (not something about "can't find toast value", IMO).

Can you detect that the value is being stored in a long-lived variable
and detoast at that point?

regards, tom lane



Re: lazy detoasting

2018-04-11 Thread Chapman Flack
On 04/11/2018 11:33 AM, Tom Lane wrote:
> Chapman Flack  writes:
>> The mission is to implement java.sql.SQLXML, which has long been
>> missing from PL/Java.
>> This is the class of object your Java code can retrieve from a
>> ResultSet row from a query with an XML column type. (It's also
>> what can be passed to you as a function parameter, if your
>> function's SQL declaration has a parameter type XML.)
> 
> OK, but if this object only lives within a single function call,
> what's the problem?  The underlying row must be visible to the
> calling query, and that condition won't change for the duration
> of the call.

Well, the devilsAdvocate() function would stash the object
in a static, then try to look at it some time in a later call
in the same transaction.

Mind you, I have no use case in mind for a function wanting to do that
(other than as a test case for spec compliance). But the API spec for
the SQLXML class expressly says "object is valid for the duration
of the transaction", and I try not to renege on spec guarantees
if I can find a practical way not to.

> Things could get interesting if you consider a Java *procedure*,

and, yes, that. Though so far, there still are no PL/Java *procedures*.
I haven't even been following that development closely enough yet to
think about what a plan for supporting those would require.

-Chap



Re: lazy detoasting

2018-04-11 Thread Tom Lane
Chapman Flack  writes:
> On 04/11/2018 10:41 AM, Tom Lane wrote:
>> So maybe we need to take two steps back and talk about the semantics
>> of what you're doing.

> The mission is to implement java.sql.SQLXML, which has long been
> missing from PL/Java.
> This is the class of object your Java code can retrieve from a
> ResultSet row from a query with an XML column type. (It's also
> what can be passed to you as a function parameter, if your
> function's SQL declaration has a parameter type XML.)

OK, but if this object only lives within a single function call,
what's the problem?  The underlying row must be visible to the
calling query, and that condition won't change for the duration
of the call.

Things could get interesting if you consider a Java *procedure*,
but I think you could afford to detoast-on-entry for that case.

(Wanders away wondering what Peter has done about toasted parameter
values for procedures in general ...)

regards, tom lane



Re: lazy detoasting

2018-04-11 Thread Chapman Flack
On 04/11/2018 10:41 AM, Tom Lane wrote:
> The core of the problem now is that in a READ COMMITTED transaction, we
> may not be holding any snapshot at all between statements.  So if you're
> hanging onto a toast pointer across statements you're at risk.
> 
> On the other hand, it's also arguable that you shouldn't be interested
> in dereferencing such a pointer later than the statement in which it
> was read, precisely because it no longer represents accessible data.
> So maybe we need to take two steps back and talk about the semantics
> of what you're doing.

The mission is to implement java.sql.SQLXML, which has long been
missing from PL/Java.

https://docs.oracle.com/javase/6/docs/api/index.html?java/sql/SQLXML.html

The API spec includes this: "An SQLXML object is valid for
the duration of the transaction in which it was created."

This is the class of object your Java code can retrieve from a
ResultSet row from a query with an XML column type. (It's also
what can be passed to you as a function parameter, if your
function's SQL declaration has a parameter type XML.)

An SQLXML object represents the XML value. You get to read the
object (exactly once, or not at all) for the content it represents,
in your choice of a half-dozen different ways corresponding to
available Java XML-processing APIs. (The SQLXML implementation knows
what the underlying stored form is, so it encapsulates the knowledge
of how most efficiently to get it from there to the form the program
wants to work with.)

In most cases I can easily imagine, a function that gets an SQLXML
object is going to read it "pretty soon" ... surely some time during
the function call, if it was passed as a parameter, and probably
within the same row loop iteration, for one retrieved from a query.
However, the spec does explicitly provide that you could, for whatever
reason, sit on the thing for a while, then read it later in the same
transaction. You should get the same stuff you would have if you had
read it right away, whether or not stuff has changed in the database
since. Eager detoasting at the time of creating the object, into a
transaction-scoped memory context, would trivially have that behavior,
but on the chance that XML values might be large, and a function might
conceivably never read the thing at all on certain code paths, I'd
rather defer detoasting until the code holding the SQLXML object
actually wants to read it.

-Chap



Re: lazy detoasting

2018-04-11 Thread Tom Lane
Chapman Flack <c...@anastigmatix.net> writes:
> On 04/10/2018 10:17 PM, Jan Wieck wrote:
>> If your session has a transaction snapshot that protects the old toast
>> slices from being vacuumed away, you are fine.

> That harks back to my earlier (naïve?) thought that, as long as
> my lazy datum doesn't have to outlive the transaction, lazy
> detoasting might Just Work.

> Tom seems to say otherwise, in
> https://www.postgresql.org/message-id/23711.1522559987%40sss.pgh.pa.us

> The message of the commit he mentions there includes:

>   I believe this code was all right when written, because our
>   management of a session's exposed xmin was such that the TOAST
>   references were safe until end of transaction.  But that's
>   no longer true now that we can advance or clear our PGXACT.xmin
>   intra-transaction.

> So perhaps my original thought really would have Just Worked,
> in PG versions before that changed? When did that change, btw?

When snapmgr.c came in, which seems to have been 8.4.

The core of the problem now is that in a READ COMMITTED transaction, we
may not be holding any snapshot at all between statements.  So if you're
hanging onto a toast pointer across statements you're at risk.

On the other hand, it's also arguable that you shouldn't be interested
in dereferencing such a pointer later than the statement in which it
was read, precisely because it no longer represents accessible data.
So maybe we need to take two steps back and talk about the semantics
of what you're doing.

regards, tom lane



Re: lazy detoasting

2018-04-11 Thread Chapman Flack
On 04/10/2018 10:17 PM, Jan Wieck wrote:
> If your session has a transaction snapshot that protects the old toast
> slices from being vacuumed away, you are fine.

That harks back to my earlier (naïve?) thought that, as long as
my lazy datum doesn't have to outlive the transaction, lazy
detoasting might Just Work.

Tom seems to say otherwise, in
https://www.postgresql.org/message-id/23711.1522559987%40sss.pgh.pa.us

The message of the commit he mentions there includes:

  I believe this code was all right when written, because our
  management of a session's exposed xmin was such that the TOAST
  references were safe until end of transaction.  But that's
  no longer true now that we can advance or clear our PGXACT.xmin
  intra-transaction.

So perhaps my original thought really would have Just Worked,
in PG versions before that changed? When did that change, btw?

On 04/11/2018 09:28 AM, Amit Kapila wrote:
> Before query execution, we push the active snapshot, so any time
> during execution, if you get the active snapshot via
> GetActiveSnapshot(), you can access it.  So, I think during a function
> execution, if you use GetActiveSnapshot, you should get the snapshot
> used by enclosing query.

So perhaps GetActiveSnapshot is the right choice to accompany a datum
that came as a function parameter.

For something retrieved from an SPI query within the function, it
looks like I will have a portal->holdSnapshot in certain cases
(PORTAL_ONE_RETURNING, PORTAL_ONE_MOD_WITH, PORTAL_UTIL_SELECT).

The comments added to portal.h in that commit say:

* Snapshot under which tuples in the holdStore were read.  We must keep
* a reference to this snapshot if there is any possibility that the
* tuples contain TOAST references, because releasing the snapshot ...

Soo, is that telling me that the three cases named above, where
holdSnapshot gets set, are in fact the only cases where "there is any
possibility that the tuples contain TOAST references", and therefore
I could count on holdSnapshot to be nonnull whenever it matters?

-Chap



Re: lazy detoasting

2018-04-11 Thread Amit Kapila
On Wed, Apr 11, 2018 at 2:34 AM, Chapman Flack  wrote:
> On 04/10/2018 04:03 PM, Robert Haas wrote:
>
>> I suspect you want, or maybe need, to use the same snapshot as the
>> scan that retrieved the tuple containing the toasted datum.
>
> I'm sure it's worth more than that, but I don't know if it's
> implementable.
>
> If I'm a function, and the datum came to me as a parameter, I may
> have no way to determine what snapshot the enclosing query used to
> obtain the thing passed to me.
>

Before query execution, we push the active snapshot, so any time
during execution, if you get the active snapshot via
GetActiveSnapshot(), you can access it.  So, I think during a function
execution, if you use GetActiveSnapshot, you should get the snapshot
used by enclosing query.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: lazy detoasting

2018-04-10 Thread Jan Wieck
Maybe I'm missing something here, but let me put $.02 in anyway.

TOAST reuses entries. If a toasted value is unchanged on UPDATE (i.e. the
toast pointer didn't get replaced by a new value), the new tuple points to
the same toast slices as the old. If it is changed, the current transaction
DELETEs the old toast slices and INSERTs new ones (if needed).

If your session has a transaction snapshot that protects the old toast
slices from being vacuumed away, you are fine. Even under READ COMMITTED
(that is why it uses that other visibility, so they don't go away when the
concurrent UPDATE commits and rather behave like REPEATABLE READ).

At least that is what the code was intended to do originally.


Regards, Jan



On Tue, Apr 10, 2018 at 6:24 PM, Chapman Flack 
wrote:

> On 04/10/2018 05:04 PM, Chapman Flack wrote:
> > ... I wonder if
> > there's at least a cheap way to check a particular snapshot
> > for suitability wrt a given toast pointer. Check a couple usual
> > suspects, find one most of the time, fall back to eager detoasting
> > otherwise?
> >
> > Guess I need to go back for a deeper dive on just what constitutes
> > a toast pointer. I was skimming last time
>
> I see all I have in a toast pointer is a relation id and a value
> oid, so probably no way to check that a given snapshot 'works' to
> find it, at least no more cheaply than by opening the toast relation
> and trying to find it.
>
> Welp, what tuptoaster does to construct its SnapshotToast is to
> grab GetOldestSnapshot():
>
> *  since we don't know which one to use, just use the oldest one.
> *  This is safe: at worst, we will get a "snapshot too old" error
> *  that might have been avoided otherwise.
>
> ... which means, I take it, that a more recent snapshot
> might work, but this conservative choice would only fail
> if the oldest snapshot has really been around for many days,
> under typical old_snapshot_threshold settings?
>
> ... and if I'm getting a value from a portal that happens to have
> a non-null holdSnapshot, then that would be the one to use, in
> preference to just conservatively grabbing the oldest?
>
> -Chap
>
>
> ... am I right that the init_toast_snapshot construction is really
> making only one change to the snapshot data, namely changing the
> 'satisfies' condition to HeapTupleSatisfiesToast ?
>
> The lsn and whenTaken seem to be fetched from the original data
> and stored right back without change.
>
>


-- 
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info


Re: lazy detoasting

2018-04-10 Thread Chapman Flack
On 04/10/2018 05:04 PM, Chapman Flack wrote:
> ... I wonder if
> there's at least a cheap way to check a particular snapshot
> for suitability wrt a given toast pointer. Check a couple usual
> suspects, find one most of the time, fall back to eager detoasting
> otherwise?
> 
> Guess I need to go back for a deeper dive on just what constitutes
> a toast pointer. I was skimming last time

I see all I have in a toast pointer is a relation id and a value
oid, so probably no way to check that a given snapshot 'works' to
find it, at least no more cheaply than by opening the toast relation
and trying to find it.

Welp, what tuptoaster does to construct its SnapshotToast is to
grab GetOldestSnapshot():

*  since we don't know which one to use, just use the oldest one.
*  This is safe: at worst, we will get a "snapshot too old" error
*  that might have been avoided otherwise.

... which means, I take it, that a more recent snapshot
might work, but this conservative choice would only fail
if the oldest snapshot has really been around for many days,
under typical old_snapshot_threshold settings?

... and if I'm getting a value from a portal that happens to have
a non-null holdSnapshot, then that would be the one to use, in
preference to just conservatively grabbing the oldest?

-Chap


... am I right that the init_toast_snapshot construction is really
making only one change to the snapshot data, namely changing the
'satisfies' condition to HeapTupleSatisfiesToast ?

The lsn and whenTaken seem to be fetched from the original data
and stored right back without change.



Re: lazy detoasting

2018-04-10 Thread Chapman Flack
On 04/10/2018 05:04 PM, Chapman Flack wrote:

> If I'm a function, and ... I found [the datum] myself, say by an
> SPI query within the function, usually that's at a level of abstraction
> somewhere above what-snapshot-was-used-in-the-scan.

It looks like for that case (since the commit 08e261cbc that Tom
mentioned earlier), there can sometimes be a portal->holdSnapshot
that will be the right one.

-Chap



Re: lazy detoasting

2018-04-10 Thread Chapman Flack
On 04/10/2018 04:03 PM, Robert Haas wrote:

> I suspect you want, or maybe need, to use the same snapshot as the
> scan that retrieved the tuple containing the toasted datum.

I'm sure it's worth more than that, but I don't know if it's
implementable.

If I'm a function, and the datum came to me as a parameter, I may
have no way to determine what snapshot the enclosing query used to
obtain the thing passed to me. Or, if I found it myself, say by an
SPI query within the function, usually that's at a level of abstraction
somewhere above what-snapshot-was-used-in-the-scan.

But in both cases, it's expected that I could successfully detoast
either datum if I did so right there on the spot, as that's the usual
convention, right? So at that moment, something in the set of
registered || active snapshots is protecting the tuples I need.

If it's impractical to determine which snapshot is needed (or just
enough work to obviate any benefit of lazy detoasting), I wonder if
there's at least a cheap way to check a particular snapshot
for suitability wrt a given toast pointer. Check a couple usual
suspects, find one most of the time, fall back to eager detoasting
otherwise?

Guess I need to go back for a deeper dive on just what constitutes
a toast pointer. I was skimming last time

-Chap



Re: lazy detoasting

2018-04-10 Thread Robert Haas
On Tue, Apr 10, 2018 at 11:26 AM, Chapman Flack  wrote:
> Out of the six GetFooSnapshot()s, would I want to squirrel away
> Active? Oldest? Transaction?

I suspect you want, or maybe need, to use the same snapshot as the
scan that retrieved the tuple containing the toasted datum.

(This advice may be worth what you paid for it.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: lazy detoasting

2018-04-10 Thread Chapman Flack
On 04/10/2018 10:06 AM, Tom Lane wrote:
> Chapman Flack  writes:

>> Am I right in thinking that, for my original purpose of
>> detoasting something later in a transaction, all that matters
>> is that I registered a snapshot from the time at which I copied
>> the toasted datum, and the resource owner I registered it to
>> has not been released yet, ...
> 
> I believe so.

Ok.

I see I have about half a dozen choices for the snapshot that
I choose to squirrel away at the same time as the copied datum.
(That's not counting SnapshotToast, which I didn't know was a thing
until Pavan's thread this morning, but it's not a thing I can get,
just something constructed on the fly in the tuptoaster.)

Out of the six GetFooSnapshot()s, would I want to squirrel away
Active? Oldest? Transaction? This should be happening in a PL
function not doing anything arcane; the datum in question might
be passed to it as a parameter or retrieved from a query done
within the function.

GetOldestTransaction() is what the tuptoaster will eventually use
to construct SnapshotToast when looking for the data, but it's
probably overkill for me to save the oldest one in sight at the
time I save the datum. Or is it? Should I be confident that, at
the time I'm handed this datum, its toasted content must be
retrievable through the (Active? Transaction?) snapshot at that
moment, and it's enough to register that, while to register the
Oldest would be to pin something unnecessarily far back?

> Wouldn't be a bad idea to test this, of course ;-)

Mm-hmm. (Thunderbird has just corrected my spelling of mm-hmm.)
I'm still not sure I know enough to construct a meaningful test...

I assume that, while I'm doing this for a backend PL at the
moment, some of the same considerations will apply if a future
wire protocol is to support Craig's "v4 protocol TODO item -
Lazy fetch/stream of TOASTed valued" suggestion in
https://www.postgresql.org/message-id/53ff0ef8@2ndquadrant.com

-Chap



Re: lazy detoasting

2018-04-10 Thread Tom Lane
Chapman Flack  writes:
> I'm becoming increasingly glad I asked (or less embarrassed that I hadn't
> figured it all out yet).  :)

> Am I right in thinking that, for my original purpose of detoasting something
> later in a transaction, all that matters is that I registered a snapshot
> from the time at which I copied the toasted datum, and the resource owner
> I registered it to has not been released yet, so rows referred to in the
> snapshot haven't been vacuumed away? Is that a sufficient condition for
> detoast to work?

I believe so.

> Or would I need to do something more, like push and pop that snapshot
> around the detoast call?

You shouldn't need that; toast reads intentionally use a non-MVCC-aware
"snapshot" to handle exactly this type of situation, where somebody is
trying to pull data out of a tuple that would no longer be visible to
the transaction's current snapshot.

Wouldn't be a bad idea to test this, of course ;-)

regards, tom lane



Re: lazy detoasting

2018-04-10 Thread Chapman Flack
On 04/10/18 00:30, Andrew Gierth wrote:

> That's not precisely true - ultimately, the routines that do actual
> scans take the snapshot to use as a parameter, and the executor mostly
> references the snapshot from the EState; but a bunch of places do
> require that ActiveSnapshot be set to the currently applicable snapshot
> (e.g. for queries inside stable functions nested inside the main query).

I'm becoming increasingly glad I asked (or less embarrassed that I hadn't
figured it all out yet).  :)

Am I right in thinking that, for my original purpose of detoasting something
later in a transaction, all that matters is that I registered a snapshot
from the time at which I copied the toasted datum, and the resource owner
I registered it to has not been released yet, so rows referred to in the
snapshot haven't been vacuumed away? Is that a sufficient condition for
detoast to work?

Or would I need to do something more, like push and pop that snapshot
around the detoast call?

This would be in a PL function (or the handler for the PL function),
if the context matters.

-Chap



Re: lazy detoasting

2018-04-09 Thread Andrew Gierth
> "Chapman" == Chapman Flack  writes:

 Chapman> AFAICS, that is *all* that comment block has to say about why
 Chapman> there's an active snapshot stack. I believe you are saying it
 Chapman> has another important function, namely that its top element is
 Chapman> what tells the executor what can be seen.

That's not precisely true - ultimately, the routines that do actual
scans take the snapshot to use as a parameter, and the executor mostly
references the snapshot from the EState; but a bunch of places do
require that ActiveSnapshot be set to the currently applicable snapshot
(e.g. for queries inside stable functions nested inside the main query).

-- 
Andrew (irc:RhodiumToad)



Re: lazy detoasting

2018-04-09 Thread Chapman Flack
On 04/08/2018 02:01 AM, Andrew Gierth wrote:

>  Chapman> (d) some other reason I haven't thought of ?
> 
> It has to be pushed as the active snapshot so that it's the snapshot
> that the executor uses to run the query to populate the tuplestore which
> becomes the "held" portal content.

That seems closest to my (b) guess.

> I think you're confusing the stack of active snapshots with the registry
> of snapshots. The snapshot of a portal that's not currently running in
> the executor won't be on the stack, but it will be registered (which is
> enough to maintain the session's reported xmin, which is what prevents
> visible data from being vacuumed away).

That's why I was asking. The first paragraph in snapmgr.c seems to say
that the registry and the active stack both exist as ways to keep the
snapshot alive, with reachability from either one being sufficient:

* We keep track of snapshots in two ways: those "registered" by
* resowner.c, and the "active snapshot" stack.  All snapshots in
* either of them live in persistent memory.  When a snapshot is
* no longer in any of these lists (tracked by separate refcounts
* on each snapshot), its memory can be freed.

AFAICS, that is *all* that comment block has to say about why there's
an active snapshot stack. I believe you are saying it has another
important function, namely that its top element is what tells the
executor what can be seen. That makes sense, and to clarify that
was why I was asking.

I suppose the reason that's not mentioned in the comments is that it
was so obviously the ultimate purpose of the whole scheme that nobody
writing or reviewing the comments could imagine not knowing it. :)

-Chap



Re: lazy detoasting

2018-04-08 Thread Andrew Gierth
> "Chapman" == Chapman Flack  writes:

 Chapman> Please bear with me as I check my understanding of snapshot
 Chapman> management by looking at PersistHoldablePortal(). There's a
 Chapman> PushActiveSnapshot(queryDesc->snapshot) in there. Is that
 Chapman> because:

 Chapman> (d) some other reason I haven't thought of ?

It has to be pushed as the active snapshot so that it's the snapshot
that the executor uses to run the query to populate the tuplestore which
becomes the "held" portal content.

I think you're confusing the stack of active snapshots with the registry
of snapshots. The snapshot of a portal that's not currently running in
the executor won't be on the stack, but it will be registered (which is
enough to maintain the session's reported xmin, which is what prevents
visible data from being vacuumed away).

-- 
Andrew (irc:RhodiumToad)



Re: lazy detoasting

2018-04-07 Thread Amit Kapila
On Thu, Apr 5, 2018 at 8:04 AM, Chapman Flack  wrote:
> On 04/01/18 01:19, Tom Lane wrote:
>> Chapman Flack  writes:
>>> If I copy an out-of-line, on-disk TOAST pointer into a memory context
>>> with transaction lifetime, with an eye to detoasting it later in the
>>> same transaction, are there circumstances where it wouldn't work?
>>
>> Should be safe *as long as you hold onto a snapshot that can see the
>> toast value's parent row*.  Otherwise, it's not.
>>
>> For a counterexample, see the changes we had to make to avoid depending
>> on out-of-line toast values in the catcaches, commit 08e261cbc.
>
> Thanks. I think I see two approaches happening in that commit: retaining
> a snapshot, if the tuples will be used within the transaction, or eagerly
> detoasting, when persisting a holdable portal so tuples can be used after
> the transaction.
>
> Please bear with me as I check my understanding of snapshot management
> by looking at PersistHoldablePortal(). There's a
> PushActiveSnapshot(queryDesc->snapshot) in there. Is that because:
>
> (a) that snapshot might not be on the active stack at this point, and it
> needs to be be there to keep it alive during this executor re-run, or
>
> (b) it's known to be somewhere on the active stack already, but not
> necessarily on top, and needs to be pushed so it is topmost while
> re-running this executor (does the top snapshot on the active stack
> affect which tuples the executor sees? or is this stack's only purpose
> to keep snapshots alive?), or
>
> (c) it's known to be somewhere on the stack already, but needs to be
> be pushed to make sure some stack-depth invariant is preserved
> (perhaps because of an implied pop in case of an error), or
>
> (d) some other reason I haven't thought of ?
>
> I *think* I'm smart enough to rule out choice (a), because it wouldn't
> make sense to have queryDesc->snapshot refer to a snapshot that isn't
> already on the stack (unless it's also been registered, in which case,
> why would it need to be pushed on the stack too?), as then I would be
> reckless to assume it's still alive to *be* pushed. Am I close?
>

I think it is somewhat close to what you have mentioned in (b).
Basically, it will help executor to use that snapshot for tuple
visibility.

> I haven't yet gone to track down the code that assigns a snapshot to
> queryDesc->snapshot.
>

See CreateQueryDesc().


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: lazy detoasting

2018-03-31 Thread Tom Lane
Chapman Flack  writes:
> If I copy an out-of-line, on-disk TOAST pointer into a memory context
> with transaction lifetime, with an eye to detoasting it later in the
> same transaction, are there circumstances where it wouldn't work?

Should be safe *as long as you hold onto a snapshot that can see the
toast value's parent row*.  Otherwise, it's not.

For a counterexample, see the changes we had to make to avoid depending
on out-of-line toast values in the catcaches, commit 08e261cbc.

regards, tom lane



lazy detoasting

2018-03-31 Thread Chapman Flack
Hi,

If I copy an out-of-line, on-disk TOAST pointer into a memory context
with transaction lifetime, with an eye to detoasting it later in the
same transaction, are there circumstances where it wouldn't work?

Thanks,
-Chap