Re: [HACKERS] stack usage in toast_insert_or_update()

2007-02-01 Thread Pavan Deolasee

On 1/31/07, Tom Lane [EMAIL PROTECTED] wrote:



We can't change TOAST_MAX_CHUNK_SIZE without forcing an initdb, but I
think that it would be safe to remove the MAXALIGN'ing of the tuple
size in the tests in heapam.c, that is



That would mean that the tuple size in the heap may exceed
TOAST_TUPLE_THRESHOLD which should be OK, but we
may want to have a comment explaining that.

We should do the same for heap_update() as well.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] stack usage in toast_insert_or_update()

2007-02-01 Thread Jan Wieck

On 1/31/2007 12:41 PM, Tom Lane wrote:

Pavan Deolasee [EMAIL PROTECTED] writes:

On 1/31/07, Tom Lane [EMAIL PROTECTED] wrote:

The toast code takes pains to ensure that the tuples it creates won't be
subject to re-toasting.  Else it'd be an infinite recursion.



I think I found it. The toast_insert_or_update() function gets into an
unnecessary recursion because of alignment issues. It thus toasts
already toasted data.  This IMHO might be causing unnecessary
overheads for each toast operation.


Interesting --- I'd never seen this because both of my usual development
machines have MAXALIGN 8, and it works out that that makes
TOAST_MAX_CHUNK_SIZE 1986, which makes the actual toasted tuple size
2030, which maxaligns to 2032, which is still less than
TOAST_TUPLE_THRESHOLD.  I think the coding was implicitly assuming that
TOAST_TUPLE_THRESHOLD would itself be a maxalign'd value, but it's not
necessarily (and in fact not, with the current page header size ---
I wonder whether the bug was originally masked because the page header
size was different??)

We can't change TOAST_MAX_CHUNK_SIZE without forcing an initdb, but I
think that it would be safe to remove the MAXALIGN'ing of the tuple
size in the tests in heapam.c, that is

if (HeapTupleHasExternal(tup) ||
(MAXALIGN(tup-t_len)  TOAST_TUPLE_THRESHOLD))
heaptup = toast_insert_or_update(relation, tup, NULL);
else
heaptup = tup;

becomes

if (HeapTupleHasExternal(tup) ||
(tup-t_len  TOAST_TUPLE_THRESHOLD))
heaptup = toast_insert_or_update(relation, tup, NULL);
else
heaptup = tup;

which'll save a cycle or two as well as avoid this corner case.
It seems like a number of the uses of MAXALIGN in tuptoaster.c
are useless/bogus as well.  Comments?


Can't we maxalign the page header in the calculations?


Jan



regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq



--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] stack usage in toast_insert_or_update()

2007-02-01 Thread Tom Lane
Jan Wieck [EMAIL PROTECTED] writes:
 On 1/31/2007 12:41 PM, Tom Lane wrote:
 We can't change TOAST_MAX_CHUNK_SIZE without forcing an initdb, but I
 think that it would be safe to remove the MAXALIGN'ing of the tuple
 size in the tests in heapam.c, that is

 Can't we maxalign the page header in the calculations?

Actually, the page header size *is* maxaligned.  The problem is that
TOAST_TUPLE_THRESHOLD isn't.

We could make it so, but that would change TOAST_MAX_CHUNK_SIZE thus
forcing an initdb.  I think simply removing the MAXALIGN operations
in the code is a better answer, as it eliminates some rather pointless
overhead.  There's no particular reason why the threshold needs to be
maxaligned ...

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] stack usage in toast_insert_or_update()

2007-01-31 Thread Pavan Deolasee

On 1/31/07, Tom Lane [EMAIL PROTECTED] wrote:


Pavan Deolasee [EMAIL PROTECTED] writes:
 Btw, I noticed that the toast_insert_or_update() is re-entrant.
 toast_save_datum() calls simple_heap_insert() which somewhere down the
 line calls toast_insert_or_update() again.

The toast code takes pains to ensure that the tuples it creates won't be
subject to re-toasting.  Else it'd be an infinite recursion.



I think I found it. The toast_insert_or_update() function gets into an
unnecessary
recursion because of alignment issues. It thus toasts already toasted data.
This
IMHO might be causing unnecessary overheads for each toast operation.

The default value of TOAST_TUPLE_THRESHOLD is 2034 (assuming 8K block size)

TOAST_MAX_CHUNK_SIZE is defined as below:

#define TOAST_MAX_CHUNK_SIZE(TOAST_TUPLE_THRESHOLD -\
MAXALIGN(   \
   MAXALIGN(offsetof(HeapTupleHeaderData, t_bits)) +   \
   sizeof(Oid) +   \
   sizeof(int32) + \
   VARHDRSZ))

So the default value of TOAST_MAX_CHUNK_SIZE is set to 1994.

When toast_insert_or_update() returns a tuple for the first chunk, t_len
is set to 2034 (TOAST_MAX_CHUNK_SIZE + tuple header + chunk_id
+ chunk_seqno + VARHDRSZ)

In heap_insert(), we MAXALIGN(tup-t_len) before comparing it with
TOAST_TUPLE_THRESHOLD to decide whether to invoke TOAST or not.
In this corner case, MAXALIGN(2034) = 2036  TOAST_TUPLE_THRESHOLD
and so TOAST is invoked again.

Fortunately, we don't get into infinite recursion because reltoastrelid is
set to
InvalidOid for toast tables and hence TOASTing is not invoked in the second
call.

Attached is a patch which would print the recursion depth for
toast_insert_or_update() before PANICing the server to help us
examine the core.

Let me know if this sounds like an issue and  I can work out a patch.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] stack usage in toast_insert_or_update()

2007-01-31 Thread Pavan Deolasee

On 1/31/07, Pavan Deolasee [EMAIL PROTECTED] wrote:



Attached is a patch which would print the recursion depth for
toast_insert_or_update() before PANICing the server to help us
examine the core.



Here is the attachment.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


toast.patch
Description: Binary data

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] stack usage in toast_insert_or_update()

2007-01-31 Thread Tom Lane
Pavan Deolasee [EMAIL PROTECTED] writes:
 On 1/31/07, Tom Lane [EMAIL PROTECTED] wrote:
 The toast code takes pains to ensure that the tuples it creates won't be
 subject to re-toasting.  Else it'd be an infinite recursion.

 I think I found it. The toast_insert_or_update() function gets into an
 unnecessary recursion because of alignment issues. It thus toasts
 already toasted data.  This IMHO might be causing unnecessary
 overheads for each toast operation.

Interesting --- I'd never seen this because both of my usual development
machines have MAXALIGN 8, and it works out that that makes
TOAST_MAX_CHUNK_SIZE 1986, which makes the actual toasted tuple size
2030, which maxaligns to 2032, which is still less than
TOAST_TUPLE_THRESHOLD.  I think the coding was implicitly assuming that
TOAST_TUPLE_THRESHOLD would itself be a maxalign'd value, but it's not
necessarily (and in fact not, with the current page header size ---
I wonder whether the bug was originally masked because the page header
size was different??)

We can't change TOAST_MAX_CHUNK_SIZE without forcing an initdb, but I
think that it would be safe to remove the MAXALIGN'ing of the tuple
size in the tests in heapam.c, that is

if (HeapTupleHasExternal(tup) ||
(MAXALIGN(tup-t_len)  TOAST_TUPLE_THRESHOLD))
heaptup = toast_insert_or_update(relation, tup, NULL);
else
heaptup = tup;

becomes

if (HeapTupleHasExternal(tup) ||
(tup-t_len  TOAST_TUPLE_THRESHOLD))
heaptup = toast_insert_or_update(relation, tup, NULL);
else
heaptup = tup;

which'll save a cycle or two as well as avoid this corner case.
It seems like a number of the uses of MAXALIGN in tuptoaster.c
are useless/bogus as well.  Comments?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] stack usage in toast_insert_or_update()

2007-01-30 Thread Tom Lane
Pavan Deolasee [EMAIL PROTECTED] writes:
 The stack usage for toast_insert_or_update() may run into several KBs since
 the MaxHeapAttributeNumber is set to a very large value of 1600. The usage
 could anywhere between 28K to 48K depending on alignment and whether its a
 32-bit or a 64-bit machine.

So?  The routine is not re-entrant so I don't see that the stack space
is a big problem.  It's coded that way to avoid palloc/pfree cycles...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] stack usage in toast_insert_or_update()

2007-01-30 Thread Pavan Deolasee

On 1/30/07, Tom Lane [EMAIL PROTECTED] wrote:


Pavan Deolasee [EMAIL PROTECTED] writes:
 The stack usage for toast_insert_or_update() may run into several KBs
since
 the MaxHeapAttributeNumber is set to a very large value of 1600. The
usage
 could anywhere between 28K to 48K depending on alignment and whether its
a
 32-bit or a 64-bit machine.

So?  The routine is not re-entrant so I don't see that the stack space
is a big problem.  It's coded that way to avoid palloc/pfree cycles...



I always thought that it would be costlier to have a repeated stack
allocation/deallocation
of many KBs than dynamically allocating a small percentage of that. But I
might be wrong.
In fact, a small test I ran showed that mallloc/free is more costly. So may
be are
good.

Btw, I noticed that the toast_insert_or_update() is re-entrant.
toast_save_datum()
calls simple_heap_insert() which somewhere down the line calls
toast_insert_or_update() again. It looks a bit surprising, haven't look into
detail
though.

Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] stack usage in toast_insert_or_update()

2007-01-30 Thread Tom Lane
Pavan Deolasee [EMAIL PROTECTED] writes:
 Btw, I noticed that the toast_insert_or_update() is re-entrant.
 toast_save_datum() calls simple_heap_insert() which somewhere down the
 line calls toast_insert_or_update() again.

The toast code takes pains to ensure that the tuples it creates won't be
subject to re-toasting.  Else it'd be an infinite recursion.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq