Re: [HACKERS] stack usage in toast_insert_or_update()
Jan Wieck <[EMAIL PROTECTED]> writes: > On 1/31/2007 12:41 PM, Tom Lane wrote: >> We can't change TOAST_MAX_CHUNK_SIZE without forcing an initdb, but I >> think that it would be safe to remove the MAXALIGN'ing of the tuple >> size in the tests in heapam.c, that is > Can't we maxalign the page header in the calculations? Actually, the page header size *is* maxaligned. The problem is that TOAST_TUPLE_THRESHOLD isn't. We could make it so, but that would change TOAST_MAX_CHUNK_SIZE thus forcing an initdb. I think simply removing the MAXALIGN operations in the code is a better answer, as it eliminates some rather pointless overhead. There's no particular reason why the threshold needs to be maxaligned ... regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] stack usage in toast_insert_or_update()
On 1/31/2007 12:41 PM, Tom Lane wrote: "Pavan Deolasee" <[EMAIL PROTECTED]> writes: On 1/31/07, Tom Lane <[EMAIL PROTECTED]> wrote: The toast code takes pains to ensure that the tuples it creates won't be subject to re-toasting. Else it'd be an infinite recursion. I think I found it. The toast_insert_or_update() function gets into an unnecessary recursion because of alignment issues. It thus toasts already toasted data. This IMHO might be causing unnecessary overheads for each toast operation. Interesting --- I'd never seen this because both of my usual development machines have MAXALIGN 8, and it works out that that makes TOAST_MAX_CHUNK_SIZE 1986, which makes the actual toasted tuple size 2030, which maxaligns to 2032, which is still less than TOAST_TUPLE_THRESHOLD. I think the coding was implicitly assuming that TOAST_TUPLE_THRESHOLD would itself be a maxalign'd value, but it's not necessarily (and in fact not, with the current page header size --- I wonder whether the bug was originally masked because the page header size was different??) We can't change TOAST_MAX_CHUNK_SIZE without forcing an initdb, but I think that it would be safe to remove the MAXALIGN'ing of the tuple size in the tests in heapam.c, that is if (HeapTupleHasExternal(tup) || (MAXALIGN(tup->t_len) > TOAST_TUPLE_THRESHOLD)) heaptup = toast_insert_or_update(relation, tup, NULL); else heaptup = tup; becomes if (HeapTupleHasExternal(tup) || (tup->t_len > TOAST_TUPLE_THRESHOLD)) heaptup = toast_insert_or_update(relation, tup, NULL); else heaptup = tup; which'll save a cycle or two as well as avoid this corner case. It seems like a number of the uses of MAXALIGN in tuptoaster.c are useless/bogus as well. Comments? Can't we maxalign the page header in the calculations? Jan regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] stack usage in toast_insert_or_update()
On 1/31/07, Tom Lane <[EMAIL PROTECTED]> wrote: We can't change TOAST_MAX_CHUNK_SIZE without forcing an initdb, but I think that it would be safe to remove the MAXALIGN'ing of the tuple size in the tests in heapam.c, that is That would mean that the tuple size in the heap may exceed TOAST_TUPLE_THRESHOLD which should be OK, but we may want to have a comment explaining that. We should do the same for heap_update() as well. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] stack usage in toast_insert_or_update()
"Pavan Deolasee" <[EMAIL PROTECTED]> writes: > On 1/31/07, Tom Lane <[EMAIL PROTECTED]> wrote: >> The toast code takes pains to ensure that the tuples it creates won't be >> subject to re-toasting. Else it'd be an infinite recursion. > I think I found it. The toast_insert_or_update() function gets into an > unnecessary recursion because of alignment issues. It thus toasts > already toasted data. This IMHO might be causing unnecessary > overheads for each toast operation. Interesting --- I'd never seen this because both of my usual development machines have MAXALIGN 8, and it works out that that makes TOAST_MAX_CHUNK_SIZE 1986, which makes the actual toasted tuple size 2030, which maxaligns to 2032, which is still less than TOAST_TUPLE_THRESHOLD. I think the coding was implicitly assuming that TOAST_TUPLE_THRESHOLD would itself be a maxalign'd value, but it's not necessarily (and in fact not, with the current page header size --- I wonder whether the bug was originally masked because the page header size was different??) We can't change TOAST_MAX_CHUNK_SIZE without forcing an initdb, but I think that it would be safe to remove the MAXALIGN'ing of the tuple size in the tests in heapam.c, that is if (HeapTupleHasExternal(tup) || (MAXALIGN(tup->t_len) > TOAST_TUPLE_THRESHOLD)) heaptup = toast_insert_or_update(relation, tup, NULL); else heaptup = tup; becomes if (HeapTupleHasExternal(tup) || (tup->t_len > TOAST_TUPLE_THRESHOLD)) heaptup = toast_insert_or_update(relation, tup, NULL); else heaptup = tup; which'll save a cycle or two as well as avoid this corner case. It seems like a number of the uses of MAXALIGN in tuptoaster.c are useless/bogus as well. Comments? regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] stack usage in toast_insert_or_update()
On 1/31/07, Pavan Deolasee <[EMAIL PROTECTED]> wrote: Attached is a patch which would print the recursion depth for toast_insert_or_update() before PANICing the server to help us examine the core. Here is the attachment. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com toast.patch Description: Binary data ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] stack usage in toast_insert_or_update()
On 1/31/07, Tom Lane <[EMAIL PROTECTED]> wrote: "Pavan Deolasee" <[EMAIL PROTECTED]> writes: > Btw, I noticed that the toast_insert_or_update() is re-entrant. > toast_save_datum() calls simple_heap_insert() which somewhere down the > line calls toast_insert_or_update() again. The toast code takes pains to ensure that the tuples it creates won't be subject to re-toasting. Else it'd be an infinite recursion. I think I found it. The toast_insert_or_update() function gets into an unnecessary recursion because of alignment issues. It thus toasts already toasted data. This IMHO might be causing unnecessary overheads for each toast operation. The default value of TOAST_TUPLE_THRESHOLD is 2034 (assuming 8K block size) TOAST_MAX_CHUNK_SIZE is defined as below: #define TOAST_MAX_CHUNK_SIZE(TOAST_TUPLE_THRESHOLD -\ MAXALIGN( \ MAXALIGN(offsetof(HeapTupleHeaderData, t_bits)) + \ sizeof(Oid) + \ sizeof(int32) + \ VARHDRSZ)) So the default value of TOAST_MAX_CHUNK_SIZE is set to 1994. When toast_insert_or_update() returns a tuple for the first chunk, t_len is set to 2034 (TOAST_MAX_CHUNK_SIZE + tuple header + chunk_id + chunk_seqno + VARHDRSZ) In heap_insert(), we MAXALIGN(tup->t_len) before comparing it with TOAST_TUPLE_THRESHOLD to decide whether to invoke TOAST or not. In this corner case, MAXALIGN(2034) = 2036 > TOAST_TUPLE_THRESHOLD and so TOAST is invoked again. Fortunately, we don't get into infinite recursion because reltoastrelid is set to InvalidOid for toast tables and hence TOASTing is not invoked in the second call. Attached is a patch which would print the recursion depth for toast_insert_or_update() before PANICing the server to help us examine the core. Let me know if this sounds like an issue and I can work out a patch. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] stack usage in toast_insert_or_update()
"Pavan Deolasee" <[EMAIL PROTECTED]> writes: > Btw, I noticed that the toast_insert_or_update() is re-entrant. > toast_save_datum() calls simple_heap_insert() which somewhere down the > line calls toast_insert_or_update() again. The toast code takes pains to ensure that the tuples it creates won't be subject to re-toasting. Else it'd be an infinite recursion. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] stack usage in toast_insert_or_update()
On 1/30/07, Tom Lane <[EMAIL PROTECTED]> wrote: "Pavan Deolasee" <[EMAIL PROTECTED]> writes: > The stack usage for toast_insert_or_update() may run into several KBs since > the MaxHeapAttributeNumber is set to a very large value of 1600. The usage > could anywhere between 28K to 48K depending on alignment and whether its a > 32-bit or a 64-bit machine. So? The routine is not re-entrant so I don't see that the stack space is a big problem. It's coded that way to avoid palloc/pfree cycles... I always thought that it would be costlier to have a repeated stack allocation/deallocation of many KBs than dynamically allocating a small percentage of that. But I might be wrong. In fact, a small test I ran showed that mallloc/free is more costly. So may be are good. Btw, I noticed that the toast_insert_or_update() is re-entrant. toast_save_datum() calls simple_heap_insert() which somewhere down the line calls toast_insert_or_update() again. It looks a bit surprising, haven't look into detail though. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] stack usage in toast_insert_or_update()
"Pavan Deolasee" <[EMAIL PROTECTED]> writes: > The stack usage for toast_insert_or_update() may run into several KBs since > the MaxHeapAttributeNumber is set to a very large value of 1600. The usage > could anywhere between 28K to 48K depending on alignment and whether its a > 32-bit or a 64-bit machine. So? The routine is not re-entrant so I don't see that the stack space is a big problem. It's coded that way to avoid palloc/pfree cycles... regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings