Added to TODO:
* Consider increasing internal areas when shared buffers is increased
http://archives.postgresql.org/pgsql-hackers/2005-10/msg01419.php
---
Alvaro Herrera wrote:
Jim C. Nasby
Jim C. Nasby wrote:
BTW, is MEMORY_CONTEXT_CHECKING that expensive? It seems like it
shouldn't be, but I'm only guessing at what exactly it does...
Yes, because not only it checks marker bytes at the end of palloc chunks
and similar stuff, but it also overwrites whole contexts with 0x7f when
Alvaro Herrera [EMAIL PROTECTED] writes:
May I propose to make Assert() yield only WARNINGs,
That is a horrid idea --- for one thing, it means that Asserts inside
the elog machinery itself would be instant infinite recursion, and even
elsewhere you'd have to think a bit about whether it's ok to
On Thu, Nov 03, 2005 at 11:11:42AM -0500, Tom Lane wrote:
Perhaps rather than an all-or-nothing debug_assertions GUC variable,
what we want is something that turns on or off expensive assertion
checks at runtime. This could include MEMORY_CONTEXT_CHECKING and
anything else where the actual
Jim C. Nasby [EMAIL PROTECTED] writes:
Seriously, I am wondering about the performance hit of always checking
debug_assertions.
http://archives.postgresql.org/pgsql-hackers/2005-08/msg00389.php
indicates that even with debug_assertions=false, --enable-cassert has a
big performance impact.
Tom Lane [EMAIL PROTECTED] writes:
Greg Stark [EMAIL PROTECTED] writes:
I happen to think that except for the rare assertion that has major
performance impact all the assertions should be on in production builds. The
goal of assertions is to catch corruption quickly and that's something
On Wed, Nov 02, 2005 at 07:03:57AM -0500, Greg Stark wrote:
I would bet that ninety percent of the Asserts in the existing code are on
conditions that could represent, at worst, corruption of backend-local or
even transaction-local data structures. Taking down the entire database
cluster
Jim C. Nasby [EMAIL PROTECTED] writes:
BTW, that's a reversal from what I was originally arguing for, which was
due to the performance penalty associated with --enable-cassert. My
client is now running with Tom's suggestion of commenting out
CLOBBER_FREED_MEMORY and MEMORY_CONTEXT_CHECKING and
Tom Lane wrote:
Jim C. Nasby [EMAIL PROTECTED] writes:
AFAIK they're not using subtransactions at all, but I'll check.
Well, yeah, they are ... else you'd never have seen this failure.
Maybe it's in plpgsql EXCEPTION clauses.
--
Alvaro Herrera Valdivia, Chile ICBM: S 39ยบ 49' 17.7,
On Tue, Nov 01, 2005 at 11:23:55AM -0300, Alvaro Herrera wrote:
Tom Lane wrote:
Jim C. Nasby [EMAIL PROTECTED] writes:
AFAIK they're not using subtransactions at all, but I'll check.
Well, yeah, they are ... else you'd never have seen this failure.
Maybe it's in plpgsql EXCEPTION
On Tue, Nov 01, 2005 at 11:23:55AM -0300, Alvaro Herrera wrote:
Tom Lane wrote:
Jim C. Nasby [EMAIL PROTECTED] writes:
AFAIK they're not using subtransactions at all, but I'll check.
Well, yeah, they are ... else you'd never have seen this failure.
Maybe it's in plpgsql EXCEPTION
I wrote:
Even though the bug seems completely clog.c's fault,
s/clog.c/slru.c/ of course :-(
regards, tom lane
---(end of broadcast)---
TIP 6: explain analyze is your friend
Jim C. Nasby [EMAIL PROTECTED] writes:
Doesn't clog use the same code?
Yeah, but all three of your examples were referencing pg_subtrans,
as proven by the stack traces and the contents of the shared control
block.
Even though the bug seems completely clog.c's fault, this is not a
coincidence:
Bruce Momjian pgman@candle.pha.pa.us writes:
Jim C. Nasby wrote:
My assumption is that the asserts that are currently in place fall into
one of two categories: either they check for something that if false
could result in data corruption in the heap, or they check for something
that
Greg Stark [EMAIL PROTECTED] writes:
I happen to think that except for the rare assertion that has major
performance impact all the assertions should be on in production builds. The
goal of assertions is to catch corruption quickly and that's something that's
just as important in production as
On Sun, Oct 30, 2005 at 06:17:53PM -0500, Tom Lane wrote:
I'd like Jim to test this theory by seeing if it helps to reverse the
order of the if-test elements at lines 294/295, ie make it look like
if (shared-page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS ||
Sorry, two more things...
Will increasing shared_buffers make this less likely to occur? Or is
this just something that's likely to happen when there are things like
seqscans that are putting buffers near the front of the LRU? (The 8.0.3
buffer manager does something like that, right?)
Is this
Jim C. Nasby wrote:
On Sun, Oct 30, 2005 at 06:17:53PM -0500, Tom Lane wrote:
I'd like Jim to test this theory by seeing if it helps to reverse the
order of the if-test elements at lines 294/295, ie make it look like
if (shared-page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS ||
Jim C. Nasby [EMAIL PROTECTED] writes:
On Sun, Oct 30, 2005 at 06:17:53PM -0500, Tom Lane wrote:
This won't do as a permanent patch, because it isn't guaranteed to fix
the problem on machines that don't strongly order writes, but it should
work on Opterons, at least well enough to confirm the
On Mon, Oct 31, 2005 at 01:05:06PM -0500, Tom Lane wrote:
Jim C. Nasby [EMAIL PROTECTED] writes:
On Sun, Oct 30, 2005 at 06:17:53PM -0500, Tom Lane wrote:
This won't do as a permanent patch, because it isn't guaranteed to fix
the problem on machines that don't strongly order writes, but it
Tom Lane wrote:
Jim C. Nasby [EMAIL PROTECTED] writes:
On Sun, Oct 30, 2005 at 06:17:53PM -0500, Tom Lane wrote:
This won't do as a permanent patch, because it isn't guaranteed to fix
the problem on machines that don't strongly order writes, but it should
work on Opterons, at least well
On Mon, Oct 31, 2005 at 01:01:14PM -0500, Bruce Momjian wrote:
This incident has made me wonder if it's worth creating two classes of
assertions. The (hopefully more common) set of assertions would be for
things that shouldn't happen, but if go un-caught won't result in heap
corruption. A
Jim C. Nasby wrote:
On Mon, Oct 31, 2005 at 01:01:14PM -0500, Bruce Momjian wrote:
This incident has made me wonder if it's worth creating two classes of
assertions. The (hopefully more common) set of assertions would be for
things that shouldn't happen, but if go un-caught won't result
On Mon, Oct 31, 2005 at 01:34:17PM -0500, Bruce Momjian wrote:
There is no way if the system has some incorrect value whether that
would later corrupt the data or not. Anything the system does that it
shouldn't do is a potential corruption problem.
But is it safe to say that there are areas
On 10/31/05, Jim C. Nasby [EMAIL PROTECTED] wrote:
On Mon, Oct 31, 2005 at 01:34:17PM -0500, Bruce Momjian wrote:
There is no way if the system has some incorrect value whether that
would later corrupt the data or not. Anything the system does that it
shouldn't do is a potential corruption
Now that I've got a little better idea of what this code does, I've
noticed something interesting... this issue is happening on an 8-way
machine, and NUM_SLRU_BUFFERS is currently defined at 8. Doesn't this
greatly increase the odds of buffer conflicts? Bug aside, would it be
better to set
Jim C. Nasby wrote:
Now that I've got a little better idea of what this code does, I've
noticed something interesting... this issue is happening on an 8-way
machine, and NUM_SLRU_BUFFERS is currently defined at 8. Doesn't this
greatly increase the odds of buffer conflicts? Bug aside, would it
On Mon, Oct 31, 2005 at 09:02:59PM -0300, Alvaro Herrera wrote:
Jim C. Nasby wrote:
Now that I've got a little better idea of what this code does, I've
noticed something interesting... this issue is happening on an 8-way
machine, and NUM_SLRU_BUFFERS is currently defined at 8. Doesn't this
Jim C. Nasby [EMAIL PROTECTED] writes:
AFAIK they're not using subtransactions at all, but I'll check.
Well, yeah, they are ... else you'd never have seen this failure.
regards, tom lane
---(end of broadcast)---
TIP 9: In
OK, I think I see it. The problem is that the code in slru.c is careful
about not modifying state when it doesn't hold the proper lock, but not
so careful about not *inspecting* state without the proper lock. In
particular consider these lines in SimpleLruReadPage (line numbers are
as in CVS
30 matches
Mail list logo