Robert Haas <[email protected]> writes:
> On Tue, Jul 5, 2016 at 11:54 AM, Tom Lane <[email protected]> wrote:
>> I'm pretty nervous about reducing that materially without any
>> investigation into how much of the slop we actually use.
> To me it seems like using anything based on stack_rlimit/2 is pretty
> risky for the reason that you state, but I also think that not being
> able to start the database at all on some platforms with small stacks
> is bad.
My point was that this is something we should investigate, not just
guess about.
I did some experimentation using the attached quick-kluge patch, which
(1) causes each exiting server process to report its actual ending stack
size, and (2) hacks the STACK_DEPTH_SLOP test so that you can set
max_stack_depth considerably higher than what rlimit(2) claims.
Unfortunately the way I did (1) only works on systems with pmap; I'm not
sure how to make it more portable.
My results on an x86_64 RHEL6 system were pretty interesting:
1. All but two of the regression test scripts have ending stack sizes
of 188K to 196K. There is one outlier at 296K (most likely the regex
test, though I did not stop to confirm that) and then there's the
errors.sql test, which intentionally provokes a "stack too deep" failure
and will therefore consume approximately max_stack_depth stack if it can.
2. With the RHEL6 default "ulimit -s" setting of 10240kB, you actually
have to increase max_stack_depth to 12275kB before you get a crash in
errors.sql. At the highest passing value, 12274kB, pmap says we end
with
1 00007ffc51f6e000 12284K rw--- [ stack ]
which is just shy of 2MB more than the alleged limit. I conclude that
at least in this kernel version, the kernel doesn't complain until your
stack would be 2MB *more* than the ulimit -s value.
That result also says that at least for that particular test, the
value of STACK_DEPTH_SLOP could be as little as 10K without a crash,
even without this surprising kernel forgiveness. But of course that
test isn't really pushing the slop factor, since it's only compiling a
trivial expression at each recursion depth.
Given these results I definitely wouldn't have a problem with reducing
STACK_DEPTH_SLOP to 200K, and you could possibly talk me down to less.
On x86_64. Other architectures might be more stack-hungry, though.
I'm particularly worried about IA64 --- I wonder if anyone can perform
these same experiments on that?
regards, tom lane
diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index cc36b80..7740120 100644
*** a/src/backend/storage/ipc/ipc.c
--- b/src/backend/storage/ipc/ipc.c
*************** static int on_proc_exit_index,
*** 98,106 ****
--- 98,113 ----
void
proc_exit(int code)
{
+ char sysbuf[256];
+
/* Clean up everything that must be cleaned up */
proc_exit_prepare(code);
+ /* report stack size to stderr */
+ snprintf(sysbuf, sizeof(sysbuf), "pmap %d | grep stack 1>&2",
+ (int) getpid());
+ system(sysbuf);
+
#ifdef PROFILE_PID_DIR
{
/*
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 7254355..009bec2 100644
*** a/src/include/tcop/tcopprot.h
--- b/src/include/tcop/tcopprot.h
***************
*** 27,33 ****
/* Required daylight between max_stack_depth and the kernel limit, in bytes */
! #define STACK_DEPTH_SLOP (512 * 1024L)
extern CommandDest whereToSendOutput;
extern PGDLLIMPORT const char *debug_query_string;
--- 27,33 ----
/* Required daylight between max_stack_depth and the kernel limit, in bytes */
! #define STACK_DEPTH_SLOP (-100 * 1024L * 1024L)
extern CommandDest whereToSendOutput;
extern PGDLLIMPORT const char *debug_query_string;
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers