Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Tom Lane
daveg da...@sonic.net writes:
 I'd like to propose adding a new GUC to limit the amount of memory a backend
 can allocate for its own use.

Use ulimit.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Alvaro Herrera
daveg wrote:
 
 I'd like to propose adding a new GUC to limit the amount of memory a backend
 can allocate for its own use. The problem this addresses is that sometimes
 one needs to set work_mem fairly high to get good query plans for large joins.
 However, some complex queries will then use huge amounts of memory so that
 one or a few of them will consume all the memory on the host and run it deep
 into swap or trigger the oom killer or worse.

Oh, BTW, did anyone get interested in adding the bits to disable the OOM
killer for postmaster on the various Linux initscripts?  It needs some
games with /proc/pid/oom_adj and requires root privileges, but I think
an initscript is in an excellent position to do it.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Euler Taveira de Oliveira
Tom Lane escreveu:
 daveg da...@sonic.net writes:
 I'd like to propose adding a new GUC to limit the amount of memory a backend
 can allocate for its own use.
 
 Use ulimit.
 
What about plataforms (Windows) that don't have ulimit?


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Tom Lane
Euler Taveira de Oliveira eu...@timbira.com writes:
 Tom Lane escreveu:
 daveg da...@sonic.net writes:
 I'd like to propose adding a new GUC to limit the amount of memory a backend
 can allocate for its own use.
 
 Use ulimit.
 
 What about plataforms (Windows) that don't have ulimit?

Get a real operating system ;-)

Seriously, the proposed patch introduces overhead into a place that is
already a known hot spot, in return for not much of anything.  It will
*not* bound backend memory use very accurately, because there is no way
to track raw malloc() calls.  And I think that 99% of users will not
find it useful.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Bill Moran
In response to Euler Taveira de Oliveira eu...@timbira.com:

 Tom Lane escreveu:
  daveg da...@sonic.net writes:
  I'd like to propose adding a new GUC to limit the amount of memory a 
  backend
  can allocate for its own use.
  
  Use ulimit.

 What about plataforms (Windows) that don't have ulimit?

I have a hard time believing that Windows doesn't have a ulimit equivalent.

I don't want to degrade this thread into MS-bashing, but if that were the
case, it would make Windows a pretty crippled OS.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Robert Haas
On Thu, Oct 1, 2009 at 11:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Euler Taveira de Oliveira eu...@timbira.com writes:
 Tom Lane escreveu:
 daveg da...@sonic.net writes:
 I'd like to propose adding a new GUC to limit the amount of memory a 
 backend
 can allocate for its own use.

 Use ulimit.

 What about plataforms (Windows) that don't have ulimit?

 Get a real operating system ;-)

 Seriously, the proposed patch introduces overhead into a place that is
 already a known hot spot, in return for not much of anything.  It will
 *not* bound backend memory use very accurately, because there is no way
 to track raw malloc() calls.  And I think that 99% of users will not
 find it useful.

What WOULD be useful is to find a way to provide a way to configure
work_mem per backend rather than per executor node.  But that's a much
harder problem.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 What WOULD be useful is to find a way to provide a way to configure
 work_mem per backend rather than per executor node.  But that's a much
 harder problem.

I think it's mostly a planner problem: how do you deal with the fact
that that would make cost estimates for different sub-problems
interrelated?  The cost of a hash, for instance, depends a lot on how
much memory you assume it can use.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Euler Taveira de Oliveira
Robert Haas escreveu:
 On Thu, Oct 1, 2009 at 11:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Euler Taveira de Oliveira eu...@timbira.com writes:
 Tom Lane escreveu:
 daveg da...@sonic.net writes:
 I'd like to propose adding a new GUC to limit the amount of memory a 
 backend
 can allocate for its own use.
 Use ulimit.

 What about plataforms (Windows) that don't have ulimit?
 Get a real operating system ;-)

 Seriously, the proposed patch introduces overhead into a place that is
 already a known hot spot, in return for not much of anything.  It will
 *not* bound backend memory use very accurately, because there is no way
 to track raw malloc() calls.  And I think that 99% of users will not
 find it useful.
 
 What WOULD be useful is to find a way to provide a way to configure
 work_mem per backend rather than per executor node.  But that's a much
 harder problem.
 
I see. Tough problem is: how do we get per backend memory usage accurately? Is
it relying on OS specific API the only way?


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Tom Lane
Euler Taveira de Oliveira eu...@timbira.com writes:
 I see. Tough problem is: how do we get per backend memory usage accurately? Is
 it relying on OS specific API the only way?

Given all the third-party libraries (perl, python, libxml2, yadda yadda)
that can be in use and won't go through palloc, I think that this would
have to be done at the OS level to be very meaningful.

The other problem is the one Robert touched on: what you actually *want*
is something entirely different, namely for the backend to actively try
to meet an overall target for its memory usage, rather than having
queries fail ungracefully when they hit an arbitrary limit that the
planner didn't even know about.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Euler Taveira de Oliveira
Tom Lane escreveu:
 Robert Haas robertmh...@gmail.com writes:
 What WOULD be useful is to find a way to provide a way to configure
 work_mem per backend rather than per executor node.  But that's a much
 harder problem.
 
 I think it's mostly a planner problem: how do you deal with the fact
 that that would make cost estimates for different sub-problems
 interrelated?  The cost of a hash, for instance, depends a lot on how
 much memory you assume it can use.
 
It could introduce some complexity but you could track (subtract) the memory
usage as you're walking up the tree. Also, you need to decide what to do when
you have more than one node per level. :( How do you deal with priority in
this case?


-- 
  Euler Taveira de Oliveira
  http://www.timbira.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Oh, BTW, did anyone get interested in adding the bits to disable the OOM
 killer for postmaster on the various Linux initscripts?  It needs some
 games with /proc/pid/oom_adj and requires root privileges, but I think
 an initscript is in an excellent position to do it.

I was imagining that this would be something for individual distros
to tackle.  It's probably not portable enough to go into the
contrib/start-scripts examples.  On the other hand, it'd make lots
of sense to have the Fedora or Debian or whatever scripts do this,
since they know what kernel version they're targeting.  (If anyone
wants to send me the fixes to make Fedora's script do this ...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 I was imagining that this would be something for individual distros
 to tackle.  It's probably not portable enough to go into the
 contrib/start-scripts examples.  On the other hand, it'd make lots
 of sense to have the Fedora or Debian or whatever scripts do this,
 since they know what kernel version they're targeting.  (If anyone
 wants to send me the fixes to make Fedora's script do this ...)

I'm not exactly keen on Debian init scripts hacking kernel settings.
Should it hack up the shared memory numbers too?  This is not what I
would consider 'init script' material for specific applications.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Oh, BTW, did anyone get interested in adding the bits to disable the OOM
  killer for postmaster on the various Linux initscripts?  It needs some
  games with /proc/pid/oom_adj and requires root privileges, but I think
  an initscript is in an excellent position to do it.
 
 I was imagining that this would be something for individual distros
 to tackle.  It's probably not portable enough to go into the
 contrib/start-scripts examples.

Hmm?  I think it should be just (as root)

if [ -f /proc/$pid_of_postmaster/oom_adj ]; then
echo -17  /proc/$pid_of_postmaster/oom_adj
fi

This is supported from 2.6.11 onwards AFAIK.  If the kernel is older
than that, the file would not exist and this would be a noop.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 I'm not exactly keen on Debian init scripts hacking kernel settings.
 Should it hack up the shared memory numbers too?  This is not what I
 would consider 'init script' material for specific applications.

What was suggested was tweaking the oom_adj setting for the postmaster
process only, not messing with any system-wide settings.  Do you really
find that unreasonable?  The default OOM killer behavior is just about
as unreasonable as can be :-(

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread daveg
On Thu, Oct 01, 2009 at 10:35:55AM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  I'd like to propose adding a new GUC to limit the amount of memory a backend
  can allocate for its own use.
 
 Use ulimit.

That was my initial thought too. However, ulimit() is documented as superceded
by setrlimit(). Which has the option RLIMIT_DATA to limit the size of the data
segment. Perfect!

Except, RLIMIT_DATA does not appear to work on linux. The call succeeds and
the new value can even be read back with getrlimit(), but it does not seem
to do anything to actually limit the memory allocated. I tested this on
SuSE 11: kernel 2.6.25, and Ubuntu Intrepid: kernel 2.6.28.

Setting RLIMIT_AS to limit the total address space for a process works as
expected. However this seems undesireable for postgresql as it can also cause
stack expansion to fail, which would then force a general restart. Also,
this limit would interact with the buffercache size setting as it includes
the shared address space as well.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread daveg
On Thu, Oct 01, 2009 at 11:47:43AM -0400, Tom Lane wrote:
 Euler Taveira de Oliveira eu...@timbira.com writes:
  Tom Lane escreveu:
  daveg da...@sonic.net writes:
  I'd like to propose adding a new GUC to limit the amount of memory a 
  backend
  can allocate for its own use.
  
  Use ulimit.

 Seriously, the proposed patch introduces overhead into a place that is
 already a known hot spot, in return for not much of anything.  It will

The overhead is simply an integer addition and compare with values that are
likely already in processor caches. And this only occurs when we actually
call malloc() to get a new block, not on every palloc. So I suspect it will
not be noticable. However, I welcome any suggestion on how to test this
and actually measure the overhead if any. pg_bench? Something else?

 *not* bound backend memory use very accurately, because there is no way
 to track raw malloc() calls. And I think that 99% of users will
  not find it useful.

The use case that motivated is a client that runs many postgresql instances
with a mostly batch/large query workload. Some of the queries are code
generated by an application and can be very complex.  A few times a month
one of these will run through 64GB of memory and oom the host. So it
seriously hurts production. Setting work_mem low enough to prevent this
results in poor query performance.

This client does not use any outside libraries that call malloc() directly.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread Robert Haas
On Thu, Oct 1, 2009 at 12:15 PM, Euler Taveira de Oliveira
eu...@timbira.com wrote:
 Robert Haas escreveu:
 On Thu, Oct 1, 2009 at 11:47 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Euler Taveira de Oliveira eu...@timbira.com writes:
 Tom Lane escreveu:
 daveg da...@sonic.net writes:
 I'd like to propose adding a new GUC to limit the amount of memory a 
 backend
 can allocate for its own use.
 Use ulimit.

 What about plataforms (Windows) that don't have ulimit?
 Get a real operating system ;-)

 Seriously, the proposed patch introduces overhead into a place that is
 already a known hot spot, in return for not much of anything.  It will
 *not* bound backend memory use very accurately, because there is no way
 to track raw malloc() calls.  And I think that 99% of users will not
 find it useful.

 What WOULD be useful is to find a way to provide a way to configure
 work_mem per backend rather than per executor node.  But that's a much
 harder problem.

 I see. Tough problem is: how do we get per backend memory usage accurately? Is
 it relying on OS specific API the only way?

As I see it, this is really a planning problem, not an executor
problem, so measuring ACTUAL memory usage is not really important: the
problem is taking memory usage into account during planning.  The
difficulty with adjusting work_mem right now is that the correct value
depends not only on the number of queries that are concurrently
executing (which isn't a constant) but also on the number of sort/hash
operations being performed per query (which is also not a constant).
So if your queries become more complex, a value of work_mem that was
previously OK may start to cause swapping, which encourages setting
work_mem conservatively.  But setting it conservatively can cause the
planner to pick plans that save memory at a LARGE performance cost.

Fixing this isn't simple.  Right now, when planning a particular
joinrel, we only keep track of the best plans for each possible set of
path keys, regardless of how much or little memory they use.  So if we
do something naive, like just track the total amount of memory that
each candidate path is forecast to use and avoid letting it go above
some ceiling, query planning might fail altogether, because the
lower-level joinrels use as much memory as they want and the higher
level nodes, which for some reason can't be done without memory, can't
be planned.  Or we might just end up with a badly suboptimal plan,
because we pick a slightly cheaper plan lower down in the tree that
uses a LOT more memory over a slightly more expensive one that uses
much less.  Later we'll wish we hadn't, but by that point it's too
late.

Another possible angle of attack is to try to give the planner a range
for work_mem rather than a hard limit.  The planner would ordinarily
construct paths as though the lower end of the range was the limit,
but for a sufficiently large cost savings it would be willing to adopt
a path that used more memory.  Potentially this willingness could also
be conditioned on the amount of memory used by the path so far,
although that has the same problems described above in kind if not in
degree.  I'm not really sure whether something like this can be made
to work; I'm not sure there's really enough information available when
constructing paths for any sort of local decision-making to prove
fruitful.

The other idea I have is to adopt a strategy where each plan node has
upper and lower bounds on cost, as I previously suggested here with
respect to index-only scans.

http://archives.postgresql.org/pgsql-hackers/2009-09/msg01379.php

The idea would basically be to estimate the lower-bound for the cost
of a sort based on the idea that we'll have the maximum possible
amount of memory to work with (say, the budget for the whole query)
and the upper-bound cost based on the idea that we'll have the minimum
possible amount of memory (zero, or whatever the minimal amount is).
We can also estimate the most memory we think we can usefully use (for
example, a hash join with a smaller inner rel doesn't benefit from
more memory than the amount required to hold the entire hash table in
memory).

After we complete the first round of planning, we look at the
resulting paths and decide which sorts or hashes will get funded with
how much memory.  I'm hand-waving a little bit here, because there may
be a knapsack problem in here (which is NP-complete), since the cost
as a function of memory probably has sharp cliffs with not much change
in between them - certainly for hashing, and I suspect for sorting as
well, but it might be that in practice N is small enough not to
matter, or we might be able to find an approximation that is good
enough that we can live with it.  Even if we can get past that hurdle,
though, there's still all the caveats from the original email,
principally that it's unclear that the necessary computations can be
done without blowing planning time out of the water.  Plus, if we used
this strategy for