Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-21 Thread Shridhar Daithankar
On Wednesday 20 August 2003 23:57, Andrew Dunstan wrote:
 http://archives.postgresql.org/pgsql-hackers/2003-07/msg00608.php

 Subject is reprise on Linux overcommit handling  - is that too
 deceptive? :-)

I did little searching on this and found..

http://www.ussg.iu.edu/hypermail/linux/kernel/0306.3/1647.html

Can anybody comment on how much diference 2.6 would make to this situation of 
OOM kiiler feature?

If there could be any cross-OS comparisons, that would be great as well. A 
summary would really help.

 Shridhar



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-21 Thread Andrew Dunstan
I see btw that no change has been made to the docs. That's bad IMNSHO. 
The situation with RH is unchanged with today's kernel errata patch, 
too. I propose to submit a doc patch with the following wording, unless 
someone objects or improves it:

---

Linux kernel version 2.4.* has poor default memory overcommit behavior, 
which can result in the postmaster being killed by the kernel due to 
memory demands by another process if the system runs out of memory. To 
avoid this situation, run postgres on a machine where you can be sure 
that other processes will not run the machine out of memory. If your 
kernel supports strict and/or parnoid modes of overcommit handling, you 
can also relieve this problem by altering the system's default 
behaviour. This can be determined by examining the function 
vm_enough_memory in the file mm/mmap.c in the kernel source. If this 
file reveals that strict and/or paranoid modes are supported by your 
kernel, turn one of these modes on by using

sysctl -w vm.overcommit_memory=2

for strict mode or

   sysctl -w vm.overcommit_memory=3

for paranoid mode.

Warning: using these settings in a kernel which does not support these 
modes will almost certainly increase the danger of the kernel killing 
the postmaster, rather than reducing it. If in any doubt, consult a 
kernel expert or your kernel vendor.

These modes are expected to be supported in all 2.6 and later kernels. 
Some vendor 2.4 kernels may also support these modes. However, it is 
known that some vendor documents suggest that they support them while 
examination of the kernel source reveals that they do not.

-

The kernel docs on these modes state this:

Gotchas
---
The C language stack growth does an implicit mremap. If you want absolute
guarantees and run close to the edge you MUST mmap your stack for the
largest size you think you will need. For typical stack usage is does
not matter much but its a corner case if you really really care
Does this affect Pg?

andrew

Andrew Dunstan wrote:

http://archives.postgresql.org/pgsql-hackers/2003-07/msg00608.php

Subject is reprise on Linux overcommit handling  - is that too 
deceptive? :-)

andrew

Josh Berkus wrote:

Hackers,

I've been searching the archives, but I can't find the thread from 
last month where we discussed the problem with Linux memory 
overcommits in kernel 2.4.x.

Can someone point me to the right thread?   I think maybe the subject 
line was something deceptive 

 



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-21 Thread Jon Jensen
On Thu, 21 Aug 2003, Andrew Dunstan wrote:

 Linux kernel version 2.4.* has poor default memory overcommit behavior, 
 which can result in the postmaster being killed by the kernel due to 
 memory demands by another process if the system runs out of memory. To 
 avoid this situation, run postgres on a machine where you can be sure 
 that other processes will not run the machine out of memory.

I would also note that the OOM killer logs its evil deeds:

printk(KERN_ERR Out of Memory: Killed process %d (%s).\n, p-pid, p-comm);

So there's no need to wonder whether that's a source of trouble for your
PostgreSQL processes or not; just check the logs. I've had the OOM killer
go after large Perl processes and X, but never (yet) PostgreSQL, I'm happy
to say.

Jon

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-21 Thread Josh Berkus
Guys,

 So there's no need to wonder whether that's a source of trouble for your
 PostgreSQL processes or not; just check the logs. I've had the OOM killer
 go after large Perl processes and X, but never (yet) PostgreSQL, I'm happy
 to say.

Well, sadly, the reason I posted is that I (apparently) had a client's 
database fatally corrupted by this problem.  

Downloading Alan Cox's patch now 

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-21 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes:
 Well, sadly, the reason I posted is that I (apparently) had a client's 
 database fatally corrupted by this problem.  

Fatally corrupted?  That should not happen --- at worst, an OOM kill
should lose your current sessions.  We need more details.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-21 Thread Josh Berkus
Tom,

 Fatally corrupted?  That should not happen --- at worst, an OOM kill
 should lose your current sessions.  We need more details.

Joe and I are diagnosing.   Likely the files will come to you before the end 
of the day.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-21 Thread Kurt Roeckx
On Thu, Aug 21, 2003 at 01:04:13PM -0400, Tom Lane wrote:
 Josh Berkus [EMAIL PROTECTED] writes:
  Well, sadly, the reason I posted is that I (apparently) had a client's 
  database fatally corrupted by this problem.  
 
 Fatally corrupted?  That should not happen --- at worst, an OOM kill
 should lose your current sessions.  We need more details.

Even if it kills the postmaster?


Kurt


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-21 Thread Josh Berkus
Alan,

 You need to be careful using Alan's patch. The reason RH stopped using
 this part of it in their errata kernels is that it had conflicts with
 other stuff, specifically the rmap stuff (he told me that himself in
 email).

Hmmm ... that leaves us without a workaround for this problem, then, yes?  
Because even parnoid-mode kernels you can discourage, but not prevent, 
overcommitting.

 For mission critical apps I would advise running the postmaster on a
 dedicated machine, with no X or other nasty stuff running.

Unfortunately, this is frequently not an option ... PostgreSQL is often 
together on a server with Apache, a JVM, and other server software.   As 
happened in the one case of possible (diagnosis pending) failure that we are 
looking into.  Of course, it could be something else 

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-21 Thread Tom Lane
Kurt Roeckx [EMAIL PROTECTED] writes:
 On Thu, Aug 21, 2003 at 01:04:13PM -0400, Tom Lane wrote:
 Fatally corrupted?  That should not happen --- at worst, an OOM kill
 should lose your current sessions.  We need more details.

 Even if it kills the postmaster?

Then you'd have to start a new postmaster.  But it still shouldn't
corrupt anything.

(Since the postmaster has small and essentially constant memory usage,
it's a pretty improbable target for the OOM killer, anyway.)

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-21 Thread Andrew Dunstan
(er, that's Andrew :-)

It depends how important your data is to you. A modest server probably 
costs a few thousand dollars. How much does an expert to install a 
custom kernel cost? Probably about the same.

I believe paranoid mode is supposed to prevent any overcommiting that 
can't later be honoured when the process comes to map the memory, and 
strict mode is supposed to do the same in most normal circumstances. 
Maybe someone more expert in kernel hackery than I am can give a better 
answer.

BTW, Alan Cox is going on sabbatical from RH very soon - so there will 
be no more -ac patches. By the time he returns 2.6 should have been 
released and bedded down, with any luck.

andrew

Josh Berkus wrote:

Alan,

 

You need to be careful using Alan's patch. The reason RH stopped using
this part of it in their errata kernels is that it had conflicts with
other stuff, specifically the rmap stuff (he told me that himself in
email).
   

Hmmm ... that leaves us without a workaround for this problem, then, yes?  
Because even parnoid-mode kernels you can discourage, but not prevent, 
overcommitting.

 

For mission critical apps I would advise running the postmaster on a
dedicated machine, with no X or other nasty stuff running.
   

Unfortunately, this is frequently not an option ... PostgreSQL is often 
together on a server with Apache, a JVM, and other server software.   As 
happened in the one case of possible (diagnosis pending) failure that we are 
looking into.  Of course, it could be something else 

 



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-21 Thread Josh Berkus
Folks,

  Well, sadly, the reason I posted is that I (apparently) had a client's 
  database fatally corrupted by this problem.  

OK, diagnosis progresses, it's not the Linux OOM problem, it's something else.  

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-21 Thread Josh Berkus
Andrew,

 I see btw that no change has been made to the docs. That's bad IMNSHO.
 The situation with RH is unchanged with today's kernel errata patch,
 too. I propose to submit a doc patch with the following wording, unless
 someone objects or improves it:

First, off, I'm crossing this to PGSQL-DOCS, which is the correct list for doc 
patches.

Second, don't you think we should have some mention of the Alan Cox patch?   

Otherwise, I think your doc patch is good and needed before we go final.   
When we settle the second question, I'll submit a diff for you.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-21 Thread Andrew Dunstan
I replied to Josh thus:

---

You need to be careful using Alan's patch. The reason RH stopped using 
this part of it in their errata kernels is that it had conflicts with 
other stuff, specifically the rmap stuff (he told me that himself in 
email).

I am very wary of advising people to use what is essentially an 
experimental patch in a production system. This should be a last resort 
- a better solution is to have better control over what is running on 
your db server, so you can ensure it never gets into an OOM situation. 
For mission critical apps I would advise running the postmaster on a 
dedicated machine, with no X or other nasty stuff running.

-

I do have a doc patch ready (with one sensible addition suggested by Jon 
Jensen).

andrew

Josh Berkus wrote:

Andrew,

 

I see btw that no change has been made to the docs. That's bad IMNSHO.
The situation with RH is unchanged with today's kernel errata patch,
too. I propose to submit a doc patch with the following wording, unless
someone objects or improves it:
   

First, off, I'm crossing this to PGSQL-DOCS, which is the correct list for doc 
patches.

Second, don't you think we should have some mention of the Alan Cox patch?   

Otherwise, I think your doc patch is good and needed before we go final.   
When we settle the second question, I'll submit a diff for you.

 



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[HACKERS] Can't find thread on Linux memory overcommit

2003-08-20 Thread Josh Berkus
Hackers,

I've been searching the archives, but I can't find the thread from last month 
where we discussed the problem with Linux memory overcommits in kernel 2.4.x.

Can someone point me to the right thread?   I think maybe the subject line was 
something deceptive 

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-20 Thread Andrew Dunstan
http://archives.postgresql.org/pgsql-hackers/2003-07/msg00608.php

Subject is reprise on Linux overcommit handling  - is that too 
deceptive? :-)

andrew

Josh Berkus wrote:

Hackers,

I've been searching the archives, but I can't find the thread from last month 
where we discussed the problem with Linux memory overcommits in kernel 2.4.x.

Can someone point me to the right thread?   I think maybe the subject line was 
something deceptive 

 



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Can't find thread on Linux memory overcommit

2003-08-20 Thread Thomas Swan
On 8/20/2003 1:02 PM, Josh Berkus wrote:

Hackers,

I've been searching the archives, but I can't find the thread from last month 
where we discussed the problem with Linux memory overcommits in kernel 2.4.x.

Can someone point me to the right thread?   I think maybe the subject line was 
something deceptive 

  

Re: [HACKERS] Pre-allocation of shared memory ...
On 6/11/2003


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org