Re: [HACKERS] redo error?

2003-01-08 Thread Greg Copeland
On Tue, 2003-01-07 at 22:58, Tom Lane wrote:

  It also logged that it was killed with signal 9, although I didn't kill it!
  Is there something weird going on here?
 
 Is this Linux?  The Linux kernel seems to think that killing
 randomly-chosen processes with SIGKILL is an appropriate response to
 running out of memory.  I cannot offhand think of a more brain-dead
 behavior in any OS living or dead, but that's what it does.

Just FYI, I believe the 2.6.x series of kernels will rectify this
situation.


-- 
Greg Copeland [EMAIL PROTECTED]
Copeland Computer Consulting


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] redo error?

2003-01-07 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 My postgres totally messed up again for some reason (there were like 3
 postmasters running, other weirdness).
 I noticed this as it was starting up again:
 2003-01-07 18:01:34 DEBUG:  ReadRecord: unexpected pageaddr 16/F2794000 in
 log file 22, segment 249, offset 7946240
 2003-01-07 18:01:34 DEBUG:  redo done at 16/F9791664

This is probably OK --- I believe it just suggests that an XLOG page
header is not what was expected, which is an unsurprising case after a
crash.  The system should recover anyway.  (If you were running with
fsync off, then more paranoia might be appropriate.)

 It also logged that it was killed with signal 9, although I didn't kill it!
 Is there something weird going on here?

Is this Linux?  The Linux kernel seems to think that killing
randomly-chosen processes with SIGKILL is an appropriate response to
running out of memory.  I cannot offhand think of a more brain-dead
behavior in any OS living or dead, but that's what it does.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] redo error?

2003-01-07 Thread Christopher Kings-Lynne
  It also logged that it was killed with signal 9, although I
 didn't kill it!
  Is there something weird going on here?

 Is this Linux?  The Linux kernel seems to think that killing
 randomly-chosen processes with SIGKILL is an appropriate response to
 running out of memory.  I cannot offhand think of a more brain-dead
 behavior in any OS living or dead, but that's what it does.

No, FreeBSD.  It does the same thing as Linux.

What happened is that the postmaster got confused by lots of kill requests
from the kernel I think so I ended up with 3 of them running.

But then I killed them all manually, ipcclean'd and restarted postmaster
cleanly.  Then, a few minutes later I saw that.  However, I might be getting
mixed up as to the order of events, so it is probably me or the kernel doing
it.

Chris


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]