Well, it's not good idea because of SIGTERM is used for ABORT + EXIT
(pg_ctl -m fast stop), but shouldn't ABORT clean up everything?
Er, shouldn't ABORT leave the system in the exact state that it's
in so that one can get a crashdump/traceback on a wedged process
without it trying to
[EMAIL PROTECTED] (Nathan Myers) writes:
The relevance to the issue at hand is that processes dying during
heavy memory load is a documented feature of our supported platforms.
Ugh. Do you know anything about *how* they get killed --- ie, with
what signal?
regards,
Denis Perchine [EMAIL PROTECTED] writes:
Didn't you get my mail with a piece of Linux kernel code? I think all is
clear there.
That was implementing CPU-time-exceeded kill, which is a different
issue.
regards, tom lane
Didn't you get my mail with a piece of Linux kernel code? I think all is
clear there.
That was implementing CPU-time-exceeded kill, which is a different
issue.
Opps.. You are talking about OOM killer.
/* This process has hardware access, be more careful. */
if (cap_t(p-cap_effective)
START_/END_CRIT_SECTION is mostly CritSectionCount++/--.
Recording could be made as
LockedSpinLocks[LockedSpinCounter++] = spinlock
in pre-allocated array.
Yeah, I suppose. We already do record locking of all the fixed
spinlocks (BufMgrLock etc), it's just the per-buffer spinlocks
"Mikheev, Vadim" [EMAIL PROTECTED] writes:
Yeah, I suppose. We already do record locking of all the fixed
spinlocks (BufMgrLock etc), it's just the per-buffer spinlocks that
are missing from that (and CRIT_SECTION calls). Would it be
reasonable to assume that only one buffer spinlock could
Denis Perchine [EMAIL PROTECTED] writes:
You will get SIGKILL in most cases.
Well, a SIGKILL will cause the postmaster to shut down and restart the
other backends, so we should be safe if that happens. (Annoyed as heck,
maybe, but safe.)
Anyway, this is looking more and more like the SIGTERM
On Wed, Jan 10, 2001 at 12:46:50AM +0600, Denis Perchine wrote:
Didn't you get my mail with a piece of Linux kernel code? I think all is
clear there.
That was implementing CPU-time-exceeded kill, which is a different
issue.
Opps.. You are talking about OOM killer.
/* This
[EMAIL PROTECTED] (Nathan Myers) writes:
If a backend dies while holding a lock, doesn't that imply that
the shared memory may be in an inconsistent state?
Yup. I had just come to the realization that we'd be best off to treat
the *entire* period from SpinAcquire to SpinRelease as a critical
Yup. I had just come to the realization that we'd be best
off to treat the *entire* period from SpinAcquire to SpinRelease
as a critical section for the purposes of die(). That is, response
to SIGTERM will be held off until we have released the spinlock.
Most of the places where we grab
Denis Perchine [EMAIL PROTECTED] writes:
On Monday 08 January 2001 00:08, Tom Lane wrote:
FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
Were there any errors before that?
No... Just clean log (I redirect log from stderr/out t file, and all
other to syslog).
The
FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
Were there any errors before that?
No... Just clean log (I redirect log from stderr/out t file, and all
other to syslog).
The error messages would be in the syslog then, not in stderr.
Hmmm... The only strange
Denis Perchine [EMAIL PROTECTED] writes:
FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
Were there any errors before that?
Actually you can have a look on the logs yourself.
Well, I found a smoking gun:
Jan 7 04:27:51 mx postgres[2501]: FATAL 1: The system is
On Mon, Jan 08, 2001 at 12:21:38PM -0500, Tom Lane wrote:
Denis Perchine [EMAIL PROTECTED] writes:
FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
Were there any errors before that?
Actually you can have a look on the logs yourself.
Well, I found a smoking gun:
Well, I found a smoking gun: ...
What seems to have happened is that 2501 curled up and died, leaving
one or more buffer spinlocks locked. ...
There is something pretty fishy about this. You aren't by any chance
running the postmaster under a ulimit setting that might cut off
Denis Perchine [EMAIL PROTECTED] writes:
It's worth noting here that modern Unixes run around killing user-level
processes more or less at random when free swap space (and sometimes
just RAM) runs low.
That's not the case for sure. There are 512Mb on the machine, and when I had
this
On Monday 08 January 2001 23:21, Tom Lane wrote:
Denis Perchine [EMAIL PROTECTED] writes:
FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
Were there any errors before that?
Actually you can have a look on the logs yourself.
Well, I found a smoking gun:
Jan 7
Denis Perchine [EMAIL PROTECTED] writes:
Hmmm... actually this is real problem with vacuum lazy. Sometimes it
just do something for enormous amount of time (I have mailed a sample
database to Vadim, but did not get any response yet). It is possible,
that it was me, who killed the backend.
Killing an individual backend with SIGTERM is bad luck. The backend
will assume that it's being killed by the postmaster, and will exit
without a whole lot of concern for cleaning up shared memory --- the
What code will be returned to postmaster in this case?
Vadim
"Mikheev, Vadim" [EMAIL PROTECTED] writes:
Killing an individual backend with SIGTERM is bad luck. The backend
will assume that it's being killed by the postmaster, and will exit
without a whole lot of concern for cleaning up shared memory --- the
What code will be returned to postmaster in
Killing an individual backend with SIGTERM is bad luck.
The backend will assume that it's being killed by the postmaster,
and will exit without a whole lot of concern for cleaning up shared
memory --- the
SIGTERM -- die() -- elog(FATAL)
Is it true that elog(FATAL) doesn't clean up
"Mikheev, Vadim" [EMAIL PROTECTED] writes:
Killing an individual backend with SIGTERM is bad luck.
SIGTERM -- die() -- elog(FATAL)
Is it true that elog(FATAL) doesn't clean up shmem etc?
This would be very bad...
It tries, but I don't think it's possible to make a complete guarantee
* Mikheev, Vadim [EMAIL PROTECTED] [010108 23:08] wrote:
Killing an individual backend with SIGTERM is bad luck.
The backend will assume that it's being killed by the postmaster,
and will exit without a whole lot of concern for cleaning up shared
memory --- the
SIGTERM -- die()
Hi,
Does anyone seen this on PostgreSQL 7.0.3?
FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
Server process (pid 1008) exited with status 6 at Sun Jan 7 04:29:07 2001
Terminating any active server
Denis Perchine [EMAIL PROTECTED] writes:
Does anyone seen this on PostgreSQL 7.0.3?
FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
Were there any errors before that?
I've been suspicious for awhile that the system might neglect to release
buffer cntx_lock spinlocks if
On Monday 08 January 2001 00:08, Tom Lane wrote:
Denis Perchine [EMAIL PROTECTED] writes:
Does anyone seen this on PostgreSQL 7.0.3?
FATAL: s_lock(401f7435) at bufmgr.c:2350, stuck spinlock. Aborting.
Were there any errors before that?
No... Just clean log (I redirect log from
26 matches
Mail list logo