Re: [HACKERS] Client failure allows backed to continue

2003-01-27 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 As part of the training class I did, some people tested what happens
 when the client allocates tons of memory to store a result and aborts.

 What we found was that though elog was properly called:

   elog(COMMERROR, pq_recvbuf: recv() failed: %m);

 (I think that was the message.)  the backend did not exit and kept
 eating CPU. I think the problem is that the elog code only exits on
 ERROR, not COMMERROR.  Is there some way to fix this?

There's been talk of setting the QueryCancel flag after detecting a
client communication failure ... but no one has ever done the legwork
to see if that works nicely, or what downsides it might have.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Client failure allows backed to continue

2003-01-27 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  As part of the training class I did, some people tested what happens
  when the client allocates tons of memory to store a result and aborts.
 
  What we found was that though elog was properly called:
 
  elog(COMMERROR, pq_recvbuf: recv() failed: %m);
 
  (I think that was the message.)  the backend did not exit and kept
  eating CPU. I think the problem is that the elog code only exits on
  ERROR, not COMMERROR.  Is there some way to fix this?
 
 There's been talk of setting the QueryCancel flag after detecting a
 client communication failure ... but no one has ever done the legwork
 to see if that works nicely, or what downsides it might have.

Why is COMMERROR not doing the longjump like ERROR?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Client failure allows backed to continue

2003-01-27 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Why is COMMERROR not doing the longjump like ERROR?

Because it's defined to be like LOG.

A more useful reply might be that I'm not sure it's safe to abort in the
client I/O routines.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Client failure allows backed to continue

2003-01-27 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Why is COMMERROR not doing the longjump like ERROR?
 
 Because it's defined to be like LOG.
 
 A more useful reply might be that I'm not sure it's safe to abort in the
 client I/O routines.

Well, if we get an I/O error, I can't imagine why we would continue
doing anything --- are any of those recoverable?  Do we need a separate
error type for I/O messages?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Client failure allows backed to continue

2003-01-27 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Well, if we get an I/O error, I can't imagine why we would continue
 doing anything --- are any of those recoverable?

Well, that's what's not clear --- it's hard to tell if a write failure
is a hard error or just transient.  If we make like elog(ERROR),
returning to the main loop, and then a read from the client *doesn't*
fail, we'll try to continue ... but we've just screwed the pooch,
because we have not sent a complete message and therefore certainly have
messed up frontend/backend synchronization.  I have no idea whether it's
really possible to recover from this situation or not, but that approach
surely won't work.

If you want to take a kamikaze any-comm-error-means-we're-dead approach,
you might think about elog(FATAL).  But that tries to send a message to
the client.  Instant infinite loop, if the error is hard.

Complaints to the postmaster log, and abort at the next safe place
(*not* partway through message output) seem like the way to go to me.

 Do we need a separate error type for I/O messages?

Uh ... see COMMERROR.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Client failure allows backed to continue

2003-01-27 Thread Bruce Momjian

Well, setting query_cancel then seems like a logical solution because it
will exit at a reasonable point, hopefully.  Right now we have
statement_timeout and that exits at a give time, but I suppose it
doesn't exit while data is transfering, so it may be different.

---

Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Well, if we get an I/O error, I can't imagine why we would continue
  doing anything --- are any of those recoverable?
 
 Well, that's what's not clear --- it's hard to tell if a write failure
 is a hard error or just transient.  If we make like elog(ERROR),
 returning to the main loop, and then a read from the client *doesn't*
 fail, we'll try to continue ... but we've just screwed the pooch,
 because we have not sent a complete message and therefore certainly have
 messed up frontend/backend synchronization.  I have no idea whether it's
 really possible to recover from this situation or not, but that approach
 surely won't work.
 
 If you want to take a kamikaze any-comm-error-means-we're-dead approach,
 you might think about elog(FATAL).  But that tries to send a message to
 the client.  Instant infinite loop, if the error is hard.
 
 Complaints to the postmaster log, and abort at the next safe place
 (*not* partway through message output) seem like the way to go to me.
 
  Do we need a separate error type for I/O messages?
 
 Uh ... see COMMERROR.
 
   regards, tom lane
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster