Re: [PATCHES] APC/socket fix (final?)

2004-03-26 Thread Magnus Hagander
 Ugh.  Is there a way we can insert a wrapper layer without 
modifying the
 call sites?  I'm thinking of some kind of macro hack, say
 [snip]

Sure. Think we've even done this before (also, prevents 
developers needing to remember to use pg_*).

Yup, it's done to redefine kill() to pqkill(). See include/port/win32.h.



The reason I think it was avoided for select(), in preference 
for a thread
to invoke the socket op during the signal/APC, was a fear that 
perhaps the
Windows Sockets internals could get mashed. AFAICS, the 
discussion Magnus
had with the Microsoft guys (and, from memory, those I've had 
with Magnus
off-list) suggests this isn't true. If mashing the internals is still a
possibility, then clearly the patch I've submitted might do 
more harm than
good.

(Magnus, can you confirm?)

Yes, it can be done (for select, so I guess for recv and send). It is
recommended that we do not do it that way, though.

There is also a third option, which I'm leaning more and more towards. I
was in that direction in the beginning, but turned away from it thinking
it would be too much work. Also, I was under the impression that we
would rather have some workarounds checking errors etc than
reimplement code. I'm now getting the idea that we'd rather have it the
other way around, assuming the code is in the port dir. Fair enough.


The third option is to redefine all these functions into our own, and
implement our own emulation layer. This means our own select(), send(),
recv() (more? I don't think so). And have these call the native winsock
APIs (WSAEventSelect(), WSASend(), WSARecv() etc). These functions are
designed to work in an APC environment. 

This is going to be a bit more code that can go wrong, but it carries
the advatages that (1) it's win32 only code, and (2) the code will all
be in the port directory. We implement it along with the standard Unix
semantics, so the main code should not notice. Doing this, we can back
out the changes done so far (the thread fix, the errno fix).

It's the sledgehammer fix, but I think it's probably time to bite the
bullet and do this one. This way, it's *not* a workaround. Any surprises
we end up with should be from our own code.


If so, I'll submit a patch for select/recv/send over the 
weekend, which will
also remove the recent fixes for pgstat.

If you think my suggestion above is not a good one then yes, we can do
this one. But I don't feel all that good about it (and specifically, I
have not had it confirmed about recv() and send() yet - have asked, but
not received a response yet).

If not, I'm going to get started on an implementation using the WSA
functions for you to check over :-)

//Magnus

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] APC/socket fix (final?)

2004-03-26 Thread Magnus Hagander
 Hopeless, or cute, work-around?

It's possibly workable in the limited context of the postmaster, but
I've got doubts about doing it in libpq where we can't assume we know
what the surrounding application will do.

No need to touch the frontend parts at all. Our APCs are server side
only, so it's not a problem there.

It should be just as with the kill defrine in win32.h - #ifndef
FRONTEND.


To the other point - yes, it's a cute workaround. I'm a bit afraid that
we'll end up with another surprise down the road though, since the
behaviour of the select() function is known to be undefined in this
situation (or at leas tthe return value of it is). I'm not sure we can
rely on the pattern we're seeing now. We also don't know for sure about
send() and recv() (though I'm working on that).

I'm almost done with the our own select()/recv()/send() functions
attempt. I feel that's a more correct solution (not just a workaround),
*if* I can get the code good enough that we feel comfortable with it. If
not, then a workaround seems the way to go, and then Claudios one looks
good (except we might need to go with a global variable instead - but
the general way, and the localization of it looks good)


//Magnus

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] APC/socket fix (final?)

2004-03-26 Thread Magnus Hagander
 Hopeless, or cute, work-around?
 
 It's possibly workable in the limited context of the postmaster, but
 I've got doubts about doing it in libpq where we can't 
assume we know
 what the surrounding application will do.

 No need to touch the frontend parts at all. Our APCs are server side
 only, so it's not a problem there.

Oh really?  What if the surrounding app uses APCs?

Then I'd say it's that apps problem. And it's clearly nothing new - the
problem has existed before. We'd certainly want to document it, though.

I don't think it's a good idea in general to redefine something as
fundamental as select, send, recv etc from libpq-fe.h (or files included
from there). That will affect any and all socket calls in the program in
question, including those that aren't in use in libpq. If we want to
affect those, we need to find a way to do it without exposing it to the
client apps.


//Magnus

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] APC/socket fix (final?)

2004-03-26 Thread Magnus Hagander
 I don't think it's a good idea in general to redefine something as
 fundamental as select, send, recv etc from libpq-fe.h (or 
files included
 from there).

Certainly not; the redefinition would have to be in files that are not
part of the exported API.  However this is not difficult.  
port.h is not
included by libpq-fe.h.

I just realised that. I read too many files in parallell and got the
idea that it was included in libpq-fe.h. My mistake, sorry.

Then yes, we probably want to get that thing done for libpq as well.
Though socket operations in APCs arne't exactly common, we know for a
fact that they can give you interesting and incorrect results :-)
Whatever method we choose should probably be applied to the frontend as
well, yes.

//Magnus

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] APC/socket fix (final?)

2004-03-26 Thread Tom Lane
Claudio Natoli [EMAIL PROTECTED] writes:
 Hopeless, or cute, work-around?

It's possibly workable in the limited context of the postmaster, but
I've got doubts about doing it in libpq where we can't assume we know
what the surrounding application will do.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] APC/socket fix (final?)

2004-03-26 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Hopeless, or cute, work-around?
 
 It's possibly workable in the limited context of the postmaster, but
 I've got doubts about doing it in libpq where we can't assume we know
 what the surrounding application will do.

 No need to touch the frontend parts at all. Our APCs are server side
 only, so it's not a problem there.

Oh really?  What if the surrounding app uses APCs?

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PATCHES] APC/socket fix (final?)

2004-03-26 Thread Claudio Natoli

Hi all,

Magnus Hagander wrote:
 The third option is to redefine all these functions into our own, and
 implement our own emulation layer. This means our own select(), send(),
 recv() (more? I don't think so). And have these call the native winsock
 APIs (WSAEventSelect(), WSASend(), WSARecv() etc). These functions are
 designed to work in an APC environment. 

I personally think this is the last possible option; I'd much rather stay as
close to *nix as possible. As you said, and I agree whole-heartedly that
there is a lot more code that can go wrong, and it seems unnecessary if we
have a suitable workaround, which I think we might.


 If you think my suggestion above is not a good one then yes, we can do
 this one. But I don't feel all that good about it (and specifically, I
 have not had it confirmed about recv() and send() yet - have asked, but
 not received a response yet).

Reading back through our discussions, I'm not at all concerned about recv()
+ send(), as they are well behaved. When interrupted, they SetLastError()
reasonably, and return  0.

select(), however, is a different story. On APC interrupt, it returns a
seemingly valid value, doesn't set any of
errno/WSAGetLastError/GetLastError, and leaves the FD_SETs set, which we
concluded was impossible to work-around. However, it just occured to me that
we could wrap select() by augmenting the read_mask with an addition socket,
that we know will never be touched, and checking this socket on a valid
return. If this socket is still set, we know we got bitten by the APC/socket
interaction bug, and can set errno accordingly.

I've attached a proposal for /port. Comments?

Cheers,
Claudio


--- 
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see 
a
href=http://www.memetrics.com/emailpolicy.html;http://www.memetrics.com/em
ailpolicy.html/a
  



socket.c
Description: Binary data

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] APC/socket fix (final?)

2004-03-26 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 I *think* we can do a better job with a not-so-large amount of effort,
 ...
 I'm willing to give it a try. If it turns out to be a lot of work, or
 way too much code, I'll drop the idea.

Sure, give it a shot and see what you get.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] APC/socket fix (final?)

2004-03-26 Thread Magnus Hagander
 The third option is to redefine all these functions into our own, and
 implement our own emulation layer. This means our own 
select(), send(),
 recv() (more? I don't think so). And have these call the 
native winsock
 APIs (WSAEventSelect(), WSASend(), WSARecv() etc). These 
functions are
 designed to work in an APC environment. 

Pardon me for not having paid close enough attention, but what versions
of select() and friends are we relying on now?  Is it really reasonable
to think that we can do a better job without a large amount of effort?


oh yeah, I should add that these new-style functions are very similar
to the original functions in syntax and semantics - they're just changed
to interface with the win32 style of blocking/signalling/waiting.


//Magnus

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] APC/socket fix (final?)

2004-03-26 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 The third option is to redefine all these functions into our own, and
 implement our own emulation layer. This means our own select(), send(),
 recv() (more? I don't think so). And have these call the native winsock
 APIs (WSAEventSelect(), WSASend(), WSARecv() etc). These functions are
 designed to work in an APC environment. 

Pardon me for not having paid close enough attention, but what versions
of select() and friends are we relying on now?  Is it really reasonable
to think that we can do a better job without a large amount of effort?

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] APC/socket fix (final?)

2004-03-26 Thread Claudio Natoli

 Your patch has been added to the PostgreSQL unapplied patches list at:

Bruce, please cancel the original patch.

The attached patch is for (possible) application to HEAD, following
meritocratic contest [socket.c for src/port]

Been iterating with Magnus extensively off-list, and he is well on the way
to an engineered approach. Let's wait and see where that gets to, and then
search for community consensus between the two approaches.

Cheers,
Claudio

PS. Tested extensively, but if anyone wants to patch their own source base
and try it out any feedback would be most welcome.

--- 
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see 
a
href=http://www.memetrics.com/emailpolicy.html;http://www.memetrics.com/em
ailpolicy.html/a




diff.patch
Description: Binary data


socket.c
Description: Binary data

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PATCHES] APC/socket fix (final?)

2004-03-25 Thread Magnus Hagander
Thanks. I was getting to that, but hadn't started :-)

Per our discussion off-list, I agree with this method, and the patch
looks fine to me.

//Magnus

 -Original Message-
 From: Claudio Natoli [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, March 25, 2004 2:07 AM
 To: '[EMAIL PROTECTED]'
 Subject: [PATCHES] APC/socket fix (final?)
 
 
 
 For application to HEAD.
 
 This should take care of most, if not all, cases where a 
 backend can be interrupted in a blocking socket operation by 
 a signal which itself performs a socket operation (which 
 interacts badly with our APC-based signal implementation).
 
 --- 
 Certain disclaimers and policies apply to all email sent from 
 Memetrics. For the full text of these disclaimers and policies see 
 a 
 href=http://www.memetrics.com/emailpolicy.html;http://www.me
 metrics.com/em
 ailpolicy.html/a
   
 
 

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] APC/socket fix (final?)

2004-03-25 Thread Magnus Hagander
Claudio Natoli [EMAIL PROTECTED] writes:
 + #ifdef WIN32
 +/* Interrupted by socket/APC interaction? */
 +if (n  0  GetLastError() == ERROR_IO_PENDING)
 +errno = EINTR;
 + #endif

This seems a bit schizophrenic; if you can assign to errno, 
why can't you
read from it?  Would look more consistent if the code looked like

   if (n  0  errno == ERROR_IO_PENDING)
   errno = EINTR;


The problem is that winsock does not *set* the errno variable in the
case when it's interrupted by an APC. errno is left at 0.
GetLastError() is the win32 replacement for errno. In most cases errno
is set correctly when you use the unix api functions, but not in this
case (which is arguably a bug, but it's there nevertheless). If you use
native win32 functions, you get GetLastError() set only.

//Magnus

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] APC/socket fix (final?)

2004-03-25 Thread Tom Lane
Claudio Natoli [EMAIL PROTECTED] writes:
 It would be more consistent, but unfortunately GetLastError() != errno.

Yeah, I saw Magnus' explanation.  So essentially this is a workaround
for a bug in Windows' select() emulation.

Rather than hoping that we'll remember to decorate every select() call
with this workaround, would it make sense to use a wrapper function?
I'm loath to invent pg_select() but it might be cleaner than this.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] APC/socket fix (final?)

2004-03-25 Thread Tom Lane
Claudio Natoli [EMAIL PROTECTED] writes:
 I'm loath to invent pg_select() but it might be cleaner than this.

 We'd also need pg_recv() and pg_send(). Chances are it can happen with every
 blocking socket call :-(

Ugh.  Is there a way we can insert a wrapper layer without modifying the
call sites?  I'm thinking of some kind of macro hack, say

#ifdef WIN32
#define select(...)  pg_select(...)
#endif

and then provide a port module that goes roughly like

#undef select

pg_select(...)
{
foo = select(...);
// fix errno here;
return foo;
}

The fewer places that have to know about this sort of thing, the better
off we will be.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] APC/socket fix (final?)

2004-03-25 Thread Claudio Natoli


 Ugh.  Is there a way we can insert a wrapper layer without modifying the
 call sites?  I'm thinking of some kind of macro hack, say
 [snip]

Sure. Think we've even done this before (also, prevents developers needing
to remember to use pg_*).

The reason I think it was avoided for select(), in preference for a thread
to invoke the socket op during the signal/APC, was a fear that perhaps the
Windows Sockets internals could get mashed. AFAICS, the discussion Magnus
had with the Microsoft guys (and, from memory, those I've had with Magnus
off-list) suggests this isn't true. If mashing the internals is still a
possibility, then clearly the patch I've submitted might do more harm than
good.

(Magnus, can you confirm?)

If so, I'll submit a patch for select/recv/send over the weekend, which will
also remove the recent fixes for pgstat.

Cheers,
Claudio

--- 
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see 
a
href=http://www.memetrics.com/emailpolicy.html;http://www.memetrics.com/em
ailpolicy.html/a

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html