On Tuesday 03 March 2009 19:26:57 Steve wrote:
> Michael,
>
> I was reviewing the TCPClient.py code.
Many thanks for this. As a preface to what follows, I've put a different
implementation into TCPClient - in line with my comments yesterday. The
reason is to allow TCPClient to continue to not cause the system to freeze.
The cost at present is higher CPU usage than would be ideal, but it's during
a connection phase, so your example usage (making many many outbound
connections simultaneously) is an edge case, which we can come back to
and optimse. (personal general viewpoint: get it working, make it work
correctly[1], then optimise)
[1] eg handle edge cases "you" (me in this case) haven't considered :)
> In the runClient method you
> have:
>
> sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM);
> yield 0.3
> self.sock = sock # We need this for shutdown later
> try:
> sock.setblocking(0); yield 0.6
> try:
> startConnect = time.time()
> while not self.safeConnect(sock,(self.host,
> self.port)):
Correct. For some history as to why it uses the "raise Finality" structure,
you can see the history here:
* http://mail.python.org/pipermail/python-list/2003-June/207723.html
&
http://mail.python.org/pipermail/python-list/2003-June/thread.html#207723
> And in safeConnect you have:
>
> sock.connect(*sockArgsList); # Expect socket.error: (115,
> 'Operation now in progress')
>
> In the python socket module docs I see:
>
> s.setblocking(0) is equivalent to s.settimeout(0)
>
> and
>
> Note that the connect() operation is subject to the timeout
> setting, and in general it is recommended to call settimeout()
> before calling connect().
Note, this code form is due to me being used to coding sockets stuff
in C, C++ & perl previously where socket calls don't contain any timeout.
Indeed, if you want an idea of the complexity of implementing timeouts
normally, it's perhaps worth looking at this page:
* http://tinyurl.com/bu8tz2
(scroll down to just past 1/2 way - "There are three ways to place a timeout
on an I/O operation involving a socket.")
The timeout you're referring to here is actually implemented inside
Python/Modules/socketmodule.c, and behind the scenes actually
uses either poll or select (depending on platform) in a blocking mode
in order to "do the right thing". (do the right thing being subjective
here relative to blocking sockets)
However in this case, setting the timeout to non-zero, eventually ends up
with this piece of c-code being executed:
tv.tv_sec = (int)s->sock_timeout;
...
if (writing)
n = select(s->sock_fd+1, NULL, &fds, NULL, &tv);
else
n = select(s->sock_fd+1, &fds, NULL, NULL, &tv);
This turns into a blocking call, which then hangs the system. (Which is why
sock.setblocking(0) has to set the timeout to 0 as well :)
> So I get that you want the socket operations to be non-blocking. And
> non-blocking operations should fail if they can't complete rather than
> block. But the connect operation is using a timeout of zero because
> of the blocking setting. And it seems like the problem I'm having on
> windows is that the connection attempt never times out.
This conflates the two issues really. The real issues is simply that I
never thought of putting timeout handling into the TCPClient code, nor
where.
> So, would it be reasonable to:
> 1) setblocking(0) in runClient as it is today
> 2) In safeConnect, sock.settimeout(20)
> 3) sock.connect() as it is today
> 4) sock.settimeout(0) after the connection
>
> It seems like this would allow you to have a timeout honored for the
> connect operation without impacting non-blocking data operations post-
> connect.
>From the above you should see what this isn't reasonable, but in case it
isn't suppose you start 10 TCPClients as follows:
for x in range(10):
Pipeline( TCPClient(dest[x],port[x], connect_timeout=20),
OutputHandler() ).activate()
And suppose every single one is blocked. Rather than this timing out
in about 20 seconds (as it would now given the fix just put in), it would
effectively hang the system for 200 seconds, until all 10 connections time
out - effectively serialising the connection attempts. 1000 failed/filtered
consecutive connections in this manner would take 20,000 seconds or
just over 5 1/2 hours :)
Fundamentally that's why I've not taken this approach here :)
The fix put in, which solves the immediate issue, is here:
* http://tinyurl.com/covwp6
Michael.
--
http://yeoldeclue.com/blog
http://twitter.com/kamaelian
http://www.kamaelia.org/Home
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"kamaelia" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/kamaelia?hl=en
-~----------~----~----~----~------~----~------~--~---