Jeremy Shaw wrote:

On Feb 11, 2010, at 1:57 PM, Bardur Arantsson wrote:

[--snip lots of technical info--]

Thanks for digging so much into this.

Just a couple of comments:


The whole point of the sendfile library is to use sendfile(), so not using sendfile() seems like the wrong solution.

Heh, well, presumably it could still use sendfile() only platforms where it can actually guarantee correctness :).


There is some evidence that when you are doing select() on a readfds, and the connection is closed, select() will indicate that the fds is ready to be read, but when you read it, you get 0-bytes. That indicates that a disconnect has happened. However, if you are only doing read()/recv(), I expect that only happens in the event of a proper disconnect, because if you are just listening for packets, there is no way to tell the difference between the sender just not saying anything, and the sender dying:

True, but the point here is that the OS has a built-in timeout mechanism (via keepalives) and *can* tell the program when that timeout has elapsed.

That's the timeout we're trying to "get at" instead of having to implement a new one.

Good point about the the readfds triggering when the client disconnects. I think that's what I've been seeing in all my other network-related code and I just misremembered the details. All my code is extremely likely to have been both reading and writing from (roughly) the same set of FDs at the same time.

If this method of detection is correct, then what we need is a threadWaitReadWrite, that will notify us if the socket can be read or written. The IO manager does not currently provide a function like that.. but we could fake it like this: (untested):

import Control.Concurrent
import Control.Concurrent.MVar
import System.Posix.Types

data RW = Read | Write

threadWaitReadWrite :: Fd -> IO RW
threadWaitReadWrite fd =
  do m <- newEmptyMVar
     rid <- forkIO $ threadWaitRead fd  >> putMVar m Read
     wid <- forkIO $ threadWaitWrite fd >> putMVar m Write
     r <- takeMVar m
     killThread rid
     killThread wid
     return r


I'll try to get the sendfile code to use this instead. AFAICT it shouldn't actually be necessary to "peek" on the read end of the socket to detect that something has gone wrong. We're guaranteed that sendfile() to a connection that's died (according to the OS, either due to proper disconnect or a timeout) will fail.

I might get a bit tricky to use this if the client is actually expecting to send proper data while the sendfile() is in progress -- if there's actual data to be read from the socket() then the naive "replace threadWaitR by threadWaitRW" will end up busy-waiting on EAGAIN since the socket() will be readable every time
threadWaitReadWrite gets called.

HOWEVER, that's not an issue in my particular scenario, so a simple relacement of threadWaitWrite by threadWaitReadWrite should do fine for testing purposes.

Of course, in the case where the client disconnects because someone turns off the power or pulls the ethernet cable, we have no way of knowing what is going on -- so there is still the possibility that dead connections will be left open for a long time.

True, but then it's (properly) left to the OS to decide and timeouts can be controlled via setsockopt -- as they should IMO.

I'll test tomorrow.

What I'll expect is that I'll still see a few "dead" threads lingering around for ~60 seconds (the OS-based timeout), but that I'll not see any threads lingering indefinitely -- something which usually happens after a few hours of persistent use of my media server.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to