Ooooo goodie, a network question ...

On Sun, 27 Jan 2002, Allan Whiteford wrote:
> It hangs in a call to fflush() to flush a stream (to send text back to
> one of the clients). It seems that the send queue is filling up, netstat
> shows:
> 
> Active Internet connections (w/o servers)
> Proto Recv-Q Send-Q Local Address           Foreign Address        
> State   
> tcp        0   9847 217.10.143.92:6715      217.134.219.115:1134   
> ESTABLISHED
> 
> For all the other connections the Send-Q is zero, I have verified (with
> gdb) that it always hangs on a flush to the stream which has filled up.

Ok, from what I can see, you've got:

[App] <--> [Stream buffer] <--> [Network buffer] <--> ___The_wire_____

You've tried to do an fflush() on your stream buffer, but this has 
blocked, coz the network isn't working. For whatever reason, TCP is 
refusing to send any more packets.

Possible causes (I can think of) are:

  1. Packet loss (for one reason or another). The local computer can 
     attempt to fill the remote computer's window, as advertised by the
     last ACK packet. If it receives no ACKs back, then it will sit and
     wait for further info.

  2. Remote end is busy and has closed the window. What's the Recv-Q on
     the remote site doing? Does the remote software block for a long time 
     for any reason?
 

The best thing to do is check what the window is during the session. Fire
up a copy of tcpdump and watch what happens when you have the fault. This 
will give you far more information to diagnose the problem.

> Does this mean the packets aren't being sent out or aren't being
> received by the client?

Yes, see above.

> Not being received shouldn't hang a call to fflush

It will do if the network buffer is full.

> but what if the machine isn't even trying to send them, would
> that cause a problem?

Yes. Again, because the network buffer is full (as per No. 2 above).

> After a while, the daemon will continue running as if nothing has
> happened. Is this a problem with my application or the libraries/kernel?

Chances are its your code. But, for the record, which kernel(s) are you
running?

(I know, I spent ages tracking down a compiler bug once ... only to
discover the problem was with my code after all)

> Should I be doing some sort of check before flushing in case the client
> has been disconnected in a strange way?

Hmmm, dono. I've never tried streams w/ sockets. If the remote site dies
unexpectedly, then their end-point (IP num + port)  will no longer be
active (loosely speaking). Any IP traffic to a terminated end-point should
result in a reset packet returned. From memory, write(3)s will return 0,
unless you muck about with SIGPIPE, then you might get -1 and errno set to
EPIPE. Quite how this maps to streams, I've no idea!

Cheers,

Paul.

------------------------------------------------------------------------------
Paul Millar                            yo-yo, n. :
Particle Physics Theory Group              Something that is occasionally
Department of Physics and Astronomy        up but normally down.
University of Glasgow,                     (see also Computer)
Glasgow G12 8QQ,                                       [EMAIL PROTECTED]
Scotland                                               +44 (0)141 330 4717
------------------------------------------------------------------------------


--------------------------------------------------------------------
http://www.lug.org.uk                   http://www.linuxportal.co.uk
http://www.linuxjob.co.uk               http://www.linuxshop.co.uk
--------------------------------------------------------------------

Reply via email to