Hi Cyan, Yes, there's still a few subscribers.
Do you know if this connection dropping happens mostly when there is a lot of activity or more frequently when there is very low activity? I recall a few edge cases in the thread pooling where a thread would in some circumstances wait until another connection came in before running, and there might have been a related case where a connection could get dropped. IIRC, these both happened generally when there was low traffic (or more specifically low concurrent traffic). Playing with maxconns might diminish the problem in this case. You also mention favicon.ico; is it mostly or always that? It's notable for being a small static file, which could point to other causes, like a corrupt interpreter state as Peter suggested. Or there might be some weirdness with mmap if you have that enabled. One other thought, can you switch to naviserver? The connection handling there has evolved somewhat differently not to mention more recently) than aolserver, but programming-wise there are not a lot of differences. -J Cyan ogilvie wrote: > Hi > > I'm hoping there are still some subscribers to this list ;) > > I'm trying to debug a strange condition we're seeing on a small > percentage of our connections: connections are being closed by the > server without any response being sent back on the connection (verified > by looking at network packet traces and inserting a logging transparent > proxy between the client and server). The network packet pattern we see is: > > <normal TCP setup - SYN, SYN/ACK, ACK> > > Request data (in a single frame, or multiple), ACK > > Then the connection is closed by the server after 10 - 70 ms, without > any data being sent, with a FIN/ACK (still getting confirmation on this > - these logs are from the other side of the man-in-the-middle proxy I'm > using to get debugging info). > > For some of the failed requests, the server processing never gets as far > as the start of our Tcl code (a preauth filter that starts with an > ns_log that doesn't show up in the server log). > > For others the request is processed normally and an access.log message > written indicating that a response was generated with HTTP code 200, but > no packet shows up on the network. > > There is no pattern to the failed requests (sometimes requests for > favicon.ico fail), and retrying the exact request shortly afterwards > often succeeds. > > Has anyone seen anything like this before, or have any advice on how to > narrow down the cause further? > > We're running a slightly patched version of the last 4.5.2 rc, on Ubuntu > 12.04.5 64bit on Amazon EC2 instances, with Tcl 8.6.1 > > Thanks > > Cyan > > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > > > > _______________________________________________ > aolserver-talk mailing list > aolserver-talk@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/aolserver-talk > ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ aolserver-talk mailing list aolserver-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/aolserver-talk