Dear friends, again, an udpate of naviserver-connthreadqueue:
(1) Observing the traffic on the low-traffic-site next-scripting.org showed still sometimes surprisingly slow "response times" where the pure runtime (not including queuing time etc) was 28 seconds. Under normal conditions, the response time is just a fraction of a second. It turned out that these requests are from a mobile broadband service, with a low bandwidth. 92.40.253.47 - - [12/Dec/2012:22:19:14 +0100] "GET /2.0b3/doc/nx HTTP/1.1" 200 23406 "https://next-scripting.org/2.0b3/doc/xotcl2" "Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:17.0) Gecko/20100101 Firefox/17.0" 28.700910 "1355347126 .164479 14.759801 0.000122 0.007112 28.693676" Since this page is an adp-page, which has in the current naviserver no chance to be delivered via the writer thread (as shown by the callgraph posted earlier). Maybe your first reaction is "don't care, bandwidths become better", but the problem is quite serious. It is very easy to start an DOS attack by starting just a few requests with a low bandwidth. If the server has e.g. a max of 10 connection threads defined, just 10 slow requests to adp-pages bring the server to a halt for arbitrary long times, although the server is computationally able to handle several hundred/thousand of these adp-requests per second. The attacker has just to accept a few bytes from time to time to stay above the write timeout of the server. Browsing around shows that this is a well-known attack that affects as well apache 1.x and 2.x, but not e.g. nginx, which is fully asynchronous and performs usually request spooling/buffering in front of the back-end. So, asynchronous receives and deliveries are a not only a nice feature. Therefore, i have added an interface between the "string based" delivery API and the writer thread and have this running on next-scripting.org since a few days, everything seems to work flawless, no single long request blocking happened. (2) We started using naviserver-connthreadqueue on our production site a few days ago (which runs behind nginx, therefore it profits just in part form the changes). Naviserver sees there currently about 1.5 mio page-views per day. The experiences are: - the config file needs some tweaking to keep queuing times low. i have set minthreads to 7 (before we had 3, but with a different interpretation). - the new async log writer works nicely, although it might reverse the order of entries in the log file. Will look into that. - the writer thread was not used so far, but had some troubles: (a) we saw peaks of 600.000 mutex locks/second, most of these from a mutex of the writer thread. Under normal conditions, we see on avg 8k mutex locks/second. (b) "ns_writer list" was not thread-safe (it crashed). While looking into problem (a), the writer thread did not have a clear EOF-handling (added POLLHUP handling) and it was possible that the writer might release a socket structure while the driver still depended on it. Now the lifecycle management in all in the driver, the problem is gone. Also (b) is fixed by now. - there are much less thread creates, the memory consumption seems better, but we need some longer measurement. The average response time might be slightly better, but it is within the daily variation range. Since we have no data about the queuing time of the old server, this is still hard to compare. We have still to lower the debug output, so, it is too early for an assessment. (3) Concerning zip-delivery of files: I have refactored the code a little to ease the zip handling on the tcl layer. There is now a new subcommand "ns_conn zipaccepted" which performs the rather complex preference rules. The following code could be easily used e.g. in a filter, or it can be extended to "compile" different formats into a target format. At least something to play in a first step i will commit the changes asap to naviserver-connthreadqueue, clean up, document, etc.... all the best. -gustaf neumann set file graph.js set fn [ns_info pageroot]/$file set mime [ns_guesstype $file] if {[ns_conn zipaccepted] && [ns_set iget [ns_conn headers] Range] eq ""} { if {![file readable $fn.gz] || [file mtime $fn] > [file mtime $fn.gz]} { exec gzip -9 < $fn > $fn.gz } if {[file readable $fn.gz]} { set fn $fn.gz ns_set put [ns_conn outputheaders] Vary Accept-Encoding ns_set put [ns_conn outputheaders] Content-Encoding gzip } } ns_returnfile 200 $mime $fn Am 09.12.12 19:48, schrieb Gustaf Neumann: > Dear all, > > On the link below i have tried to summarize the changes in the > naviserver-connthreadqueue fork taken from several mails. The summary > contains as well a few charts showing the stepwise improvements, which > i hope, someone might find interesting. > > https://next-scripting.org/xowiki/docs/misc/naviserver-connthreadqueue > > The new version uses TCP_CORK for the most interesting cases. The > changes from this feature are not dramatical, since NaviServer is > often aggregating the strings to be written in DStrings, so there are > apparently not many small writes. > > If nobody objects, i would tag the current tip of naviserver with > 4.99.4 and move the changes over to the main repository in the near > future .... after i make an iteration of the affected documentation. > > -gustaf neumann ------------------------------------------------------------------------------ LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d _______________________________________________ naviserver-devel mailing list naviserver-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/naviserver-devel