On Mon, Jul 30, 2012 at 2:19 PM, Sonny Karlsson <[email protected]> wrote: > > On Jul 30, 2012, at 9:53PM, Eduardo Silva wrote: > >> On Mon, Jul 30, 2012 at 1:41 PM, Sonny Karlsson <[email protected]> wrote: >>> Here's the patch. >>> >>> On Jul 30, 2012, at 9:39PM, Sonny Karlsson wrote: >>> >>>> Hi >>>> >>>> Setting TCP_NODELAY for all connections is not needed, TCP_CORK >>>> overrides this option according to man page tcp section 7. I've seen no >>>> change in performance while benchmarking. >>>> >>>> This patch will remove an unnecessary system call and fix some warnings >>>> generated during high load when connections are closed before the system >>>> call completes. The warnings are rare (takes 10+ min to replicate, but >>>> then I get a few at the same time) and only occur in a fastcgi module >>>> I'm currently working on, not while serving static files. >>>> >>>> There is however an issue that may arise when serving static files >>>> without the TCP_NODELAY flag. I'll submit a patch and description of >>>> the issue. >>>> >>>> -- >>>> Sonny Karlsson >>> >> >> This looks an interesting topic, according tcp(7): >> >> TCP_CORK (since Linux 2.2) >> If set, don't send out partial frames. All queued >> partial frames are sent when the option is cleared again. This is >> useful for >> prepending headers before calling sendfile(2), or for >> throughput optimization. As currently implemented, there is a 200 >> millisec‐ >> ond ceiling on the time for which output is corked >> by TCP_CORK. If this ceiling is reached, then queued data is >> automatically >> transmitted. This option can be combined with >> TCP_NODELAY only since Linux 2.5.71. This option should not be >> used in code >> intended to be portable. >> >> TCP_NODELAY >> If set, disable the Nagle algorithm. This means that >> segments are always sent as soon as possible, even if there is only a >> small >> amount of data. When not set, data is buffered until >> there is a sufficient amount to send out, thereby avoiding the >> frequent send‐ >> ing of small packets, which results in poor >> utilization of the network. This option is overridden by TCP_CORK; >> however, setting >> this option forces an explicit flush of pending output, >> even if TCP_CORK is currently set. >> >> >> In monkey TCP_CORK is used when sending out the response headers plus >> a fraction of the static file, when the response headers are finally >> sent, TCP_CORK is disabled. But what happens with the next calls who >> send the data to the client ?, imagine a file of 1MB, it will need a >> few calls to sendfile(2) and on that moment the Nagle algorithm is >> still enable, so removing the TCP_NODELAY code flag will hurt. >> >> Those system calls could be reduced, what about when TCP_CORK is >> disabled, make sure to enable TCP_NODELAY on that moment ?, that could >> be an improvement. >> >> comments ? >> >> >> -- >> Eduardo Silva >> http://edsiper.linuxchile.cl >> http://www.monkey-project.com > > Only partial frames are delayed, full frames will be sent. So the 1 MB > file will be sent in as large packages as is allowed by the TCP > connection, resulting in fewer packages. The headers and first part > will be sent as soon as one frame can be filled. So if one would look at > the data transferred, the first part would be the same before and after > this patch. > > The best source I've found is a post by Linus Torvalds, found at > http://yarchive.net/comp/linux/sendfile.html. Somewhere a bit down, > linux says this about TCP_CORK: > > >> Now, TCP_CORK is basically me telling David Miller that I refuse to play >> games to have good packet size distribution, and that I wanted a way for >> the application to just tell the OS: I want big packets, please wait >> until >> you get enough data from me that you can make big packets. >> >> Basically, TCP_CORK is a kind of "anti-nagle" flag. It's the reverse of >> "no-nagle". So you'd "cork" the TCP connection when you know you are >> going >> to do bulk transfers, and when you're done with the bulk transfer you >> just >> "uncork" it. At which point the normal rules take effect (ie normally >> "send out any partial packets if you have no packets in flight"). >> >> This is a _much_ better interface than having to play games with >> scatter-gather lists etc. You could basically just do >> >> int optval = 1; >> >> setsockopt(sk, SOL_TCP, TCP_CORK, &optval, sizeof(int)); >> write(sk, ..); >> write(sk, ..); >> write(sk, ..); >> sendfile(sk, ..); >> write(..) >> printf(...); >> ...any kind of output.. >> >> optval = 0; >> setsockopt(sk, SOL_TCP, TCP_CORK, &optval, sizeof(int)); >> >> and notice how you don't need to worry about _how_ you output the data >> any >> more. It will automatically generate the best packet sizes - waiting for >> disk if necessary etc. >> >> With TCP_CORK, you can obviously and trivially emulate the HP-UX >> behaviour >> if you want to. But you can just do _soo_ much more. >> >> Imagine, for example, keep-alive http connections. Where you might be >> doing multiple sendfile()'s of small files over the same connection, one >> after the other. With Linux and TCP_CORK, what you can basically do is >> to >> just cork the connection at the beginning, and then let is stay corked >> for >> as long as you don't have any outstanding requests - ie you uncork only >> when you don't have anything pending any more. >> >> (The reason you want to uncork at all, is to obviously let the partial >> packets out when you don't know if you'll write anything more in the >> near >> future. Uncorking is important too. >> >> Basically, TCP_CORK is useful whenever the server knows the patterns of >> its bulk transfers. Which is just about 100% of the time with any kind >> of >> file serving. >> >> Linus >> > > > Also on http://baus.net/on-tcp_cork there is some information about the > workings > of the TCP_CORK flag.
Thanks for the research on this. Indeed we dont need TCP_NODELAY if we are using TCP_CORK for static data, i will apply your patches, thanks! -- Eduardo Silva http://edsiper.linuxchile.cl http://www.monkey-project.com _______________________________________________ Monkey mailing list [email protected] http://lists.monkey-project.com/listinfo/monkey
