On Wed, 17 Mar 2010 16:22:44 +0000 Sad Clouds <cryintotheblue...@googlemail.com> wrote:
> On Wed, 17 Mar 2010 16:01:28 +0000 > Quentin Garnier <c...@cubidou.net> wrote: > > > Do you have a real world use for that? For instance, I wouldn't call > > a web server that sends the same data to all its clients *at the same > > time* realistic. > > > Why? Because it never happens? I think it happens quite often. Another > example is a server that is sending live data, i.e. audio playback, > video stream, etc. If you can't use multicasting over a WAN, then you > have a situation where you are streaming the same data to large number > of clients. In the past I wrote a custom httpd which read broadcast security camera frames from LAN to broadcast them over connected HTTP clients, and since clients remain connected with keep-alive, it has to iterate through connections to send in new packets. (http://cvs.pulsar-zone.net/cgi-bin/cvsweb.cgi/mmondor/tests/bktr_httpd/) However, clients which cannot cope with the sending speed are "throttled" so that some packets are skipped, which makes things a little more complex than simply using a "send this message to all FDs"... kqueue(2)/kevent(2) were used for polling, and in my case the available bandwidth was always the bottleneck, however. I also have a question: did your test really use non-blocking sockets for writing, and an efficient polling mechanism like kqueue or libevent used, while disabling write polling when the sendq is empty, enabling it back when there's data to send, and only sending data when a poll event indicates that write is allowed? Otherwise, I assume that the LWP would lock on write(2). If a "broadcast" writev(2) to multiple FDs variant existed, it possibly would have to present an interface similar that of kevent, or be tied as a new protocol over kqueue, because of the FD specific errors/events... libevent for instance also supports transfer buffer queues and could possibly be adapted to support such a feature too. However I'm also unsure if this would really help or just move some userland and syscall overhead up to kernel overhead and achieve a similar overall performance. A test implementation might indeed be needed, to really know :( -- Matt