Re: Httpd 3.0 or something else
On Fri, Nov 13, 2009 at 14:01, Arturo 'Buanzo' Busleiman wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA512 > > Matthieu Estrade wrote: >> What about the non http protocol like ftp, or smtp tested during summer >> code ? The tentation to have a powerful core that we could adapt to any >> protocol we want... > > And Google just released SPDY ("Speedy"), a non-http protocol for web > transport... Paul and I briefly discussed adding some stuff to serf that could allow serf to do SPDY. For example, add the notion of "priority" into the request system. It would be ignored in a normal connection, but could then take effect in a SPDY connection. Cheers, -g
Re: Httpd 3.0 or something else
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Matthieu Estrade wrote: > What about the non http protocol like ftp, or smtp tested during summer > code ? The tentation to have a powerful core that we could adapt to any > protocol we want... And Google just released SPDY ("Speedy"), a non-http protocol for web transport... - -- Arturo "Buanzo" Busleiman Independent Linux and Security Consultant - OWASP - SANS - OISSG http://www.buanzo.com.ar/pro/eng.html -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEAREKAAYFAkr9rScACgkQAlpOsGhXcE2JuQCeIQdL24we50bzAphSn+KtbTia pIUAn2O819ym9idAGI19+32o5qPdO/N2 =RuhB -END PGP SIGNATURE-
Re: Httpd 3.0 or something else
Woow =) Very nice and interesting thread =) It's very hard to think how to design httpd 3.0 before knowing what is the real aim of this new webserver. Many feedback here are from very spoted problems. I've started at the end of 1.3 and the beta release of 2.0, and i must say that applicative architectures and needs from people changed a lot. Imho, the real question is what we want to do with it ? Still a very flexible and compatible web server, providing interface for many languages, with a very interesting API to develop modules ? Running in the performances issues like nginx, httpd, haproxy or some others webservers/load-balancers/reverseproxies can do ? Provide event based design to process event driven application and infrastructure like xmpp ? Able to do Soap or webservice routing message ? What about the non http protocol like ftp, or smtp tested during summer code ? The tentation to have a powerful core that we could adapt to any protocol we want... Imho, i don't see how to stay competitive without an mpm to handle event driven application, which could also solve many performances/reliability problems. Then maybe have two big categories like delivery (reverse proxy, load balancing, content caching, gzip/deflate etc.) and applications and languages (php, perl, python, ruby, external filters etc.). my 2 cts. Matthieu Graham Leggett wrote: > Jean-Marc Desperrier wrote: > > >> Last time I've heard about a large scale server thinking about switching >> from Apache to lighttpd, the one problem that site wanted to solve was a >> massive number slow clients simultaneously connected to the server, with >> the http server mostly just serving as a pipe between the client and >> php, and where the ideal solution had to consume as little resource per >> client as possible. >> >> Did the admin of that site just miss what the solution should have been >> to handle this properly with Apache ? >> > > Dedicated reverse proxy servers like varnish have appeared to solve this > problem, and apparently work quite well for the narrow problem they are > designed to solve (I say apparently because we're still at the evaluate > stage on this). > > I would prefer in the long term that the two-layered approach wasn't > necessary, which is why I am so keen to make sure httpd v3.0's > architecture can optionally do what varnish does out of the box. > > Regards, > Graham > -- > >
Re: Httpd 3.0 or something else
Jean-Marc Desperrier wrote: > Last time I've heard about a large scale server thinking about switching > from Apache to lighttpd, the one problem that site wanted to solve was a > massive number slow clients simultaneously connected to the server, with > the http server mostly just serving as a pipe between the client and > php, and where the ideal solution had to consume as little resource per > client as possible. > > Did the admin of that site just miss what the solution should have been > to handle this properly with Apache ? Dedicated reverse proxy servers like varnish have appeared to solve this problem, and apparently work quite well for the narrow problem they are designed to solve (I say apparently because we're still at the evaluate stage on this). I would prefer in the long term that the two-layered approach wasn't necessary, which is why I am so keen to make sure httpd v3.0's architecture can optionally do what varnish does out of the box. Regards, Graham --
Re: Httpd 3.0 or something else
Greg Stein wrote: > we have to take into account that some of those httpd's, like lighttpd, are > replacing Apache plain and simple. [...] [...] I'm just trying to say those aren't necessarily*better* than Apache, but that they are *better-suited* to their admin's scenarios.[...] Last time I've heard about a large scale server thinking about switching from Apache to lighttpd, the one problem that site wanted to solve was a massive number slow clients simultaneously connected to the server, with the http server mostly just serving as a pipe between the client and php, and where the ideal solution had to consume as little resource per client as possible. Did the admin of that site just miss what the solution should have been to handle this properly with Apache ?
Re: Httpd 3.0 or something else
On Thu, Nov 12, 2009 at 09:59, Arturo 'Buanzo' Busleiman wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA512 > > Greg Stein wrote: >> Apache remains the broad solution, but for narrow requirements, people >> will select something that is easier to handle for their particular >> situation. >> >> I wouldn't say "wrong", but more along the lines of "not as well-suited" > > I partially agree, but we have to take into account that some of those > httpd's, like lighttpd, are > replacing Apache plain and simple. Don't get me wrong. I love Apache. I've > written tons of articles > about it since the very early days. And I haven't released any mod_openpgp > code for any other thing > other than Apache for a reason: i love it. Yeah... I think we're in agreement. I'm just trying to say those aren't necessarily *better* than Apache, but that they are *better-suited* to their admin's scenarios. As the swiss army knife of web servers, Apache is very heavy in the pocket. In many scenarios, one little blade is all you need, and it is much easier to use and maintain. I'm not sure that is a solvable problem for us, unfortunately. We would need a drastic overhaul of how we approach configuration. (not to mention setup/building and module loading/handling) In essence, I think the project has concentrated on backwards-compat rather than an overhaul for usability. Cheers, -g
Re: Httpd 3.0 or something else
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Greg Stein wrote: > Apache remains the broad solution, but for narrow requirements, people > will select something that is easier to handle for their particular > situation. > > I wouldn't say "wrong", but more along the lines of "not as well-suited" I partially agree, but we have to take into account that some of those httpd's, like lighttpd, are replacing Apache plain and simple. Don't get me wrong. I love Apache. I've written tons of articles about it since the very early days. And I haven't released any mod_openpgp code for any other thing other than Apache for a reason: i love it. - -- Arturo "Buanzo" Busleiman Independent Linux and Security Consultant - OWASP - SANS - OISSG http://www.buanzo.com.ar/pro/eng.html -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEAREKAAYFAkr8Is4ACgkQAlpOsGhXcE1tFwCdEAEZQDVG9c2yNXwYBk2/ VgIAn2emSNcp1xwXa2bxgoK09JKMcsV4 =T/Fe -END PGP SIGNATURE-
Re: Httpd 3.0 or something else
On Nov 11, 2009, at 2:14 PM, Akins, Brian wrote: > On 11/10/09 6:20 PM, "Greg Stein" wrote: > >> I'd like to see a few "network" threads multiplexing all the writing >> to clients. > > That's what I meant. I just didn't state it properly. > > >> Then take all of *that*, and spread it across several processes for >> solid uptime, with a master monitor process. > > And then you have nginx ;) > Well, nginx is, after all, a fork of httpd
Re: Httpd 3.0 or something else
On Wed, Nov 11, 2009 at 15:00, Arturo 'Buanzo' Busleiman wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA512 > > Greg Stein wrote: >> Right. But they don't have the depth/breadth of modules like we do. > > ... yet. Keep going, but if there are great things like lighttpd and nginx > (and even more) http > daemons out there, then that means more than one thing is wrong with current > Apache. Oh, definitely. HTTP serving is commodity functionality now. Thus, it is very easy to serve niches with a specialized HTTP server. Apache remains the broad solution, but for narrow requirements, people will select something that is easier to handle for their particular situation. I wouldn't say "wrong", but more along the lines of "not as well-suited" Cheers, -g
Re: Httpd 3.0 or something else
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Greg Stein wrote: > Right. But they don't have the depth/breadth of modules like we do. ... yet. Keep going, but if there are great things like lighttpd and nginx (and even more) http daemons out there, then that means more than one thing is wrong with current Apache. Great thread. - -- Arturo "Buanzo" Busleiman Independent Linux and Security Consultant - OWASP - SANS - OISSG http://www.buanzo.com.ar/pro/eng.html -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEAREKAAYFAkr7F+QACgkQAlpOsGhXcE2HbACcCgMwxXMnOJlAyyvfOTURgjiX w6UAmQHy1fPeMOmwkYiCzV/bOL0sumBv =l0LS -END PGP SIGNATURE-
Re: Httpd 3.0 or something else
On Wed, Nov 11, 2009 at 14:14, Akins, Brian wrote: > On 11/10/09 6:20 PM, "Greg Stein" wrote: > >> I'd like to see a few "network" threads multiplexing all the writing >> to clients. > > That's what I meant. I just didn't state it properly. > > >> Then take all of *that*, and spread it across several processes for >> solid uptime, with a master monitor process. > > And then you have nginx ;) Right. But they don't have the depth/breadth of modules like we do. As long as we can keep that ecosystem, then Apache will always be a leader. Cheers, -g
Re: Httpd 3.0 or something else
On 11/10/09 6:20 PM, "Greg Stein" wrote: > I'd like to see a few "network" threads multiplexing all the writing > to clients. That's what I meant. I just didn't state it properly. > Then take all of *that*, and spread it across several processes for > solid uptime, with a master monitor process. And then you have nginx ;) -- Brian Akins
Re: Httpd 3.0 or something else
On Nov 11, 2009, at 6:09 AM, Graham Leggett wrote: > William A. Rowe Jr. wrote: > >>> - Supporting prefork as httpd does now; and >> >> I'm very happy to see prefork die it's timely death. >> >> Let's go about working out where out-of-process magic happens. >> Gated, single threaded handlers may be sensible in some cases. >> But for the core server it makes async worthless, and supporting >> both digs us deeper into the bad-old-days of the 1.3 codebase. > > I disagree strongly, for a number of reasons. > > The first is that in our experience of a very high traffic collection of > websites, the more "hops" you have, the more performance starts to suck, > with the added complication that you run the risk of bumping your head > into the ceiling of filehandle limits, and other issues. > > If you move from "httpd-prefork" to "httpd-something proxied to > random-appserver-X-doing-prefork-for-you" you aren't removing prefork - > you just moving it somewhere else and adding an extra hop. > > You're also making it more complicated, and more complicated means less > reliable. > > People like to harp on about how they want "speed speed speed". Right up > to the point where it first starts becoming unreliable. At that point > they suddenly start crying "reliable reliable reliable". > > Apache httpd does lots of things right. > > We must resist the temptation to throw out what we do right, while we > try move forward fixing what we do wrong. > I must say I agree. Having a method to avoid the 1:1 mapping of request/resp to a specific "entity" (worker or thread) is nice, but that solves a different problem than that solved by prefork. I'd like for us to solve the one while also being able to continue to solve the other. When, for example, nginx works, it works well. When it doesn't, it is simply completely unsuitable. I'd like for us to continue to avoid that being the case for httpd.
Re: Httpd 3.0 or something else
On Wed, 2009-11-11 at 13:09 +0200, Graham Leggett wrote: > Apache httpd does lots of things right. > > We must resist the temptation to throw out what we do right, while we > try move forward fixing what we do wrong. And there is also a reason why Google's Chome is essentially (pre)fork. This model is simply unrivalled when it comes to reliability. -- Bojan
Re: Httpd 3.0 or something else
William A. Rowe Jr. wrote: >> - Supporting prefork as httpd does now; and > > I'm very happy to see prefork die it's timely death. > > Let's go about working out where out-of-process magic happens. > Gated, single threaded handlers may be sensible in some cases. > But for the core server it makes async worthless, and supporting > both digs us deeper into the bad-old-days of the 1.3 codebase. I disagree strongly, for a number of reasons. The first is that in our experience of a very high traffic collection of websites, the more "hops" you have, the more performance starts to suck, with the added complication that you run the risk of bumping your head into the ceiling of filehandle limits, and other issues. If you move from "httpd-prefork" to "httpd-something proxied to random-appserver-X-doing-prefork-for-you" you aren't removing prefork - you just moving it somewhere else and adding an extra hop. You're also making it more complicated, and more complicated means less reliable. People like to harp on about how they want "speed speed speed". Right up to the point where it first starts becoming unreliable. At that point they suddenly start crying "reliable reliable reliable". Apache httpd does lots of things right. We must resist the temptation to throw out what we do right, while we try move forward fixing what we do wrong. Regards, Graham --
Re: Httpd 3.0 or something else
On Tue, Nov 10, 2009 at 05:30:34PM -0500, Akins, Brian wrote: > On 11/10/09 1:56 PM, "Greg Stein" wrote: > > > > But some buckets might be performing gzip or SSL encryption. That > > consumes CPU within the network thread. > > You could just run x times CPU cores number of "network" threads. You can't > use more than 100% of a CPU anyway. > > The model that some of us discussed -- Greg, you may have invented it ;) -- > was to have a small pool of acceptor threads (maybe just one) and a pool of > "worker" threads. The acceptor threads accept connections and move them into > worker threads - that's it. A single fd is then entirely owned by that > worker thread until it (the fd) goes away - network/disk io, gzip, ssl, etc. Sun Web Server (originated from Netscape) (Also Open Web Server) currently handle this way. It has a pool of acceptor threads which accepts connections, acceptor threads pushes the connection to a connection queue, worker threads pulls connection from connection queue and serves the request. Keep alive daemon is also multi-threaded. So multiple keep alive threads polls for the various sets of connection for future HTTP requests. The above architecture is highly scalable. Recently Sun published a specweb record using this Web Server for 128 CMT threads (32 cores system). http://www.spec.org/web2005/results/res2009q4/web2005-20091013-00143.txt You can see the sources from Open Web Server code if you are interested. http://wikis.sun.com/display/wsFOSS/Open+Web+Server Regards, Basant.
Re: Httpd 3.0 or something else
Greg Stein wrote: > On Mon, Nov 9, 2009 at 14:21, Paul Querna wrote: >> ... >> I agree in general, a serf-based core does give us a good start. >> >> But Serf Buckets and the event loop definitely do need some more work >> -- simple things, like if the backend bucket is a socket, how do you >> tell the event loop, that a would block rvalue maps to a file >> descriptor talking to an origin server. You don't want to just keep >> looping over it until it returns data, you want to poll on the origin >> socket, and only try to read when data is available. > > The goal would be that the handler's (aka content generator, aka serf > bucket) socket would be process in the same select() as the client > connections. When the bucket has no more data from the backend, then > it returns "done for now". Eventually, all network reads/writes > finalize and control returns to the core loop. If data comes in the > backend, then the core opens and that bucket can read/return data. > > There are two caveats that I can think of, right off hand: > > 1) Each client connection is associated with one bucket generating the > response. Ideally, you would not bother to read that bucket > unless/until the client connection is ready for reading. But that > could create a deadlock internal to the bucket -- *some* data may need > to be consumed from the backend, processed, and returned to the > backend to "unstick" the entire flow (think SSL). Even though nothing > pops out the top of the bucket, internal processing may need to > happen. > > 2) If you have 10,000 client connections, and some number of sockets > in the system ready for read/write... how do you quickly determine > *which* buckets to poll to get those sockets processed? You don't want > to poll idle connections/buckets if only one is ready for > read/write. (note: there are optimizations around this; if the bucket > wants to return data, but wasn't asked to, then next-time-around it > has the same data; no need to drill way down to the source bucket to > attempt to read network data; tho this kinda sets up a busy loop until > that bucket's client is ready for writing) > > Are either of these the considerations you were thinking of? > > I can certainly see some kind of system to associate buckets and the > sockets that affect their behavior. Though that could get pretty crazy > since it doesn't have to be a 1:1 mapping. One backend socket might > actually service multiple buckets, and vice-versa. > >> I am also concerned about the patterns of sendfile() in the current >> serf bucket archittecture, and making a whole pipeline do sendfile >> correctly seems quite difficult. > > Well... it generally *is* quite difficult in the presence of SSL, > gzip, and chunking. Invariably, content is mangled before hitting the > network, so sendfile() rarely gets a chance to play ball. This brings us straight back to our discussions from 2000-01 timeframe when we discussed poll buckets. Pass it up as metadata that we are stalled on an event (at the socket, ssl, etc) - sometimes multiple events (ext_filter blocked and either needs to read more from the socket, or was blocked on its read, or now has something to write).
Re: Httpd 3.0 or something else
Graham Leggett wrote: > - Supporting prefork as httpd does now; and I'm very happy to see prefork die it's timely death. Let's go about working out where out-of-process magic happens. Gated, single threaded handlers may be sensible in some cases. But for the core server it makes async worthless, and supporting both digs us deeper into the bad-old-days of the 1.3 codebase.
Re: Httpd 3.0 or something else
On Tue, Nov 10, 2009 at 17:30, Akins, Brian wrote: > On 11/10/09 1:56 PM, "Greg Stein" wrote: > > >> But some buckets might be performing gzip or SSL encryption. That >> consumes CPU within the network thread. > > You could just run x times CPU cores number of "network" threads. You can't > use more than 100% of a CPU anyway. One of those buckets might (ahem) block on a file read. While it is doing that, you want to pass control to another bucket. The buckets should be avoiding indetermine blocks like a socket or a pipe, we've basically stated that a file is okay. If we had async I/O, then we'd want to disallow that, too. Mutexes/semaphores can be used in a bucket, as long as they attempt to lock with "nowait" semantics. > The model that some of us discussed -- Greg, you may have invented it ;) -- > was to have a small pool of acceptor threads (maybe just one) and a pool of > "worker" threads. The acceptor threads accept connections and move them into > worker threads - that's it. A single fd is then entirely owned by that > worker thread until it (the fd) goes away - network/disk io, gzip, ssl, etc. Those worker threads are what we have today. It means that you have a 1:1 mapping of client connections to threads. That places serious bounds on your scaling. I'd like to see a few "network" threads multiplexing all the writing to clients. Then you have "worker" threads parsing the request and assembling the response buckets. The resulting buckets might generate-as-they-go, so the worker thread will complete very quickly. Or the worker thread could build a response bucket that already has all of its data, taking a while to do so. It all depends upon the implementation of the buckets and their construction. Then take all of *that*, and spread it across several processes for solid uptime, with a master monitor process. Cheers, -g
Re: Httpd 3.0 or something else
On Tue, Nov 10, 2009 at 16:33, Lieven Govaerts wrote: > On Tue, Nov 10, 2009 at 6:10 PM, Greg Stein wrote: >... >> You have 10k buckets representing the response for 10k clients. The >> core loop reads the response from the bucket, and writes that to the >> network. >> >> Now. A client socket wakes up as writable. I think it is pretty easy >> to say "read THAT bucket" to get data for writing. >> >> Consider the scenario where one of those responses is proxied -- it is >> arriving from a backend origin server. That underlying read-socket is >> stuffed into the core loop. When that read-socket becomes available >> for reading, *which* client response bucket do you start reading from? >> And what happens if the client socket is not writable? >> >> You could just zip thru the 10k response buckets and poll each one for >> data to read, and the serf design states that the underlying >> read-socket *will* get read. But you've gotta do a lot of polling to >> get there. >> >> I think that will be an interesting problem to solve. I believe it >> would be something like this: >> >> Consider when a request arrives. The core looks at the Request-URI and >> the Headers. From these inputs, it determines the appropriate >> response. In this case, that response is identified by a bucket, >> configured with those inputs. (and somewhere in here, any Request-Body >> is managed; but ignore that for now) As that response bucket is >> constructed, along with all interior/nested buckets, that construction >> can say "I've got an FD here. Please add this to the core loop." The >> FD would be added, and would then be associated with the response >> bucket, so we know which to read when the FD wakes up. >> > Suppose this is the diagram of the proxy scenario, where A and B are > buckets wrapping the socket bucket: > > browser --> (client fd) [core loop] [A [B [socket bucket (server > fd) <-- server > > If there's an event on the client fd, the core loop can read bytes > from bucket A - as much as the client socket can handle. Right, and right. > But if only the server fd wakes up, the core loop can't really read > anything as it has nowhere to forward the data to. > The best thing it can do, is tell bucket A: somewhere deep down > there's data to read and considering I (the core loop) was alerted of > that fact there must be one of the other buckets B, C.. interested in > buffering/proactively transforming that data, so please forward this > trigger. Buckets have a peek() function. Hmm. Theoretically, the bucket is *empty* of contents, or you would not have returned to the event loop. Thus, when the peek() rolls around, the bucket is going to figure out what it can provide without blocking. But... the buckets were designed for client-side operation. Buckets are supposed to be emptied completely. That isn't true on the server: the client socket might not be available for writing, so we don't empty a response bucket to completion. It does sound like something more may be needed, in order to propagate some reading down the stack of buckets. But there is also a worry of: if we read, then were do we put that, if the network isn't ready for writing? These read/status/nesting/etc concept are done in order to prevent deadlocks. Ideally, *everything* is read and written to completion. An appserver might not be able to provide you with more content, until you give it something first. So the trick is to flush all writes, and to flush all reads (because the latter might signal another write in order to continue generating content... ad nauseum). > I don't think the buckets interface already has a function for that, > but something similar to 'read 0 bytes' would do. > > So, did I understand your proposal correctly? Yes. But we may have some refining to do, as you've raised, and looking more closely at the flows. Cheers, -g
Re: Httpd 3.0 or something else
On 11/10/09 1:56 PM, "Greg Stein" wrote: > But some buckets might be performing gzip or SSL encryption. That > consumes CPU within the network thread. You could just run x times CPU cores number of "network" threads. You can't use more than 100% of a CPU anyway. The model that some of us discussed -- Greg, you may have invented it ;) -- was to have a small pool of acceptor threads (maybe just one) and a pool of "worker" threads. The acceptor threads accept connections and move them into worker threads - that's it. A single fd is then entirely owned by that worker thread until it (the fd) goes away - network/disk io, gzip, ssl, etc. -- Brian Akins
Re: Httpd 3.0 or something else
On Tue, Nov 10, 2009 at 6:10 PM, Greg Stein wrote: > On Tue, Nov 10, 2009 at 11:14, Akins, Brian wrote: >> On 11/9/09 3:08 PM, "Greg Stein" wrote: >> >>> 2) If you have 10,000 client connections, and some number of sockets >>> in the system ready for read/write... how do you quickly determine >>> *which* buckets to poll to get those sockets processed? You don't want >>> to poll idle connections/buckets if only one is ready for >>> read/write. >> >> Epoll/kqueue/etc. Takes care of that for you. > > Sorry. I wasn't clear. > > You have 10k buckets representing the response for 10k clients. The > core loop reads the response from the bucket, and writes that to the > network. > > Now. A client socket wakes up as writable. I think it is pretty easy > to say "read THAT bucket" to get data for writing. > > Consider the scenario where one of those responses is proxied -- it is > arriving from a backend origin server. That underlying read-socket is > stuffed into the core loop. When that read-socket becomes available > for reading, *which* client response bucket do you start reading from? > And what happens if the client socket is not writable? > > You could just zip thru the 10k response buckets and poll each one for > data to read, and the serf design states that the underlying > read-socket *will* get read. But you've gotta do a lot of polling to > get there. > > I think that will be an interesting problem to solve. I believe it > would be something like this: > > Consider when a request arrives. The core looks at the Request-URI and > the Headers. From these inputs, it determines the appropriate > response. In this case, that response is identified by a bucket, > configured with those inputs. (and somewhere in here, any Request-Body > is managed; but ignore that for now) As that response bucket is > constructed, along with all interior/nested buckets, that construction > can say "I've got an FD here. Please add this to the core loop." The > FD would be added, and would then be associated with the response > bucket, so we know which to read when the FD wakes up. > Suppose this is the diagram of the proxy scenario, where A and B are buckets wrapping the socket bucket: browser --> (client fd) [core loop] [A [B [socket bucket (server fd) <-- server If there's an event on the client fd, the core loop can read bytes from bucket A - as much as the client socket can handle. But if only the server fd wakes up, the core loop can't really read anything as it has nowhere to forward the data to. The best thing it can do, is tell bucket A: somewhere deep down there's data to read and considering I (the core loop) was alerted of that fact there must be one of the other buckets B, C.. interested in buffering/proactively transforming that data, so please forward this trigger. I don't think the buckets interface already has a function for that, but something similar to 'read 0 bytes' would do. So, did I understand your proposal correctly? Lieven
Re: Httpd 3.0 or something else
On Tue, Nov 10, 2009 at 12:54, Graham Leggett wrote: > Greg Stein wrote: > >>> Who is "you"? >> >> Anybody who reads from a bucket. In this case, the core network loop >> when a client connection is ready for writing. > > So would it be correct to say that in this theoretical httpd, the httpd > core, and nobody else, would read from the serf bucket? Correct. That bucket represents the response to the client, and only the core reads that. >... >> No module *anywhere* ever writes to the network. >> >> The core loop reads/pulls from a bucket when it needs more data (for >> writing to the network). >> >> When your cache bucket reads from its interior bucket, it can also >> drop the content into a file, off to the side. Think of this bucket as >> a filter. All content that is read through it will be dumped into a >> file, too. > > Makes sense, but what happens when the cache has finished reading the > interior bucket after the first pass through the code? If the interior has returned EOF, then the caching bucket can destroy it, if it likes. > At this point, my cache needs to make a decision, and before it can make > that decision it wants to know whether upstream is capable of swallowing > the data right now without blocking. No no... the core only asked for as much as it can handle. You return *no more* than that. It isn't your problem to make blocking decisions for the reader of your bucket. If you read more from the interior than the caller wants from you, then that's your problem :-) You need to hold that in memory, dump it to disk, or ... dunno. > If the answer is yes, I cache the data and pass the data upstream and > wait to be called again immediately, because I know upstream won't block. > > If the answer is no, I *don't* pass data upstream (because it would > block from my perspective), and I read from the interior bucket again, > cache some more, and then ask again whether to pass the two data chunks > upstream. Again: you don't make that decision. You just return what the caller asked you for. It may decide to call you again, but that isn't up to you. If you return "this is all I have for you right now", then it won't call you again until some (network) event occurs which may provide more data for reading. If you return EOF, then it shouldn't call you again, tho I believe our rules state that if it *does*, then just return EOF again. > How does my cache get the answer to its question? > > And how does my cache code know when it is safe to read from the > interior bucket without blocking? Buckets *never* block. The interior bucket will give you data saying "I have more", give you data saying "I have no more right now", or say "no more" (EOF). But in no case should it ever block. (note: we do "block" on reading a file, but if we had portable async I/O file operations, then we'd switch to those) >... > I figure there are no better people to explain how serf works than they > who wrote serf ;) Happy to. Unfortunately, we have a dearth of documentation :-( Hopefully, this thread will help to educate several (httpd) developers on the serf model. >... > Imagine big bloated expensive application server, the kind that's > typically built by the lowest bidder. > > Imagine this server is fronted by an httpd reverse proxy. > > Image at the end of the chain, there is a glacially slow (in computing > terms) browser waiting to consume the response. > > A request is processed, and the httpd proxy receives an EOS from the big > bloated application server. Ideally it wants to drop the backend > connection ASAP, no point handing around, but it can't, because the > cleanup for the backend connection is tied to the pool from the request. > And the request pool is only complete when the last byte of the request > has been finally acknowledged by the glacially slow browser. > > So httpd, and the big bloated expensive application server, sit around > waiting, waiting and waiting with memory allocated, database connections > left open, for the browser to finally say "got it, gimme some more" > before httpd's event loops goes "that was it, > apr_pool_destroy(serf_bucket->pool), next!". Okay. The bucket system is different. We have a somewhat-confusing blend between explicit and region-based freeing. If you're done with a bucket, then kill it. Don't wait for the pool to be cleared. In your above scenario, the reverse-proxy-bucket can kill the socket-bucket once the latter returns EOF, and that will drop the connection. Now... all that said, the above scenario is a bit problematic. If the appserver return 2G of content to the frontend server, then where does it go? Any type of bucket that reads-to-EOF is going to have to spool its results somewhere (memory or disk). Otherwise, you keep a small-ish read buffer in memory and you stream through the buffer at whatever read-rate your caller is providing (potentially the client browser's speed). >... > I can see us solve this problem simply by making the filter
Re: Httpd 3.0 or something else
Greg Stein wrote: >> Who is "you"? > > Anybody who reads from a bucket. In this case, the core network loop > when a client connection is ready for writing. So would it be correct to say that in this theoretical httpd, the httpd core, and nobody else, would read from the serf bucket? >> Up till now, my understanding is that "you" is the core, and therefore >> not under control of a module writer. >> >> Let me put it another way. Imagine I am a cache module. I want to read >> as much as possible as fast as possible from a backend, and I want to >> write this data to two places simultaneously: the cache, and the >> downstream network. I know the cache is always writable, but the >> downstream network I am not sure of, I only want to write to the >> downstream network when the downstream network is ready for me. >> >> How would I do this in a serf model? > > No module *anywhere* ever writes to the network. > > The core loop reads/pulls from a bucket when it needs more data (for > writing to the network). > > When your cache bucket reads from its interior bucket, it can also > drop the content into a file, off to the side. Think of this bucket as > a filter. All content that is read through it will be dumped into a > file, too. Makes sense, but what happens when the cache has finished reading the interior bucket after the first pass through the code? At this point, my cache needs to make a decision, and before it can make that decision it wants to know whether upstream is capable of swallowing the data right now without blocking. If the answer is yes, I cache the data and pass the data upstream and wait to be called again immediately, because I know upstream won't block. If the answer is no, I *don't* pass data upstream (because it would block from my perspective), and I read from the interior bucket again, cache some more, and then ask again whether to pass the two data chunks upstream. How does my cache get the answer to its question? And how does my cache code know when it is safe to read from the interior bucket without blocking? >> That I understand, but it makes no difference as I see it - your loop >> only reads from the bucket and jams it into the client socket if the >> client socket is good and ready to accept data. >> >> If the client socket isn't good and ready, the bucket doesn't get pulled >> from, and resources used by the bucket are left in limbo until the >> client is done. If the bucket wants to do something clever, like cache, >> or release resources early, it can't - because as soon as it returns the >> data it has to wait for the client socket to be good and ready all over >> again. The server runs as slow as the browser, which in computing terms >> is glacially slow. > > I'm not sure that I understand you, and that you're familiar with the > serf bucket model. You are 100% right, I am not completely familiar with the serf bucket model, which is why I'm asking these questions. I figure there are no better people to explain how serf works than they who wrote serf ;) > The bucket can certainly cache data as it flows through. No problem > there. Once the bucket has returned all of its data, it can close its > file handle or socket or whatever resources it may have. > > Buckets are one-time use, so once it has returned all of its data, it > can throw out any resources. > > And no... the server does NOT run as slow as the browser. There are N > browsers connected, and the server is processing ALL of them. One > single response bucket is running as fast as its client, sure, but the > server certainly is not idle. That isn't what I meant. Imagine big bloated expensive application server, the kind that's typically built by the lowest bidder. Imagine this server is fronted by an httpd reverse proxy. Image at the end of the chain, there is a glacially slow (in computing terms) browser waiting to consume the response. A request is processed, and the httpd proxy receives an EOS from the big bloated application server. Ideally it wants to drop the backend connection ASAP, no point handing around, but it can't, because the cleanup for the backend connection is tied to the pool from the request. And the request pool is only complete when the last byte of the request has been finally acknowledged by the glacially slow browser. So httpd, and the big bloated expensive application server, sit around waiting, waiting and waiting with memory allocated, database connections left open, for the browser to finally say "got it, gimme some more" before httpd's event loops goes "that was it, apr_pool_destroy(serf_bucket->pool), next!". And the reason why this happened was that all of this was driven by the core's event loop, timed against the speed of the glacially slow browser. Obviously a second browser next door is being serviced at same time as you pointed out, but it too waits, waits, waits for that browser to eventually acknowledge the end of the request. This is the reason why peop
Re: Httpd 3.0 or something else
On Tue, Nov 10, 2009 at 12:01, Jim Jagielski wrote: > On Nov 9, 2009, at 2:19 PM, Akins, Brian wrote: >> On 11/9/09 2:06 PM, "Greg Stein" wrote: >> >>> These issues are already solved by moving to a Serf core. It is fully >>> asynchronous. >> >> Okay that's one convert, any others? ;) Convert? Bah. Justin and myself *started* serf. I'm rather biased, and have never been a simple convert. Messiah, maybe. ;-) >> That's what Paul and I discussed a lot last week. >> >> My ideal httpd 3.0 is: >> >> Libev + serf + lua > > +1 > > For 3.0, I see us breaking the mold and the API in a pretty > substantial way. +1 and ditto. (tho I think we can provide for old handlers thru the pipe mechanism I described earlier on this thread) Cheers, -g
Re: Httpd 3.0 or something else
On Tue, Nov 10, 2009 at 11:14, Akins, Brian wrote: > On 11/9/09 3:08 PM, "Greg Stein" wrote: > >> 2) If you have 10,000 client connections, and some number of sockets >> in the system ready for read/write... how do you quickly determine >> *which* buckets to poll to get those sockets processed? You don't want >> to poll idle connections/buckets if only one is ready for >> read/write. > > Epoll/kqueue/etc. Takes care of that for you. Sorry. I wasn't clear. You have 10k buckets representing the response for 10k clients. The core loop reads the response from the bucket, and writes that to the network. Now. A client socket wakes up as writable. I think it is pretty easy to say "read THAT bucket" to get data for writing. Consider the scenario where one of those responses is proxied -- it is arriving from a backend origin server. That underlying read-socket is stuffed into the core loop. When that read-socket becomes available for reading, *which* client response bucket do you start reading from? And what happens if the client socket is not writable? You could just zip thru the 10k response buckets and poll each one for data to read, and the serf design states that the underlying read-socket *will* get read. But you've gotta do a lot of polling to get there. I think that will be an interesting problem to solve. I believe it would be something like this: Consider when a request arrives. The core looks at the Request-URI and the Headers. From these inputs, it determines the appropriate response. In this case, that response is identified by a bucket, configured with those inputs. (and somewhere in here, any Request-Body is managed; but ignore that for now) As that response bucket is constructed, along with all interior/nested buckets, that construction can say "I've got an FD here. Please add this to the core loop." The FD would be added, and would then be associated with the response bucket, so we know which to read when the FD wakes up. Cheers, -g
Re: Httpd 3.0 or something else
On Mon, Nov 9, 2009 at 18:47, Graham Leggett wrote: >... >> When you read from a serf bucket, it will return however much you ask >> for, or as much as it has without blocking. When it gives you that >> data, it can say "I have more", "I'm done", or "This is what I had >> without blocking". > > Who is "you"? Anybody who reads from a bucket. In this case, the core network loop when a client connection is ready for writing. > Up till now, my understanding is that "you" is the core, and therefore > not under control of a module writer. > > Let me put it another way. Imagine I am a cache module. I want to read > as much as possible as fast as possible from a backend, and I want to > write this data to two places simultaneously: the cache, and the > downstream network. I know the cache is always writable, but the > downstream network I am not sure of, I only want to write to the > downstream network when the downstream network is ready for me. > > How would I do this in a serf model? No module *anywhere* ever writes to the network. The core loop reads/pulls from a bucket when it needs more data (for writing to the network). When your cache bucket reads from its interior bucket, it can also drop the content into a file, off to the side. Think of this bucket as a filter. All content that is read through it will be dumped into a file, too. >... > That I understand, but it makes no difference as I see it - your loop > only reads from the bucket and jams it into the client socket if the > client socket is good and ready to accept data. > > If the client socket isn't good and ready, the bucket doesn't get pulled > from, and resources used by the bucket are left in limbo until the > client is done. If the bucket wants to do something clever, like cache, > or release resources early, it can't - because as soon as it returns the > data it has to wait for the client socket to be good and ready all over > again. The server runs as slow as the browser, which in computing terms > is glacially slow. I'm not sure that I understand you, and that you're familiar with the serf bucket model. The bucket can certainly cache data as it flows through. No problem there. Once the bucket has returned all of its data, it can close its file handle or socket or whatever resources it may have. Buckets are one-time use, so once it has returned all of its data, it can throw out any resources. And no... the server does NOT run as slow as the browser. There are N browsers connected, and the server is processing ALL of them. One single response bucket is running as fast as its client, sure, but the server certainly is not idle. >... > One event loop handling many requests each == event MPM (speed and > resource efficient, but we'd better be bug free). > Many event loops handling many requests each == worker MPM (compromise). > Many event loops handling one request each == prefork (reliable old > workhorse). These have no bearing. The current MPM model is based on content-generators writing/pushing data into the network. A serf-based model reads from content-generators. > In theory if we turn the content handler into a filter and bootstrap the > filter stack with a bucket of some kind, this may work. > > In fact, using both "push" and "pull" at the same time might also make > some sense - your event loop creates a bucket from which data is > "pulled" (serf model), which is in turn "pulled" by a filter stack > (existing filter stack model) and "pushed" upstream. That is NOT the design that myself, Paul, and Justin envision. The core is serf. So *everything* is read/pull-based. The old-style handlers and filters get their own thread and push into a pipe, or an in-memory data queue. The core loop uses a bucket which reads out of that pipe. >... Cheers, -g
Re: Httpd 3.0 or something else
On Nov 9, 2009, at 2:19 PM, Akins, Brian wrote: > On 11/9/09 2:06 PM, "Greg Stein" wrote: > >> These issues are already solved by moving to a Serf core. It is fully >> asynchronous. > > Okay that's one convert, any others? ;) > I said the same thing back on the 4th ;) > That's what Paul and I discussed a lot last week. > > My ideal httpd 3.0 is: > > Libev + serf + lua +1 For 3.0, I see us breaking the mold and the API in a pretty substantial way.
Re: Httpd 3.0 or something else
Greg Stein wrote: >> I am also concerned about the patterns of sendfile() in the current >> serf bucket archittecture, and making a whole pipeline do sendfile >> correctly seems quite difficult. > > Well... it generally *is* quite difficult in the presence of SSL, > gzip, and chunking. Invariably, content is mangled before hitting the > network, so sendfile() rarely gets a chance to play ball. Not necessarily - a sensible cache that writes an interim response to disk should ideally replace the current in-memory response with a sendfile-capable file bucket. Having done whatever filtering magic is required, the server just goes "here kernel, give this file to the network, I'm off to serve the next request, bye". Regards, Graham --
Re: Httpd 3.0 or something else
Paul Querna wrote: > But Serf Buckets and the event loop definitely do need some more work > -- simple things, like if the backend bucket is a socket, how do you > tell the event loop, that a would block rvalue maps to a file > descriptor talking to an origin server. You don't want to just keep > looping over it until it returns data, you want to poll on the origin > socket, and only try to read when data is available. I think it can probably be generally stated that every request processed by the server has N descriptors associated with that request (instead of 1 descriptor, in the current code). In the case of a simple file transfer, there are two descriptors, one belonging to the file, the other belonging to the network socket. In the case of a proxy, one socket belongs to the backend connection, and the other belongs to a frontend network socket. And descriptors might need to be polled for read, or for write, or both (SSL). If a mechanism existed whereby all descriptors associated with a request could be given to the event loop, we could be completely asynchronous throughout the server, from the reading from the backend, to the writing to the frontend. Regards, Graham --
Re: Httpd 3.0 or something else
On 11/9/09 3:08 PM, "Greg Stein" wrote: > 2) If you have 10,000 client connections, and some number of sockets > in the system ready for read/write... how do you quickly determine > *which* buckets to poll to get those sockets processed? You don't want > to poll idle connections/buckets if only one is ready for > read/write. Epoll/kqueue/etc. Takes care of that for you. -- Brian Akins
Re: Httpd 3.0 or something else
On Mon, 9 Nov 2009, Graham Leggett wrote: Akins, Brian wrote: FWIW, nginx "buffers" backend stuff to a file, then sendfiles it out - I think this is what perlbal does as well. Same can be done outside apache using X-sendfile like methods. Seems like we could move this "inside" apache fairly easy. May can do it with a filter. I tried once and got it to filter "most" backend stuff to a temp file, but it tended to miss and block. That was a while ago, but I haven't learned anymore about the filters since then to think it would work any better. Maybe a mod_buffer that goes to a file? mod_disk_cache can be made to do this quite trivially (it's on the list of things to do When I Have Time(TM)). In theory, a mod_disk_buffer could do this quite easily, on condition upstream writes didn't block. I'm guessing that this would be the good-looking implementation of my ugly-but-working making-disk-cache-work-for-large-files patchset (version for 2.2.9 at https://issues.apache.org/bugzilla/show_bug.cgi?id=39380, I'm in the process of respinning it for 2.2.14 but ENOTIME makes testing slow). The main issue I had when cobbling that together was to deal with the fact that stuff wants to block, and it really isn't obvious in the current httpd core how to do this nicely when you have a one-to-many situation. As you might remember, I "solved" it by spawning a thread to deal with caching files in the background when needed. Since our usecase is delivering static files it works, but it sure would be nice with an infrastructure that tried to help you instead of being damn near hostile at times. /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | ni...@acc.umu.se --- Quantum Trek: Time travel with a twist! =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: Httpd 3.0 or something else
Greg Stein wrote: >> How is "pull" different from "push"[1]? > > The network loop pulls data from the content-generator. > > Apache 1.x and 2.x had a handler that pushed data at the network. > There is no loop, of course, since each worker had direct control of > the socket to push data into. As I said in [1], apart from the obvious ;) >> Pull, by definition, is blocking behaviour. > > You may want to check your definitions. > > When you read from a serf bucket, it will return however much you ask > for, or as much as it has without blocking. When it gives you that > data, it can say "I have more", "I'm done", or "This is what I had > without blocking". Who is "you"? Up till now, my understanding is that "you" is the core, and therefore not under control of a module writer. Let me put it another way. Imagine I am a cache module. I want to read as much as possible as fast as possible from a backend, and I want to write this data to two places simultaneously: the cache, and the downstream network. I know the cache is always writable, but the downstream network I am not sure of, I only want to write to the downstream network when the downstream network is ready for me. How would I do this in a serf model? >> You will only run as often as you are pulled, and never more often. And >> if the pull is controlled by how quickly the client is accepting the >> data, which is typically orders of magnitude slower than the backend can >> push, you have no opportunity to try speed up the server in any way. > > Eh? Are you kidding me? > > One single network thread can manage N client connections. As each > becomes writable, the loop reads ("pulls") from the bucket and jams it > into the client socket. If you're really fancy, then you know what the > window is, and you ask the bucket for that much data. That I understand, but it makes no difference as I see it - your loop only reads from the bucket and jams it into the client socket if the client socket is good and ready to accept data. If the client socket isn't good and ready, the bucket doesn't get pulled from, and resources used by the bucket are left in limbo until the client is done. If the bucket wants to do something clever, like cache, or release resources early, it can't - because as soon as it returns the data it has to wait for the client socket to be good and ready all over again. The server runs as slow as the browser, which in computing terms is glacially slow. >> Push however, gives you a choice: the push either worked (yay! go >> browser!), or it didn't (sensible alternative behaviour, like cache it >> for later in a connection filter). Push happens as fast the backend, not >> as slow as the frontend. > > Push means that you have a worker per connection, pushing the response > onto the network. I really would like to see us get away from a worker > per connection. Only if you write it that way (which we have done till now). There is no reason why one event loop can't handle many requests at the same time. One event loop handling many requests each == event MPM (speed and resource efficient, but we'd better be bug free). Many event loops handling many requests each == worker MPM (compromise). Many event loops handling one request each == prefork (reliable old workhorse). In theory if we turn the content handler into a filter and bootstrap the filter stack with a bucket of some kind, this may work. In fact, using both "push" and "pull" at the same time might also make some sense - your event loop creates a bucket from which data is "pulled" (serf model), which is in turn "pulled" by a filter stack (existing filter stack model) and "pushed" upstream. Functions that work better as a "pull" (proxy and friends) can be pulled, functions that work better as a "push" (like caching) can be filters. Regards, Graham --
Re: Httpd 3.0 or something else
On Mon, Nov 9, 2009 at 16:19, Graham Leggett wrote: > Greg Stein wrote: >> These issues are already solved by moving to a Serf core. It is fully >> asynchronous. >> >> Backend handlers will no longer "push" bits towards the network. The >> core will "pull" them from a bucket. *Which* bucket is defined by a >> {URL,Headers}->Bucket mapping system. > > How is "pull" different from "push"[1]? The network loop pulls data from the content-generator. Apache 1.x and 2.x had a handler that pushed data at the network. There is no loop, of course, since each worker had direct control of the socket to push data into. > Pull, by definition, is blocking behaviour. You may want to check your definitions. When you read from a serf bucket, it will return however much you ask for, or as much as it has without blocking. When it gives you that data, it can say "I have more", "I'm done", or "This is what I had without blocking". > You will only run as often as you are pulled, and never more often. And > if the pull is controlled by how quickly the client is accepting the > data, which is typically orders of magnitude slower than the backend can > push, you have no opportunity to try speed up the server in any way. Eh? Are you kidding me? One single network thread can manage N client connections. As each becomes writable, the loop reads ("pulls") from the bucket and jams it into the client socket. If you're really fancy, then you know what the window is, and you ask the bucket for that much data. > Push however, gives you a choice: the push either worked (yay! go > browser!), or it didn't (sensible alternative behaviour, like cache it > for later in a connection filter). Push happens as fast the backend, not > as slow as the frontend. Push means that you have a worker per connection, pushing the response onto the network. I really would like to see us get away from a worker per connection. Once a worker thread determines which bucket to create/build, then it passes it along to the network thread, and returns for more work. The network thread can then manage N connections with their associated response buckets. If one network thread cannot read/generate the content fast enough, then you use multiple threads to keep the connections full. Then you want to add in a bit of control around reading of requests in order to manage the backlog of responses (and any potential memory buildup that entails). If the network thread is consuming 100M and 20k sockets, you may want to stop accepting connections or accept but read them slowly until the pressure eases. etc... Cheers, -g
Re: Httpd 3.0 or something else
Greg Stein wrote: > These issues are already solved by moving to a Serf core. It is fully > asynchronous. > > Backend handlers will no longer "push" bits towards the network. The > core will "pull" them from a bucket. *Which* bucket is defined by a > {URL,Headers}->Bucket mapping system. How is "pull" different from "push"[1]? Pull, by definition, is blocking behaviour. You will only run as often as you are pulled, and never more often. And if the pull is controlled by how quickly the client is accepting the data, which is typically orders of magnitude slower than the backend can push, you have no opportunity to try speed up the server in any way. Push however, gives you a choice: the push either worked (yay! go browser!), or it didn't (sensible alternative behaviour, like cache it for later in a connection filter). Push happens as fast the backend, not as slow as the frontend. So far I'm not convinced it is a step forward, will have to think about it more. [1] Apart from the obvious. Regards, Graham --
Re: Httpd 3.0 or something else
Akins, Brian wrote: What we discussed some on list some at Apachecon, was having a really good and simple process manager. Mod_fcgid is too much work to configure for mere mortals. If we just had something like: AssociateExternal .php /path/to/my/php-cgi Sounds interesting. Any notes from apachecon or otherwise on that discussion? -- Nick Kew
Re: Httpd 3.0 or something else
On Mon, Nov 9, 2009 at 14:21, Paul Querna wrote: >... > I agree in general, a serf-based core does give us a good start. > > But Serf Buckets and the event loop definitely do need some more work > -- simple things, like if the backend bucket is a socket, how do you > tell the event loop, that a would block rvalue maps to a file > descriptor talking to an origin server. You don't want to just keep > looping over it until it returns data, you want to poll on the origin > socket, and only try to read when data is available. The goal would be that the handler's (aka content generator, aka serf bucket) socket would be process in the same select() as the client connections. When the bucket has no more data from the backend, then it returns "done for now". Eventually, all network reads/writes finalize and control returns to the core loop. If data comes in the backend, then the core opens and that bucket can read/return data. There are two caveats that I can think of, right off hand: 1) Each client connection is associated with one bucket generating the response. Ideally, you would not bother to read that bucket unless/until the client connection is ready for reading. But that could create a deadlock internal to the bucket -- *some* data may need to be consumed from the backend, processed, and returned to the backend to "unstick" the entire flow (think SSL). Even though nothing pops out the top of the bucket, internal processing may need to happen. 2) If you have 10,000 client connections, and some number of sockets in the system ready for read/write... how do you quickly determine *which* buckets to poll to get those sockets processed? You don't want to poll idle connections/buckets if only one is ready for read/write. (note: there are optimizations around this; if the bucket wants to return data, but wasn't asked to, then next-time-around it has the same data; no need to drill way down to the source bucket to attempt to read network data; tho this kinda sets up a busy loop until that bucket's client is ready for writing) Are either of these the considerations you were thinking of? I can certainly see some kind of system to associate buckets and the sockets that affect their behavior. Though that could get pretty crazy since it doesn't have to be a 1:1 mapping. One backend socket might actually service multiple buckets, and vice-versa. > I am also concerned about the patterns of sendfile() in the current > serf bucket archittecture, and making a whole pipeline do sendfile > correctly seems quite difficult. Well... it generally *is* quite difficult in the presence of SSL, gzip, and chunking. Invariably, content is mangled before hitting the network, so sendfile() rarely gets a chance to play ball. But if you really are just dealing with plain files (maybe prezipped), then the read_for_sendfile() should be workable. Most buckets can't do squat with it, and should just use a default function. But the file bucket can return a proper handle. (and it is entirely possible/reasonable that the signature should be adjusted to simplify the process) Cheers, -g
Re: Httpd 3.0 or something else
On Mon, Nov 9, 2009 at 11:06 AM, Greg Stein wrote: > On Mon, Nov 9, 2009 at 13:59, Graham Leggett wrote: >> Akins, Brian wrote: >> > It works really well for proxy. Aka "static data" :) >>> >>> Nah, we proxy to fastcgi php stuff, http java stuff, some horrid HTTP perl >>> stuff, etc (Full disclosure, I wrote the horrid perl stuff.) >> >> Doesn't matter, once httpd proxy gets hold of it, it's just shifting >> static bits. >> >> Something I want to teach httpd to do is buffer up data for output, and >> then forget about the output to focus on releasing the backend resources >> ASAP, ready for the next request when it (eventually) comes. The fact >> that network writes block makes this painful to achieve. >> >> Proxy had an optimisation that released proxied backend resources when >> it detected EOS from the backend but before attempting to pass it to the >> frontend, but someone refactored that away at some point. It would be >> good if such an optimisation was available server wide. >> >> I want to be able to write something to the filter stack, and get an >> EWOULDBLOCK (or similar) back if it isn't ready. I could then make >> intelligent decisions based on this. For example, if I were a cache, I >> would carry on reading from the backend and writing the data to the >> cache, while the frontend was saying "not now, slow browser ahead". I >> could have long since finished caching and closed the backend connection >> and freed the resources, before the frontend returned "cool, ready for >> you now", at which point I answer "no worries, have the cached content I >> prepared earlier". > > These issues are already solved by moving to a Serf core. It is fully > asynchronous. > > Backend handlers will no longer "push" bits towards the network. The > core will "pull" them from a bucket. *Which* bucket is defined by a > {URL,Headers}->Bucket mapping system. I was talking to Aaron about this at ApacheCon. I agree in general, a serf-based core does give us a good start. But Serf Buckets and the event loop definitely do need some more work -- simple things, like if the backend bucket is a socket, how do you tell the event loop, that a would block rvalue maps to a file descriptor talking to an origin server. You don't want to just keep looping over it until it returns data, you want to poll on the origin socket, and only try to read when data is available. I am also concerned about the patterns of sendfile() in the current serf bucket archittecture, and making a whole pipeline do sendfile correctly seems quite difficult. -Paul
Re: Httpd 3.0 or something else
On 11/9/09 2:06 PM, "Greg Stein" wrote: > These issues are already solved by moving to a Serf core. It is fully > asynchronous. Okay that's one convert, any others? ;) That's what Paul and I discussed a lot last week. My ideal httpd 3.0 is: Libev + serf + lua -- Brian Akins
Re: Httpd 3.0 or something else
Akins, Brian wrote: > FWIW, nginx "buffers" backend stuff to a file, then sendfiles it out - I > think this is what perlbal does as well. Same can be done outside apache > using X-sendfile like methods. Seems like we could move this "inside" > apache fairly easy. May can do it with a filter. I tried once and got it > to filter "most" backend stuff to a temp file, but it tended to miss and > block. That was a while ago, but I haven't learned anymore about the > filters since then to think it would work any better. > > Maybe a mod_buffer that goes to a file? mod_disk_cache can be made to do this quite trivially (it's on the list of things to do When I Have Time(TM)). In theory, a mod_disk_buffer could do this quite easily, on condition upstream writes didn't block. Regards, Graham --
Re: Httpd 3.0 or something else
On Mon, Nov 9, 2009 at 13:59, Graham Leggett wrote: > Akins, Brian wrote: > It works really well for proxy. >>> Aka "static data" :) >> >> Nah, we proxy to fastcgi php stuff, http java stuff, some horrid HTTP perl >> stuff, etc (Full disclosure, I wrote the horrid perl stuff.) > > Doesn't matter, once httpd proxy gets hold of it, it's just shifting > static bits. > > Something I want to teach httpd to do is buffer up data for output, and > then forget about the output to focus on releasing the backend resources > ASAP, ready for the next request when it (eventually) comes. The fact > that network writes block makes this painful to achieve. > > Proxy had an optimisation that released proxied backend resources when > it detected EOS from the backend but before attempting to pass it to the > frontend, but someone refactored that away at some point. It would be > good if such an optimisation was available server wide. > > I want to be able to write something to the filter stack, and get an > EWOULDBLOCK (or similar) back if it isn't ready. I could then make > intelligent decisions based on this. For example, if I were a cache, I > would carry on reading from the backend and writing the data to the > cache, while the frontend was saying "not now, slow browser ahead". I > could have long since finished caching and closed the backend connection > and freed the resources, before the frontend returned "cool, ready for > you now", at which point I answer "no worries, have the cached content I > prepared earlier". These issues are already solved by moving to a Serf core. It is fully asynchronous. Backend handlers will no longer "push" bits towards the network. The core will "pull" them from a bucket. *Which* bucket is defined by a {URL,Headers}->Bucket mapping system. Cheers, -g
Re: Httpd 3.0 or something else
On 11/9/09 1:59 PM, "Graham Leggett" wrote: > Doesn't matter, once httpd proxy gets hold of it, it's just shifting > static bits. True. > Something I want to teach httpd to do is buffer up data for output, and > then forget about the output to focus on releasing the backend resources > ASAP, ready for the next request when it (eventually) comes. The fact > that network writes block makes this painful to achieve. FWIW, nginx "buffers" backend stuff to a file, then sendfiles it out - I think this is what perlbal does as well. Same can be done outside apache using X-sendfile like methods. Seems like we could move this "inside" apache fairly easy. May can do it with a filter. I tried once and got it to filter "most" backend stuff to a temp file, but it tended to miss and block. That was a while ago, but I haven't learned anymore about the filters since then to think it would work any better. Maybe a mod_buffer that goes to a file? Also, all these temp files are normally in tmpfs for us. -- Brian Akins
Re: Httpd 3.0 or something else
Akins, Brian wrote: >>> It works really well for proxy. >> Aka "static data" :) > > Nah, we proxy to fastcgi php stuff, http java stuff, some horrid HTTP perl > stuff, etc (Full disclosure, I wrote the horrid perl stuff.) Doesn't matter, once httpd proxy gets hold of it, it's just shifting static bits. Something I want to teach httpd to do is buffer up data for output, and then forget about the output to focus on releasing the backend resources ASAP, ready for the next request when it (eventually) comes. The fact that network writes block makes this painful to achieve. Proxy had an optimisation that released proxied backend resources when it detected EOS from the backend but before attempting to pass it to the frontend, but someone refactored that away at some point. It would be good if such an optimisation was available server wide. I want to be able to write something to the filter stack, and get an EWOULDBLOCK (or similar) back if it isn't ready. I could then make intelligent decisions based on this. For example, if I were a cache, I would carry on reading from the backend and writing the data to the cache, while the frontend was saying "not now, slow browser ahead". I could have long since finished caching and closed the backend connection and freed the resources, before the frontend returned "cool, ready for you now", at which point I answer "no worries, have the cached content I prepared earlier". Regards, Graham --
Re: Httpd 3.0 or something else
On 11/9/09 1:40 PM, "Brian Akins" wrote: > On 11/9/09 1:36 PM, "Graham Leggett" wrote: > >>> It works really well for proxy. >> >> Aka "static data" :) > > Nah, we proxy to fastcgi php stuff, http java stuff, some horrid HTTP perl > stuff, etc (Full disclosure, I wrote the horrid perl stuff.) Replying to my own post: What we discussed some on list some at Apachecon, was having a really good and simple process manager. Mod_fcgid is too much work to configure for mere mortals. If we just had something like: AssociateExternal .php /path/to/my/php-cgi And it did the sensible thing (whether fcgi, http, wscgi, etc.) then all the "config" is in one place. Obviously, we could have some "advanced" process management directives. If your app needed some special config stuff, we could easily pass it across somehow. -- Brian Akins
Re: Httpd 3.0 or something else
On 11/9/09 1:36 PM, "Graham Leggett" wrote: >> It works really well for proxy. > > Aka "static data" :) Nah, we proxy to fastcgi php stuff, http java stuff, some horrid HTTP perl stuff, etc (Full disclosure, I wrote the horrid perl stuff.) -- Brian Akins
Re: Httpd 3.0 or something else
Akins, Brian wrote: >> and we know >> from the same period of experience from others that a pure event driven >> model is useful for shipping static data and not much further. > > It works really well for proxy. Aka "static data" :) The key advantage to doing both prefork and event behaviour in the same server is that operationally, it is one beast to feed and care for. You might deploy them differently in different environments, but it is one set of skills to manage. Regards, Graham --
Re: Httpd 3.0 or something else
On 11/9/09 1:18 PM, "Graham Leggett" wrote: > and we know > from the same period of experience from others that a pure event driven > model is useful for shipping static data and not much further. It works really well for proxy. -- Brian Akins
Re: Httpd 3.0 or something else
Akins, Brian wrote: >> This gives us the option of prefork reliability, and event driven speed, >> as required by the admin. > > I think if we try to do both, we will wind up with the worst of both worlds. > (Or is it worse??) Blocking/buggy "modules" should be ran out of process > (FactCGI/HTTP/Thrift). That is exactly what prefork means - to run something out of process, so that it can leak and crash at will. I disagree we'll end up with the worst of both worlds. A lot of head banging in the cache code has been caused because we are doing blocking reads and blocking writes on the filter stacks. When I say "be asynchronous" I mean use non-blocking reads and writes everywhere, in both prefork, worker and event. We know from 15+ years of experience that prefork works, and we know from the same period of experience from others that a pure event driven model is useful for shipping static data and not much further. But some people have a need to just ship static data, and there is no reason why httpd and an event MPM can't do that job well too. Regards, Graham --
Re: Httpd 3.0 or something else
On 11/9/09 12:52 PM, "Graham Leggett" wrote: > This gives us the option of prefork reliability, and event driven speed, > as required by the admin. I think if we try to do both, we will wind up with the worst of both worlds. (Or is it worse??) Blocking/buggy "modules" should be ran out of process (FactCGI/HTTP/Thrift). -- Brian Akins
Re: Httpd 3.0 or something else
Akins, Brian wrote: > FWIW, nginx delivers on its performance promises, but is a horrible hairball > of code (my opinion). We (httpd-dev type folks) could do much better - if > we just would. (Easy for the guy with no time to say, I know...) I think it is entirely reasonable for the httpd v3.0 codebase to do this as a goal: - Be asynchronous throughout; while - Supporting prefork as httpd does now; and - Allow variable levels of event-driven-ness in between. This gives us the option of prefork reliability, and event driven speed, as required by the admin. Regards, Graham --
Re: Httpd 3.0 or something else
On 11/9/09 12:32 AM, "Brian McCallister" wrote: > A 3.0, a fundamental architectural shift, would be interesting to > discuss, I am not sure there is a ton of value in it, though, to be > honest. So I should continue to investigate nginx? ;) FWIW, nginx delivers on its performance promises, but is a horrible hairball of code (my opinion). We (httpd-dev type folks) could do much better - if we just would. (Easy for the guy with no time to say, I know...) -- Brian Akins
Re: Httpd 3.0 or something else
On Wed, Nov 4, 2009 at 10:26 AM, Akins, Brian wrote: > So, after several conversations at Apachecon and on the list, we still have > no real "vision" of how we want to move ahead with httpd "3.0." Or, if we > do, it's not communicated very well. > > Some have suggested we just create a new server project. Others want to keep > hacking away at the current code base. > > Thoughts? I see no reason to call what we have been working on anything other than 2.4. A 3.0, a fundamental architectural shift, would be interesting to discuss, I am not sure there is a ton of value in it, though, to be honest.
Re: Httpd 3.0 or something else
On 06/11/09 20:07, Jim Jagielski wrote: I'd like we remove the entire forwarding proxy stuff for example. So we have mod_forward_proxy and mod_reverse_proxy? Interesting take. Would make some sense to make mod_proxy and top-level "framework" and forward/reverse as submodules. I'd like that we clear the current dependency and API mess. E.g common code depends on balancer (which was a dirty hack I did so we can setup the workers) This would obviously require some decent shared memory code that would allow dynamic config instead stealing the space from the scoreboard. Think that shared memory rewrite was one of the major topics inside 'Amsterdam' discussion few years back. Regards -- ^TM
Re: Httpd 3.0 or something else
On a phone, so pls excuse my brevity... I think a lot of your discussion can be easily passed off to Apache Thrift. Let it handle all the message passing to external procceses, and its provided multi-language support. On Nov 5, 2009 4:31 PM, "Graham Dumpleton" wrote: 2009/11/5 Graham Leggett : > Jim Jagielski wrote: > >> Let's get 2.4 out. And then let's rip it to shreds and drop >> buckets/b... Sorry, long post but it was inevitable that I was going to air all this at some point. Now seems a good as time as any. I'd like to see a more radical architecture change, one that recognises that it isn't just about serving static files any more and provides much better builtin support for safe hosting of content generating web applications constructed using alternate languages. Before anyone jumps to the conclusion that I want to start seeing even more heavy weight applications being run direct in the Apache server child processes that accept initial requests, know that I don't want that and that I actually want to promote a model which is the opposite and which would encourage people not to do that. As first step, like Jim I would like to see the current Apache server child processes (workers) being asynchronous. In addition to that though, I would like to see as part of core Apache, and running in parent process, a means for spawning and monitoring of distinct processes outside of the set of worker processes. There is currently support in APR and in part in Apache for 'other' processes via 'apr_proc_other_child_???()' functions, but this is quite basic and you still need to a large degree need to roll your own management routines around that for (re)spawning etc. As a result, you see modules such as mod_cgid, mod_fastcgi, mod_fcgid, mod_wsgi all having their own process management code for managing either their daemon processes and/or manager process. Technically one could implement this as a distinct module called mod_procd which had an API which could be utilised by other modules and stop duplication of all this stuff, but perhaps needs to go a step further than that as far as being integrated into core. This is because at present any 'other' processes are dealt with rather harshly on graceful restarts because they are still simply killed off after a few seconds if they don't shutdown. Being able to extend graceful restart semantics into other processes may be worthwhile for some applications. The next thing want to see is for the whole FASTCGI type ecosystem be revisited and for a better version of this concept for hosting web applications in disparate languages be developed which modernises it and brings it in as a core feature of Apache. The intent here being to simplify the task for implementers as well as those wish to deploy applications. An important part of this would be to switch away from the interface being a socket protocol. Instead, let the web server control both halves of the communication channel between Apache worker process and the application daemon process. What would replace the socket protocol as interface would be C API and instead of the application having to implement the socket protocol as foreign process, specific language support would provided as a way of a dynamically loaded plugin. That plugin would then use embedding to access support for a particular language and just execute code in the file that the enclosing code of the web server system told it to execute. By way of example, imagine languages such as Python, Perl or Ruby which in turn now have simplified web server interfaces in the form of WSGI, PSGI and RACK, or even PHP. In the Apache configuration one would simply say that a specific file extension is implemented by a specific named language plugin. One would also indicate that a separate manager process should be started up for managing processes for handling any requests for that language. Only after that separate manager process had been spawned be it by just straight fork or preferably fork/exec would the specific language plugin be loaded. This eliminates the problems caused by complex language modules being preloaded into Apache parent process and causing conflicts with other languages. The existing mod_php module is a good example for causing lots of problems because of it dragging in libraries which aren't multithread safe. That manager process would then spawn its own language specific worker processes as configured for handling actual requests. When the main asynchronous Apache worker processes receive a request and determines that target resource file is related to specific language, it determines then how to connect to those language specific worker processes and proxies the request to them for handling. On the language worker process side the web server part of the code in that process receives the proxied request and then calls into the plugin code to have the request handle against the target file. Because most language solutions for web
Re: Httpd 3.0 or something else
On 11/5/09 4:30 PM, "Graham Dumpleton" wrote: > Thoughts? Still digesting, but generally +1 to the entire post. -- Brian Akins
Re: Httpd 3.0 or something else
2009/11/5 Graham Leggett : > Jim Jagielski wrote: > >> Let's get 2.4 out. And then let's rip it to shreds and drop >> buckets/brigades and fold in serf. > > I think we should decide on exactly what problem we're trying to solve, > before we start thinking about how it is to be solved. > > I'm keen to teach httpd v3.0 to work asynchronously throughout - still > maintaining the prefork behaviour as a sensible default[1], but being > asynchronous and non blocking throughout. > > [1] The fact that dodgy module code can leak, crash and be otherwise > unsociable, and yet the server remains functional, is one of the key > reasons why httpd still endures. Sorry, long post but it was inevitable that I was going to air all this at some point. Now seems a good as time as any. I'd like to see a more radical architecture change, one that recognises that it isn't just about serving static files any more and provides much better builtin support for safe hosting of content generating web applications constructed using alternate languages. Before anyone jumps to the conclusion that I want to start seeing even more heavy weight applications being run direct in the Apache server child processes that accept initial requests, know that I don't want that and that I actually want to promote a model which is the opposite and which would encourage people not to do that. As first step, like Jim I would like to see the current Apache server child processes (workers) being asynchronous. In addition to that though, I would like to see as part of core Apache, and running in parent process, a means for spawning and monitoring of distinct processes outside of the set of worker processes. There is currently support in APR and in part in Apache for 'other' processes via 'apr_proc_other_child_???()' functions, but this is quite basic and you still need to a large degree need to roll your own management routines around that for (re)spawning etc. As a result, you see modules such as mod_cgid, mod_fastcgi, mod_fcgid, mod_wsgi all having their own process management code for managing either their daemon processes and/or manager process. Technically one could implement this as a distinct module called mod_procd which had an API which could be utilised by other modules and stop duplication of all this stuff, but perhaps needs to go a step further than that as far as being integrated into core. This is because at present any 'other' processes are dealt with rather harshly on graceful restarts because they are still simply killed off after a few seconds if they don't shutdown. Being able to extend graceful restart semantics into other processes may be worthwhile for some applications. The next thing want to see is for the whole FASTCGI type ecosystem be revisited and for a better version of this concept for hosting web applications in disparate languages be developed which modernises it and brings it in as a core feature of Apache. The intent here being to simplify the task for implementers as well as those wish to deploy applications. An important part of this would be to switch away from the interface being a socket protocol. Instead, let the web server control both halves of the communication channel between Apache worker process and the application daemon process. What would replace the socket protocol as interface would be C API and instead of the application having to implement the socket protocol as foreign process, specific language support would provided as a way of a dynamically loaded plugin. That plugin would then use embedding to access support for a particular language and just execute code in the file that the enclosing code of the web server system told it to execute. By way of example, imagine languages such as Python, Perl or Ruby which in turn now have simplified web server interfaces in the form of WSGI, PSGI and RACK, or even PHP. In the Apache configuration one would simply say that a specific file extension is implemented by a specific named language plugin. One would also indicate that a separate manager process should be started up for managing processes for handling any requests for that language. Only after that separate manager process had been spawned be it by just straight fork or preferably fork/exec would the specific language plugin be loaded. This eliminates the problems caused by complex language modules being preloaded into Apache parent process and causing conflicts with other languages. The existing mod_php module is a good example for causing lots of problems because of it dragging in libraries which aren't multithread safe. That manager process would then spawn its own language specific worker processes as configured for handling actual requests. When the main asynchronous Apache worker processes receive a request and determines that target resource file is related to specific language, it determines then how to connect to those language specific worker processes and proxies the request to them for
Re: Httpd 3.0 or something else
On 05/11/09 12:38, Graham Leggett wrote: Jim Jagielski wrote: Let's get 2.4 out. And then let's rip it to shreds and drop buckets/brigades and fold in serf. I think we should decide on exactly what problem we're trying to solve, before we start thinking about how it is to be solved. +1 I'd like we remove the entire forwarding proxy stuff for example. There are also few other things that simply doesn't fit inside 'that web server thing' thought. Others might simply have different ideas. So IMHO we should define what we wanna do first. Regards -- ^TM
Re: Httpd 3.0 or something else
How about support of openmp? Regards, Jie
Re: Httpd 3.0 or something else
On Thu, 2009-11-05 at 13:38 +0200, Graham Leggett wrote: > I'm keen to teach httpd v3.0 to work asynchronously throughout - still > maintaining the prefork behaviour as a sensible default[1], but being > asynchronous and non blocking throughout. > > [1] The fact that dodgy module code can leak, crash and be otherwise > unsociable, and yet the server remains functional, is one of the key > reasons why httpd still endures. +1 That the concept is not outdated, we just need to look at Google's Chrome. -- Bojan
Re: Httpd 3.0 or something else
I'm with Jim, Head for 2.4 first. IIRC there was some talk about moving to a 'd' project, since httpd now does ftp (mod_ftp), echo, pop3,... and some other protocols. I don't remember much from it though. I did like the idea back then but thats about the only thing I remember from that. Maybe we could also poll the user base? I know there was a restart on the debate about the current conf and lua/perl/whatever not so long ago so maybe these are all things to look into again for a 3.0? Just my .2 cents /me off to studying windows 2008 server -_- ~Jorge On Wed, Nov 4, 2009 at 8:30 PM, Jim Jagielski wrote: > Let's get 2.4 out. And then let's rip it to shreds and drop > buckets/brigades and fold in serf. > > On Nov 4, 2009, at 1:26 PM, Akins, Brian wrote: > >> So, after several conversations at Apachecon and on the list, we still >> have >> no real "vision" of how we want to move ahead with httpd "3.0." Or, if we >> do, it's not communicated very well. >> >> Some have suggested we just create a new server project. Others want to >> keep >> hacking away at the current code base. >> >> Thoughts? >> >> -- >> Brian Akins >> > >
Re: Httpd 3.0 or something else
Let's get 2.4 out. And then let's rip it to shreds and drop buckets/brigades and fold in serf. On Nov 4, 2009, at 1:26 PM, Akins, Brian wrote: So, after several conversations at Apachecon and on the list, we still have no real "vision" of how we want to move ahead with httpd "3.0." Or, if we do, it's not communicated very well. Some have suggested we just create a new server project. Others want to keep hacking away at the current code base. Thoughts? -- Brian Akins
Httpd 3.0 or something else
So, after several conversations at Apachecon and on the list, we still have no real "vision" of how we want to move ahead with httpd "3.0." Or, if we do, it's not communicated very well. Some have suggested we just create a new server project. Others want to keep hacking away at the current code base. Thoughts? -- Brian Akins