On Mon, 30 Apr 2001, Bill Stoddard wrote: > I lost the original context for this discussion... Exactly when will a > file bucket morph to an MMAP bucket? My concern is not to break using > apr_sendfile(), which on some OSes can provide a big performance > boost.
I thought about that, and it is a potential issue, but I believe I have a cure. Basically, all references to a file bucket would morph (lazily, as with pool buckets) to MMAP-type when any one of them gets run through apr_bucket_read() (and therefore MMAP'ed). The problem is this: since the MMAP is created in the same pool as the apr_file_t, we leak MMAP's for file descriptors that are repeatedly passed through apr_bucket_read(). Take, for example, mod_file_cache in Apache. Each time a file descriptor is served up, we leak another MMAP. That's bad. There are only three solutions that I can think of: (1) Let mmap_destroy() in the buckets code delete the MMAP when the last bucket referencing it goes away. This has been basically vetoed by Ryan, citing the fact that it increases the number of system calls that happen during request processing. (2) Put two pools in apr_file_t: one that contains the file and one (possibly the same) that any apr_mmap_t's of that file should be stored in. This would work but is a hack, is altogether ugly, potentially wastes time repeatedly MMAPing the file, and I myself am just about -1 on it. (3) Make all file bucket references lazily morph to MMAP type when any one of them does (as I'm proposing). This is good for preventing resource leakage, but could potentially limit the use of apr_sendfile(), particularly in the case of mod_file_cache. On the other hand, this would actually be *good* during a regular request, since the work of MMAPing the file would only have to be done once, ever, for any given open file. So the only real problem is how to fix the mod_file_cache case, where the FD could potentially be sendfile'd on later requests and it's more performant to do that than to use an MMAP just because we have one available. Here's how that's done, and why this change is good: The beauty is in the lazy morphing. mod_file_cache will be changed to no longer use ap_send_fd, but instead keep a cached file bucket for each FD it caches. This is good for another reason (it saves a malloc per request), but that's beside the point. For each request, it makes an apr_bucket_copy() of its cached file bucket and sends that down the filter stack. If at some point during request processing, that copy gets changed to an MMAP bucket, then all of its siblings (including the "master" one in the cache) will also have access to that MMAP and will morph automatically the next time they're read. But the master copy in the cache will never BE read! So the copy in the cache always remains of file type, not mmap type, and copies of the master made to serve future requests will start out as file type as well (enabling sendfile for those future requests). Even better, if one of those later requests decides it does need to do a read on the file bucket and would have MMAPed it, it discovers that an MMAP of the file is already available, it just changes its type and reads the MMAP with virtually zero extra work incurred. Sound good? =-) --Cliff -------------------------------------------------------------- Cliff Woolley [EMAIL PROTECTED] Charlottesville, VA
