Why is it that we currently cannot split file buckets natively? (Is it a sendfile limitation?)
Let me just say ahead of time that I'm not trying to start another massive debate... if somebody has a good reason that this cannot be done, I'll shut up about it. It just seems that it's a lot easier to do with file buckets than with pipes and sockets, since the split can be done natively with no read required (as the requirement for split functions is currently defined... speaking of which, is anybody ever going to commit the ap_bucket_split_any() patch?) In trying to implement such a function in conversation with OtherBill, the only big problem I ran into was that a seek/read sequence on a fd is not threadsafe if there are two file buckets in existance that point to the same fd, since it breaks the assumptions that file_read() makes about the current location of the file pointer. OtherBill suggested that maybe the second file bucket should point to a dup'ed file handle rather than the same file handle. At first glance, that'd be a great solution, but at least some OSes cause dup'ed file handles to share file pointers/flags/etc, so dup'ing doesn't necessarily help in this situation. I don't know if that's a standard behavior of dup() or not (somebody please enlighten me); if it is, then it makes sense for apr_dupfile() to not change that behavior. But if dup() would fix this problem on some systems, then it probably makes sense for apr_dupfile() to somehow guarantee that it's fixed on all systems. In that case, file_split() would just dup the file handle (a future file_copy() would want to do the same thing). As it stands though, it seems that the only way to make it possible to have multiple file buckets point to the same fd is to serialize reads, seeking to the right spot in the file before reading from it. For every read. That sucks. (It's too bad that there's no version of apr_read() that takes as a parameter the offset into the file from which you want to begin reading... that'd be another way around this problem. Of course, it would have to deal with the same problems I'm dealing with now, so that's probably not such a good idea.) As I mentioned, a future file_copy operation would want to do the same thing. That assumes that there is still interest in implementing a copy operation on all bucket types (which I think is a Good Thing). If there is such an interest, I'd like to do it before APR goes to Beta, since it's an API change. Somebody just say the word, and I'll do the work. So I guess I'm asking two questions here: (1) Does anybody have any bright ideas that would help in implementing split() and a future copy() on file buckets, given the problems in reading non-sequentially from the file? (2) Is there still an interest in adding a copy operation to the buckets API? Thanks, and Happy Thankgiving to all... --Cliff __________________________________________________ Do You Yahoo!? Yahoo! Shopping - Thousands of Stores. Millions of Products. http://shopping.yahoo.com/