At 10:51 AM 3/21/2007 +0100, you wrote: >There's a thread about fixing some issue in Subversion w.r.t. writing >large amounts of data to a console handle. > >This issue is an example of a broader issue: Windows doesn't support >arbitrary amounts of data to be written to all of its handle types. >Smaller writes *are* supported, but it doesn't automatically fall back >to smaller writes when the underlaying handle only supports small >writes.
That is correct, as I understand it. >There were a number of patches contributed until now, with the >following properties (please correct me if I'm wrong or incomplete): > >1) Split all writes into 'always supported' smaller writes My original patch. Note that while it used the buffering functionality, the handle wasn't technically buffered in the sense of combining separate calls to apr_file_write. >2) Split writes for handles with specific numbers into 'always >supported' smaller writes Well, it wasn't done by numbers, but that was my second patch. It reworked how specifically console handles are opened, including some factoring of common std handle code that would be nice as a stand-alone patch even if the approach is rejected. >3) Fall back to a buffered file once the condition has been detected >which signals this kind of problem This was my third patch, but it isn't entirely correct to refer to the file as buffered for the same reason as (1). I'll explain below. >4) Fall back to a small write (once) when detecting this condition, >returning an incomplete write This was proposed by Joe and implemented in the patch you posted to the list. >(1) was rejected because the increased performance on 'other' handles >must be an OS issue not to be worked around in a portability layer Honestly, I think (1) should be applied. You wouldn't normally write code in such a way because you would expect it to decrease performance. My testing showed, however, that -- at least for file handles -- the Win32 APIs actually perform *better* with a larger number of smaller writes. It is a very odd result, but it was consistent across more than 10 hours of continuous testing, both with preallocated files and with the file blocks being allocated on-the-fly. My test compared writing 55 megabyte files in 48KB blocks with writing the same files as a combination of each possible whole-megabyte write size from 1 to 10, shuffling the order of the large writes and randomly interleaving the two for each iteration, and at the end of it, every single test iteration showed the 48 KB writes to be around 3 times faster than the large writes for the same amount of total data written. >(2) was rejected, because it doesn't solve the issue when other handle >numbers are of the same class There are only 3 console handles, and they are all opened through the methods that I modified. The only question is whether stand-alone pipes might behave the same way as the console std handles (which would be a fair indication that console std handles are in fact pipes :-). Certainly files do not behave that way, but, as described above and in previous messages, files would actually perform better with writes cut up. >(3) Wasn't rejected (yet?), but feels wrong to me: an unbuffered file >was created, then it should probably stay unbuffered (since starting >to buffer will change the behavior of the file significantly on >smaller writes) This is where the confusion lies. If you look closely at my third patch, it introduces a new BOOLEAN field in the (opaque) apr_file_t structure called "autoflush". This is combined with two extra lines of code in apr_file_write so that when this field is set, after the last chunk of the input buffer is sent to the apr_file_t's buffer (which may be the only chunk if it is a small write), apr_file_flush is called. Thus, the behaviour remains the same; apr_file_write doesn't return until the data provided has guaranteed to have been passed to the OS. The buffering code is used because it already divides the data up in a way that is known to work (though it is actually fairly inefficient, with an unnecessary complete copy of the data from the user-supplied buffer into the apr_file_t's buffer), but the file handle isn't actually made to behave like a buffered file. I don't understand the resistance to (1), but if (1) won't be applied, I think (3) is the best option. >(4) Was rejected because the same condition was detected over and over >again, resulting in too many failed OS calls. > >Patch #4 can be easily extended to cache the result of the fallback in >the file structure. Would that be enough to get this change committed >/ issue resolved? This is effectively what (3) does, except that it is more efficient than (4) (in the same way that (1) is more efficient than (3)) because it reduces the stack depth between the loop that is pumping the chunks of data and the actual call to the system API. >If not, what more would you expect? I can't think of many other possible permutations with the desired effect. My recommendation would be (1) in preference to (3), (3) in preference to (4), and (4) with caching of the result if nothing else. Something must be done, though, because as it stands, on Windows APR's file handling does not live up to its documentation, at least where console handles are concerned. Though it is not APR's fault, it is certainly APR's problem, because we all know how likely it is that Microsoft will push through an urgent fix for a part of Windows they've been trying to get rid of in each version of Windows since XP SP2. Jonathan Gilbert
