Anders J. Munch wrote: > Watch out! There's an essentiel difference between files and > bidirectional communications channels that you need to take into > account. For a TCP connection, input and output can be seen as > isolated from one another, with each their own stream position, and > each their own contents. For read/write files, it's a whole different > ballgame, because stream position and data are shared. > > That means you cannot use the same buffering code for both cases. For > files, whenever you write something, you need to take into account > that that may overlap your read buffer or change read position. You > should take another look at layer.BufferingLayer with that in mind. > > regards, Anders
This is a better explanation of some of the comments I was raising earlier: The choice of buffering strategy depends on a number of factors related to how the stream is going to be used, as well as the internal implementation of the stream. A buffering strategy that works well for a socket won't work very well for a DBMS. When I stated earlier that 'the OS can do a better job of buffering than we can', what I meant to say was somewhat broader than that - which is that each layer is, in many cases, a better judge of what *kind* of buffering it needs than the person assembling the layers. This doesn't mean that each layer has to implement its own buffering algorithm. The common buffering algorithms can be factored out into their own objects -- but what I'd suggest is that the choice of buffer algorithm not *normally* be exposed to the person constructing the io stack. Thus, when creating a standard "line reader", instead of having the user call: fh = TextReader( Buffer( File( ... ) ) ) Instead, let the TextReader choose the kind of buffer it wants and supply that part automatically. There are several reasons why I think this would work better: 1) You can't simply stick just any buffer object in the middle there and expect it to work. Different buffer strategies have different interfaces, and trying to meld them all into one uber-interface would make for a very complex interface. 2) The TextReader knows perfectly well what kind of buffer it needs. Depending on how TextReader is implemented, it might want a serial, read-only buffer that allows a limited degree of look-ahead buffering so that it can find the line breaks. Or it might want a pair of buffers - one decoded, one encoded. There's no way that the user can know what kind of buffer to use without knowing the implementation details of TextReader. 3) TextReader can be optimized even more if it is allowed to 'peek' inside the internals of the buffer - something that would not be allowed if it had to conform to calling the buffer through a standard interface. More generally, the choice of buffer depends on the usage pattern for reading / writing to the file - and that usage pattern is embodied in the definition of "TextReader". By creating a "TextReader" object, the user is stating their intention to read the file a certain way, in a certain order, with certain performance characteristics. The choice of buffering derives directly from those usage patterns. So the two go hand in hand. Now, I'm not saying that you can't stick additional layers in-between TextReader and FileStream if you want to. An example might be the "resync" layer that you mentioned, or a journaling layer that insures that all writes are recoverable. I'm merely saying that for the specific issue of buffering, I think that the choice of buffer type is complicated, and requires knowledge that might not be accessible to the person assembling the stack. -- Talin _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com