Now that Python 2.4 is out the door (and the problems with StreamReader.readline() are hopefully fixed), I'd like bring up the topic of a feed style codec API again. A feed style API would make it possible to use stateful encoding/decoding where the data is not available as a stream.
Two examples:
- xml.sax.xmlreader.IncrementalParser: Here the client passes raw XML data to the parser in multiple calls to the feed() method. If the parser wants to use Python codecs machinery, it has to wrap a stream interface around the data passed to the feed() method. - WSGI (PEP 333) specifies that the web application returns the fragments of the resulting webpage as an iterator. If this result is encoded unicode we have the same problem: This must be wrapped in a stream interface.
The simplest solution is to add a feed() method both to StreamReader and StreamWriter, that takes the state of the codec into account, but doesn't use the stream. This can be done by simply moving a few lines of code into separate methods. I've uploaded a patch to Sourceforge: #1101097.
There are other open issues with the codec changes: unicode-escape, UTF-7, the CJK codecs and probably a few others don't support decoding imcomplete input yet (although AFAICR the functionality is mostly there in the CJK codecs).
Bye, Walter Dörwald _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com