[issue20132] Many incremental codecs don’t handle fragmented data

Walter Dörwald Fri, 10 Jan 2014 03:27:06 -0800

Walter Dörwald added the comment:

The best solution IMHO would be to implement real incremental codecs for all of 
those.


Maybe iterencode() with an empty iterator should never call encode()? (But IMHO 
it would be better to document that iterencode()/iterdecode() should only be 
used with "real" codecs.)

Note that the comment before PyUnicode_DecodeUTF7Stateful() in unicodeobject.c 
reads:

/* The decoder.  The only state we preserve is our read position,
 * i.e. how many characters we have consumed.  So if we end in the
 * middle of a shift sequence we have to back off the read position
 * and the output to the beginning of the sequence, otherwise we lose
 * all the shift state (seen bits, number of bits seen, high
 * surrogate). */

Changing that would have to introduce a state object that the codec updates and 
from which it can be restarted.

Also the encoder does not buffer anything. To implement the suggested 
behaviour, the encoder might have to buffer unlimited data.

----------

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue20132>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue20132] Many incremental codecs don’t handle fragmented data

Reply via email to