Patches item #1443155, was opened at 2006-03-04 19:45 Message generated for change (Comment added) made by doerwalter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1443155&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Modules Group: Python 2.5 Status: Open Resolution: None Priority: 5 Submitted By: Hye-Shik Chang (perky) Assigned to: Hye-Shik Chang (perky) Summary: Incremental codecs for CJKCodecs Initial Comment: Here's a supplemental patch for SF #1436130 to implement CJKCodecs part of the Incremental codec specification. This patch is implemented in an interface of Walter's fourth patch on #1436130. Please test this whether it agrees the design. ---------------------------------------------------------------------- >Comment By: Walter Dörwald (doerwalter) Date: 2006-03-18 21:47 Message: Logged In: YES user_id=89016 Maybe the statement is a bit misleading in its current form. I didn't mean that error handling prevents the flushing of the buffers, just that error handling prevents raising an error. I hope the following is clearer: "If final is true the codec must encode/decode the input completely and must flush all buffers. If this isn't possible (e.g. because of incomplete byte sequences still remaining in the buffer on decoding) it must initiate error handling just like in the stateless case (which might raise an exception)." I'll take a look at the patch in the next few days. ---------------------------------------------------------------------- Comment By: Hye-Shik Chang (perky) Date: 2006-03-18 19:32 Message: Logged In: YES user_id=55188 I updated the patch for the .errors visibility. I like the statement. But the current implementation of the patch is slightly different for a corner case; the current implementation keeps buffers if an error is occurred. That is somewhat natural because error cases doesn't make side-effects usually. But in other side, I agree to your statement in a view of a direct interpretation of final=True. What do you think about this? ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2006-03-18 16:24 Message: Logged In: YES user_id=89016 What other interpretation of the final parameter can we use that doesn't make it completely useless? What about the following: "If final is true the codec must encode/decode the input completely and must flush all buffers. If this isn't possible (e.g. because of incomplete byte sequences on decoding) it must raise an exception (unless prevented by an error handler)"? ---------------------------------------------------------------------- Comment By: Hye-Shik Chang (perky) Date: 2006-03-16 13:32 Message: Logged In: YES user_id=55188 1) Because CJKCodecs had an internal stateful framework, I implemented just an interface using it for IncrementalCodec. It treats final=True as a simple `flush' message(which doesn't reset or terminate the codec). The behavior is quite useful for real-time stream processing such as sockets and tail log watchers. If we disallow that, such usages may require its own sequence detectors. For "to reset or not" issue, I think we can simply follow how iconv does. iconv doesn't reset the internal state for iconv(ic, NULL, NULL, ..). 2) Aah. I didn't notice that .errors is a part of public API. The current CJKCodecs can't support it easily yet. I'll fix it and upload a updated patch soon. Thank you for your review! ---------------------------------------------------------------------- Comment By: Walter Dörwald (doerwalter) Date: 2006-03-15 13:08 Message: Logged In: YES user_id=89016 The patch doesn't apply cleanly (conflicts in Lib/test/test_multibytecodec.py and Tools/unicode/Makefile). Could you update the patch? I haven't looked at the C code to closely yet. Two notes: 1) The tests often call incencoder.encode() or incdecoder.decode() again after the method has been called with final=True before. I'm not sure that this should be allowed. If we allow it, it should be documented in what state the codec is after calling with final=True (probably it should be back to the initial state (i.e. like calling reset())). 2) It seems to me that it isn't possible to change the error handling during the lifetime of a codec. Anyway, thanks for the quick patch. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1443155&group_id=5470 _______________________________________________ Patches mailing list [email protected] http://mail.python.org/mailman/listinfo/patches
