[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-08-17 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Could you please finish this issue Victor? -- assignee: -> haypo stage: resolved -> ___ Python tracker ___ _

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I have no opinion. -- assignee: serhiy.storchaka -> ___ Python tracker ___ ___ Python-bugs-list m

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-16 Thread STINNER Victor
STINNER Victor added the comment: > But an exception reports about CP_UTF8. Oh, that's my fault! And it is a bug: "CP_UTF8" is the Windows constant, but it is not a valid Python codec name. Attached patch cp_encoding_name.patch fixes this issue. I don't think that Py_LOWER() is needed because

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-16 Thread Roundup Robot
Roundup Robot added the comment: New changeset 8ee2b73cda7a by Victor Stinner in branch 'default': Issue #13916: Fix surrogatepass error handler on Windows http://hg.python.org/cpython/rev/8ee2b73cda7a -- ___ Python tracker

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch which tests encoding name with "cp65001" instead of "CP_UTF8". I can't test on Windows and don't know which of two patches are correct. -- Added file: http://bugs.python.org/file35262/surrogatepass_cp65001.patch __

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: But an exception reports about CP_UTF8. -- title: disallow the "surrogatepass" handler for non utf-* encodings -> disallow the "surrogatepass" handler for non utf-* encodings ___ Python tracker

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-15 Thread STINNER Victor
STINNER Victor added the comment: > Here is a patch, which adds support for cp65001 The name of the encoding is "cp65001", not something like "cp-utf8". And there is no alias like "cp_65001", there is only "cp65001". -- ___ Python tracker

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch, which adds support for cp65001 and fixes test_cp1252. Please test it on Windows Vista. Lone surrogates are not illegal in UTF-7 (see RFC 1642), so error handler is not called and explicit support of UTF-7 is not needed. Could you please hel

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-15 Thread STINNER Victor
STINNER Victor added the comment: Windows buildbots are unhappy. http://buildbot.python.org/all/builders/x86%20Windows7%203.x/builds/8355/steps/test/logs/stdio == ERROR: test_surrogatepass_handler (test.test_codecs.CP65001Test)

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-15 Thread STINNER Victor
STINNER Victor added the comment: It makes sense to restrict surrogatepass to UTF-* encodings. UTF-8, UTF-16 and UTF-32 encoders reject surrogate characters, but not UTF-7. Is it a bug? I'm asking because PyCodec_SurrogatePassErrors() doesn't support UTF-7. IMO your change is important enough

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-15 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- assignee: -> serhiy.storchaka resolution: -> fixed stage: -> resolved status: open -> closed ___ Python tracker ___ __

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-15 Thread Roundup Robot
Roundup Robot added the comment: New changeset 5e98a50e0f55 by Serhiy Storchaka in branch 'default': Issue #13916: Disallowed the surrogatepass error handler for non UTF-* http://hg.python.org/cpython/rev/5e98a50e0f55 -- nosy: +python-dev ___ Python t

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-15 Thread Martin v . Löwis
Martin v. Löwis added the comment: LGTM -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Oh, sorry. -- keywords: +patch Added file: http://bugs.python.org/file35257/surrogatepass_non_utf.patch ___ Python tracker ___ ___

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-15 Thread STINNER Victor
STINNER Victor added the comment: Serhiy Storchaka wrote: > Here is a patch I don't see your patch. -- ___ Python tracker ___ ___ Pyt

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch which disallows the surrogatepass handler for non-utf encodings. Please test it on Windows. -- type: behavior -> enhancement versions: +Python 3.5 -Python 3.1, Python 3.2, Python 3.3 ___ Python trac

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2014-05-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: This issue was mainly resolved in issue12892. The surrogatepass error handler now works with UTF-16* and UTF-32* encodings. But for other encodings it behaves as for UTF-8 (preserve old behavior). Should we change the behavior for non-UTF encodings end raise

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2012-04-28 Thread Martin v . Löwis
Martin v. Löwis added the comment: I see. The proper reaction for a codec that can't handle a certain error then is to raise the original exception. I'm -1 on raising LookupError when trying to find the error handler - this would suggest that the error handler does not exist, which is not tru

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2012-04-28 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The problem is that "surrogatepass" specific to utf-8 and there is no standard way to decode alone surrogates in utf-16. >>> "\udc80\udc80".encode("utf-16", "surrogatepass").decode("utf-16", >>> "surrogatepass") Traceback (most recent call last): File "",

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2012-04-28 Thread Martin v . Löwis
Martin v. Löwis added the comment: I fail to see the problem. If the error handler does not produce meaningful results in some context, then just don't use it. The whole point of error handlers is that they handle errors; using them shouldn't ever cause errors/exceptions. -- nosy: +l

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2012-04-27 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- nosy: +storchaka ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.p

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2012-01-31 Thread STINNER Victor
Changes by STINNER Victor : -- nosy: +haypo ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2012-01-31 Thread Kang-Hao (Kenny) Lu
New submission from Kang-Hao (Kenny) Lu : Currently the "surrogatepass" handler always encodes the surrogates in UTF-8 and hence the behavior for, say, "\udc80".encode("latin-1", "surrogatepass").decode("latin-1") might be unexpected and I don't even know what would, say, "\udc80\udc80".encode