[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-08-17 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Could you please finish this issue Victor? -- assignee: - haypo stage: resolved - ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13916 ___

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I have no opinion. -- assignee: serhiy.storchaka - ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13916 ___

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch which tests encoding name with cp65001 instead of CP_UTF8. I can't test on Windows and don't know which of two patches are correct. -- Added file: http://bugs.python.org/file35262/surrogatepass_cp65001.patch

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-16 Thread Roundup Robot
Roundup Robot added the comment: New changeset 8ee2b73cda7a by Victor Stinner in branch 'default': Issue #13916: Fix surrogatepass error handler on Windows http://hg.python.org/cpython/rev/8ee2b73cda7a -- ___ Python tracker rep...@bugs.python.org

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-16 Thread STINNER Victor
STINNER Victor added the comment: But an exception reports about CP_UTF8. Oh, that's my fault! And it is a bug: CP_UTF8 is the Windows constant, but it is not a valid Python codec name. Attached patch cp_encoding_name.patch fixes this issue. I don't think that Py_LOWER() is needed because

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: This issue was mainly resolved in issue12892. The surrogatepass error handler now works with UTF-16* and UTF-32* encodings. But for other encodings it behaves as for UTF-8 (preserve old behavior). Should we change the behavior for non-UTF encodings end

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch which disallows the surrogatepass handler for non-utf encodings. Please test it on Windows. -- type: behavior - enhancement versions: +Python 3.5 -Python 3.1, Python 3.2, Python 3.3 ___ Python

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-15 Thread STINNER Victor
STINNER Victor added the comment: Serhiy Storchaka wrote: Here is a patch I don't see your patch. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13916 ___

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Oh, sorry. -- keywords: +patch Added file: http://bugs.python.org/file35257/surrogatepass_non_utf.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13916

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-15 Thread Martin v . Löwis
Martin v. Löwis added the comment: LGTM -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13916 ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-15 Thread Roundup Robot
Roundup Robot added the comment: New changeset 5e98a50e0f55 by Serhiy Storchaka in branch 'default': Issue #13916: Disallowed the surrogatepass error handler for non UTF-* http://hg.python.org/cpython/rev/5e98a50e0f55 -- nosy: +python-dev ___ Python

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-15 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- assignee: - serhiy.storchaka resolution: - fixed stage: - resolved status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13916

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-15 Thread STINNER Victor
STINNER Victor added the comment: It makes sense to restrict surrogatepass to UTF-* encodings. UTF-8, UTF-16 and UTF-32 encoders reject surrogate characters, but not UTF-7. Is it a bug? I'm asking because PyCodec_SurrogatePassErrors() doesn't support UTF-7. IMO your change is important enough

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-15 Thread STINNER Victor
STINNER Victor added the comment: Windows buildbots are unhappy. http://buildbot.python.org/all/builders/x86%20Windows7%203.x/builds/8355/steps/test/logs/stdio == ERROR: test_surrogatepass_handler

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch, which adds support for cp65001 and fixes test_cp1252. Please test it on Windows Vista. Lone surrogates are not illegal in UTF-7 (see RFC 1642), so error handler is not called and explicit support of UTF-7 is not needed. Could you please

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-15 Thread STINNER Victor
STINNER Victor added the comment: Here is a patch, which adds support for cp65001 The name of the encoding is cp65001, not something like cp-utf8. And there is no alias like cp_65001, there is only cp65001. -- ___ Python tracker

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2014-05-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: But an exception reports about CP_UTF8. -- title: disallow the surrogatepass handler for non utf-* encodings - disallow the surrogatepass handler for non utf-* encodings ___ Python tracker rep...@bugs.python.org

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2012-04-28 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: I fail to see the problem. If the error handler does not produce meaningful results in some context, then just don't use it. The whole point of error handlers is that they handle errors; using them shouldn't ever cause errors/exceptions.

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2012-04-28 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: The problem is that surrogatepass specific to utf-8 and there is no standard way to decode alone surrogates in utf-16. \udc80\udc80.encode(utf-16, surrogatepass).decode(utf-16, surrogatepass) Traceback (most recent call last): File

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2012-04-28 Thread Martin v . Löwis
Martin v. Löwis mar...@v.loewis.de added the comment: I see. The proper reaction for a codec that can't handle a certain error then is to raise the original exception. I'm -1 on raising LookupError when trying to find the error handler - this would suggest that the error handler does not

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2012-04-27 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- nosy: +storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13916 ___ ___ Python-bugs-list

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2012-01-31 Thread Kang-Hao (Kenny) Lu
New submission from Kang-Hao (Kenny) Lu kennyl...@csail.mit.edu: Currently the surrogatepass handler always encodes the surrogates in UTF-8 and hence the behavior for, say, \udc80.encode(latin-1, surrogatepass).decode(latin-1) might be unexpected and I don't even know what would, say,

[issue13916] disallow the surrogatepass handler for non utf-* encodings

2012-01-31 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13916 ___ ___