[issue17404] ValueError: can't have unbuffered text I/O for io.open(1, 'wt', 0)

2013-03-13 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: - it won't work for reading: TextIOWrapper calls the read1() method, which is only defined by BufferedIO objects. Since 3.3 TextIOWrapper works with raw IO objects (issue12591). Yes. And I just noticed that the _io module (the C version) will also buffer

[issue17299] Test cPickle with real files

2013-03-14 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I'm a little polished the patch before committing. Thank you for the patch, Aman Shah. -- resolution: - fixed stage: commit review - committed/rejected status: open - closed ___ Python tracker rep

[issue1285086] urllib.quote is too slow

2013-03-14 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Sorry, I perhaps missed your response, Senthil. Now committed and closed again. -- resolution: - fixed stage: patch review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http

[issue17016] _sre: avoid relying on pointer overflow

2013-03-14 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Of course it would be nice to have the tests for so much cases as possible, but I am afraid that it will not be easy. The patch LGTM. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17016

[issue13056] test_multibytecodec.py:TestStreamWriter is skipped after PEP393

2013-03-14 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I think these tests have no sense after PEP393. They tests that StreamWriter works with non-BMP characters broken inside surrogate pair. I.e. c.write(s[:i]); c.write(s[i:]) always is same as c.write(s), even if i breaks s inside a surrogate pair. This case

[issue1243730] Big speedup in email message parsing

2013-03-14 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Test fails with stack overflow: == ERROR: test_pushCR_LF (email.test.test_email.TestIterators) FeedParser BufferedSubFile.push() assumed it received complete

[issue17440] Some IO related problems on x86 windows

2013-03-16 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- components: +IO nosy: +benjamin.peterson, hynek, pitrou, stutzbach ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17440

[issue1159051] Handle corrupted gzip files with unexpected EOF

2013-03-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: tuned_gzip does dangerous things, it overloads private methods of GzipFile. From Bazaar 2.3 Release Notes: * Stop using ``bzrlib.tuned_gzip.GzipFile``. It is incompatible with python-2.7 and was only used for Knit format repositories, which haven't been

[issue17441] Do not cache re.compile

2013-03-16 Thread Serhiy Storchaka
New submission from Serhiy Storchaka: Ezio proposed in issue16389 to not cache re.compile. Caching of re.compile has no sense and only pollutes the cache. -- components: Library (Lib), Regular Expressions messages: 184354 nosy: ezio.melotti, mrabarnett, pitrou, serhiy.storchaka

[issue17441] Do not cache re.compile

2013-03-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch. -- keywords: +patch stage: needs patch - patch review Added file: http://bugs.python.org/file29429/re_compile_nocache.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org

[issue17415] Clarify docs of os.path.normpath()

2013-03-17 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: os.path.normpath() works not only with strings but with bytes objects too. -- nosy: +serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17415

[issue17447] str.identifier shouldn't accept Python keywords

2013-03-17 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Hmm. I were going to use this method for re's named group (see issue14462). There is a possibility that some third-party code uses it for checking on general Unicode-aware identifiers. The language specifification says that keywords is a subset

[issue17299] Test cPickle with real files

2013-03-17 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- resolution: fixed - status: closed - open ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17299

[issue17299] Test cPickle with real files

2013-03-17 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I'm not sure what is wrong and can't check on Windows, but it is possible that this patch fixes tests. Please check it if you can. -- Added file: http://bugs.python.org/file29433/test_cpickle_fileio.patch

[issue17299] Test cPickle with real files

2013-03-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Oh, yes. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17299 ___ ___ Python-bugs-list mailing list

[issue17299] Test cPickle with real files

2013-03-18 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Removed file: http://bugs.python.org/file29433/test_cpickle_fileio.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17299

[issue17299] Test cPickle with real files

2013-03-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Benjamin has fixed this in the changeset 6aab72424063. -- resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17299

[issue17460] Remove the strict and related params completely removing the 0.9 support

2013-03-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: May be in 3.4 an exception should be raised? HTTPConnection('python.org', 80, False) now silently returns wrong result. -- components: +Library (Lib) nosy: +serhiy.storchaka stage: - patch review type: - enhancement versions: +Python 3.4

[issue17397] ttk::themes missing from ttk.py

2013-03-18 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: This looks similar to issue16809 and requires a similar solution. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17397

[issue17433] stdlib generator-like iterators don't forward send/throw

2013-03-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: This was proposed before (see issue16150) and was rejected after discussing on Python-ideas. -- nosy: +serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17433

[issue17433] stdlib generator-like iterators don't forward send/throw

2013-03-19 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- nosy: +rhettinger type: - enhancement ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17433

[issue17478] Tkinter's split() inconsistent for bytes and unicode strings

2013-03-19 Thread Serhiy Storchaka
New submission from Serhiy Storchaka: Tkinter's split() recursive splits bytes but not unicode strings. from tkinter import * t = Tcl() t.tk.split((b'a 2',)) (('a', '2'),) t.tk.split(('a 2',)) ('a 2',) -- components: Tkinter, Unicode messages: 184622 nosy: ezio.melotti, gpolo

[issue16809] Tk 8.6.0 introduces TypeError. (Tk 8.5.13 works)

2013-03-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch which add support of Tcl_Obj to tkinter's splitlist(). This not only fixes some incompatibility with Tk 8.6, but can fix some issues with older Tk versions (see for example issue17397). -- keywords: +patch nosy: +gpolo stage

[issue17460] Remove the strict and related params completely removing the 0.9 support

2013-03-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I do not understand what is bad in converting parameters after removed 'strict' to be keyword-only. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17460

[issue13477] tarfile module should have a command line

2013-03-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Note that --create command should support --directory option too. Modern tar programs don't need to be told the compression method--they infer it. If they can do it in C, we can do it in Python. So we should simply omit the -bz2 stuff. An archive may

[issue14010] deeply nested filter segfaults

2013-03-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I'm trying to solve this issue (it seemed easy), but the bug is worse than expected. Python crashed even without iteration at all. it = 'abracadabra' for _ in range(100): it = filter(bool, it) del it And fixing a recursive deallocator is more

[issue14010] deeply nested filter segfaults

2013-03-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Thank you. Now I understand why this issue not happened with containers. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14010

[issue14010] deeply nested filter segfaults

2013-03-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch which adds recursion limit checks to builtin and itertools recursive iterators. -- components: +Extension Modules keywords: +patch nosy: +rhettinger stage: needs patch - patch review Added file: http://bugs.python.org/file29483

[issue2518] smtpd.py to handle huge email

2013-03-19 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- versions: +Python 3.4 -Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2518

[issue1159051] Handle corrupted gzip files with unexpected EOF

2013-03-19 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I will be offline some time. Feel free to revert these changes in 2.7-3.3 if it is necessary. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1159051

[issue14313] zipfile should raise an exception for unsupported compression methods

2012-05-14 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Modified patch adopted in 3.3 (changeset 596b0eaeece8), therefore the current patch only applies to 3.2 and 2.7. If this is a new feature, the issue can be closed. -- nosy: +loewis, storchaka versions: -Python 3.3

[issue14315] zipfile.ZipFile() unable to open zip File

2012-05-14 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: This is definitely *not* a padding issue. This is definitely a padding issue. All uncompressed files are located so that the data starts with a 4-byte boundary (1190+30+15+1=1236, 27486 +30+17+3=27536, etc). This is, probably, allows

[issue14624] Faster utf-16 decoder

2012-05-14 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: The patch updated with a little clarified code and added comments. -- Added file: http://bugs.python.org/file25590/decode_utf16_4.patch ___ Python tracker rep...@bugs.python.org http

[issue14315] zipfile.ZipFile() unable to open zip File

2012-05-14 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: That can't possibly be the reason. mmap requires 4k (4096) alignment (on x86; more than that on SPARC). This may be the reason to mmap the entire file and then read aligned binary data

[issue14674] Add link to RFC 4627 from json documentation

2012-05-15 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: for key, value in pairs: if key in pairs: if key in obj:? -- title: Link to explain deviations from RFC 4627 in json module docs - Add link to RFC 4627 from json documentation

[issue14674] Add link to RFC 4627 from json documentation

2012-05-15 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: IMHO, it would be sufficient to have a simple bullet list of differences and notes or warnings in places where Python can generate non-standard JSON (top-level scalars, inf and nan, non-utf8 encoded strings

[issue14811] compile fails - UTF-8 character decoding

2012-05-15 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: I can reproduce it on Linux. Minimal example: $ ./python -c open('longline.py', 'w').write('#' + repr('\u00A1' * 4096) + '\n') $ ./python longline.py File longline.py, line 1 SyntaxError: Non-UTF-8 code starting with '\xc2' in file

[issue14811] compile fails - UTF-8 character decoding

2012-05-15 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: And for Python 2.7 too. -- versions: +Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14811

[issue14811] compile fails - UTF-8 character decoding

2012-05-15 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Function decoding_fgets (Parser/tokenizer.c) reads line in buffer of fixed size 8192 (line truncated to size 8191) and then fails because line is cut in the middle of a multibyte UTF-8 character

[issue14811] Syntax error on long UTF-8 lines

2012-05-15 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- title: compile fails - UTF-8 character decoding - Syntax error on long UTF-8 lines ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14811

[issue14803] Add feature to allow code execution prior to __main__ invocation

2012-05-15 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: For faulthandler and coverage would be more convenient option -M (run module with __name__='__premain__' (or something of the sort) and continue command line processing). -- ___ Python tracker

[issue14777] Tkinter clipboard_get() decodes characters incorrectly

2012-05-15 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: ...And mere minutes after I said I hadn't heard anything, I've got the confirmation email. :-) Congratulations! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14777

[issue14624] Faster utf-16 decoder

2012-05-15 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Here are two new patch. Checking for characters out-of-range moved, making the code simpler. Theoretically it is a bit slow down decoding of short UCS1 strings (up to 1 and 3 chars on 32- and 64-bit), but practically there is no difference

[issue14692] json.loads parse_constant callback not working anymore

2012-05-16 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: I'm afraid I have to close this one as rejected. It works as documented and it's unlikely we'll decide to change it back. I'm sorry. It does not work as documented. The proposed patch fixes the documentation

[issue14313] zipfile should raise an exception for unsupported compression methods

2012-05-16 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: I still like NotImplementedError more than RuntimeError, though. Well. here are patches for Python 3.2 and 2.7 (backported changeset 596b0eaeece8 + part of changeset fccdcd83708a). -- Added file: http://bugs.python.org/file25618

[issue13031] small speed-up for tarfile.py when unzipping tarballs

2012-05-16 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Justin, perhaps of interest to the patch would be better if you provide any microbenchmark. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13031

[issue3931] codecs.charmap_build is untested and undocumented

2012-05-17 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- versions: +Python 3.3 -Python 2.7, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3931

[issue3931] codecs.charmap_build is untested and undocumented

2012-05-17 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- versions: +Python 2.7, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3931

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2012-05-17 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Looks like issue14738 fixes this bug for Python 3.3. print(ascii(b\xc2\x41\x42.decode('utf8', 'replace'))) '\ufffdAB' print(ascii(b\xf1ABCD.decode('utf8', 'replace'))) '\ufffdABCD' -- nosy: +storchaka

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2012-05-17 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: The only issue left was about the number of U+FFFD generated with invalid sequences in some cases. My last patch has extensive tests for this, so you could try to apply it (or copy the tests) and see if they all pass. Tests fails

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2012-05-17 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: I think that one U+FFFD is correct. The on;y error is a premature end of data. I poorly expressed. I also think that there is only one decoding error, and not two. I think the test is wrong

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2012-05-17 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: This might be just because it first checks if there two more bytes before checking if they are valid, but 'invalid continuation byte' works too. Yes, this implementation detail. It is much easier and faster. Whether it is necessary

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2012-05-17 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Changing from 'unexpected end of data' to 'invalid continuation byte' for b'\xe0\x00' is fine with me, but this will be a (minor) deviation from 2.7, 3.1, 3.2, and pypy (it could still be changed on all these except 3.1 though). I

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2012-05-17 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: I don't remember all the details right now, but it that test was passing with my patch there must be something wrong somewhere (either in the patch, in the test, or in our understanding of the standard). No, test correctly expects two

[issue1767933] Badly formed XML using etree and utf-16

2012-05-18 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Anyone can review the patch? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1767933

[issue14850] The inconsistency of codecs.charmap_decode

2012-05-18 Thread Serhiy Storchaka
New submission from Serhiy Storchaka storch...@gmail.com: codecs.charmap_decode behaves differently with native and user string as decode table. import codecs print(ascii(codecs.charmap_decode(b'\x00', 'replace', '\uFFFE'))) ('\ufffd', 1) class S(str): pass ... print(ascii

[issue14624] Faster utf-16 decoder

2012-05-19 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Thank you, Antoine. Now only issue14625 waits for review. changeset: 77012:3430d7329a3b +* UTF-8 and UTF-16 decoding is now 2x to 4x faster. In fact now UTF-16 decoding faster for a maximum of +25% compared to Python 3.2 on my

[issue1767933] Badly formed XML using etree and utf-16

2012-05-20 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Here is updated patch, with tests and support of objects with only 'write' method. -- Added file: http://bugs.python.org/file25652/etree_write_utf16_2.patch ___ Python tracker rep

[issue14868] Allow log calls to return True for code optimization.

2012-05-21 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: assert logging.debug(This is a test.) or True -- nosy: +storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14868

[issue14469] Python 3 documentation links

2012-05-21 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: http://permalink.gmane.org/gmane.comp.python.devel/132675 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14469

[issue14874] Faster charmap decoding

2012-05-21 Thread Serhiy Storchaka
New submission from Serhiy Storchaka storch...@gmail.com: Charmap decoders are not as important as UTF decoders, but are still widely used. In Python 3.3 with PEP 393 they slowed down 4x. The proposed patch restores the performance. Optimized only the most common case, when the decoder

[issue14874] Faster charmap decoding

2012-05-21 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file25665/charmapdecodebench.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14874

[issue14874] Faster charmap decoding

2012-05-21 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file25666/bench-diff.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14874

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-24 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: For Python 3.3, _PyUnicodeWriter API is faster than the Py_UCS4 buffer API and PyAccu API in quite all cases, with a speedup between 30% and 100%. But there are some cases where the _PyUnicodeWriter API is slower: Perhaps most

[issue14897] struct.pack raises unexpected error message

2012-05-24 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Funny. struct.pack(fmt, args...) is just an alias to struct.Struct(fmt).pack(args...). The error message should be changed to explicitly state that we are talking about the data for packing, and not about the arguments of function

[issue14897] struct.pack raises unexpected error message

2012-05-24 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: It might help if the error message also stated how many arguments were actually received, like the TypeError message already does for bad function / method calls. E.g., struct.error: pack expected 2 items for packing (got 1) Yes

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2012-05-25 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Here is a patch for 3.3. All of the tests pass successfully. Unfortunately, it is a little slow, but I tried to minimize the losses. -- Added file: http://bugs.python.org/file25709/issue8271-3.3.patch

[issue14920] help(urllib.parse) fails when LANG=C

2012-05-25 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- versions: +Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14920 ___ ___ Python-bugs

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2012-05-26 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Here are the benchmark results (numbers are speed, MB/s). On 32-bit Linux, AMD Athlon 64 X2: vanilla patched utf-8 'A'*1 2016 (+5%) 2111 utf-8 '\x80

[issue14923] Even faster UTF-8 decoding

2012-05-26 Thread Serhiy Storchaka
New submission from Serhiy Storchaka storch...@gmail.com: As strange as it may seem, but using a simple trick was made UTF-8 decoding even more speed up. Here are the benchmark results. On 32-bit Linux, AMD Athlon 64 X2: vanilla patched utf-8

[issue14923] Even faster UTF-8 decoding

2012-05-26 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file25718/decodebench.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14923

[issue14923] Even faster UTF-8 decoding

2012-05-26 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file25719/bench-diff.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14923

[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

2012-05-26 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Fortunately, issue14923 (if accepted) will compensate for the slowdown. On 32-bit Linux, AMD Athlon 64 X2: vanilla old patchfast patch utf-8 'A'*1 2016 (+3

[issue14923] Even faster UTF-8 decoding

2012-05-26 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: It seems the patch relies on a two's complement representation of integers. Mark, do you think that's ok? Yes, the patch depends on two facts -- 8-bit bytes and a two's complement representation of integers. That's why I call it a trick

[issue14923] Even faster UTF-8 decoding

2012-05-27 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Yes, this is an implementation-dependent behavior (and on the supported platforms it is implemented well in a certain way). However, if the continuation byte check to do the simplest way ((ch) = 0x80 (ch) 0xC0), this has the same

[issue12716] Reorganize os docs for files/dirs/fds

2012-05-28 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- nosy: +storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12716 ___ ___ Python-bugs-list

[issue1470548] Bugfix for #1470540 (XMLGenerator cannot output UTF-16)

2012-05-28 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: See also issue1767933. Instead of codecs.StreamWriter better to use io.TextIOWrapper, because the first is slower and has numerous flaws. -- nosy: +storchaka versions: +Python 3.3 ___ Python

[issue2005] posixmodule expects sizeof(pid_t/gid_t/uid_t) = sizeof(long)

2012-05-28 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- nosy: +storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2005 ___ ___ Python-bugs-list

[issue2005] posixmodule expects sizeof(pid_t/gid_t/uid_t) = sizeof(long)

2012-05-28 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- versions: +Python 3.3 -Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2005

[issue13518] configparser can’t read file objects from urlopen

2012-05-28 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Mickey, you can wrap file-like object returned by urlopen with io.TextIOWrapper. config = configparser.RawConfigParser() config.read_file(io.TextIOWrapper(urlopen(path_config), encoding='utf-8')) Because there is no bug and new

[issue4733] Add a decode to declared encoding version of urlopen to urllib

2012-05-28 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: If you add the encoding parameter, you should also add at least errors and newline parameters. And why not just use io.TextIOWrapper? page.decode_content() bad that compels to read and to decode at once all of the data, while

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-28 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: So, do you have any comment or complain? Or can I commit the patch? I beg your pardon, I will do a review and additional benchmarks today. So far away I have to say, it is better to use stringlib approach, than the massive macros, which

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-28 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: I just sent you a patch which does not use any macros or stringlib. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14744

[issue1470548] Bugfix for #1470540 (XMLGenerator cannot output UTF-16)

2012-05-30 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- nosy: +loewis ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue1470548 ___ ___ Python-bugs-list

[issue1470548] Bugfix for #1470540 (XMLGenerator cannot output UTF-16)

2012-05-30 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Oh, I see XMLGenerator completely outdated. It even has not been ported to Python 3. See function _write: def _write(self, text): if isinstance(text, str): self._out.write(text) else: self

[issue10376] ZipFile unzip is unbuffered

2012-05-31 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: The patch updated to reflect Martin's stylistic comments. Sorry for the delay, Martin. I have not received an email with your review from 2012-05-13, and only today accidentally discovered your comments in Rietveld. It seems to have been

[issue14973] restore python2 unicode literals in ur strings

2012-05-31 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: See issue3665. -- nosy: +storchaka title: restore python2 unicode literals in ru strings - restore python2 unicode literals in ur strings ___ Python tracker rep...@bugs.python.org http

[issue3665] Support \u and \U escapes in regexes

2012-06-01 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: I don't think it is worth to target it for 2.7 and 3.2 (it's new feature, not bugfix), but for 3.3 it will be very useful. Since PEP 393 conversion to the surrogate pairs is no longer relevant. -- components: +Regular Expressions

[issue3665] Support \u and \U escapes in regexes

2012-06-01 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file25781/re_unicode_escapes.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3665

[issue3665] Support \u and \U escapes in regexes

2012-06-01 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file25782/3665.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3665

[issue3665] Support \u and \U escapes in regexes

2012-06-01 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Removed file: http://bugs.python.org/file25781/re_unicode_escapes.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3665

[issue3665] Support \u and \U escapes in regexes

2012-06-01 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Removed file: http://bugs.python.org/file25782/3665.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3665

[issue3665] Support \u and \U escapes in regexes

2012-06-01 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file25783/re_unicode_escapes.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3665

[issue3665] Support \u and \U escapes in regexes

2012-06-01 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file25784/3665.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3665

[issue14993] GCC error when using unicodeobject.h

2012-06-04 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- nosy: +storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14993 ___ ___ Python-bugs-list

[issue14626] os module: use keyword-only arguments for dir_fd and nofollow to reduce function count

2012-06-04 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: Well, I'm going to ignore the long lines and documentation. The patch is really big and impressive. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14626

[issue15026] Faster UTF-16 encoding

2012-06-07 Thread Serhiy Storchaka
New submission from Serhiy Storchaka storch...@gmail.com: In pair to issue14624 here is a patch than speed up UTF-16 encoding in several times. In addition, it fixes an unsafe check of an integer overflow. Here are the results of benchmarking. See benchmark tools in https://bitbucket.org

[issue15027] Faster UTF-32 encoding

2012-06-07 Thread Serhiy Storchaka
New submission from Serhiy Storchaka storch...@gmail.com: In pair to issue14625 here is a patch than speed up UTF-32 encoding in several times. In addition, it fixes an unsafe check of an integer overflow. Here are the results of benchmarking. See benchmark tools in https://bitbucket.org

[issue14850] The inconsistency of codecs.charmap_decode

2012-06-10 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: What is the use case for passing a string subclass to charmap_decode? Or in other words, how did you stumble upon the bug? I stumbled upon it, rewriting the charmap decoder (issue14874). Now charmap decoder processes the two cases

[issue14850] The inconsistency of codecs.charmap_decode

2012-06-10 Thread Serhiy Storchaka
Serhiy Storchaka storch...@gmail.com added the comment: U+FFFE is documented as representing an undefined mapping, Yes, using U+FFFE for representing an undefined mapping in strings is normal, the question was about string subclasses. And if we will correct it for string subclasses, how far we

<    7   8   9   10   11   12   13   14   15   16   >