[issue12721] Chaotic use of helper functions in test_shutil for reading and writing files
Petri Lehtinen pe...@digip.org added the comment: I'd call os.path.join() in the test functions rather than in read_file() and write_file(). This makes it easier to understand what the test is doing without looking at the code of read_file() and write_file(). Otherwise, looks good to me, and I think this would be useful cleanup. -- keywords: +needs review nosy: +eric.araujo, petri.lehtinen, tarek stage: - patch review versions: -Python 2.6, Python 2.7, Python 3.1, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12721 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12687] Python 3.2 fails to load protocol 0 pickle
Antoine Pitrou pit...@free.fr added the comment: Ok, the patch is not correct. The core issue is that _Unpickler_Readline should always return a \0-terminated string, but sometimes it doesn't; this issue should be fixed instead of working around it in some other function. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12687 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12721] Chaotic use of helper functions in test_shutil for reading and writing files
Hynek Schlawack h...@ox.cx added the comment: I tend to agree on public APIs, however in this case of a helper function the use case with a join is really really common so this extra function comes in very handy. I also kept it using lists, so it's more obvious than tuples. JFTR it wasn't my idea, so I'm not defensive about my own idea here. :) I just re-implemented it for read_file b/c it's really handy and saves a lot of typing. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12721 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12726] explain why locale.getlocale() does not read system's locales
New submission from Alexis Metaireau ale...@notmyidea.org: The documentation about locale.getlocale() doesn't talk about the fact that the locale isn't read from the system locale. Thus, it seemed strange to have locale.getlocale() returning (None, None). As it seems to be the expected behaviour, it seems useful to specify this in the documentation and make it explicit. I'm okay to write a patch and apply it. This issue is related to #6203, but does not supersede it (the two conversations are discussing two different things). -- assignee: alexis components: Documentation messages: 141897 nosy: alexis, feth priority: normal severity: normal status: open title: explain why locale.getlocale() does not read system's locales ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12726 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12687] Python 3.2 fails to load protocol 0 pickle
Vinay Sajip vinay_sa...@yahoo.co.uk added the comment: I confess I'm not familiar enough with the pickle module internals to be sure of putting in the right fix quickly. I will take a look at _Unpickler_Readline when I get a chance, if someone doesn't beat me to it :-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12687 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12661] Add a new shutil.cleartree function to shutil module
Leonid Vasilev vsleo...@gmail.com added the comment: yup, it's really to specific. -- resolution: - invalid status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12661 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12661] Add a new shutil.cleartree function to shutil module
Changes by Leonid Vasilev vsleo...@gmail.com: -- resolution: invalid - rejected ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12661 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12701] Apple's clang 2.1 (xcode 4.1, OSX 10.7) optimizer miscompiles intobject.c
Mark Dickinson dicki...@gmail.com added the comment: If there's dependence on undefined behaviour (from overflow of signed integer operations) in intobject.c, I'd call that a bug. I've been trying to remove similar overflow checks from the Python source when I've encountered them, but there are still a good few left. -- nosy: +mark.dickinson ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12701 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12701] Apple's clang 2.1 (xcode 4.1, OSX 10.7) optimizer miscompiles intobject.c
Changes by Petri Lehtinen pe...@digip.org: Removed file: http://bugs.python.org/file22866/unnamed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12701 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12723] tkSimpleDialog.askstring shouldn't allow empty string input
Matthew Hemke mghe...@gmail.com added the comment: What about adding a validatecommand option like on Tkinter.Entry? For what I am trying to do it was sort of a kludge to validate the entry because an empty string was invalid, but in the interface design, it would have been rude to validate after the dialog closes and then keep popping up another tkSimpleDialog.askstring until the input is correct. It almost makes askstring useless because I can't validate on close. That wouldn't break backwards compatibility would it? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12723 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12701] Apple's clang 2.1 (xcode 4.1, OSX 10.7) optimizer miscompiles intobject.c
Ronald Oussoren ronaldousso...@mac.com added the comment: Clang has an option -fcatch-undefined-behavior that might help in locating other locations where we use undefined behavior. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12701 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12723] Provide an API in tkSimpleDialog for defining custom validation functions
R. David Murray rdmur...@bitdance.com added the comment: Adding an option would also be a reasonable feature request, but I think exposing _QueryDialog would be a more general solution, since it would apply to more than just strings. While not backward incompatible, either of these is a new feature and so can only go into 3.3. -- stage: - needs patch title: tkSimpleDialog.askstring shouldn't allow empty string input - Provide an API in tkSimpleDialog for defining custom validation functions versions: +Python 3.3 -Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12723 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12718] Logical mistake of importer method in logging.config.BaseConfigurator
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 0fbd44e3f342 by Vinay Sajip in branch '2.7': Issue #12718: Add documentation on using custom importers. http://hg.python.org/cpython/rev/0fbd44e3f342 New changeset 1e96a4406565 by Vinay Sajip in branch '3.2': Issue #12718: Add documentation on using custom importers. http://hg.python.org/cpython/rev/1e96a4406565 New changeset 76964d70c81c by Vinay Sajip in branch 'default': Closes #12718: Merge documentation fix from 3.2. http://hg.python.org/cpython/rev/76964d70c81c -- nosy: +python-dev resolution: invalid - fixed stage: - committed/rejected status: pending - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12718 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12718] Logical mistake of importer method in logging.config.BaseConfigurator
Changes by R. David Murray rdmur...@bitdance.com: -- nosy: +ncoghlan ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12718 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3244] multipart/form-data encoding
Johannes Hoff johsh...@gmail.com added the comment: Forest Bond: Thanks for this patch - I hope it will go in soon. In the meantime, could I get permission to use it as is? (I notice there is a copyright in the file) I would of course keep the attributions in the file. -- nosy: +Johannes.Hoff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3244 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3244] multipart/form-data encoding
Forest Bond for...@alittletooquiet.net added the comment: Hi, Johannes. You can assume the Python license for this patch. -Forest -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3244 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12724] Add Py_RETURN_NOTIMPLEMENTED
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset e88362fb4950 by Brian Curtin in branch 'default': Add doc for Py_RETURN_NOTIMPLEMENTED, added in #12724. http://hg.python.org/cpython/rev/e88362fb4950 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12724 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12721] Chaotic use of helper functions in test_shutil for reading and writing files
Éric Araujo mer...@netwok.org added the comment: I’ll make one change before committing: Lib/test/test_shutil.py:69: if isinstance(path, (list, tuple)): Using a list for path components does not make sense. I have changed a similar helper function in packaging to allow only tuples. Petri: these helper functions are all about convenienve. I would reject a patch for a function just doing open+read, but here I think that doing os.path.join+open+read is worth a function. We use such helpers all the time in packaging tests and it helps reducing boilerplate, without being very hard to understand. -- assignee: - eric.araujo resolution: - accepted versions: +Python 2.7, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12721 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12722] Link to heapq source from docs.python.org not working
Éric Araujo mer...@netwok.org added the comment: The fix committed will be superseded by a link to the Mercurial repo when I fix #11435 (probably tomorrow). -- nosy: +eric.araujo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12722 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12672] Some problems in documentation extending/newtypes.html
Éric Araujo mer...@netwok.org added the comment: I know perfectly well that [].append is valid Python, but I don't think this is the clearest way to give an example of an object method. I think spelling [].append's meaning more explicitly would be better. Would it be clearer if we replaced the literal with a name? These C functions are called “type methods” to distinguish them from - things like [].append (which we call “object methods”). + methods bound to specific instances (things like sys.path.append), + which we call “object methods”. I'm also aware that there are tab problems all over the code base. I'm not suggesting a large cleanup. *I* was suggesting a large cleanup :), but we can do that in another commit. If you want to clean the example code in Doc/extending or even just in newtypes.rst, I think you can just go ahead. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12672 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9528] Add pure Python implementation of time module to CPython
Éric Araujo mer...@netwok.org added the comment: Alan: the Versions field is used to mark versions that will get a patch, not all versions affected. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9528 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5639] Support TLS SNI extension in ssl module
Dolf Andringa dolfandri...@gmail.com added the comment: I see the patch has been applied python3 in r85793, but is there any chance there will also be patches for python 2.6 or 2.7? And if so, what release of python (any version) might this patch be included in? -- nosy: +Dolf.Andringa versions: +Python 2.6, Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5639 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5639] Support TLS SNI extension in ssl module
Antoine Pitrou pit...@free.fr added the comment: I see the patch has been applied python3 in r85793, but is there any chance there will also be patches for python 2.6 or 2.7 No, Python 2 only receives bug fixes. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue5639 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12721] Chaotic use of helper functions in test_shutil for reading and writing files
Petri Lehtinen pe...@digip.org added the comment: Éric Araujo wrote: Petri: these helper functions are all about convenienve. I would reject a patch for a function just doing open+read, but here I think that doing os.path.join+open+read is worth a function. We use such helpers all the time in packaging tests and it helps reducing boilerplate, without being very hard to understand. Ok, sounds reasonable. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12721 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12727] make always reruns asdl_c.py
New submission from Antoine Pitrou pit...@free.fr: It's not really an issue, but I thought I would mention it. It is a bit misleading, since it makes you think that you changed something in the grammar that's triggering the rebuild. $ make ./Parser/asdl_c.py -h ./Include ./Parser/Python.asdl running build running build_ext -- assignee: benjamin.peterson components: Build messages: 141915 nosy: benjamin.peterson, pitrou priority: low severity: normal status: open title: make always reruns asdl_c.py type: behavior versions: Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12727 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12728] Python re lib fails case insensitive matches on Unicode data
New submission from Tom Christiansen tchr...@perl.com: The Python re library is broken in its approach to case-insensitive matches. It erroneously attempts to compare lowercase mappings. This is wrong. You must compare the Unicode casefolds, not the Unicode casemaps. Otherwise you get wrong answers. I include a small test case that illustrates this bug. The bug exists on both 2.7 and 3.2, and on both wide builds and narrow builds. For comparison, I also show results using Matthew Barnett's regex library, which gets all 5 tests correct where re gets all 5 tests wrong. A sample run is: FAIL: repattern Ι isnot the same as string ͅ PASS: regex pattern Ι is indeed the same as string ͅ FAIL: repattern Μ isnot the same as string µ PASS: regex pattern Μ is indeed the same as string µ FAIL: repattern ſ isnot the same as string s PASS: regex pattern ſ is indeed the same as string s FAIL: repattern ΣΤΙΓΜΑΣ isnot the same as string στιγμας PASS: regex pattern ΣΤΙΓΜΑΣ is indeed the same as string στιγμας FAIL: repattern POST isnot the same as string poſt PASS: regex pattern POST is indeed the same as string poſt relib passed 0 of 5 tests regex lib passed 5 of 5 tests -- components: Library (Lib) files: sigmata.python messages: 141916 nosy: tchrist priority: normal severity: normal status: open title: Python re lib fails case insensitive matches on Unicode data versions: Python 2.7 Added file: http://bugs.python.org/file22879/sigmata.python ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12728 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug
New submission from Tom Christiansen tchr...@perl.com: Python is in flagrant violation of the very most basic premises of Unicode Technical Report #18 on Regular Expressions, which requires that a regex engine support Unicode characters as basic logical units independent of serialization like UTF‑*. Because sometimes you must specify .. to match a single Unicode character -- whenever those code points are above the BMP and you are on a narrow build -- Python regexes cannot be reliably used for Unicode text. % python3.2 Python 3.2 (r32:88445, Jul 21 2011, 14:44:19) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type help, copyright, credits or license for more information. import re g = \N{GREEK SMALL LETTER ALPHA WITH VARIA AND YPOGEGRAMMENI} print(g) ᾲ print(re.search(r'\w', g)) _sre.SRE_Match object at 0x10051f988 p = \N{MATHEMATICAL SCRIPT CAPITAL P} print(p) 풫 print(re.search(r'\w', p)) None print(re.search(r'..', p)) # ← 홏홃홄홎 홄홎 홏홃홀 홑홄홊홇혼홏홄홊홉 홍홄홂홃홏 홃홀홍홀 _sre.SRE_Match object at 0x10051f988 print(len(chr(0x1D4AB))) 2 That is illegal in Unicode regular expressions. -- components: Regular Expressions messages: 141917 nosy: tchrist priority: normal severity: normal status: open title: Python lib re cannot handle Unicode properly due to narrow/wide bug type: behavior versions: Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12729 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12730] Python's casemapping functions are untrustworthy due to narrow/wide build issues
New submission from Tom Christiansen tchr...@perl.com: You cannot use Python's casemapping functions on Unicode data because they fail on narrow builds. This makes it impossible to write portable code in Python that can cope with full Unicode. I've tried several times to submit this bug, but the file selection widget blows up. I believe it was an Opera bug because I had a write lock on the file. One more time. -- components: Unicode files: casemaps.python messages: 141918 nosy: tchrist priority: normal severity: normal status: open title: Python's casemapping functions are untrustworthy due to narrow/wide build issues type: behavior versions: Python 2.7 Added file: http://bugs.python.org/file22880/casemaps.python ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12730 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12687] Python 3.2 fails to load protocol 0 pickle
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset c47bc1349e61 by Antoine Pitrou in branch '3.2': Issue #12687: Fix a possible buffering bug when unpickling text mode (protocol 0, mostly) pickles. http://hg.python.org/cpython/rev/c47bc1349e61 New changeset 6aa822071f4e by Antoine Pitrou in branch 'default': Issue #12687: Fix a possible buffering bug when unpickling text mode (protocol 0, mostly) pickles. http://hg.python.org/cpython/rev/6aa822071f4e -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12687 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a
New submission from Tom Christiansen tchr...@perl.com: You cannot use Python's lib re for handling Unicode regular expressions because it violates the standard set out for the same in UTS#18 on Unicode Regular Expressions in RL1.2a on compatibility properties. What \w is allowed to match is clearly explained there, but Python has its own idea. Because it is in clear violation of the standard, it is misleading and wrong for Python to claim that the re.UNICODE flag makes \w and friends match Unicode. Here are the failed test cases when the attached file is run under v3.2; there are further failures when run under v2.7. FAIL lib refound non alphanumeric string café FAIL lib refound non alphanumeric string Ⓚ FAIL lib refound non alphanumeric string ͅ FAIL lib refound non alphanumeric string ְ FAIL lib refound non alphanumeric string ퟘ FAIL lib refound non alphanumeric string ́ FAIL lib refound non alphanumeric string 픘픫픦픠픬픡픢 FAIL lib refound non alphanumeric string ДЯхШщЯл FAIL lib refound non alphanumeric string connector‿punctuation FAIL lib refound non alphanumeric string Ὰͅ_Στο_Διάολο FAIL lib refound non alphanumeric string ̰̰̈́̈́‿̰̿̽̓͂‿̸̿‿̹̽‿̷̹̼̹̰̼̽ FAIL lib refound all alphanumeric string ¹²³ FAIL lib refound all alphanumeric string ₁₂₃ FAIL lib refound all alphanumeric string ¼½¾ FAIL lib refound all alphanumeric string ⑶ Note that Matthew Barnett's regex lib for Python handles all of these cases in comformance with The Unicode Standard. -- components: Regular Expressions files: alnum.python messages: 141920 nosy: tchrist priority: normal severity: normal status: open title: python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a type: behavior versions: Python 2.7 Added file: http://bugs.python.org/file22881/alnum.python ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12731 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12687] Python 3.2 fails to load protocol 0 pickle
Antoine Pitrou pit...@free.fr added the comment: Fixed with a test. -- resolution: - fixed stage: patch review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12687 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12727] make always reruns asdl_c.py
Roundup Robot devn...@psf.upfronthosting.co.za added the comment: New changeset 5e005773feaa by Benjamin Peterson in branch 'default': revert code which conditionally writes Python-ast.h (closes #12727) http://hg.python.org/cpython/rev/5e005773feaa -- nosy: +python-dev resolution: - fixed stage: - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12727 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12732] Can't portably use Unicode in Python identifiers
New submission from Tom Christiansen tchr...@perl.com: You cannot reliably use Unicode in Python identifiers because of the narrow/wide build issue. The enclosed file is fine on wide builds but gets compiler errors on narrow ones during compilation. Go, Ruby, Java, and Perl all handle this situation without any problem; only Python has the bug. -- components: Interpreter Core files: badidents.python messages: 141923 nosy: tchrist priority: normal severity: normal status: open title: Can't portably use Unicode in Python identifiers type: behavior versions: Python 3.2 Added file: http://bugs.python.org/file22882/badidents.python ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12732 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12728] Python re lib fails case insensitive matches on Unicode data
Changes by Tom Christiansen tchr...@perl.com: -- components: +Regular Expressions -Library (Lib) type: - behavior ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12728 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12733] Request for grapheme support in Python re lib
New submission from Tom Christiansen tchr...@perl.com: Without proper grapheme support in the regular expression library, it is impossible to correctly process Unicode. And the very least, one needs the \X escape supported, which is an extended grapheme cluster per UTS#18. This escape is supported by many regex libraries, include Perl's own and of course PCRE (and thence PHP, the standard ICU library, and Matthew Barnett's replacement regex library for Python. How do you process a string by graphemes if you cannot split on \X? How can you avoid splitting a grapheme into silly pieces if you cannot match one? How do I match the letter O no matter what diacritics have been applied to it otherwise? A match of (?=O)\X against an NFD string is by far the simplest and best way. This is necessary for a wide variety of reasons. Adding \pM and \PM go a little ways, but not far enough, because that is not how grapheme clusters are defined. You need a proper \X. -- components: Regular Expressions messages: 141924 nosy: tchrist priority: normal severity: normal status: open title: Request for grapheme support in Python re lib type: feature request versions: Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12733 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12734] Request for property support in Python re lib
New submission from Tom Christiansen tchr...@perl.com: Python supports no Unicode properties in its re library, making it unsuitable for work with Unicode. This is therefore a formal request for the Python re library to support Unicode properties. The eleven properties required by Unicode Technical Report #18's RL1.2 are the bare minimum which must be added to make it possible to use Python reguyar expressions on Unicode. The proposed RL2.7 on Full Properties is even better. That is found at http://unicode.org/reports/tr18/proposed.html#Full_Properties Although by the time you read this, it will have been made an official part of tr18. Matthew Barnett's replacement library for re, called regex, support 67 Unicode properties at last count, including the strongly recommended loose matching. The standard re library needs to be spiffed up to make it suitable for Unicode processing; it is not currently usable for that due to this missing functionality. I quote from the Level 1 conformance requirement of tr18: Level 1: This is a minimal level for useful Unicode support. It does not account for end-user expectations for character support, but does satisfy most low-level programmer requirements. The results of regular expression matching at this level are independent of country or language. At this level, the user of the regular expression engine would need to write more complicated regular expressions to do full Unicode processing. pass RL1.1 Hex Notation fail RL1.2 Properties fail RL1.2a Compatibility Properties fail RL1.3 Subtraction and Intersection fail RL1.4 Simple Word Boundaries fail RL1.5 Simple Loose Matches fail RL1.6 Line Boundaries fail RL1.7 Supplementary Code Points (withdrawn) RL2.1 Canonical Equivalents fail RL2.2 Extended Grapheme Clusters fail RL2.3 Default Word Boundaries fail RL2.4 Default Case Conversion pass RL2.5 Name Properties fail RL2.6 Wildcards in Property Values fail RL2.7 Full Properties I won’t even talk about Level 3. ICU, Perl, and Java7 all meet Level One conformance requirements with several Level 2 requirements also met. It is important for Python to meet the Unicode Standard in this so that people can use Python for regex matching Unicode text. They currently cannot usefully do so per the requirements of tr18. -- components: Regular Expressions messages: 141925 nosy: tchrist priority: normal severity: normal status: open title: Request for property support in Python re lib type: feature request versions: Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12734 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12735] request full Unicode collation support in std python library
New submission from Tom Christiansen tchr...@perl.com: Python has no standard support for the Unicode Collation Library as explained in UTS #10. This is request that UCA library be added to the standard Python distribution. Collation underlies virtually everything we do with text, not just sorting but any sort of comparison. Furthermore, the UCA is tailorable for locales in a portable way that does not require dodgy vendor support. It is a very important step in making Python suitable for full Unicode text processing. -- components: Library (Lib) messages: 141926 nosy: tchrist priority: normal severity: normal status: open title: request full Unicode collation support in std python library type: feature request versions: Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12735 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12730] Python's casemapping functions are untrustworthy due to narrow/wide build issues
Steffen Daode Nurpmeso sdao...@googlemail.com added the comment: A sign! A sign! Someone with a name-name-name!! (Not a useful comment, i'm afraid.) -- nosy: +sdaoden ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12730 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12730] Python's casemapping functions are untrustworthy due to narrow/wide build issues
Changes by Steffen Daode Nurpmeso sdao...@googlemail.com: -- nosy: -sdaoden ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12730 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12730] Python's casemapping functions are untrustworthy due to narrow/wide build issues
Changes by Stefan Krah stefan-use...@bytereef.org: -- Removed message: http://bugs.python.org/msg141927 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12730 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation
New submission from Tom Christiansen tchr...@perl.com: Python's casemapping functions only use what Unicode calls simple casemaps. These are only appropriate for functions that operate on single characters alone, not for those that operate on strings. The reason for this is that you get much better results with full casemapping. Java, Ruby, and Perl all do full casemapping for their equivalent functions that do string mapping, and Python should, too. I include a program that has a much of mappings and foldings both simple and full. Yes, it was machine-generated. -- components: Library (Lib) files: mux.python messages: 141928 nosy: tchrist priority: normal severity: normal status: open title: Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation type: feature request versions: Python 3.2 Added file: http://bugs.python.org/file22883/mux.python ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12736 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3561] Windows installer should add Python and Scripts directories to the PATH environment variable
Changes by Aaron Robson shiny.mag...@googlemail.com: -- nosy: +AaronR ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue3561 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12006] strptime should implement %V or %u directive from libc
Changes by Aaron Robson shiny.mag...@googlemail.com: -- nosy: +AaronR ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12006 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12737] string.title() is overzealous by upcasing combining marks inappropriately
New submission from Tom Christiansen tchr...@perl.com: Python's string.title() function claims it titlecases the first letter in each word and lowercases the rest. However, this is not true. It is not using either of the two word detection algorithms that Unicode provides. One allows you to use a legacy \w+, where \w means any Alphabetic, Mark, Decimal Number, or Connector Punctuation (see UTS#18 Annex C: Compatibility Properties), and the other uses the more sophisticated word-break provided by the Word_Break properties such as Word_Break=MidNumLet Python is using neither of these, so gets the wrong answer. titlecase of déme un café should be Déme Un Café not DéMe Un Café titlecase of i̇stanbul should be İstanbul not İStanbul titlecase of ᾲ στο διάολο should be Ὰͅ Στο Διάολο not ᾺΙ Στο ΔιάΟλο Because those are in NFD form, you get different answers than if they are in NFC. That is not right. You should get the same answer. The bug is you aren't using the right definition for \w, and so get screwed up. This is likely related to issue 12731. In the enclosed tester file, which fails 4 out of its 6 tests, there is also a bug shown with this failed result: titlecase of мЯхШщЯл should be ДЯхШщЯл not мЯхШщЯл That one is related to issue 12730. See the attached tester, which was run under Python 3.2. As far as I can tell, these bugs exist in all python versions. -- files: titletest.python messages: 141929 nosy: tchrist priority: normal severity: normal status: open title: string.title() is overzealous by upcasing combining marks inappropriately type: behavior versions: Python 3.2 Added file: http://bugs.python.org/file22884/titletest.python ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12737 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12730] Python's casemapping functions are untrustworthy due to narrow/wide build issues
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12730 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12737] string.title() is overzealous by upcasing combining marks inappropriately
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12737 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug
R. David Murray rdmur...@bitdance.com added the comment: This is an acknowledged problem with Python narrow builds, and applies to much more than just regex processing. -- nosy: +r.david.murray ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12729 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12734] Request for property support in Python re lib
R. David Murray rdmur...@bitdance.com added the comment: I think the only way re is going to get spiffed up is by replacing it with Matthew's library. This is a goal, but I'm not sure where exactly we are in the process. The more Matthew's code gets tested (especially for compatibility with the current re API), the closer we will be to that goal. -- nosy: +r.david.murray ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12734 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +belopolsky, ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12736 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12735] request full Unicode collation support in std python library
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +belopolsky, ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12735 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12734] Request for property support in Python re lib
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12734 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12733] Request for grapheme support in Python re lib
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12733 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12732] Can't portably use Unicode in Python identifiers
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12732 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12731 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12728] Python re lib fails case insensitive matches on Unicode data
Changes by Ezio Melotti ezio.melo...@gmail.com: -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12728 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12738] Bug in multiprocessing.JoinableQueue() implementation on Ubuntu 11.04
New submission from Michael Hall michaelhal...@gmail.com: I recently switched to Ubuntu 11.04 from OpenSUSE 11.4, and when I go to run a project I coded a couple days ago under OpenSUSE using the multiprocessing library, it hangs when it did not under OpenSUSE. Specifically, I am using two queues, work_queue from which the children get jobs, and results_queue where they place their results before calling JoinableQueue.task_done() and grabbing the next result. I use the poison pill technique to terminate the children, where a None object is placed at the end of the queue for each child, and when they get one of the terminating objects they call task_done() again (to account for the None object) and exit. In the main process, after spawning all of the children (one per physical CPU), it joins with the work_queue in order to wait for all of its children to finish. This is pretty much a cookie-cutter multiprocessing implementation that I've used successfully for years under OpenSUSE, but for some odd reason the exact same code does not work under Ubuntu. I would try porting to python 3.x, but the rest of my research team is still using 2.7, so that's not really an option right now. -- components: Library (Lib) messages: 141932 nosy: Michael.Hall priority: normal severity: normal status: open title: Bug in multiprocessing.JoinableQueue() implementation on Ubuntu 11.04 versions: Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12738 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12738] Bug in multiprocessing.JoinableQueue() implementation on Ubuntu 11.04
Michael Hall michaelhal...@gmail.com added the comment: Edit: Sorry, I should have been more clear. The hang occurs after the first child process exits, at which point all four children become zombies (none of the others exit, they just zombify immediately), and the main process sits there waiting forever for the rest of the children to clear out the queue, which of course never happens. -- type: - behavior ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12738 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12734] Request for property support in Python re lib
Tom Christiansen tchr...@perl.com added the comment: I've been a lot of testing of Matthew's regex library against UTS#18 issues, but only somewhat incidentally testing re. To use regex, one has to accept that certain things will work differently than they work in re, because he is following Unicode definitions for things like casefolding. But I doubt that is the sort of difference you are talking about. One of the things that Java, Go, and Perl all do is run regression tests against the whole Unicode Character Database to make sure nothing gets hosed, missed, or otherwise out of sync. That might a sort of regression test you might like to add. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12734 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12708] multiprocessing.Pool is missing a starmap[_async]() method.
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: +def starmapstar(args): +return list(itertools.starmap(args[0], args[1])) Is your new function restricted to 2 arguments? -- nosy: +amaury.forgeotdarc ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12708 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12568] Add functions to get the width in columns of a character
Tom Christiansen tchr...@perl.com added the comment: I can attest that being able to get the columns of a grapheme cluster is very important for printing, because you need this to do correct linebreaking. There might be something you can steal from http://search.cpan.org/perldoc?Unicode::GCString http://search.cpan.org/perldoc?Unicode::LineBreak which implements UAX#14 on linebreaking and UAX#11 on East Asian widths. I use this in my own code to help format Unicode strings my columns or lines. The right way would be to build this sort of knowledge into string.format(), but that is much harder, so an intermediary library module seems good enough for now. -- nosy: +tchrist ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12568 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11230] Full unicode import system not in 3.2
Tom Christiansen tchr...@perl.com added the comment: How does this work for modules that have filesystem names different from the one used for import? The issue I'm thinking about is that the Mac HSF+ filesystem keeps its Unicode filenames in (close to) NFD form. That means that a module named caf\N{LATIN SMALL LETTER E WITH ACUTE} with 4 graphemes and 4 code points in its name winds up in the filesystem as cafe\N{COMBINING ACUTE ACCENT} still with 4 graphemes but now with 5 code points. I believe (well, suspect; I have empirical evidence not proof) Python stores its own identifiers in NFD, so this may not be quite as much of a problem as it might otherwise be. Nonetheless, I have had users complain about what HFS+ does with such filenames, although I am not quite sure why. I think it’s because they access a file with 4 chars but they need a 5-char fileglob to wildcard it, so touch caf\N{LATIN SMALL LETTER E WITH ACUTE} and then you need a wildcard of ? with an extra ? to find it. Kinda weird. -- nosy: +tchrist ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue11230 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2857] add codec for java modified utf-8
Tom Christiansen tchr...@perl.com added the comment: Please do not call this utf-8-java. It is called cesu-8 per UTS#18 at: http://unicode.org/reports/tr26/ CESU-8 is *not* a a valid Unicode Transform Format and should not be called UTF-8. It is a real pain in the butt, caused by people who misunderand Unicode mis-encoding UCS-2 into UTF-8, screwing it up. I understand the need to be able to read it, but call it what it is, please. Despite the talk about Lucene, I note that the Perl port of Lucene uses real UTF-8, not CESU-8. -- nosy: +tchrist ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2857 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12739] read stuck with multithreading and simultaneous subprocess.Popen
New submission from Joe Hu sapika...@gmail.com: When multiple threads create child processes simultaneously and redirect their stdout using subprocess.Popen, at least one thread will stuck on reading the stdout after its child process exited, until all other processes are also exited. The test case reproduces the problem. It's always reproducible on my system (Python 3.1 on Windows 7 x64 / Python 3.2 on Windows 7 x86). Here is my suspicion: When Popen is called by two threads simultaneously, the latter child processes may be started before pipe handles for the former process are closed, causing the handles be incorrectly inherited by the latter process. So these handles can only be closed after all the two processes exit, and only after that, p.stdout.read* can detect EOF and return. -- components: Library (Lib), Windows files: python-subprocess-bug-test-case.py messages: 141939 nosy: SAPikachu priority: normal severity: normal status: open title: read stuck with multithreading and simultaneous subprocess.Popen type: behavior versions: Python 3.1, Python 3.2 Added file: http://bugs.python.org/file22885/python-subprocess-bug-test-case.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12739 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11105] Compiling evil ast crashes interpreter
Changes by Meador Inge mead...@gmail.com: -- nosy: +meador.inge ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue11105 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2857] add codec for java modified utf-8
Georg Brandl ge...@python.org added the comment: +1 for calling it by the correct name (the docs can of course state that this is equivalent to Java Modified UTF-8 or however they like to call it). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue2857 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com