[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-26 Thread Terry J. Reedy
Terry J. Reedy added the comment: PEP-393 will take care of iterating by code points. Where would you have other iterators go? The string module? Something else I have not thought of? Or something new? -- ___ Python tracker

[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-26 Thread Guido van Rossum
Guido van Rossum added the comment: To me, making (default) iteration deviate from indexing is anathema. However, there is nothing wrong with providing a library function that takes a string and returns an iterator that iterates over code points, joining surrogate pairs as needed. You could eve

[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-26 Thread Terry J. Reedy
Terry J. Reedy added the comment: My proposal is better than log(N) in 2 respects. 1) There need only be a time penalty when there are non-BMP chars and indexing currently gives the 'wrong' answer and therefore when a time-penalty should be acceptable. Lookup for normal all-BMP strings could

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-26 Thread Vlad Riscutia
Vlad Riscutia added the comment: I wasn't aware this is an auto-generated file. I can add a comment but looking at it, it seems we auto-generate this file just to save a call to _dosmaperr. I would refactor the whole function to call _dosmaperr first then if result is still EINVAL, tweak with

[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-26 Thread Tom Christiansen
Tom Christiansen added the comment: Here’s my casing test suite; I thought I sent it in but the mux file here isn’t the full thing. It does several things, including letting you run it with regex vs re. It also checks for the islower, etc functions. It has both simple and full (and turkic)

[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-26 Thread Tom Christiansen
Tom Christiansen added the comment: Guido van Rossum wrote on Fri, 26 Aug 2011 21:11:24 -: > Guido van Rossum added the comment: > I presume this applies to builtin str methods like .lower(), right? I > think it is a good thing to do for Python 3.3. Yes, the full casemaps are for u

[issue12735] request full Unicode collation support in std python library

2011-08-26 Thread Tom Christiansen
Tom Christiansen added the comment: Guido van Rossum wrote on Fri, 26 Aug 2011 21:55:03 -: > I know I sound like NIH, but I'm always reluctant to add a big 3rd > party lib like ICU to the permanent dependencies of all future Python > distros. If people want to use ICU they already can

[issue12735] request full Unicode collation support in std python library

2011-08-26 Thread Tom Christiansen
Tom Christiansen added the comment: I should probably mention the importance in the design of a UCA module of being able to specify which UCA version number you want it to behave like in case you plan to override some of the DUCET entries. That way if you run under a later UCA with different DU

[issue12735] request full Unicode collation support in std python library

2011-08-26 Thread Tom Christiansen
Tom Christiansen added the comment: Raymond Hettinger added the comment: > I would like to be involved in the design of the API for a UCA module > and its routines for loading Unicode Collation Element Tables (not > making the mistake of using global state like the locale module does). Is thi

[issue12735] request full Unicode collation support in std python library

2011-08-26 Thread Raymond Hettinger
Raymond Hettinger added the comment: I would like to be involved in the design of the API for a UCA module and its routines for loading Unicode Collation Element Tables (not making the mistake of using global state like the locale module does). -- nosy: +rhettinger __

[issue12737] str.title() is overzealous by upcasing combining marks inappropriately

2011-08-26 Thread Tom Christiansen
Tom Christiansen added the comment: Guido van Rossum wrote on Fri, 26 Aug 2011 21:16:57 -: > Yeah, this should be fixed in 3.3 and probably backported to 3.2 > and 2.7. (There is already no guarantee that len(s) == > len(s.title()), right?) Well, *I* don't know of any such guarantee,

[issue12735] request full Unicode collation support in std python library

2011-08-26 Thread Guido van Rossum
Guido van Rossum added the comment: I know I sound like NIH, but I'm always reluctant to add a big 3rd party lib like ICU to the permanent dependencies of all future Python distros. If people want to use ICU they already can. OTOH I don't have a better idea. :-( --

[issue12735] request full Unicode collation support in std python library

2011-08-26 Thread Tom Christiansen
Tom Christiansen added the comment: > Sounds like a fair feature request for Python 3.3, as long as the > intention is that users must import some module from the standard > library and use functions defined in that module. The operations and > methods defined for str instances (e.g. ==, <, etc

[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

2011-08-26 Thread Guido van Rossum
Guido van Rossum added the comment: +1 on the feature request. -- nosy: +gvanrossum ___ Python tracker ___ ___ Python-bugs-list maili

[issue12734] Request for property support in Python re lib

2011-08-26 Thread Guido van Rossum
Guido van Rossum added the comment: +1 on adding the feature to 3.3 in whichever way makes sense. -- nosy: +gvanrossum ___ Python tracker ___ ___

[issue12733] Request for grapheme support in Python re lib

2011-08-26 Thread Guido van Rossum
Guido van Rossum added the comment: Again, I would be disappointed if the re (_sre) module could not be fixed. It is a reasonable feature request. -- nosy: +gvanrossum ___ Python tracker

[issue12731] python lib re uses obsolete sense of \w in full violation of UTS#18 RL1.2a

2011-08-26 Thread Guido van Rossum
Guido van Rossum added the comment: Really? The re module cannot be salvaged and we should add regex but keep the (buggy) re? That does not make a lot of sense to me. I think it should just be fixed in the re module. Or the re module should be *replaced* by the code from the regex module

[issue12746] normalization is affected by unicode width

2011-08-26 Thread Guido van Rossum
Guido van Rossum added the comment: Yeah, we should fix this. At least in 3.3, but (without knowing what exactly is involved) I think backporting to 2.7 and 3.2 makes sense too. -- nosy: +gvanrossum ___ Python tracker

[issue12737] str.title() is overzealous by upcasing combining marks inappropriately

2011-08-26 Thread Guido van Rossum
Guido van Rossum added the comment: Yeah, this should be fixed in 3.3 and probably backported to 3.2 and 2.7. (There is already no guarantee that len(s) == len(s.title()), right?) -- nosy: +gvanrossum ___ Python tracker

[issue12749] lib re cannot match non-BMP ranges (all versions, all builds)

2011-08-26 Thread Guido van Rossum
Guido van Rossum added the comment: We should at least get this fixed in 3.3. Then we can discuss the benefits of backporting the fixes to 2.7 and 3.2 (though it sounds to me like the backports will fix more than they will break, since it is pretty much impossible to do the right thing in th

[issue12736] Request for python casemapping functions to use full not simple casemaps per Unicode's recommendation

2011-08-26 Thread Guido van Rossum
Guido van Rossum added the comment: I presume this applies to builtin str methods like .lower(), right? I think it is a good thing to do for Python 3.3. We'd need to define what should happen in edge cases, e.g. when (against all odds) a string happens to contain a lone surrogate or some oth

[issue12735] request full Unicode collation support in std python library

2011-08-26 Thread Guido van Rossum
Guido van Rossum added the comment: Sounds like a fair feature request for Python 3.3, as long as the intention is that users must import some module from the standard library and use functions defined in that module. The operations and methods defined for str instances (e.g. ==, <, etc.) sh

[issue12728] Python re lib fails case insensitive matches on Unicode data

2011-08-26 Thread Guido van Rossum
Guido van Rossum added the comment: This bug could do with a little less attitude. That said, I think it is a bug and should be fixed, at the very least for Python 3.3. As always, it is a matter of much debate to what extent bugs can be fixed in previous Python versions (specifically, 2.7 a

[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

2011-08-26 Thread Guido van Rossum
Guido van Rossum added the comment: Wow. A very educational discussion. We will be referencing this issue for many years to come. As long as the buck stops with me, I feel strongly that *today* changing indexing from O(1) to O(log N) is a bad idea, partly for technical reasons, partly beca

[issue11913] sdist should allow for README.rst

2011-08-26 Thread resc
resc added the comment: Just wanted to note that this confuses other people too... http://stackoverflow.com/questions/4384796/readme-extension-for-python-projects Is this something that could be changed in 'distribute'? -- nosy: +Thomas.Smith ___ P

[issue9262] IDLE: Use tabbed shell and edit windows

2011-08-26 Thread Roger Serwy
Roger Serwy added the comment: Attached is an extension which provides tabbed windows for IDLE. It supports drag-and-drop reordering and separate windows. The implementation relies on monkey-patching a few subroutines and duck-typing for the toplevel window. The extension emulates each tab a

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-26 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: If you have a copy of Visual Studio, you can see the code of _dosmaperr() in VC/crt/src/dosmap.c. Otherwise the Google query "inurl:dosmap.c" returns some online copies of this file. -- ___ Python tracker

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-26 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: Note that this file is not written by hand. It's generated by PC/generrmap.c, which uses the _dosmaperr() function provided by the msvcrt. If we want to modify it, this should be clearly marked somewhere. --

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-26 Thread Tim Golden
Tim Golden added the comment: Obviously someone's code would break if it were relying on the Unix errno only in a Windows-only situation to determine the situation of opening a directory which isn't one. But that combination of events doesn't seem terribly likely. Speaking for myself, since

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-26 Thread Brian Curtin
Brian Curtin added the comment: With that PEP likely to be accepted, I say go ahead with the change for that benefit. -- ___ Python tracker ___

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-26 Thread Antoine Pitrou
Antoine Pitrou added the comment: > I could see how they'd use EINVAL, but to me ENOTDIR makes more sense > here. However, I'm not sure if anyone is depending on this (or what > they could depend on it for). Right now I'm not sure, but if PEP 3151 is accepted it will make much more sense to get

[issue12833] raw_input misbehaves when readline is imported

2011-08-26 Thread Idan Kamara
Idan Kamara added the comment: Reproduced on 2.7. (flushing stdin/out doesn't help) -- versions: +Python 2.7 ___ Python tracker ___

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-26 Thread Brian Curtin
Brian Curtin added the comment: I could see how they'd use EINVAL, but to me ENOTDIR makes more sense here. However, I'm not sure if anyone is depending on this (or what they could depend on it for). -- ___ Python tracker

[issue12802] Windows error code 267 should be mapped to ENOTDIR, not EINVAL

2011-08-26 Thread Antoine Pitrou
Antoine Pitrou added the comment: Brian, Tim, I'd feel more comfortable if any of you confirmed this isn't a stupid proposal on my part :) -- components: +Interpreter Core stage: needs patch -> patch review ___ Python tracker

[issue12768] docstrings for the threading module

2011-08-26 Thread Eli Bendersky
Eli Bendersky added the comment: Éric, yeah I received an email. Hopefully Graeme did too. It's a shame a new review isn't notified in the tracker instead. -- ___ Python tracker __

[issue12195] Little documentation of annotations

2011-08-26 Thread Raymond Hettinger
Raymond Hettinger added the comment: > some simple examples showing the syntax would go a long way. Sorry, there as just too many ways to go and we are intentionally not stating which way is preferred. I've seen many variants a:[Integral] for a list of integers, a:(int,str) for a 2-tuple of

[issue12742] Add support for CESU-8 encoding

2011-08-26 Thread Ezio Melotti
Ezio Melotti added the comment: Can you provide some example? The page you linked says "It should be used exclusively for internal processing and never for external data exchange.", so I'm not sure why these APIs would want to use it. -- nosy: +ezio.melotti __

[issue12195] Little documentation of annotations

2011-08-26 Thread Éric Araujo
Changes by Éric Araujo : -- nosy: +eric.araujo ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pyth

[issue12768] docstrings for the threading module

2011-08-26 Thread Éric Araujo
Éric Araujo added the comment: I have made a review on Rietveld. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscr

[issue12806] argparse: Hybrid help text formatter

2011-08-26 Thread Éric Araujo
Éric Araujo added the comment: Steven: What do you think? GraylinKim: You can open a feature request for message preview on the metatracker (see “Report Tracker Problem” in the sidebar). -- nosy: +bethard, eric.araujo type: -> feature request versions: +Python 3.3 -Python 2.7 __

[issue12759] "(?P=)" input for Tools/scripts/redemo.py raises unnhandled exception

2011-08-26 Thread Éric Araujo
Éric Araujo added the comment: I can reproduce in 3.3 (the file has been moved to Tools/demo/redemo.py). The Tk application does not crash but there is a traceback. Would you like to work on a patch? If so, there are good guidelines in the devguide. -- keywords: +easy nosy: +eric.a

[issue9302] distutils API Reference: setup() and Extension parameters' description not correct.

2011-08-26 Thread Éric Araujo
Éric Araujo added the comment: Improved and committed, thanks again! -- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed ___ Python tracker

[issue12842] Docs: first parameter of tp_richcompare() always has the correct type

2011-08-26 Thread Éric Araujo
Changes by Éric Araujo : -- keywords: +needs review stage: -> patch review versions: -Python 3.1 ___ Python tracker ___ ___ Python-b

[issue12833] raw_input misbehaves when readline is imported

2011-08-26 Thread Éric Araujo
Éric Araujo added the comment: Maybe you need to call sys.stdin.flush() before raw_input? In any way, 2.6 is in security mode, so we need to reproduce this with current versions: 2.7, 3.2 or 3.3. -- components: +IO, Interpreter Core -Library (Lib) nosy: +eric.araujo, pitrou stage: ->

[issue11360] In documentation of getopt, advertise argparse instead of optparse

2011-08-26 Thread Roundup Robot
Roundup Robot added the comment: New changeset 6d3c645fa52f by Éric Araujo in branch '2.7': Remove outdated pointer to optparse (fixes #11360). http://hg.python.org/cpython/rev/6d3c645fa52f -- ___ Python tracker _

[issue11360] In documentation of getopt, advertise argparse instead of optparse

2011-08-26 Thread Roundup Robot
Roundup Robot added the comment: New changeset 40f7a6e71930 by Éric Araujo in branch '3.2': Remove outdated pointer to optparse (fixes #11360). http://hg.python.org/cpython/rev/40f7a6e71930 -- nosy: +python-dev ___ Python tracker

[issue9302] distutils API Reference: setup() and Extension parameters' description not correct.

2011-08-26 Thread Roundup Robot
Roundup Robot added the comment: New changeset 78b26e7720c0 by Éric Araujo in branch '2.7': Fix type information in distutils API reference (#9302). http://hg.python.org/cpython/rev/78b26e7720c0 -- ___ Python tracker

[issue12678] test_packaging and test_distutils failures under Windows

2011-08-26 Thread Roundup Robot
Roundup Robot added the comment: New changeset 8ad1670c0f1f by Éric Araujo in branch '2.7': Try to fix test_distutils on Windows (#12678) http://hg.python.org/cpython/rev/8ad1670c0f1f -- ___ Python tracker ___

[issue9302] distutils API Reference: setup() and Extension parameters' description not correct.

2011-08-26 Thread Roundup Robot
Roundup Robot added the comment: New changeset 96f0ccb9716d by Éric Araujo in branch '3.2': Fix type information in distutils API reference (#9302). http://hg.python.org/cpython/rev/96f0ccb9716d New changeset a410b857efe3 by Éric Araujo in branch 'default': Merge from 3.2 (#9302 fix and other c

[issue12846] unicodedata.normalize turkish letter problem

2011-08-26 Thread Cem YILDIZ
Cem YILDIZ added the comment: unicodedata.normalize cannot convert turkish letter "ı" into "i": import unicodedata s = u"üfürükçü ağaç ve ıslıkçı çeşme" print unicodedata.normalize('NFKD', s).encode('ascii','ignore') >> ufurukcu agac ve slkc cesme but the result should be >> ufurukcu agac v

[issue12846] unicodedata.normalize turkish letter problem

2011-08-26 Thread Cem YILDIZ
Changes by Cem YILDIZ : -- type: -> behavior ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pytho

[issue12846] unicodedata.normalize turkish letter problem

2011-08-26 Thread Cem YILDIZ
New submission from Cem YILDIZ : unicodedata.normalize cannot convert turkish letter "ı" into "i": import unicodedata s = u"üfürükçü ağaç ve ıslıkçı çeşme" print(shoehorn_unicode_into_ascii(s)) print unicodedata.normalize('NFKD', s).encode('ascii','ignore') >> ufurukcu agac ve slkc cesme but

[issue12820] Tests for Lib/xml/dom/minicompat.py

2011-08-26 Thread John Chandler
John Chandler added the comment: Cool, thanks for the feedback! :-) I'll make the appropriate changes to the tests and add some coverage for defproperty as soon as I can. John -- ___ Python tracker

[issue12831] 2to3 and integer division

2011-08-26 Thread Raymond Hettinger
Raymond Hettinger added the comment: Running python with the -3 command line option will warn about Python 3.x incompatibilities that 2to3 cannot trivially fix. -- nosy: +rhettinger ___ Python tracker ___

[issue12845] PEP-3118: C-contiguity with zero strides

2011-08-26 Thread Stefan Krah
New submission from Stefan Krah : Numpy and PyBuffer_IsContiguous() have different ideas of C-contiguity if there is a zero in strides (this is allowed, I asked Pauli Virtanen). >>> from numpy import * >>> nd = ndarray(shape=[10], strides=[0]) >>> nd.flags C_CONTIGUOUS : True F_CONTIGUOUS :

[issue12831] 2to3 and integer division

2011-08-26 Thread Alexander Rødseth
Alexander Rødseth added the comment: Even though it's hard to cover every case, it should be possible in quite a few cases: self.maxstars = 4 half = self.maxstars / 2 -- ___ Python tracker __

[issue12808] Coverage of codecs.py

2011-08-26 Thread Tennessee Leeuwenburg
Tennessee Leeuwenburg added the comment: Here is a stab at updated documentation. I would suggest that if further changes are recommended to the documentation, that a core committer go ahead and make them. I'm absolutely more than happy to keep taking "stabs" at it, but ultimately I probably

[issue12844] Support more than 255 arguments

2011-08-26 Thread Martin v . Löwis
Martin v. Löwis added the comment: The approach looks fine to me. Would you like to work on a patch? -- nosy: +loewis ___ Python tracker ___