Re: [Python-Dev] issue 6721 Locks in python standard library should be sanitized on fork

2011-08-26 Thread Nir Aides
Another face of the discussion is about whether to deprecate the mixing of the threading and processing modules and what to do about the multiprocessing module which is implemented with worker threads. On Tue, Aug 23, 2011 at 11:29 PM, Antoine Pitrou solip...@pitrou.netwrote: Le mardi 23 août

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Paul Moore
On 26 August 2011 03:52, Guido van Rossum gu...@python.org wrote: I know that by now I am repeating myself, but I think it would be really good if we could get rid of this ambiguity. PEP 393 seems the best way forward, even if it doesn't directly address what to do for IronPython or Jython,

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread M.-A. Lemburg
Stefan Behnel wrote: Isaac Morland, 26.08.2011 04:28: On Thu, 25 Aug 2011, Guido van Rossum wrote: I'm not sure what should happen with UTF-8 when it (in flagrant violation of the standard, I presume) contains two separately-encoded surrogates forming a valid surrogate pair; probably whatever

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Ezio Melotti
On Fri, Aug 26, 2011 at 5:59 AM, Guido van Rossum gu...@python.org wrote: On Thu, Aug 25, 2011 at 7:28 PM, Isaac Morland ijmor...@uwaterloo.ca wrote: On Thu, 25 Aug 2011, Guido van Rossum wrote: I'm not sure what should happen with UTF-8 when it (in flagrant violation of the standard, I

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Martin v. Löwis
IronPython and Jython can retain UTF-16 as their native form if that makes interop cleaner, but in doing so they need to ensure that basic operations like indexing and len work in terms of code points, not code units, if they are to conform. That means that they won't conform, period. There

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Stefan Behnel
Martin v. Löwis, 26.08.2011 11:29: You seem to assume it is ok for Jython/IronPython to provide indexing in O(n). It is not. I think we can leave this discussion aside. Jython and IronPython have their own platform specific constraints to which they need to adapt their implementation. For a

Re: [Python-Dev] PEP 393 review

2011-08-26 Thread Martin v. Löwis
But strings are allocated via PyObject_Malloc(), i.e. the custom arena-based allocator -- isn't its overhead (for small objects) less than 2 pointers per block? Ah, right, I missed that. Indeed, those have no header, and the only overhead is the padding to a multiple of 8. That shifts the

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Antoine Pitrou
Why would PEP 393 apply to other implementations than CPython? Regards Antoine. On Fri, 26 Aug 2011 00:01:42 + Dino Viehland di...@microsoft.com wrote: Guido wrote: Which reminds me. The PEP does not say what other Python implementations besides CPython should do. presumably Jython

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Stefan Behnel
Antoine Pitrou, 26.08.2011 12:51: Why would PEP 393 apply to other implementations than CPython? Not the PEP itself, just the implications of the result. The question was whether the language specification in a post PEP-393 can (and if so, should) be changed into requiring unicode objects to

Re: [Python-Dev] Windows installers and %PATH%

2011-08-26 Thread Antoine Pitrou
On Fri, 26 Aug 2011 14:52:07 +1000 Nick Coghlan ncogh...@gmail.com wrote: Windows is a developer hostile platform unless you completely buy into the Microsoft toolchain, which is not an option for cross-platform projects like Python. We already buy into the MS toolchain since we require Visual

Re: [Python-Dev] Windows installers and %PATH%

2011-08-26 Thread Brian Curtin
On Thu, Aug 25, 2011 at 23:04, Andrew Pennebaker andrew.penneba...@gmail.com wrote: Please have the Windows installers add the Python installation directory to the PATH environment variable. The http://bugs.python.org bug tracker is a better place for feature requests like this, of which

Re: [Python-Dev] issue 6721 Locks in python standard library should be sanitized on fork

2011-08-26 Thread Jesse Noller
On Fri, Aug 26, 2011 at 3:18 AM, Nir Aides n...@winpdb.org wrote: Another face of the discussion is about whether to deprecate the mixing of the threading and processing modules and what to do about the multiprocessing module which is implemented with worker threads. There's a bug open -

Re: [Python-Dev] PEP 393 review

2011-08-26 Thread Guido van Rossum
It would be nice if someone wrote a test to roughly verify these numbers, e.v. by allocating lots of strings of a certain size and measuring the process size before and after (being careful to adjust for the list or other data structure required to keep those objects alive). --Guido On Fri, Aug

Re: [Python-Dev] PEP 393 review

2011-08-26 Thread Guido van Rossum
Also, please add the table (and the reasoning that led to it) to the PEP. On Fri, Aug 26, 2011 at 7:55 AM, Guido van Rossum gu...@python.org wrote: It would be nice if someone wrote a test to roughly verify these numbers, e.v. by allocating lots of strings of a certain size and measuring the

Re: [Python-Dev] PEP 393 review

2011-08-26 Thread Stefan Behnel
Stefan Behnel, 25.08.2011 23:30: Sadly, a quick look at a couple of recent commits in the pep-393 branch suggested that it is not even always obvious to you as the authors which macros can be called safely and which cannot. I immediately spotted a bug in one of the updated core functions

Re: [Python-Dev] issue 6721 Locks in python standard library should be sanitized on fork

2011-08-26 Thread Antoine Pitrou
Hi, I think that deprecating the use of threads w/ multiprocessing - or at least crippling it is the wrong answer. Multiprocessing needs the helper threads it uses internally to manage queues, etc. Removing that ability would require a near-total rewrite, which is just a non-starter. I

[Python-Dev] Summary of Python tracker Issues

2011-08-26 Thread Python tracker
ACTIVITY SUMMARY (2011-08-19 - 2011-08-26) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open2963 (+26) closed 21665 (+35) total 24628 (+61) Open issues

Re: [Python-Dev] Planned PEP status changes

2011-08-26 Thread Brett Cannon
On Tue, Aug 23, 2011 at 19:42, Nick Coghlan ncogh...@gmail.com wrote: Unless I hear any objections, I plan to adjust the current PEP statuses as follows some time this weekend: Move from Accepted to Finished:    389  argparse - New Command Line Parsing Module              Bethard    391  

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Guido van Rossum
On Fri, Aug 26, 2011 at 2:29 AM, Martin v. Löwis mar...@v.loewis.de wrote: IronPython and Jython can retain UTF-16 as their native form if that makes interop cleaner, but in doing so they need to ensure that basic operations like indexing and len work in terms of code points, not code units,

Re: [Python-Dev] PEP 393 review

2011-08-26 Thread Martin v. Löwis
Am 26.08.2011 17:55, schrieb Stefan Behnel: Stefan Behnel, 25.08.2011 23:30: Sadly, a quick look at a couple of recent commits in the pep-393 branch suggested that it is not even always obvious to you as the authors which macros can be called safely and which cannot. I immediately spotted a

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Guido van Rossum
On Fri, Aug 26, 2011 at 3:29 AM, Stefan Behnel stefan...@behnel.de wrote: Martin v. Löwis, 26.08.2011 11:29: You seem to assume it is ok for Jython/IronPython to provide indexing in O(n). It is not. I think we can leave this discussion aside. (And yet, you keep arguing. :-) Jython and

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Paul Moore
On 26 August 2011 17:51, Guido van Rossum gu...@python.org wrote: On Fri, Aug 26, 2011 at 2:29 AM, Martin v. Löwis mar...@v.loewis.de wrote: (Regarding my comments on code point semantics) You seem to assume it is ok for Jython/IronPython to provide indexing in O(n). It is not. Indeed. On

Re: [Python-Dev] Windows installers and %PATH%

2011-08-26 Thread Andrew Pennebaker
I see that the Ruby 1.9 stable Windows installer has a checkbox to add the Ruby binaries to PATH. That would be excellent for Python. Also, there's no need to buy in to the Windows toolchain just to edit PATH. Installer software includes functionality for editing environment variables, and in any

Re: [Python-Dev] Windows installers and %PATH%

2011-08-26 Thread Andrew Pennebaker
I mentioned PYTHONROOT\Script because of the distribute package, which adds PYTHONROOT\Script\easy_install.exe. My mistake if \Script is created by distribute and not Python. Then my beef is with distribute for not adding its binaries to PATH--how else would I use easy_setup if not in a terminal?

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Guido van Rossum
On Fri, Aug 26, 2011 at 10:13 AM, Paul Moore p.f.mo...@gmail.com wrote: On 26 August 2011 18:02, Guido van Rossum gu...@python.org wrote: Eek. No, please. Those platforms' native string types have length and slicing operations that are O(1) and work in terms of 16-bit code points. Python

Re: [Python-Dev] Windows installers and %PATH%

2011-08-26 Thread Brian Curtin
On Fri, Aug 26, 2011 at 12:18, Andrew Pennebaker andrew.penneba...@gmail.com wrote: Also, there's no need to buy in to the Windows toolchain just to edit PATH. Installer software includes functionality for editing environment variables, and in any case Python has built in environment variable

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Stefan Behnel
Guido van Rossum, 26.08.2011 19:02: On Fri, Aug 26, 2011 at 3:29 AM, Stefan Behnel wrote: Besides, what if these implementations provided indexing in, say, O(log N) instead of O(1) or O(N), e.g. by building a tree index into each string? You could have an index that simply marks runs of

Re: [Python-Dev] PEP 393 review

2011-08-26 Thread Stefan Behnel
Martin v. Löwis, 26.08.2011 18:56: I agree with your observation that somebody should be done about error handling, and will update the PEP shortly. I propose that PyUnicode_Ready should be explicitly called on input where raising an exception is feasible. In contexts where it is not feasible

Re: [Python-Dev] PEP 393 review

2011-08-26 Thread Stefan Behnel
Stefan Behnel, 26.08.2011 20:28: Martin v. Löwis, 26.08.2011 18:56: I agree with your observation that somebody should be done about error handling, and will update the PEP shortly. I propose that PyUnicode_Ready should be explicitly called on input where raising an exception is feasible. In

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Guido van Rossum
I have a different question about IronPython and Jython now. Do their regular expression libraries support Unicode better than CPython's? E.g. does . match a surrogate pair? Tom C suggests that Java's regex libraries get this and many other details right despite Java's use of UTF-16 to represent

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread M.-A. Lemburg
Guido van Rossum wrote: I just made a pass of all the Unicode-related bugs filed by Tom Christiansen, and found that in several, the response was this is fixed in the regex module [by Matthew Barnett]. I started replying that I thought that we should fix the bugs in the re module (i.e.,

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Guido van Rossum
On Fri, Aug 26, 2011 at 3:09 PM, M.-A. Lemburg m...@egenix.com wrote: Guido van Rossum wrote: I just made a pass of all the Unicode-related bugs filed by Tom Christiansen, and found that in several, the response was this is fixed in the regex module [by Matthew Barnett]. I started replying

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Antoine Pitrou
On Fri, 26 Aug 2011 15:18:35 -0700 Guido van Rossum gu...@python.org wrote: I can't say I liked how that transition was handled last time around. I really don't want to have to tell people Oh, that bug is fixed but you have to use regex instead of re and then a few years later have to tell

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Guido van Rossum
On Fri, Aug 26, 2011 at 3:33 PM, Antoine Pitrou solip...@pitrou.net wrote: On Fri, 26 Aug 2011 15:18:35 -0700 Guido van Rossum gu...@python.org wrote: I can't say I liked how that transition was handled last time around. I really don't want to have to tell people Oh, that bug is fixed but

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Dan Stromberg
On Fri, Aug 26, 2011 at 2:45 PM, Guido van Rossum gu...@python.org wrote: ...but on second thought I wonder if maybe regex is mature enough to replace re in Python 3.3. I agree that the move from regex to re was kind of painful. It seems someone should merge the unit tests for re and regex,

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Martin v. Löwis
However, I don't know much about regex The problem really is: nobody does (except for Matthew Barnett probably). This means that this contribution might be stuck forever: somebody would have to review the module, identify issues, approve it, and take the blame if something breaks. That takes

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Guido van Rossum
On Fri, Aug 26, 2011 at 3:54 PM, Martin v. Löwis mar...@v.loewis.de wrote: However, I don't know much about regex The problem really is: nobody does (except for Matthew Barnett probably). This means that this contribution might be stuck forever: somebody would have to review the module,

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread M.-A. Lemburg
Guido van Rossum wrote: On Fri, Aug 26, 2011 at 3:09 PM, M.-A. Lemburg m...@egenix.com wrote: Guido van Rossum wrote: I just made a pass of all the Unicode-related bugs filed by Tom Christiansen, and found that in several, the response was this is fixed in the regex module [by Matthew

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread MRAB
On 27/08/2011 00:08, Tom Christiansen wrote: M.-A. Lemburgm...@egenix.com wrote on Sat, 27 Aug 2011 01:00:31 +0200: The good part is that it's based on the re code, the FUD comes from the fact that the new lib is 380kB larger than the old one and that's not even counting the generated

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Guido van Rossum
On Fri, Aug 26, 2011 at 4:21 PM, MRAB pyt...@mrabarnett.plus.com wrote: On 27/08/2011 00:08, Tom Christiansen wrote: M.-A. Lemburgm...@egenix.com  wrote    on Sat, 27 Aug 2011 01:00:31 +0200: The good part is that it's based on the re code, the FUD comes from the fact that the new lib is

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Tom Christiansen
M.-A. Lemburg m...@egenix.com wrote on Sat, 27 Aug 2011 01:00:31 +0200: The good part is that it's based on the re code, the FUD comes from the fact that the new lib is 380kB larger than the old one and that's not even counting the generated 500kB of lookup tables. Well, you have to put

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Terry Reedy
On 8/26/2011 5:29 AM, Martin v. Löwis wrote: IronPython and Jython can retain UTF-16 as their native form if that makes interop cleaner, but in doing so they need to ensure that basic operations like indexing and len work in terms of code points, not code units, if they are to conform. My

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Antoine Pitrou
On Fri, 26 Aug 2011 15:47:21 -0700 Guido van Rossum gu...@python.org wrote: The best way would be to contact the author, Matthew Barnett, I had added him to the beginning of this thread but someone took him off. or to ask on the tracker on http://bugs.python.org/issue2636. He has been

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Antoine Pitrou
On Sat, 27 Aug 2011 01:00:31 +0200 M.-A. Lemburg m...@egenix.com wrote: I can't say I liked how that transition was handled last time around. I really don't want to have to tell people Oh, that bug is fixed but you have to use regex instead of re and then a few years later have to tell

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Antoine Pitrou
On Fri, 26 Aug 2011 15:48:42 -0700 Dan Stromberg drsali...@gmail.com wrote: Then there probably should be a from __future__ import for a while. If you are willing to use a from __future__ import, why not simply import regex as re ? We're not Perl, we don't have built-in syntactic support

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Greg Ewing
Paul Moore wrote: IronPython and Jython can retain UTF-16 as their native form if that makes interop cleaner, but in doing so they need to ensure that basic operations like indexing and len work in terms of code points, not code units, if they are to conform. ... They lose the O(1) guarantee,

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Dan Stromberg
On Fri, Aug 26, 2011 at 5:08 PM, Antoine Pitrou solip...@pitrou.net wrote: On Fri, 26 Aug 2011 15:48:42 -0700 Dan Stromberg drsali...@gmail.com wrote: Then there probably should be a from __future__ import for a while. If you are willing to use a from __future__ import, why not simply

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Antoine Pitrou
On Sat, 27 Aug 2011 12:17:18 +1200 Greg Ewing greg.ew...@canterbury.ac.nz wrote: Paul Moore wrote: IronPython and Jython can retain UTF-16 as their native form if that makes interop cleaner, but in doing so they need to ensure that basic operations like indexing and len work in terms of

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Greg Ewing
M.-A. Lemburg wrote: Simply going with UCS-4 does not solve the problem, since even with UCS-4 storage, you can still have surrogates in your Python Unicode string. Yes, but in that case, you presumably *intend* them to be treated as separate indexing units. If you didn't, there would be no

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Guido van Rossum
On Fri, Aug 26, 2011 at 3:57 PM, Terry Reedy tjre...@udel.edu wrote: On 8/26/2011 5:29 AM, Martin v. Löwis wrote: IronPython and Jython can retain UTF-16 as their native form if that makes interop cleaner, but in doing so they need to ensure that basic operations like indexing and len work

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Ben Finney
M.-A. Lemburg m...@egenix.com writes: Guido van Rossum wrote: I really don't want to have to tell people Oh, that bug is fixed but you have to use regex instead of re and then a few years later have to tell them Oh, we're deprecating regex, you should just use re. No, you tell them:

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Ezio Melotti
On Sat, Aug 27, 2011 at 1:57 AM, Guido van Rossum gu...@python.org wrote: On Fri, Aug 26, 2011 at 3:54 PM, Martin v. Löwis mar...@v.loewis.de wrote: [...] Among us, some are more regex gurus than others; you know who you are. I guess the PSF would pay for the review, if that is what it

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Steven D'Aprano
Ben Finney wrote: M.-A. Lemburg m...@egenix.com writes: No, you tell them: If you want Unicode 6 semantics, use regex, if you're fine with Unicode 2.0/3.0 semantics, use re. What do we say, then, to those who are unaware of the different semantics between those versions of Unicode, and want

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Antoine Pitrou
On Sat, 27 Aug 2011 04:37:21 +0300 Ezio Melotti ezio.melo...@gmail.com wrote: I'm not sure it's worth doing an extensive review of the code, a better approach might be to require extensive test coverage (and a review of tests). If the code seems well written, commented, documented (I think

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Antoine Pitrou
On Fri, 26 Aug 2011 17:25:56 -0700 Dan Stromberg drsali...@gmail.com wrote: On Fri, Aug 26, 2011 at 5:08 PM, Antoine Pitrou solip...@pitrou.net wrote: On Fri, 26 Aug 2011 15:48:42 -0700 Dan Stromberg drsali...@gmail.com wrote: Then there probably should be a from __future__ import for

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Ben Finney
Steven D'Aprano st...@pearwood.info writes: Ben Finney wrote: M.-A. Lemburg m...@egenix.com writes: No, you tell them: If you want Unicode 6 semantics, use regex, if you're fine with Unicode 2.0/3.0 semantics, use re. What do we say, then, to those who are unaware of the different

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Steven D'Aprano
Ben Finney wrote: Steven D'Aprano st...@pearwood.info writes: Ben Finney wrote: M.-A. Lemburg m...@egenix.com writes: No, you tell them: If you want Unicode 6 semantics, use regex, if you're fine with Unicode 2.0/3.0 semantics, use re. What do we say, then, to those who are unaware of the

Re: [Python-Dev] Should we move to replace re with regex?

2011-08-26 Thread Steven D'Aprano
Antoine Pitrou wrote: On Fri, 26 Aug 2011 17:25:56 -0700 Dan Stromberg drsali...@gmail.com wrote: [...] If you add regex as import regex, and the new regex module doesn't work out, regex might be harder to get rid of. from __future__ import is an established way of trying something for a

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-26 Thread Terry Reedy
On 8/26/2011 8:42 PM, Guido van Rossum wrote: On Fri, Aug 26, 2011 at 3:57 PM, Terry Reedytjre...@udel.edu wrote: My impression is that a UFT-16 implementation, to be properly called such, must do len and [] in terms of code points, which is why Python's narrow builds are called UCS-2 and