On 8/26/2011 9:56 PM, Antoine Pitrou wrote:
Another "interesting" question is whether it's easy to port to the PEP
393 string representation, if it gets accepted.
Will the re module need porting also?
--
Terry Jan Reedy
___
Python-Dev mailing list
> I can't either, but ISTR hearing that from __future__ import was started
> with such an intent.
No, not at all. The original intention was to enable features that would
definitely would be added, not just right now. Tim Peters always
objected to claims that future imports were talking about pro
On 8/26/2011 8:23 PM, Antoine Pitrou wrote:
I would only agree as long as it wasn't too much worse
than O(1). O(log n) might be all right, but O(n) would be
unacceptable, I think.
It also depends a lot on *actual* measured performance
Amen. Some regard O(n*n) sorts to be, by definition, 'wor
> I'm not sure it's worth doing an extensive review of the code, a better
> approach might be to require extensive test coverage (and a review of
> tests).
I think it's worth. It's really bad if only one developer fully
understands the regex implementation.
Regards,
Martin
__
On Fri, Aug 26, 2011 at 8:47 PM, Steven D'Aprano wrote:
> Antoine Pitrou wrote:
>
>> On Fri, 26 Aug 2011 17:25:56 -0700
>> Dan Stromberg wrote:
>>
> If you add regex as "import regex", and the new regex module doesn't work
>
>> out, regex might be harder to get rid of. from __future__ import is
On Aug 26, 2011, at 8:51 PM, Terry Reedy wrote:
>
>
> On 8/26/2011 8:42 PM, Guido van Rossum wrote:
>> On Fri, Aug 26, 2011 at 3:57 PM, Terry Reedy wrote:
>
>>> My impression is that a UFT-16 implementation, to be properly called such,
>>> must do len and [] in terms of code points, which is
Am 23.08.2011 01:09, schrieb Sandro Tosi:
> Hi all,
>
>> Any chance the version of sphinx used to generate the docs on
>> docs.python.org could be updated?
>
> I'd like to discuss this aspect, in particular for the implication it
> has on http://bugs.python.org/issue12409 .
>
> Personally, I do
On 8/26/2011 8:42 PM, Guido van Rossum wrote:
On Fri, Aug 26, 2011 at 3:57 PM, Terry Reedy wrote:
My impression is that a UFT-16 implementation, to be properly called such,
must do len and [] in terms of code points, which is why Python's narrow
builds are called UCS-2 and not UTF-16.
I d
Antoine Pitrou wrote:
On Fri, 26 Aug 2011 17:25:56 -0700
Dan Stromberg wrote:
[...]
If you add regex as "import regex", and the new regex module doesn't work
out, regex might be harder to get rid of. from __future__ import is an
established way of trying something for a while to see if it's g
Ben Finney wrote:
Steven D'Aprano writes:
Ben Finney wrote:
"M.-A. Lemburg" writes:
No, you tell them: "If you want Unicode 6 semantics, use regex, if
you're fine with Unicode 2.0/3.0 semantics, use re".
What do we say, then, to those who are unaware of the different
semantics between thos
Steven D'Aprano writes:
> Ben Finney wrote:
> > "M.-A. Lemburg" writes:
>
> >> No, you tell them: "If you want Unicode 6 semantics, use regex, if
> >> you're fine with Unicode 2.0/3.0 semantics, use re".
> >
> > What do we say, then, to those who are unaware of the different
> > semantics betwee
On Fri, 26 Aug 2011 17:25:56 -0700
Dan Stromberg wrote:
> On Fri, Aug 26, 2011 at 5:08 PM, Antoine Pitrou wrote:
>
> > On Fri, 26 Aug 2011 15:48:42 -0700
> > Dan Stromberg wrote:
> > >
> > > Then there probably should be a from __future__ import for a while.
> >
> > If you are willing to use a
On Sat, 27 Aug 2011 04:37:21 +0300
Ezio Melotti wrote:
>
> I'm not sure it's worth doing an extensive review of the code, a better
> approach might be to require extensive test coverage (and a review of
> tests). If the code seems well written, commented, documented (I think
> proper rst docume
Ben Finney wrote:
"M.-A. Lemburg" writes:
No, you tell them: "If you want Unicode 6 semantics, use regex, if
you're fine with Unicode 2.0/3.0 semantics, use re".
What do we say, then, to those who are unaware of the different
semantics between those versions of Unicode, and want regular exp
On Sat, Aug 27, 2011 at 1:57 AM, Guido van Rossum wrote:
> On Fri, Aug 26, 2011 at 3:54 PM, "Martin v. Löwis"
> wrote:
> > [...]
> > Among us, some are more "regex gurus" than others; you know
> > who you are. I guess the PSF would pay for the review, if that
> > is what it would take.
>
> Makes
"M.-A. Lemburg" writes:
> Guido van Rossum wrote:
> > I really don't want to have to tell people "Oh, that bug is fixed
> > but you have to use regex instead of re" and then a few years later
> > have to tell them "Oh, we're deprecating regex, you should just use
> > re".
>
> No, you tell them:
On Fri, Aug 26, 2011 at 3:57 PM, Terry Reedy wrote:
>
>
> On 8/26/2011 5:29 AM, "Martin v. Löwis" wrote:
>>>
>>> IronPython and Jython can retain UTF-16 as their native form if that
>>> makes interop cleaner, but in doing so they need to ensure that basic
>>> operations like indexing and len work
M.-A. Lemburg wrote:
Simply going with UCS-4 does not solve the problem, since
even with UCS-4 storage, you can still have surrogates in your
Python Unicode string.
Yes, but in that case, you presumably *intend* them to
be treated as separate indexing units. If you didn't,
there would be no nee
On Sat, 27 Aug 2011 12:17:18 +1200
Greg Ewing wrote:
> Paul Moore wrote:
>
> > IronPython and Jython can retain UTF-16 as their native form if that
> > makes interop cleaner, but in doing so they need to ensure that basic
> > operations like indexing and len work in terms of code points, not
> >
On Fri, Aug 26, 2011 at 5:08 PM, Antoine Pitrou wrote:
> On Fri, 26 Aug 2011 15:48:42 -0700
> Dan Stromberg wrote:
> >
> > Then there probably should be a from __future__ import for a while.
>
> If you are willing to use a "from __future__ import", why not simply
>
>import regex as re
>
> ?
Paul Moore wrote:
IronPython and Jython can retain UTF-16 as their native form if that
makes interop cleaner, but in doing so they need to ensure that basic
operations like indexing and len work in terms of code points, not
code units, if they are to conform. ... They lose the O(1)
guarantee, bu
On Fri, 26 Aug 2011 15:48:42 -0700
Dan Stromberg wrote:
>
> Then there probably should be a from __future__ import for a while.
If you are willing to use a "from __future__ import", why not simply
import regex as re
? We're not Perl, we don't have built-in syntactic support for regular
exp
On Sat, 27 Aug 2011 01:00:31 +0200
"M.-A. Lemburg" wrote:
> >
> > I can't say I liked how that transition was handled last time around.
> > I really don't want to have to tell people "Oh, that bug is fixed but
> > you have to use regex instead of re" and then a few years later have
> > to tell th
On Fri, 26 Aug 2011 15:47:21 -0700
Guido van Rossum wrote:
> > The best way would be to contact the author, Matthew Barnett,
>
> I had added him to the beginning of this thread but someone took him off.
>
> > or to ask
> > on the tracker on http://bugs.python.org/issue2636. He has been quite
> >
On 8/26/2011 5:29 AM, "Martin v. Löwis" wrote:
IronPython and Jython can retain UTF-16 as their native form if that
makes interop cleaner, but in doing so they need to ensure that basic
operations like indexing and len work in terms of code points, not
code units, if they are to conform.
My i
"M.-A. Lemburg" wrote
on Sat, 27 Aug 2011 01:00:31 +0200:
> The good part is that it's based on the re code, the FUD comes
> from the fact that the new lib is 380kB larger than the old one
> and that's not even counting the generated 500kB of lookup
> tables.
Well, you have to put the proper
On Fri, Aug 26, 2011 at 4:21 PM, MRAB wrote:
> On 27/08/2011 00:08, Tom Christiansen wrote:
>>
>> "M.-A. Lemburg" wrote
>> on Sat, 27 Aug 2011 01:00:31 +0200:
>>
>>> The good part is that it's based on the re code, the FUD comes
>>> from the fact that the new lib is 380kB larger than the old o
On 27/08/2011 00:08, Tom Christiansen wrote:
"M.-A. Lemburg" wrote
on Sat, 27 Aug 2011 01:00:31 +0200:
The good part is that it's based on the re code, the FUD comes
from the fact that the new lib is 380kB larger than the old one
and that's not even counting the generated 500kB of lookup
t
Guido van Rossum wrote:
> On Fri, Aug 26, 2011 at 3:09 PM, M.-A. Lemburg wrote:
>> Guido van Rossum wrote:
>>> I just made a pass of all the Unicode-related bugs filed by Tom
>>> Christiansen, and found that in several, the response was "this is
>>> fixed in the regex module [by Matthew Barnett]".
On Fri, Aug 26, 2011 at 3:54 PM, "Martin v. Löwis" wrote:
>> However, I don't know much about regex
>
> The problem really is: nobody does (except for Matthew Barnett
> probably). This means that this contribution might be stuck
> "forever": somebody would have to review the module, identify
> iss
> However, I don't know much about regex
The problem really is: nobody does (except for Matthew Barnett
probably). This means that this contribution might be stuck
"forever": somebody would have to review the module, identify
issues, approve it, and take the blame if something breaks.
That takes c
On Fri, Aug 26, 2011 at 2:45 PM, Guido van Rossum wrote:
> ...but on second thought I wonder if maybe regex is
> mature enough to replace re in Python 3.3.
>
I agree that the move from regex to re was kind of painful.
It seems someone should merge the unit tests for re and regex, and apply the
On Fri, Aug 26, 2011 at 3:33 PM, Antoine Pitrou wrote:
> On Fri, 26 Aug 2011 15:18:35 -0700
> Guido van Rossum wrote:
>>
>> I can't say I liked how that transition was handled last time around.
>> I really don't want to have to tell people "Oh, that bug is fixed but
>> you have to use regex inste
On Fri, 26 Aug 2011 15:18:35 -0700
Guido van Rossum wrote:
>
> I can't say I liked how that transition was handled last time around.
> I really don't want to have to tell people "Oh, that bug is fixed but
> you have to use regex instead of re" and then a few years later have
> to tell them "Oh, w
On Fri, Aug 26, 2011 at 3:09 PM, M.-A. Lemburg wrote:
> Guido van Rossum wrote:
>> I just made a pass of all the Unicode-related bugs filed by Tom
>> Christiansen, and found that in several, the response was "this is
>> fixed in the regex module [by Matthew Barnett]". I started replying
>> that I
Guido van Rossum wrote:
> I just made a pass of all the Unicode-related bugs filed by Tom
> Christiansen, and found that in several, the response was "this is
> fixed in the regex module [by Matthew Barnett]". I started replying
> that I thought that we should fix the bugs in the re module (i.e.,
>
I have a different question about IronPython and Jython now. Do their
regular expression libraries support Unicode better than CPython's?
E.g. does "." match a surrogate pair? Tom C suggests that Java's regex
libraries get this and many other details right despite Java's use of
UTF-16 to represent
Le vendredi 26 août 2011 02:01:42, Dino Viehland a écrit :
> The biggest difficulty for IronPython here would be dealing w/ .NET
> interop. We can certainly introduce either an IronPython specific string
> class which is similar to CPython's PyUnicodeObject or we could have
> multiple distinct .NET
I just made a pass of all the Unicode-related bugs filed by Tom
Christiansen, and found that in several, the response was "this is
fixed in the regex module [by Matthew Barnett]". I started replying
that I thought that we should fix the bugs in the re module (i.e.,
really in _sre.c) but on second t
Stefan Behnel, 26.08.2011 20:28:
"Martin v. Löwis", 26.08.2011 18:56:
I agree with your observation that somebody should be done about error
handling, and will update the PEP shortly. I propose that
PyUnicode_Ready should be explicitly called on input where raising an
exception is feasible. In c
"Martin v. Löwis", 26.08.2011 18:56:
I agree with your observation that somebody should be done about error
handling, and will update the PEP shortly. I propose that
PyUnicode_Ready should be explicitly called on input where raising an
exception is feasible. In contexts where it is not feasible (
Guido van Rossum, 26.08.2011 19:02:
On Fri, Aug 26, 2011 at 3:29 AM, Stefan Behnel wrote:
Besides, what if these implementations provided indexing in, say, O(log N)
instead of O(1) or O(N), e.g. by building a tree index into each string? You
could have an index that simply marks runs of surrogat
On Fri, Aug 26, 2011 at 12:18, Andrew Pennebaker <
andrew.penneba...@gmail.com> wrote:
> Also, there's no need to "buy in" to the Windows toolchain just to edit
> PATH. Installer software includes functionality for editing environment
> variables, and in any case Python has built in environment va
On Fri, Aug 26, 2011 at 10:13 AM, Paul Moore wrote:
> On 26 August 2011 18:02, Guido van Rossum wrote:
>
>> Eek. No, please. Those platforms' native string types have length and
>> slicing operations that are O(1) and work in terms of 16-bit code
>> points. Python should use those. It would be aw
I mentioned PYTHONROOT\Script because of the distribute package, which adds
PYTHONROOT\Script\easy_install.exe.
My mistake if \Script is created by distribute and not Python. Then my beef
is with distribute for not adding its binaries to PATH--how else would I use
easy_setup if not in a terminal?
I see that the Ruby 1.9 stable Windows installer has a checkbox to add the
Ruby binaries to PATH. That would be excellent for Python.
Also, there's no need to "buy in" to the Windows toolchain just to edit
PATH. Installer software includes functionality for editing environment
variables, and in an
On 26 August 2011 17:51, Guido van Rossum wrote:
> On Fri, Aug 26, 2011 at 2:29 AM, "Martin v. Löwis" wrote:
(Regarding my comments on code point semantics)
>> You seem to assume it is ok for Jython/IronPython to provide indexing in
>> O(n). It is not.
>
> Indeed.
On 26 August 2011 18:02, Gui
On Fri, Aug 26, 2011 at 3:29 AM, Stefan Behnel wrote:
> "Martin v. Löwis", 26.08.2011 11:29:
>>
>> You seem to assume it is ok for Jython/IronPython to provide indexing in
>> O(n). It is not.
>
> I think we can leave this discussion aside.
(And yet, you keep arguing. :-)
> Jython and IronPython
Am 26.08.2011 17:55, schrieb Stefan Behnel:
> Stefan Behnel, 25.08.2011 23:30:
>> Sadly, a quick look at a couple of recent commits in the pep-393 branch
>> suggested that it is not even always obvious to you as the authors which
>> macros can be called safely and which cannot. I immediately spotte
On Fri, Aug 26, 2011 at 2:29 AM, "Martin v. Löwis" wrote:
>> IronPython and Jython can retain UTF-16 as their native form if that
>> makes interop cleaner, but in doing so they need to ensure that basic
>> operations like indexing and len work in terms of code points, not
>> code units, if they ar
On Tue, Aug 23, 2011 at 19:42, Nick Coghlan wrote:
> Unless I hear any objections, I plan to adjust the current PEP
> statuses as follows some time this weekend:
>
> Move from Accepted to Finished:
>
> 389 argparse - New Command Line Parsing Module Bethard
> 391 Dictionary-Bas
ACTIVITY SUMMARY (2011-08-19 - 2011-08-26)
Python tracker at http://bugs.python.org/
To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.
Issues counts and deltas:
open2963 (+26)
closed 21665 (+35)
total 24628 (+61)
Open issues wit
Hi,
> I think that "deprecating" the use of threads w/ multiprocessing - or
> at least crippling it is the wrong answer. Multiprocessing needs the
> helper threads it uses internally to manage queues, etc. Removing that
> ability would require a near-total rewrite, which is just a
> non-starter.
Stefan Behnel, 25.08.2011 23:30:
Sadly, a quick look at a couple of recent commits in the pep-393 branch
suggested that it is not even always obvious to you as the authors which
macros can be called safely and which cannot. I immediately spotted a bug
in one of the updated core functions (unicode
Also, please add the table (and the reasoning that led to it) to the PEP.
On Fri, Aug 26, 2011 at 7:55 AM, Guido van Rossum wrote:
> It would be nice if someone wrote a test to roughly verify these
> numbers, e.v. by allocating lots of strings of a certain size and
> measuring the process size be
It would be nice if someone wrote a test to roughly verify these
numbers, e.v. by allocating lots of strings of a certain size and
measuring the process size before and after (being careful to adjust
for the list or other data structure required to keep those objects
alive).
--Guido
On Fri, Aug 2
On Fri, Aug 26, 2011 at 3:18 AM, Nir Aides wrote:
> Another face of the discussion is about whether to deprecate the mixing of
> the threading and processing modules and what to do about the
> multiprocessing module which is implemented with worker threads.
There's a bug open - http://bugs.python
On Thu, Aug 25, 2011 at 23:04, Andrew Pennebaker <
andrew.penneba...@gmail.com> wrote:
> Please have the Windows installers add the Python installation directory to
> the PATH environment variable.
The http://bugs.python.org bug tracker is a better place for feature
requests like this, of which
On Fri, 26 Aug 2011 14:52:07 +1000
Nick Coghlan wrote:
> Windows is a developer hostile platform unless you completely buy into
> the Microsoft toolchain, which is not an option for cross-platform
> projects like Python.
We already buy into the MS toolchain since we require Visual Studio (or
at l
Antoine Pitrou, 26.08.2011 12:51:
Why would PEP 393 apply to other implementations than CPython?
Not the PEP itself, just the implications of the result.
The question was whether the language specification in a post PEP-393 can
(and if so, should) be changed into requiring unicode objects to
Why would PEP 393 apply to other implementations than CPython?
Regards
Antoine.
On Fri, 26 Aug 2011 00:01:42 +
Dino Viehland wrote:
> Guido wrote:
> > Which reminds me. The PEP does not say what other Python
> > implementations besides CPython should do. presumably Jython and
> > IronPyt
> But strings are allocated via PyObject_Malloc(), i.e. the custom
> arena-based allocator -- isn't its overhead (for small objects) less
> than 2 pointers per block?
Ah, right, I missed that. Indeed, those have no header, and the only
overhead is the padding to a multiple of 8.
That shifts the p
"Martin v. Löwis", 26.08.2011 11:29:
You seem to assume it is ok for Jython/IronPython to provide indexing in
O(n). It is not.
I think we can leave this discussion aside. Jython and IronPython have
their own platform specific constraints to which they need to adapt their
implementation. For a
> IronPython and Jython can retain UTF-16 as their native form if that
> makes interop cleaner, but in doing so they need to ensure that basic
> operations like indexing and len work in terms of code points, not
> code units, if they are to conform.
That means that they won't conform, period. Ther
On Fri, Aug 26, 2011 at 5:59 AM, Guido van Rossum wrote:
> On Thu, Aug 25, 2011 at 7:28 PM, Isaac Morland
> wrote:
> > On Thu, 25 Aug 2011, Guido van Rossum wrote:
> >
> >> I'm not sure what should happen with UTF-8 when it (in flagrant
> >> violation of the standard, I presume) contains two sep
Stefan Behnel wrote:
> Isaac Morland, 26.08.2011 04:28:
>> On Thu, 25 Aug 2011, Guido van Rossum wrote:
>>> I'm not sure what should happen with UTF-8 when it (in flagrant
>>> violation of the standard, I presume) contains two separately-encoded
>>> surrogates forming a valid surrogate pair; probab
On 26 August 2011 03:52, Guido van Rossum wrote:
> I know that by now I am repeating myself, but I think it would be
> really good if we could get rid of this ambiguity. PEP 393 seems the
> best way forward, even if it doesn't directly address what to do for
> IronPython or Jython, both of which h
Another face of the discussion is about whether to deprecate the mixing of
the threading and processing modules and what to do about the
multiprocessing module which is implemented with worker threads.
On Tue, Aug 23, 2011 at 11:29 PM, Antoine Pitrou wrote:
> Le mardi 23 août 2011 à 22:07 +0200
68 matches
Mail list logo