Re: [Python-Dev] versioned .so files for Python 3.2

2010-07-08 Thread Georg Brandl
Am 07.07.2010 23:04, schrieb Georg Brandl:
> Am 07.07.2010 20:40, schrieb Barry Warsaw:
> 
>> Getting back to this after the US holiday.  Thanks for running these numbers
>> Scott.  I've opened a bug in the Python tracker and attached my latest patch:
>> 
>> http://bugs.python.org/issue9193
>> 
>> The one difference from previous versions of the patch is that the .so tag is
>> now settable via "./configure --with-so-abi-tag=foo".  This would generate
>> shared libs like _multiprocessing.foo.so.
>> 
>> I'd like to get consensus as to whether folks feel that a PEP is needed.  My
>> own thought is that I'd rather not do a PEP specific to this change, but I
>> would update PEP 384 with the implications on .so versioning.  Please also
>> feel free to review the patch in that issue.
> 
> I can see where this is going... writing it into PEP 384 would automatically 
> get
> the change accepted?

I hit "Send" prematurely.  I wanted to add that I'd be okay with this change,
be it in a new PEP or an old one.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] query: docstring formatting in python distutils code

2010-07-08 Thread Martin Geisler
"Stephen J. Turnbull"  writes:

> Benjamin Peterson writes:
>  > 2010/7/7 Stephen J. Turnbull :
>  > > Antoine Pitrou writes:
>  > >
>  > >  > >   http://selenic.com/hg/file/tip/mercurial/minirst.py
>  > >  >
>  > >  > Given that Mercurial is GPL, this is probably of no use to us,
>  > >  > unfortunately.

I must admit that reading this felt strange somehow... that a piece of
open source code should be useless. But I understand what you mean :)

>  > > Given that Martin apparently is the only or main author, I don't
>  > > see a problem as long as he's willing.
>  > 
>  > And he hasn't assigned the copyright away.
>
> (Or that the assignment has an automatic author-use-ok clause like the
> standard FSF assignment does, etc.)

We don't assign copyright in Mercurial, so this should be no problem.
This also meant that we had to contact about 300 guys when changing from
GPLv2 to GPLv2+.

> Just ask Martin, there are too many possibilities here to worry about.
> If maybe we want it, and he is willing to contribute the parts he
> wrote to Python under Python's license, then we can worry about
> whether we really want it and about how much any required hoop-jumping
> will cost.

I would be happy to relicense it under the Python license.

-- 
Martin Geisler

aragost Trifork: Professional Mercurial support
http://aragost.com/mercurial/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 2986: difflib.SequenceMatcher is partly broken

2010-07-08 Thread Antoine Pitrou
On Wed, 07 Jul 2010 21:04:17 -0400
Terry Reedy  wrote:
> 
> In other words, I see three options for 2.7.1+:
[...]

I don't think 2.7 should get any change at all here. Only 3.2 should be
modified. As Tim said, difflib works ok for its intended use (regular
text diffs). Making it work for other uses is a new feature, not a
bugfix.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python equivalents in stdlib Was: Include datetime.py in stdlib or not?

2010-07-08 Thread Antoine Pitrou
On Wed, 07 Jul 2010 21:45:30 -0400
Terry Reedy  wrote:
> 
> > Except that ctypes doesn't help provide C extensions at all. It only
> > helps provide wrappers around existing C libraries, which is quite a
> > different thing.
> > Which, in the end, makes the original suggestion meaningless.
> 
> To you, so let me restate it. It would be easier for many people to only 
> rewrite, for instance,  difflib.SequenceMatcher.get_longest_matching in 
> C than to rewrite the whole SequenceMatcher class, let alone the whole 
> difflib module.

And you still haven't understood my point. ctypes doesn't allow you to
write any C code, only to interface with existing C code. So,
regardless of whether get_longest_matching() is a function or method,
it would have to be written in C manually, and that would certainly be
in an extension module.
(admittedly, you can instead make a pure C library with such a function
and then wrap it with ctypes, but I don't see the point: you still have
to write most C code yourself)

> I got the impression from the datetime issue tracker discussion that it 
> is not possible to replace a single method of a Python-coded class with 
> a C version.

And that's a wrong impression. Inheritance allows you to do that (see
Michael's answer).
Besides, you can also code that method as a helper function. It is not
difficult to graft a function from a module into another module.

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] query: docstring formatting in python distutils code

2010-07-08 Thread Steve Holden
Martin Geisler wrote:
> "Stephen J. Turnbull"  writes:
> 
>> Benjamin Peterson writes:
>>  > 2010/7/7 Stephen J. Turnbull :
>>  > > Antoine Pitrou writes:
>>  > >
>>  > >  > >   http://selenic.com/hg/file/tip/mercurial/minirst.py
>>  > >  >
>>  > >  > Given that Mercurial is GPL, this is probably of no use to us,
>>  > >  > unfortunately.
> 
> I must admit that reading this felt strange somehow... that a piece of
> open source code should be useless. But I understand what you mean :)
> 
>>  > > Given that Martin apparently is the only or main author, I don't
>>  > > see a problem as long as he's willing.
>>  > 
>>  > And he hasn't assigned the copyright away.
>>
>> (Or that the assignment has an automatic author-use-ok clause like the
>> standard FSF assignment does, etc.)
> 
> We don't assign copyright in Mercurial, so this should be no problem.
> This also meant that we had to contact about 300 guys when changing from
> GPLv2 to GPLv2+.
> 
>> Just ask Martin, there are too many possibilities here to worry about.
>> If maybe we want it, and he is willing to contribute the parts he
>> wrote to Python under Python's license, then we can worry about
>> whether we really want it and about how much any required hoop-jumping
>> will cost.
> 
> I would be happy to relicense it under the Python license.
> 

I believe the ideal outcome, if it is possible, is for you to sign a
contributor agreement. This will license your material to the PSF in
such a way that we can release it under whatever license we deem necessary.

regards
 Steve
-- 
Steve Holden   +1 571 484 6266   +1 800 494 3119
DjangoCon US September 7-9, 2010http://djangocon.us/
See Python Video!   http://python.mirocommunity.org/
Holden Web LLC http://www.holdenweb.com/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 2986: difflib.SequenceMatcher is partly broken

2010-07-08 Thread Tim Peters
[Antoine Pitrou]
> I don't think 2.7 should get any change at all here. Only 3.2 should be
> modified. As Tim said, difflib works ok for its intended use (regular
> text diffs).

That was the use case that drove the implementation, but it's going
too far to say that was the only "intended" case.  I believe (but
can't prove) that remains the most common use (& overwhelmingly so),
but it was indeed _intended_ to work for any sequences of hashable
elements.

And it always did, and it still does, in the sense that it computes a
diff that transforms the first sequence into the second sequence.  The
problem is that I introduced a heuristic speedup with the primary use
case in mind that turned out to vastly damage the _quality_ of the
results for some other uses (a correct diff isn't necessarily a useful
diff - for example, "delete the entire sequence you started with, then
insert the entire new sequence" is a correct diff for any pair of
input sequences, but not a useful diff for most purposes).

> Making it work for other uses is a new feature, not a bugfix.

Definitely not a new feature.  These other cases used to deliver much
better diffs, before I introduced the heuristic in question.  People
with these other cases are asking for a way to get the results they
used to get - and we know that's so because a few figured out they get
what they want just by (in effect) reverting the checkin (made about 8
years ago) that _introduced_ the heuristic.  So they're looking for a
way to restore older behavior, not to introduce new behavior.  Of
course this is obscured by that the change happened so long ago that I
bet most of them don't know at first that it _was_ the old behavior.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python equivalents in stdlib Was: Include datetime.py in stdlib or not?

2010-07-08 Thread Nick Coghlan
On Thu, Jul 8, 2010 at 9:13 AM, Benjamin Peterson  wrote:
> 2010/7/7 Nick Coghlan :
>> On Thu, Jul 8, 2010 at 7:56 AM, Michael Foord  
>> wrote:
>>> Using a class decorator to duplicate each _test_ into two test_* methods
>>> sounds  like a good approach.
>>
>> Note that parameterised methods have a similar problem to
>> parameterised modules - unittest results are reported in terms of
>> "testmodule.testclass.testfunction", so proper attribution of results
>> in the test output will require additional work. The separate
>> subclasses approach doesn't share this issue, since it changes the
>> value of the second item in accordance with the module under test.
>
> A good parameterized implementation, though, gives the repr() of the
> parameters in failure output.

That would qualify as "additional work" if your tests aren't already
set up that way (and doesn't cover the case of unexpected exceptions
in a test, where the test method doesn't get to say much about the way
the error is reported).

I realised during the day that my suggested approach was more
complicated than is actually necessary - once the existing tests have
been moved to a separate module, *that test module* can itself be
imported twice, once with the python version of the module to be
tested and once with the C version. You can then do some hackery to
distinguish the test classes without having to modify the test code
itself (note, the below code should work in theory, but isn't actually
tested):

=
py_module_tests = support.import_fresh_module('moduletester',
fresh=['modulename'], blocked=['_modulename'])
c_module_tests = support.import_fresh_module('moduletester',
fresh=['modulename', '_modulename'])

test_modules = [py_module_tests, c_module_tests]
suffixes = ["_Py", "_C"]

for module, suffix in zip(test_modules, suffixes):
for obj in module.itervalues():
if isinstance(obj, unittest,TestCase):
obj.__name__ += suffix
setattr(module, obj.__name__, obj)

def test_main():
for module in test_modules:
module.test_main()
=

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python equivalents in stdlib Was: Include datetime.py in stdlib or not?

2010-07-08 Thread Nick Coghlan
On Fri, Jul 9, 2010 at 12:59 AM, Nick Coghlan  wrote:
>    for obj in module.itervalues():
>        if isinstance(obj, unittest,TestCase):

Hmm, isn't there a never-quite-made-it-into-the-Zen line about "syntax
shall not look like grit on Tim's monitor"? (s/,/./ in that second
line)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] query: docstring formatting in python distutils code

2010-07-08 Thread Martin Geisler
Steve Holden  writes:

> Martin Geisler wrote:
>> "Stephen J. Turnbull"  writes:
>> 
>>> Just ask Martin, there are too many possibilities here to worry
>>> about. If maybe we want it, and he is willing to contribute the
>>> parts he wrote to Python under Python's license, then we can worry
>>> about whether we really want it and about how much any required
>>> hoop-jumping will cost.
>> 
>> I would be happy to relicense it under the Python license.
>
> I believe the ideal outcome, if it is possible, is for you to sign a
> contributor agreement. This will license your material to the PSF in
> such a way that we can release it under whatever license we deem
> necessary.

Sure, I'll be happy to sign a contributor agreement if you guys think it
worthwhile to use my little parser and formatter.

-- 
Martin Geisler

aragost Trifork -- Professional Mercurial support
http://aragost.com/mercurial/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Call for Applications - PSF Sponsored Sprints

2010-07-08 Thread Jesse Noller
[Sending this to Python-Dev, as it might very well be of interest as the sprints
are focused on Python "core" tasks]

The PSF is happy to open our first call for applications for sprint funding!

Have you ever had a group of people together to hack towards a common goal?
You've hosted a sprint!

Have you ever wanted to get a group of like minded Pythonistas together to hack
for a day? You're going to want to hold a sprint!

Whether you call them Sprints, Hackfests, Hack-a-thons, or any other name,
they're a great way to hang out with like-minded developers and work on common
code. Sprints are an unbeatable way to build friendships and contacts that will
last for years to come, and they're a great way to learn about something new if
you're just starting out.

The Python Software Foundation has set aside funds to be distributed to
world-wide sprint efforts. We're anticipating 2-3 events per month focused on
covering topics to help the entire community:

 - Python Core bug triage and patch submission (on-boarding new contributors)
 - Python Core documentation (including process documentation) improvements
 - Porting libraries/applications to Python 3
 - Python website/wiki content improvements
 - PyPI packaging hosting site improvements
 - Contribution to other "core" projects, such as packaging related issues.

If you are interested in holding a sprint on any of the topics above and you're
looking for some money to help out with sprint costs, we can help (up to a max
of $250 USD). Prepare an application including the following information:

 - Date and Location: Where will the event be? What day and time?
 - Organizers: Who are the event organizers and sprint coach? Is the sprint
   being run by a Python user group?
 - Attendees: How many participants do you expect?
 - Goal: What is the focus and goal of the sprint?
 - Budget: How much funding you are requesting, and what will you use it for?
 - Applications should be sent to: spri...@python.org with the subject "Sprint
   Funding Application - "

We encourage anyone - even those who have never held, or been to a sprint - to
consider holding one. We will help you as much as we can with welcome packets,
advertising, and hooking you up with required resources - anything to make it
possible.

As part of being approved, the you will need to agree to deliver a report
(hopefully, with pictures!) of the sprint to the Sprint Committee, so we can
post it on the sprint blog and site:

http://www.pythonsprints.com

If you have any questions or need more information, contact us by email at
spri...@python.org.

More information is up on our blog:
http://pythonsprints.com/2010/07/8/call-applications-now-open/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Include datetime.py in stdlib or not?

2010-07-08 Thread Brett Cannon
On Wed, Jul 7, 2010 at 15:17, Terry Reedy  wrote:
> On 7/7/2010 3:32 PM, Brett Cannon wrote:
>
>> That's the idea. We already have contributors from the various VMs who
>> has commit privileges, but they all work in their own repos for
>> convenience. My hope is that if we break the stdlib out into its own
>> repository that people simply pull in then other VM contributors will
>> work directly off of the stdlib repo instead of their own, magnifying
>> the usefulness of their work.
>
> I was wondering if you had more than 'hope', but thinking about it now, I
> think it premature to ask for commitments. Once a Python3 stdlib hg
> subrepository is set up and running, the logic of joining in should be
> obvious -- or not.

I can say that all the VM representatives have all said they like the idea.

-Brett


>
> I am now seeing that a more complete common Python-level test suite is also
> important. Being able to move Python code, that only uses the
> stdlibk,between implementations and have it just work would be good for all
> of them.
>
>>> 3. What version of Python would be allowed for use in the stdlib? I would
>>> like the stdlib for 3.x to be able to use 3.x code. This would be only a
>>> minor concern for CPython as long as 2.7 is maintained, but a major
>>> concern
>>> for the other implementation currently 'stuck' in 2.x only. A good 3to2
>>> would be needed.
>>
>> This will only affect py3k.
>
> Good. The Python3 stdlib should gradually become modern Python3 code. (An
> example archaism -- the use in difflib of dicts with arbitrary values used
> as sets -- which I plan to fix.)
>
>>> I generally favor having Python versions of modules available. My current
>>> post on difflib.SequenceMatcher is based on experiments with an altered
>>> version. I copied difflib.py to my test directory, renamed it
>>> diff2lib.py,
>>> so I could import both versions, found and edited the appropriate method,
>>> and off I went. If difflib were in C, my post would have been based on
>>> speculation about how a fixed version would operate, rather than on data.
>>>
>>
>> The effect upon CPython would be the extension modules become just
>> performance improvements, nothing more (unless they have to be in C as
>> in the case for sqlite3).
>
> As pre- and jit compilation improve, the need for hand-coded C will go down.
> For instance, annotate (in a branch, not trunk) and compile with Cython.
>
>>> 4. Does not ctypes make it possible to replace a method of a Python-coded
>>> class with a faster C version, with something like
>>>  try:
>>>    connect to methods.dll
>>>    check that function xyx exists
>>>    replace Someclass.xyy with ctypes wrapper
>>>  except: pass
>>> For instance, the SequenceMatcher heuristic was added to speedup the
>>> matching process that I believe is encapsulated in one O(n**2) or so
>>> bottleneck method. I believe most everything else is O(n) bookkeeping.
>
>> There is no need to go that far. All one needs to do is structure the
>> extension code such that when the extension module is imported, it
>> overrides key objects in the Python version.
>
> Is it possible to replace a python-coded function in a python-coded class
> with a C-coded function? I had the impression from the issue discussion that
> one would have to recode the entire class, even if only a single method
> really needed it.
>
>> Using ctypes is just added complexity.
>
> Only to be used if easier than extra C coding.
>
> --
> Terry Jan Reedy
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python equivalents in stdlib Was: Include datetime.py in stdlib or not?

2010-07-08 Thread Brett Cannon
On Thu, Jul 8, 2010 at 07:59, Nick Coghlan  wrote:
> On Thu, Jul 8, 2010 at 9:13 AM, Benjamin Peterson  wrote:
>> 2010/7/7 Nick Coghlan :
>>> On Thu, Jul 8, 2010 at 7:56 AM, Michael Foord  
>>> wrote:
 Using a class decorator to duplicate each _test_ into two test_* methods
 sounds  like a good approach.
>>>
>>> Note that parameterised methods have a similar problem to
>>> parameterised modules - unittest results are reported in terms of
>>> "testmodule.testclass.testfunction", so proper attribution of results
>>> in the test output will require additional work. The separate
>>> subclasses approach doesn't share this issue, since it changes the
>>> value of the second item in accordance with the module under test.
>>
>> A good parameterized implementation, though, gives the repr() of the
>> parameters in failure output.
>
> That would qualify as "additional work" if your tests aren't already
> set up that way (and doesn't cover the case of unexpected exceptions
> in a test, where the test method doesn't get to say much about the way
> the error is reported).
>
> I realised during the day that my suggested approach was more
> complicated than is actually necessary - once the existing tests have
> been moved to a separate module, *that test module* can itself be
> imported twice, once with the python version of the module to be
> tested and once with the C version. You can then do some hackery to
> distinguish the test classes without having to modify the test code
> itself (note, the below code should work in theory, but isn't actually
> tested):
>
> =
> py_module_tests = support.import_fresh_module('moduletester',
> fresh=['modulename'], blocked=['_modulename'])
> c_module_tests = support.import_fresh_module('moduletester',
> fresh=['modulename', '_modulename'])
>
> test_modules = [py_module_tests, c_module_tests]
> suffixes = ["_Py", "_C"]
>
> for module, suffix in zip(test_modules, suffixes):
>    for obj in module.itervalues():
>        if isinstance(obj, unittest,TestCase):
>            obj.__name__ += suffix
>            setattr(module, obj.__name__, obj)
>
> def test_main():
>    for module in test_modules:
>        module.test_main()
> =

Very cool solution (assuming it works =) !

One issue I see with this is deciding how to organize tests that are
specific to one version of a module compared to another. For instance,
test_warnings has some tests specific to _warnings because of the
hoops it has to jump through in order to get overriding showwarnings
and friends to work. I guess I could try to make them generic enough
that they don't require a specific module. Otherwise I would insert
the module-specific tests into test_warnings to have that module also
call gnostic_test_warnings to run the universal tests.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python equivalents in stdlib Was: Include datetime.py in stdlib or not?

2010-07-08 Thread Alexander Belopolsky
On Thu, Jul 8, 2010 at 10:59 AM, Nick Coghlan  wrote:
..
> I realised during the day that my suggested approach was more
> complicated than is actually necessary - once the existing tests have
> been moved to a separate module, *that test module* can itself be
> imported twice, once with the python version of the module to be
> tested and once with the C version. You can then do some hackery to
> distinguish the test classes without having to modify the test code
> itself (note, the below code should work in theory, but isn't actually
> tested):
>
> =
> py_module_tests = support.import_fresh_module('moduletester',
> fresh=['modulename'], blocked=['_modulename'])
> c_module_tests = support.import_fresh_module('moduletester',
> fresh=['modulename', '_modulename'])
>
> test_modules = [py_module_tests, c_module_tests]
> suffixes = ["_Py", "_C"]
>
> for module, suffix in zip(test_modules, suffixes):
>    for obj in module.itervalues():
>        if isinstance(obj, unittest,TestCase):
>            obj.__name__ += suffix
>            setattr(module, obj.__name__, obj)
>
> def test_main():
>    for module in test_modules:
>        module.test_main()
> =

Yes, this is definitely an improvement over my current datetime patch
[1]_, but it still requires a custom test_main() and does not make the
test cases discoverable by alternative unittest runners.  I think that
can be fixed by injecting imported TestCase subclasses into the main
test module globals.   I'll try to implement that for datetime.
Thanks, Nick - great idea!

.. [1] http://bugs.python.org/file17848/issue7989.diff
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python equivalents in stdlib Was: Include datetime.py in stdlib or not?

2010-07-08 Thread Antoine Pitrou
On Fri, 9 Jul 2010 00:59:02 +1000
Nick Coghlan  wrote:
> py_module_tests = support.import_fresh_module('moduletester',
> fresh=['modulename'], blocked=['_modulename'])
> c_module_tests = support.import_fresh_module('moduletester',
> fresh=['modulename', '_modulename'])

I don't really like the proliferation of module test helpers, it only
makes things confusing and forces you to switch between more files in
your editor. By contrast, the subclassing solution is simple, explicit
and obvious.

(I also wonder what problem this subthread is trying to solve at all.
Just my 2 eurocents)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] New regex module for 3.2?

2010-07-08 Thread MRAB

Hi all,

I re-implemented the re module, adding new features and speed
improvements. It's available at:

http://pypi.python.org/pypi/regex

under the name "regex" so that it can be tried alongside "re".

I'd be interested in any comments or feedback. How does it compare with
"re" in terms of speed on real-world data? The benchmarks suggest it
should be faster, or at worst comparable.

How much interest would there be in putting it in Python 3.2?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python equivalents in stdlib Was: Include datetime.py in stdlib or not?

2010-07-08 Thread Alexander Belopolsky
On Thu, Jul 8, 2010 at 3:29 PM, Antoine Pitrou  wrote:
> On Fri, 9 Jul 2010 00:59:02 +1000
> Nick Coghlan  wrote:
..
> I don't really like the proliferation of module test helpers, it only
> makes things confusing and forces you to switch between more files in
> your editor. By contrast, the subclassing solution is simple, explicit
> and obvious.
>
And would require a lot of tedious and error prone work to retrofit
existing tests.  Since we don't have meta regression tests, there is
no obvious way to assure that retrofitting does not change the tests.
Note that test_pickle uses both the subclassing solution *and* a
helper pickletester module because this neatly separates
maulti-implementation machinery from the actual test definitions.

> (I also wonder what problem this subthread is trying to solve at all.

The problem is to find a simple solution that will allow running
existing unit tests written for a C extension on both the original
extension and the added pure python equivalent.  When the existing
tests were developed over many years and have 100+ test cases, this is
not as easy task as it would be if you wrote your tests from scratch.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] query: docstring formatting in python distutils code

2010-07-08 Thread Georg Brandl
Am 08.07.2010 17:44, schrieb Martin Geisler:
> Steve Holden  writes:
> 
>> Martin Geisler wrote:
>>> "Stephen J. Turnbull"  writes:
>>> 
 Just ask Martin, there are too many possibilities here to worry
 about. If maybe we want it, and he is willing to contribute the
 parts he wrote to Python under Python's license, then we can worry
 about whether we really want it and about how much any required
 hoop-jumping will cost.
>>> 
>>> I would be happy to relicense it under the Python license.
>>
>> I believe the ideal outcome, if it is possible, is for you to sign a
>> contributor agreement. This will license your material to the PSF in
>> such a way that we can release it under whatever license we deem
>> necessary.
> 
> Sure, I'll be happy to sign a contributor agreement if you guys think it
> worthwhile to use my little parser and formatter.

Problem is, in the case of help() we have no way of knowing whether the
given __doc__ string is supposed to be (mini)reST.  Of course, reverting
to showing the plain content on parsing errors is one possibility, but
I can still imagine instances where something is successfully interpreted
as reST, but intended to be read and understood verbatim by the author.

It's different for Hg, of course, there you can just decide that help
texts have to be reST.

Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New regex module for 3.2?

2010-07-08 Thread Nick Coghlan
On Fri, Jul 9, 2010 at 5:52 AM, MRAB  wrote:
> Hi all,
>
> I re-implemented the re module, adding new features and speed
> improvements. It's available at:
>
>    http://pypi.python.org/pypi/regex
>
> under the name "regex" so that it can be tried alongside "re".
>
> I'd be interested in any comments or feedback. How does it compare with
> "re" in terms of speed on real-world data? The benchmarks suggest it
> should be faster, or at worst comparable.
>
> How much interest would there be in putting it in Python 3.2?

The list of fixed bugs/new features is certainly impressive. How does
Python's test suite go if you drop it in place of the current "re"
module? (Ditto for test suites of major applications and frameworks
like Django, etc).

Off the top of my head, I would say that this won't have enough time
to bake properly for inclusion in 3.2, but if the potential benefits
and intended backwards compatibility are borne out by real world usage
and the code fares well under review then it may be a contender for
3.3. If the backwards compatibility isn't quite there (and can't be
improved), then adding it under a name other than "re" wouldn't be
impossible, but it would be a harder sell.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] query: docstring formatting in python distutils code

2010-07-08 Thread Fred Drake
On Thu, Jul 8, 2010 at 5:21 PM, Georg Brandl  wrote:
> Problem is, in the case of help() we have no way of knowing whether the
> given __doc__ string is supposed to be (mini)reST.  Of course, reverting
> to showing the plain content on parsing errors is one possibility, but
> I can still imagine instances where something is successfully interpreted
> as reST, but intended to be read and understood verbatim by the author.

The docstring processing PEP provides for this:

http://www.python.org/dev/peps/pep-0258/#id42


  -Fred

-- 
Fred L. Drake, Jr.
"A storm broke loose in my mind."  --Albert Einstein
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] query: docstring formatting in python distutils code

2010-07-08 Thread Alexander Belopolsky
On Thu, Jul 8, 2010 at 5:21 PM, Georg Brandl  wrote:
..
> Problem is, in the case of help() we have no way of knowing whether the
> given __doc__ string is supposed to be (mini)reST.

I am against mark-up in doc-strings, but this problem can be easily
solved by placing a magic character at __doc__[0] to indicate that the
rest is  (mini)reST.  The magic character should be chosen to be
inconspicuous and unlikely to appear at the start of a plain-text
docstting.  For example, any type closing braces, ), }. ] will do, or
any end of sentence punctuation such as . or !.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python equivalents in stdlib Was: Include datetime.py in stdlib or not?

2010-07-08 Thread Nick Coghlan
On Fri, Jul 9, 2010 at 5:24 AM, Alexander Belopolsky
 wrote:
> Yes, this is definitely an improvement over my current datetime patch
> [1]_, but it still requires a custom test_main() and does not make the
> test cases discoverable by alternative unittest runners.  I think that
> can be fixed by injecting imported TestCase subclasses into the main
> test module globals.

So include something along the lines of "globals()[obj.__name__] =
obj" in the name hacking loop to make the test classes more
discoverable? Good idea.

Including a comment in the main test module along the lines of your
reply to Antoine would be good, too (i.e. this is acknowledged as
being something of a hack to make sure we don't break the datetime
tests when updating them to be applied to both the existing C module
and the new pure Python equivalent). As Antoine says, using explicit
subclasses is a *much* cleaner way of doing this kind of thing when
the tests are being written from scratch to test multiple
implementations within a single interpreter.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New regex module for 3.2?

2010-07-08 Thread Benjamin Peterson
2010/7/8 MRAB :
> Hi all,
>
> I re-implemented the re module, adding new features and speed
> improvements. It's available at:
>
>    http://pypi.python.org/pypi/regex
>
> under the name "regex" so that it can be tried alongside "re".
>
> I'd be interested in any comments or feedback. How does it compare with
> "re" in terms of speed on real-world data? The benchmarks suggest it
> should be faster, or at worst comparable.
>
> How much interest would there be in putting it in Python 3.2?

It would really be nice if you explained incrementally everything you
changed, so it could better be evaluated.


-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New regex module for 3.2?

2010-07-08 Thread Antoine Pitrou
On Thu, 08 Jul 2010 20:52:44 +0100
MRAB  wrote:
> 
> I'd be interested in any comments or feedback. How does it compare with
> "re" in terms of speed on real-world data? The benchmarks suggest it
> should be faster, or at worst comparable.

Can you publish these benchmarks somewhere?
(or send them here)

> How much interest would there be in putting it in Python 3.2?

I think there's certainly interest (especially given that the original
re module doesn't really have an expert and active maintainer).
Since it's a very big change (and a rather annoying to undo one),
though, it must really not add any maintenance problems, and ideally
you should promise to maintain it at least for a couple of years.
Bonus points if the internals are sufficiently documented, too.

Thanks

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New regex module for 3.2?

2010-07-08 Thread MRAB

Nick Coghlan wrote:

On Fri, Jul 9, 2010 at 5:52 AM, MRAB  wrote:

Hi all,

I re-implemented the re module, adding new features and speed
improvements. It's available at:

   http://pypi.python.org/pypi/regex

under the name "regex" so that it can be tried alongside "re".

I'd be interested in any comments or feedback. How does it compare with
"re" in terms of speed on real-world data? The benchmarks suggest it
should be faster, or at worst comparable.

How much interest would there be in putting it in Python 3.2?


The list of fixed bugs/new features is certainly impressive. How does
Python's test suite go if you drop it in place of the current "re"
module? (Ditto for test suites of major applications and frameworks
like Django, etc).

Off the top of my head, I would say that this won't have enough time
to bake properly for inclusion in 3.2, but if the potential benefits
and intended backwards compatibility are borne out by real world usage
and the code fares well under review then it may be a contender for
3.3. If the backwards compatibility isn't quite there (and can't be
improved), then adding it under a name other than "re" wouldn't be
impossible, but it would be a harder sell.


You should be able to replace:

import re

with:

import regex as re

and still have everything work the same, ie it's backwards compatible
with re.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] query: docstring formatting in python distutils code

2010-07-08 Thread Fred Drake
On Thu, Jul 8, 2010 at 5:42 PM, Alexander Belopolsky
 wrote:
> I am against mark-up in doc-strings, but this problem can be easily
> solved by placing a magic character at __doc__[0] to indicate that the
> rest is  (mini)reST.

Or __docformat__ can be set appropriately.  See:

http://www.python.org/dev/peps/pep-0258/#id42


  -Fred

-- 
Fred L. Drake, Jr.
"A storm broke loose in my mind."  --Albert Einstein
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New regex module for 3.2?

2010-07-08 Thread Nick Coghlan
On Fri, Jul 9, 2010 at 7:54 AM, MRAB  wrote:
> You should be able to replace:
>
>    import re
>
> with:
>
>    import regex as re
>
> and still have everything work the same, ie it's backwards compatible
> with re.

That's not what I'm asking. I'm asking what happens if you take an
existing Python installation's re module, move it aside, and drop
regex in its place as "re.py".

Doing that and then running Python's own test suite as well as the
test suites of major Python applications and frameworks like Twisted,
Zope and Django would provide solid evidence that the new version
really *is* backwards compatible, rather than that it is *meant* to be
backwards compatible.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] query: docstring formatting in python distutils code

2010-07-08 Thread Steve Holden
Fred Drake wrote:
> On Thu, Jul 8, 2010 at 5:42 PM, Alexander Belopolsky
>  wrote:
>> I am against mark-up in doc-strings, but this problem can be easily
>> solved by placing a magic character at __doc__[0] to indicate that the
>> rest is  (mini)reST.
> 
> Or __docformat__ can be set appropriately.  See:
> 
> http://www.python.org/dev/peps/pep-0258/#id42
> 
That is _so_ Python 2 ;-)

regards
 Steve
-- 
Steve Holden   +1 571 484 6266   +1 800 494 3119
DjangoCon US September 7-9, 2010http://djangocon.us/
See Python Video!   http://python.mirocommunity.org/
Holden Web LLC http://www.holdenweb.com/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Full unicode support for the import machinery

2010-07-08 Thread Victor Stinner
Hi,

I'm trying to fix Python to support undecodable bytes in the Python path since 
some months ago. My first try was really huge and sometimes ugly. When it was 
possible, I extracted some short and simple patches and applied them to py3k 
(sometimes with an issue, sometimes directly in the svn).

When it was no more possible to split the big patch, I restarted the work from 
scratch. The main change from my previous try is that I changed import.c to 
use unicode strings instead of byte strings. With the surrogate hack (PEP 
383), unicode is a superset of bytes and so it is "forward compatible".

I just created a branch called "import_unicode" (based on py3k) including all 
my patches. It's still a work in progress. It is possible to start Python 
installed in an undecodable path (eg. directory with an non-ASCII character 
with C locale for Linux), which is an huge progress, but some tests are still 
failing.

The last biggest problem is that code object filenames are not reencoded after 
that the file system encoding is changed (but sys.path and sys.modules 
filenames are reencoded). I think that I will register all code objects into a 
list to be able to reencode their filename attribute (and then drop the list). 

I created an svn branch because I think that it's easier to review short 
commits than one unique huge patch. The branch also helps me to share the 
branch between different computers, and allow other people to review the 
commits (and/or contribute!).

Some people will maybe understand better my work with the "whole picture" :-)

--

There are at least 4 issues related to this work:

 #3080: Full unicode import system
 #4352: imp.find_module() fails with a UnicodeDecodeError 
when called with non-ASCII search paths
 #8611: Python3 doesn't support locale different than utf8 
and an non-ASCII path (POSIX)
 #8988: import + coding = failure (3.1.2/win32)

--

Some examples of previous issues related to my secret goal (patch import 
machinery):

 #8391: os.execvpe() doesn't support surrogates in env
 #8393: subprocess: support undecodable current working directory on POSIX OS
 #8412: os.system() doesn't support surrogates nor bytes
 #8485: Don't accept bytearray as filenames, or simplify the API
 # 8514: Add fsencode() functions to os module
 #8610: Python3/POSIX: errors if file system encoding is None 
 (-> create initfsencoding() in pythonrun.c)
 #8715: Create PyUnicode_EncodeFSDefault() function
 ...

-- 
Victor Stinner
http://www.haypocalc.com/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New regex module for 3.2?

2010-07-08 Thread MRAB

Nick Coghlan wrote:

On Fri, Jul 9, 2010 at 7:54 AM, MRAB  wrote:

You should be able to replace:

   import re

with:

   import regex as re

and still have everything work the same, ie it's backwards compatible
with re.


That's not what I'm asking. I'm asking what happens if you take an
existing Python installation's re module, move it aside, and drop
regex in its place as "re.py".

Doing that and then running Python's own test suite as well as the
test suites of major Python applications and frameworks like Twisted,
Zope and Django would provide solid evidence that the new version
really *is* backwards compatible, rather than that it is *meant* to be
backwards compatible.


I had to recompile the .pyd to change its internal name from "regex" to
"re", but apart from that it passed Python's own test suite except for
where I expected it to fail:

1. Some of the inline flags are scoped; for example, putting "(?i)" at
the end of a regex will now have no effect because it's no longer a
global, all-or-nothing, flag.

2. The .sub method will treat unmatched groups in an expansion as empty
strings. The re module raises an exception in such cases, which means
that users currently need a workaround.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python equivalents in stdlib Was: Include datetime.py in stdlib or not?

2010-07-08 Thread Alexander Belopolsky
On Thu, Jul 8, 2010 at 5:46 PM, Nick Coghlan  wrote:
..
> So include something along the lines of "globals()[obj.__name__] =
> obj" in the name hacking loop to make the test classes more
> discoverable? Good idea.
>

As often happens, a good idea turns quite ugly when facing real world
realities.  I've uploaded a new patch at
http://bugs.python.org/issue7989 and here is what I had to do to make
this work for datetime:

==
import unittest
import sys; sys.modules['_pickle'] = None
from test.support import import_fresh_module, run_unittest
TESTS = 'test.datetimetester'
pure_tests = import_fresh_module(TESTS, fresh=['datetime', '_strptime', 'time'],
 blocked=['_datetime'])
fast_tests = import_fresh_module(TESTS, fresh=['datetime',
   '_datetime',
'_strptime', 'time'])

test_modules = [pure_tests, fast_tests]
test_suffixes = ["_Pure", "_Fast"]

globs = globals()
for module, suffix in zip(test_modules, test_suffixes):
for name, cls in module.__dict__.items():
if isinstance(cls, type) and issubclass(cls, unittest.TestCase):
name += suffix
cls.__name__ = name
globs[name] = cls
def setUp(self, module=module, setup=cls.setUp):
self._save_sys_modules = sys.modules.copy()
sys.modules[TESTS] = module
sys.modules['datetime'] = module.datetime_module
sys.modules['_strptime'] = module.datetime_module._strptime
setup(self)
def tearDown(self, teardown=cls.tearDown):
teardown(self)
sys.modules = self._save_sys_modules
cls.setUp = setUp
cls.tearDown = tearDown

def test_main():
run_unittest(__name__)
=

and it still requires that '_pickle' is disabled to pass pickle tests.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] query: docstring formatting in python distutils code

2010-07-08 Thread Stephen J. Turnbull
Steve Holden writes:

 > That is _so_ Python 2 ;-)

High praise!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Can ftp url start with file:// ?

2010-07-08 Thread Senthil Kumaran
Strictly not a Python question, but I wanted to know from the
experience of others in this list.

Is this is valid ftp url?

# file://ftp.example.com/blah.txt (an ftp URL)

My answer is no. When we have the scheme specifically mentioned as
file:// it is no point in considering it as ftp url (which should
start with ftp://).

If I go ahead with this assumption and fix a bug in stdlib, I am
introducing a regression because at the moment the above is considered
a ftp url.



-- 
Senthil

A real diplomat is one who can cut his neighbor's throat without having
his neighbour notice it.
-- Trygve Lie
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can ftp url start with file:// ?

2010-07-08 Thread Steven D'Aprano
On Fri, 9 Jul 2010 01:52:32 pm Senthil Kumaran wrote:
> Strictly not a Python question, but I wanted to know from the
> experience of others in this list.
>
> Is this is valid ftp url?
>
> # file://ftp.example.com/blah.txt (an ftp URL)
>
> My answer is no. When we have the scheme specifically mentioned as
> file:// it is no point in considering it as ftp url (which should
> start with ftp://).

I agree. Just because the host is *called* ftp doesn't mean you should 
use the ftp protocol to get the file.

http://en.wikipedia.org/wiki/File_URI_scheme

> If I go ahead with this assumption and fix a bug in stdlib, I am
> introducing a regression because at the moment the above is
> considered a ftp url.

Do you have a url for the bug report?



-- 
Steven D'Aprano
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Can ftp url start with file:// ?

2010-07-08 Thread Senthil Kumaran
On Fri, Jul 09, 2010 at 02:23:40PM +1000, Steven D'Aprano wrote:
> > Is this is valid ftp url?
> >
> > # file://ftp.example.com/blah.txt (an ftp URL)
> >
> > My answer is no. When we have the scheme specifically mentioned as
> > file:// it is no point in considering it as ftp url (which should
> > start with ftp://).
> 
> I agree. Just because the host is *called* ftp doesn't mean you should 
> use the ftp protocol to get the file.

It was not just for the host being called ftp.example.com

It was for a pattern that file:/// is local file (correct) and
file://localhost/somepath is again local file (correct again) but
file://anyhost.domain/file.txt is actually ftp (pretty weird).

> Do you have a url for the bug report?

http://bugs.python.org/issue8801

Don't go into the suggestion in the report, but just notice that file
url lead to an ftp error exception.

-- 
Senthil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] query: docstring formatting in python distutils code

2010-07-08 Thread Georg Brandl
Am 09.07.2010 00:01, schrieb Fred Drake:
> On Thu, Jul 8, 2010 at 5:42 PM, Alexander Belopolsky
>  wrote:
>> I am against mark-up in doc-strings, but this problem can be easily
>> solved by placing a magic character at __doc__[0] to indicate that the
>> rest is  (mini)reST.

Ugh. :)

> Or __docformat__ can be set appropriately.  See:
> 
> http://www.python.org/dev/peps/pep-0258/#id42

Yes, but[tm] it is not always easy to find the correct module to look for
__docformat__ when given an object.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com