Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-30 Thread Steve Holden
R. David Murray wrote: On Tue, 29 Jun 2010 17:02:14 -0400, Steve Holden st...@holdenweb.com wrote: Guido van Rossum wrote: - wrap the binary stream in a text stream wrap how? The ultimate destiny of the text is twofold: I would imagine Guido is talking about an io.TextIOWrapper...in other

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-30 Thread Antoine Pitrou
On Tue, 29 Jun 2010 20:05:29 -0400 R. David Murray rdmur...@bitdance.com wrote: I would imagine Guido is talking about an io.TextIOWrapper...in other words, take the binary file you've just finished grabbing info from, and reread it as a text file in order to grab the actual message content.

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-30 Thread Guido van Rossum
On Wed, Jun 30, 2010 at 9:42 AM, Antoine Pitrou solip...@pitrou.net wrote: On Tue, 29 Jun 2010 20:05:29 -0400 R. David Murray rdmur...@bitdance.com wrote: I would imagine Guido is talking about an io.TextIOWrapper...in other words, take the binary file you've just finished grabbing info

[Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread Steve Holden
I hope this is an appropriate dev topic. It seems to me that the unicode discussions of recent days are well highlighted by difficulties I am having using the mailbox module (hardly surprising given the difficulties of handling email generally) even though it passes its tests. I can't find

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread Miki Tebeka
Hello Steve, Can someone who is set up to do easily just do a timing of test_mailbox under 2.6 and 3.2, to verify they see the same disparity as me? The test takes about twice as long under 3.1 here On Ubuntu timing was: Python 2.6.5: 23.8sec Python 2.7rc2: 32.7sec Python 3.1.2: 32.3sec

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread Senthil Kumaran
On Tue, Jun 29, 2010 at 09:56:11AM -0400, Steve Holden wrote: Can someone who is set up to do easily just do a timing of test_mailbox under 2.6 and 3.2, to verify they see the same disparity as me? The test Actually, No. Python 2.7b2+ (trunk:81685M, Jun 4 2010, 21:52:06) Ran 274 tests in

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread Nick Coghlan
Command line: ./python -m test.regrtest -v test_mailbox trunk: Ran 274 tests in 25.239s py3k: Ran 268 tests in 26.263s So I don't see any substantial difference on a Kubuntu 10.04 box (both builds are recent'ish, but not completely up to date). However, the underlying IO access is significantly

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread Steve Holden
Nick Coghlan wrote: Command line: ./python -m test.regrtest -v test_mailbox trunk: Ran 274 tests in 25.239s py3k: Ran 268 tests in 26.263s So I don't see any substantial difference on a Kubuntu 10.04 box (both builds are recent'ish, but not completely up to date). However, the

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread Tim Golden
On 29/06/2010 15:26, Steve Holden wrote: Nick Coghlan wrote: Command line: ./python -m test.regrtest -v test_mailbox trunk: Ran 274 tests in 25.239s py3k: Ran 268 tests in 26.263s So I don't see any substantial difference on a Kubuntu 10.04 box (both builds are recent'ish, but not completely

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread Tim Golden
On 29/06/2010 15:51, Tim Golden wrote: On 29/06/2010 15:26, Steve Holden wrote: Nick Coghlan wrote: Command line: ./python -m test.regrtest -v test_mailbox trunk: Ran 274 tests in 25.239s py3k: Ran 268 tests in 26.263s So I don't see any substantial difference on a Kubuntu 10.04 box (both

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread Antoine Pitrou
On Tue, 29 Jun 2010 11:40:50 -0400 Steve Holden st...@holdenweb.com wrote: Sure. I attach the outputs of both files, as well as the program and the data. With profiling (python -m cProfile test3.py) the run took less than a third of a second under 2.5, and 168 seconds under 3.1. I'd say that

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread A.M. Kuchling
On Tue, Jun 29, 2010 at 07:56:22AM -0700, Guido van Rossum wrote: Since you have such a great reproducible test case, could you point the profiler at it? (Perhaps on a reduced dataset... The profiler multiples your run time by some number between 2 and 10 IIRC.) Let me underline Guido's

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread A.M. Kuchling
On Tue, Jun 29, 2010 at 11:40:50AM -0400, Steve Holden wrote: I will leave the profiler output to speak for itself, since I can find nothing much to say about it except that there's a hell of a lot of decoding going on inside mailbox.iterkeys(). The problem is actually in _generate_toc(),

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread R. David Murray
On Tue, 29 Jun 2010 18:34:22 +0200, Antoine Pitrou solip...@pitrou.net wrote: On Tue, 29 Jun 2010 11:40:50 -0400 Steve Holden st...@holdenweb.com wrote: Sure. I attach the outputs of both files, as well as the program and the data. With profiling (python -m cProfile test3.py) the run took

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread Antoine Pitrou
On Tue, 29 Jun 2010 12:52:28 -0400 A.M. Kuchling a...@amk.ca wrote: But should mailboxes really be opened in a UTF-8 encoding, or should they be treated as 7-bit text? I'll have to think about this. I don't see how you can assume UTF-8 for mailbox files, given that each message will have its

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread Steve Holden
A.M. Kuchling wrote: On Tue, Jun 29, 2010 at 11:40:50AM -0400, Steve Holden wrote: I will leave the profiler output to speak for itself, since I can find nothing much to say about it except that there's a hell of a lot of decoding going on inside mailbox.iterkeys(). The problem is actually

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread Guido van Rossum
It should probably be opened in binary mode. Binary files do have a .readline() method (returning a bytes object), and bytes objects have a .startswith() method. The tell positions computed this way are even compatible with those used by the text file. So you could do it this way: - open binary

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread Steve Holden
Guido van Rossum wrote: It should probably be opened in binary mode. Binary files do have a .readline() method (returning a bytes object), and bytes objects have a .startswith() method. The tell positions computed this way are even compatible with those used by the text file. So you could do

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread R. David Murray
On Tue, 29 Jun 2010 13:54:09 -0400, Steve Holden st...@holdenweb.com wrote: A.M. Kuchling wrote: But should mailboxes really be opened in a UTF-8 encoding, or should they be treated as 7-bit text? I'll have to think about this. Neither! You can't open them as 7-bit text, because

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread R. David Murray
On Tue, 29 Jun 2010 17:02:14 -0400, Steve Holden st...@holdenweb.com wrote: Guido van Rossum wrote: - wrap the binary stream in a text stream wrap how? The ultimate destiny of the text is twofold: I would imagine Guido is talking about an io.TextIOWrapper...in other words, take the binary

Re: [Python-Dev] Mailbox module - timings and functionality changes

2010-06-29 Thread Steve Holden
R. David Murray wrote: On Tue, 29 Jun 2010 13:54:09 -0400, Steve Holden st...@holdenweb.com wrote: A.M. Kuchling wrote: But should mailboxes really be opened in a UTF-8 encoding, or should they be treated as 7-bit text? I'll have to think about this. Neither! You can't open them as 7-bit