[Python-Dev] Mailbox module - timings and functionality changes
I hope this is an appropriate dev topic. It seems to me that the unicode discussions of recent days are well highlighted by difficulties I am having using the mailbox module (hardly surprising given the difficulties of handling email generally) even though it passes its tests. I can't find anything related in the issue tracker (symptoms: one program that works fine under Python 2 in under twenty seconds takes forever (over ten minutes) to fail while creating the (start, stop) index to the mailbox). My code reads Thunderbird mailboxen from file store on my Windows Vista system under 3.1. The failures I am experiencing could easily be encoding issues so I won't post any detail yet, but I am concerned about the timing - even when the code is fixed, if it needs to be, the performance may still make the module of dubious value. Can someone who is set up to do easily just do a timing of test_mailbox under 2.6 and 3.2, to verify they see the same disparity as me? The test takes about twice as long under 3.1 here (and I am concerned that unexercised aspects of the code may extend real-world problem run times by an order of magnitude or more). regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 See Python Video! http://python.mirocommunity.org/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS:http://holdenweb.eventbrite.com/ All I want for my birthday is another birthday - Ian Dury, 1942-2000 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
Hello Steve, Can someone who is set up to do easily just do a timing of test_mailbox under 2.6 and 3.2, to verify they see the same disparity as me? The test takes about twice as long under 3.1 here On Ubuntu timing was: Python 2.6.5: 23.8sec Python 2.7rc2: 32.7sec Python 3.1.2: 32.3sec All the best, -- Miki ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
On Tue, Jun 29, 2010 at 09:56:11AM -0400, Steve Holden wrote: Can someone who is set up to do easily just do a timing of test_mailbox under 2.6 and 3.2, to verify they see the same disparity as me? The test Actually, No. Python 2.7b2+ (trunk:81685M, Jun 4 2010, 21:52:06) Ran 274 tests in 27.231s OK real0m27.769s user0m1.110s sys 0m0.440s Python 3.2a0 (py3k:82364M, Jun 29 2010, 19:37:27 Ran 268 tests in 24.444s OK real0m25.126s user0m2.810s sys 0m0.270s 07:39 PM:senthil@:~/python/py3k This is under Ubuntu 64 Bit. Perhaps, the problem you are observing is Windows Only? -- Senthil Banectomy, n.: The removal of bruises on a banana. -- Rich Hall, Sniglets ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
Command line: ./python -m test.regrtest -v test_mailbox trunk: Ran 274 tests in 25.239s py3k: Ran 268 tests in 26.263s So I don't see any substantial difference on a Kubuntu 10.04 box (both builds are recent'ish, but not completely up to date). However, the underlying IO access is significantly different between POSIX and Windows, so there could still be something pathological happening at the filesystem manipulation layer. My comparisons are also 2.7 vs 3.2 rather than 2.6 vs 3.1. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
Nick Coghlan wrote: Command line: ./python -m test.regrtest -v test_mailbox trunk: Ran 274 tests in 25.239s py3k: Ran 268 tests in 26.263s So I don't see any substantial difference on a Kubuntu 10.04 box (both builds are recent'ish, but not completely up to date). However, the underlying IO access is significantly different between POSIX and Windows, so there could still be something pathological happening at the filesystem manipulation layer. My comparisons are also 2.7 vs 3.2 rather than 2.6 vs 3.1. Cheers, Nick. Thanks for all the timings! If a Windows user could do the same thing that would help ... regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 See Python Video! http://python.mirocommunity.org/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS:http://holdenweb.eventbrite.com/ All I want for my birthday is another birthday - Ian Dury, 1942-2000 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] what environment variable should contain compiler warning suppression flags?
On Jun 28, 2010, at 05:28 PM, M.-A. Lemburg wrote: How many Python users will compile Python in debug mode ? How many Python users compile Python at all? :) The point is that the default build of Python should use the correct production settings for the C compiler out of the box and that's what AC_PROG_CC is all about. Sure. I'm pretty sure that Python developers who want to use a debug build have enough code foo to get the -O2 turned into a -O0 either by adjust OPT and/or by providing their own CFLAGS env var. Yes, but it's a PITA for several reasons, IMO: * It's pretty underdocumented * It's obscure * It's hard to remember the exact fu needed because you do it infrequently * I usually only remember my mistake when gdb acts funny I strongly suggest that --with-pydebug should be all you need to ensure the best debugging environment, which means turning off compiler optimization. Last time I tried, the -O0 was added and it worked well. (I know this has been in flux though.) Also note that in some cases you may actually want to have a debug build with optimizations turned on, e.g. to track down a compiler optimization bug. Yes, but that's *much* more rare than wanting to step through some bit of C code without going crazy. -Barry signature.asc Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
On 29/06/2010 15:26, Steve Holden wrote: Nick Coghlan wrote: Command line: ./python -m test.regrtest -v test_mailbox trunk: Ran 274 tests in 25.239s py3k: Ran 268 tests in 26.263s So I don't see any substantial difference on a Kubuntu 10.04 box (both builds are recent'ish, but not completely up to date). However, the underlying IO access is significantly different between POSIX and Windows, so there could still be something pathological happening at the filesystem manipulation layer. My comparisons are also 2.7 vs 3.2 rather than 2.6 vs 3.1. Cheers, Nick. Thanks for all the timings! If a Windows user could do the same thing that would help ... WinXP SP3 2.6 Ran 272 tests in 13.172s 3.1 Ran 267 tests in 15.735s py3k A *lot* of ERROR and FAIL tests WinXP SP3 TJG ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] what environment variable should contain compiler warning suppression flags?
On Jun 28, 2010, at 06:03 PM, M.-A. Lemburg wrote: OPT already uses -O0 if --with-pydebug is used and the compiler supports -g. Since OPT gets added after CFLAGS, the override already happens... So nobody's proposing to drop that? Good! Ignore my last message then. :) -Barry signature.asc Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
On 29/06/2010 15:51, Tim Golden wrote: On 29/06/2010 15:26, Steve Holden wrote: Nick Coghlan wrote: Command line: ./python -m test.regrtest -v test_mailbox trunk: Ran 274 tests in 25.239s py3k: Ran 268 tests in 26.263s So I don't see any substantial difference on a Kubuntu 10.04 box (both builds are recent'ish, but not completely up to date). However, the underlying IO access is significantly different between POSIX and Windows, so there could still be something pathological happening at the filesystem manipulation layer. My comparisons are also 2.7 vs 3.2 rather than 2.6 vs 3.1. Cheers, Nick. Thanks for all the timings! If a Windows user could do the same thing that would help ... WinXP SP3 2.6 Ran 272 tests in 13.172s 3.1 Ran 267 tests in 15.735s py3k A *lot* of ERROR and FAIL tests py3k HEAD on Win7 Ran 268 tests in 34.055s TJG ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pickle security and remote logging
anatoly techtonik techtonik at gmail.com writes: insecure. SocketHandler and DatagramHandler docs should at least contain a warning about danger of exposing unpickling interfaces to insecure networks. I've updated the documentation of SocketHandler.makePickle to mention security concerns, and that the method can be overridden to use a more secure implementation (e.g. HMAC-signed pickles). Regards, Vinay Sajip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] what environment variable should contain compiler warning suppression flags?
Barry Warsaw wrote: On Jun 28, 2010, at 05:28 PM, M.-A. Lemburg wrote: How many Python users will compile Python in debug mode ? How many Python users compile Python at all? :) The point is that the default build of Python should use the correct production settings for the C compiler out of the box and that's what AC_PROG_CC is all about. Sure. I'm pretty sure that Python developers who want to use a debug build have enough code foo to get the -O2 turned into a -O0 either by adjust OPT and/or by providing their own CFLAGS env var. Yes, but it's a PITA for several reasons, IMO: * It's pretty underdocumented * It's obscure * It's hard to remember the exact fu needed because you do it infrequently * I usually only remember my mistake when gdb acts funny I strongly suggest that --with-pydebug should be all you need to ensure the best debugging environment, which means turning off compiler optimization. Last time I tried, the -O0 was added and it worked well. (I know this has been in flux though.) Also note that in some cases you may actually want to have a debug build with optimizations turned on, e.g. to track down a compiler optimization bug. Yes, but that's *much* more rare than wanting to step through some bit of C code without going crazy. I agree - trying to step through -O2 optimized code isn't going to help debug your code, it's going to help you debug the optimizer. That's a very rare use case. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 See Python Video! http://python.mirocommunity.org/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS:http://holdenweb.eventbrite.com/ All I want for my birthday is another birthday - Ian Dury, 1942-2000 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
On Tue, 29 Jun 2010 11:40:50 -0400 Steve Holden st...@holdenweb.com wrote: Sure. I attach the outputs of both files, as well as the program and the data. With profiling (python -m cProfile test3.py) the run took less than a third of a second under 2.5, and 168 seconds under 3.1. I'd say that was problematical :) I will leave the profiler output to speak for itself, since I can find nothing much to say about it except that there's a hell of a lot of decoding going on inside mailbox.iterkeys(). Ok, a lot of time is spent in cp1252 decoding. Somewhat less time, but still too much of it, is spent in TextIOWrapper.tell(). This seems to imply that mailbox files are opened in text mode, which sounds wrong to me. Perhaps Andrew can shed more light on this? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
On Tue, Jun 29, 2010 at 07:56:22AM -0700, Guido van Rossum wrote: Since you have such a great reproducible test case, could you point the profiler at it? (Perhaps on a reduced dataset... The profiler multiples your run time by some number between 2 and 10 IIRC.) Let me underline Guido's suggestion. Steve, I've done a lot of mailbox.py stuff and can look at your problem, but off the top of my head, my suspicion would be that I/O is the culprit, and a profile could confirm that. My thought is that mailbox.py is opening the file in some reading mode that ends up doing a lot more processing on Windows than on Unix because of universal newlines or something like that. --amk ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
On Tue, Jun 29, 2010 at 11:40:50AM -0400, Steve Holden wrote: I will leave the profiler output to speak for itself, since I can find nothing much to say about it except that there's a hell of a lot of decoding going on inside mailbox.iterkeys(). The problem is actually in _generate_toc(), which is reading through the entire file to figure out where all the 'From' lines that start messages are located. TextIOWrapper()'s tell() method seems to be very slow, so one help is to only call tell() when necessary; patch: - svn diff Lib/ Index: Lib/mailbox.py === --- Lib/mailbox.py (revision 82346) +++ Lib/mailbox.py (working copy) @@ -775,13 +775,14 @@ starts, stops = [], [] self._file.seek(0) while True: -line_pos = self._file.tell() line = self._file.readline() if line.startswith('From '): +line_pos = self._file.tell() if len(stops) len(starts): stops.append(line_pos - len(os.linesep)) starts.append(line_pos) elif not line: +line_pos = self._file.tell() stops.append(line_pos) break self._toc = dict(enumerate(zip(starts, stops))) But should mailboxes really be opened in a UTF-8 encoding, or should they be treated as 7-bit text? I'll have to think about this. --amk ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
On Tue, 29 Jun 2010 18:34:22 +0200, Antoine Pitrou solip...@pitrou.net wrote: On Tue, 29 Jun 2010 11:40:50 -0400 Steve Holden st...@holdenweb.com wrote: Sure. I attach the outputs of both files, as well as the program and the data. With profiling (python -m cProfile test3.py) the run took less than a third of a second under 2.5, and 168 seconds under 3.1. I'd say that was problematical :) I will leave the profiler output to speak for itself, since I can find nothing much to say about it except that there's a hell of a lot of decoding going on inside mailbox.iterkeys(). Ok, a lot of time is spent in cp1252 decoding. Somewhat less time, but still too much of it, is spent in TextIOWrapper.tell(). This seems to imply that mailbox files are opened in text mode, which sounds wrong to me. Perhaps Andrew can shed more light on this? Given the current state of the email package for python3, it makes sense that it would open them in text mode. email can't currently process bytes, only text. -- R. David Murray www.bitdance.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
On Tue, 29 Jun 2010 12:52:28 -0400 A.M. Kuchling a...@amk.ca wrote: But should mailboxes really be opened in a UTF-8 encoding, or should they be treated as 7-bit text? I'll have to think about this. I don't see how you can assume UTF-8 for mailbox files, given that each message will have its particular encoding. Besides, Steve's profile results show that you are not using UTF-8, but rather the local encoding, which is cp1252 under his Windows setup. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
A.M. Kuchling wrote: On Tue, Jun 29, 2010 at 11:40:50AM -0400, Steve Holden wrote: I will leave the profiler output to speak for itself, since I can find nothing much to say about it except that there's a hell of a lot of decoding going on inside mailbox.iterkeys(). The problem is actually in _generate_toc(), which is reading through the entire file to figure out where all the 'From' lines that start messages are located. TextIOWrapper()'s tell() method seems to be very slow, so one help is to only call tell() when necessary; patch: - svn diff Lib/ Index: Lib/mailbox.py === --- Lib/mailbox.py(revision 82346) +++ Lib/mailbox.py(working copy) @@ -775,13 +775,14 @@ starts, stops = [], [] self._file.seek(0) while True: -line_pos = self._file.tell() line = self._file.readline() if line.startswith('From '): +line_pos = self._file.tell() if len(stops) len(starts): stops.append(line_pos - len(os.linesep)) starts.append(line_pos) elif not line: +line_pos = self._file.tell() stops.append(line_pos) break self._toc = dict(enumerate(zip(starts, stops))) But should mailboxes really be opened in a UTF-8 encoding, or should they be treated as 7-bit text? I'll have to think about this. Neither! You can't open them as 7-bit text, because real-world email does contain bytes whose ordinal value exceeds 127. You can't open them using a text encoding because theoretically there might be ASCII headers that indicate that parts of the content are in specific character sets or encodings. If only we had a data structure that easily allowed us to manipulate 8-bit characters ... regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 See Python Video! http://python.mirocommunity.org/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS:http://holdenweb.eventbrite.com/ All I want for my birthday is another birthday - Ian Dury, 1942-2000 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
It should probably be opened in binary mode. Binary files do have a .readline() method (returning a bytes object), and bytes objects have a .startswith() method. The tell positions computed this way are even compatible with those used by the text file. So you could do it this way: - open binary stream - compute TOC by reading through it using .readline() and .tell() - rewind (don't close) - wrap the binary stream in a text stream - use that for the rest of the code --Guido On Tue, Jun 29, 2010 at 10:54 AM, Steve Holden st...@holdenweb.com wrote: A.M. Kuchling wrote: On Tue, Jun 29, 2010 at 11:40:50AM -0400, Steve Holden wrote: I will leave the profiler output to speak for itself, since I can find nothing much to say about it except that there's a hell of a lot of decoding going on inside mailbox.iterkeys(). The problem is actually in _generate_toc(), which is reading through the entire file to figure out where all the 'From' lines that start messages are located. TextIOWrapper()'s tell() method seems to be very slow, so one help is to only call tell() when necessary; patch: - svn diff Lib/ Index: Lib/mailbox.py === --- Lib/mailbox.py (revision 82346) +++ Lib/mailbox.py (working copy) @@ -775,13 +775,14 @@ starts, stops = [], [] self._file.seek(0) while True: - line_pos = self._file.tell() line = self._file.readline() if line.startswith('From '): + line_pos = self._file.tell() if len(stops) len(starts): stops.append(line_pos - len(os.linesep)) starts.append(line_pos) elif not line: + line_pos = self._file.tell() stops.append(line_pos) break self._toc = dict(enumerate(zip(starts, stops))) But should mailboxes really be opened in a UTF-8 encoding, or should they be treated as 7-bit text? I'll have to think about this. Neither! You can't open them as 7-bit text, because real-world email does contain bytes whose ordinal value exceeds 127. You can't open them using a text encoding because theoretically there might be ASCII headers that indicate that parts of the content are in specific character sets or encodings. If only we had a data structure that easily allowed us to manipulate 8-bit characters ... regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 See Python Video! http://python.mirocommunity.org/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS: http://holdenweb.eventbrite.com/ All I want for my birthday is another birthday - Ian Dury, 1942-2000 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
Guido van Rossum wrote: It should probably be opened in binary mode. Binary files do have a .readline() method (returning a bytes object), and bytes objects have a .startswith() method. The tell positions computed this way are even compatible with those used by the text file. So you could do it this way: - open binary stream - compute TOC by reading through it using .readline() and .tell() - rewind (don't close) Because closing is inefficient, or because it breaks the algorithm? - wrap the binary stream in a text stream wrap how? The ultimate destiny of the text is twofold: 1) To be stored as some kind of LOB in a database, and 2) Therefrom to be reconstituted and parsed into email.Message objects. Is the wrapping a one-off operation or a software layer? Sorry, being a bit dense here, I know. regards Steve - use that for the rest of the code --Guido On Tue, Jun 29, 2010 at 10:54 AM, Steve Holden st...@holdenweb.com wrote: A.M. Kuchling wrote: On Tue, Jun 29, 2010 at 11:40:50AM -0400, Steve Holden wrote: I will leave the profiler output to speak for itself, since I can find nothing much to say about it except that there's a hell of a lot of decoding going on inside mailbox.iterkeys(). The problem is actually in _generate_toc(), which is reading through the entire file to figure out where all the 'From' lines that start messages are located. TextIOWrapper()'s tell() method seems to be very slow, so one help is to only call tell() when necessary; patch: - svn diff Lib/ Index: Lib/mailbox.py === --- Lib/mailbox.py(revision 82346) +++ Lib/mailbox.py(working copy) @@ -775,13 +775,14 @@ starts, stops = [], [] self._file.seek(0) while True: -line_pos = self._file.tell() line = self._file.readline() if line.startswith('From '): +line_pos = self._file.tell() if len(stops) len(starts): stops.append(line_pos - len(os.linesep)) starts.append(line_pos) elif not line: +line_pos = self._file.tell() stops.append(line_pos) break self._toc = dict(enumerate(zip(starts, stops))) But should mailboxes really be opened in a UTF-8 encoding, or should they be treated as 7-bit text? I'll have to think about this. Neither! You can't open them as 7-bit text, because real-world email does contain bytes whose ordinal value exceeds 127. You can't open them using a text encoding because theoretically there might be ASCII headers that indicate that parts of the content are in specific character sets or encodings. If only we had a data structure that easily allowed us to manipulate 8-bit characters ... regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 See Python Video! http://python.mirocommunity.org/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS:http://holdenweb.eventbrite.com/ All I want for my birthday is another birthday - Ian Dury, 1942-2000 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pickle security and remote logging
On Tue, Jun 29, 2010 at 6:15 PM, Vinay Sajip vinay_sa...@yahoo.co.uk wrote: I've updated the documentation of SocketHandler.makePickle to mention security concerns, and that the method can be overridden to use a more secure implementation (e.g. HMAC-signed pickles). Thanks. But I doubt HMAC complication helps to protect logging server. If shared key is compromised -server becomes vulnerable. I would prefer approach when no code execution is possible. Some alternative serialization way for transmitting log data structures over network. Protocol buffers first come in mind, but they seem to be an overkill, and stdlib doesn't include any implementation. -- anatoly t. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
On Tue, 29 Jun 2010 13:54:09 -0400, Steve Holden st...@holdenweb.com wrote: A.M. Kuchling wrote: But should mailboxes really be opened in a UTF-8 encoding, or should they be treated as 7-bit text? I'll have to think about this. Neither! You can't open them as 7-bit text, because real-world email does contain bytes whose ordinal value exceeds 127. You can't open them using a text encoding because theoretically there might be ASCII headers that indicate that parts of the content are in specific character sets or encodings. If only we had a data structure that easily allowed us to manipulate 8-bit characters ... email6 *will* handle this use case. When it exists :) But note that it is *not* just a matter of easily handling 8 bit characters. There are a whole bunch of algorithms needed for interpreting that 7 and 8 bit data. All the info is there in the email headers, but being able to do string operations on 8 bit byte strings doesn't get you the answers you need by itself. It really is the case that the Python3 bytes/unicode split forces us to redo most of the algorithms so that they handle bytes and text *correctly*. This isn't a trivial undertaking, but the end result will be well worth it. -- R. David Murray www.bitdance.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
On Tue, 29 Jun 2010 17:02:14 -0400, Steve Holden st...@holdenweb.com wrote: Guido van Rossum wrote: - wrap the binary stream in a text stream wrap how? The ultimate destiny of the text is twofold: I would imagine Guido is talking about an io.TextIOWrapper...in other words, take the binary file you've just finished grabbing info from, and reread it as a text file in order to grab the actual message content. If you have messages in your files that are using an 8bit content transfer encoding, then you (currently) will have some problems unless the charset happens to be the one you use when you wrap the binary stream as a text stream. -- R. David Murray www.bitdance.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mailbox module - timings and functionality changes
R. David Murray wrote: On Tue, 29 Jun 2010 13:54:09 -0400, Steve Holden st...@holdenweb.com wrote: A.M. Kuchling wrote: But should mailboxes really be opened in a UTF-8 encoding, or should they be treated as 7-bit text? I'll have to think about this. Neither! You can't open them as 7-bit text, because real-world email does contain bytes whose ordinal value exceeds 127. You can't open them using a text encoding because theoretically there might be ASCII headers that indicate that parts of the content are in specific character sets or encodings. If only we had a data structure that easily allowed us to manipulate 8-bit characters ... email6 *will* handle this use case. When it exists :) But note that it is *not* just a matter of easily handling 8 bit characters. There are a whole bunch of algorithms needed for interpreting that 7 and 8 bit data. All the info is there in the email headers, but being able to do string operations on 8 bit byte strings doesn't get you the answers you need by itself. It really is the case that the Python3 bytes/unicode split forces us to redo most of the algorithms so that they handle bytes and text *correctly*. This isn't a trivial undertaking, but the end result will be well worth it. I completely agree. The unusual thing here is that I of all people should find himself running into these issues, since my use of Python is normally pretty conservative. Since the course I am currently writing is already overdue I have to find answers now to problems that were present in the initial 3.0 release and have not received much attention since. You know that I support your work to revise the email package. I hope that we can eventually have it incorporate mailbox readers as well. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 See Python Video! http://python.mirocommunity.org/ Holden Web LLC http://www.holdenweb.com/ UPCOMING EVENTS:http://holdenweb.eventbrite.com/ All I want for my birthday is another birthday - Ian Dury, 1942-2000 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] OS X buildbots: why am I skipping these tests?
My Leopard and Tiger PPC buildbots are momentarily green! But I'm looking into why I'm skipping some tests. My buildbots are up-to-date OS-wise and very vanilla, with the latest applicable Xcode. 4 skips unexpected on darwin: test_gdb test_ioctl test_readline test_ttk_guionly Three of these (gdb, readline, ttk_guionly) are just bad predictions of which tests should skip on Darwin, I think -- gdb is only version 6, so that test won't run, readline doesn't get built, ttk doesn't work without Tcl/Tk 8.5. But the the skip of test_ioctl baffles me. test_ioctl skipped -- Unable to open /dev/tty But when I log in via ssh and try it with the system python: ~ wjanssen$ python python Python 2.5.1 (r251:54863, Jun 17 2009, 20:37:34) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type help, copyright, credits or license for more information. open(/dev/tty) open(/dev/tty) open file '/dev/tty', mode 'r' at 0x597b8 Seems to work fine. So this I don't understand. Any ideas, anyone? Bill ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] what environment variable should contain compiler warning suppression flags?
Steve Holden writes: I agree - trying to step through -O2 optimized code isn't going to help debug your code, it's going to help you debug the optimizer. That's a very rare use case. Not really. I don't have a lot of practice in debugging at that level, so take it with a grain of salt, but what I've found with XEmacs code is that debugging at -O0 is less often helpful than debugging at -O2. Quite often a naive compilation strategy is used which basically turns those C statements into macros for the underlying assembler, and the code works the way the author thinks it should. But his assumptions are invalid, and when optimized it fails. So I guess you can call that debugging the optimizer if you like ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] OS X buildbots: why am I skipping these tests?
On Tue, Jun 29, 2010 at 7:55 PM, Bill Janssen jans...@parc.com wrote: My Leopard and Tiger PPC buildbots are momentarily green! But I'm looking into why I'm skipping some tests. My buildbots are up-to-date OS-wise and very vanilla, with the latest applicable Xcode. 4 skips unexpected on darwin: test_gdb test_ioctl test_readline test_ttk_guionly Three of these (gdb, readline, ttk_guionly) are just bad predictions of which tests should skip on Darwin, I think -- gdb is only version 6, so that test won't run, readline doesn't get built, ttk doesn't work without Tcl/Tk 8.5. So it looks like you gould get readline and ttk to run and pass by separately downloading and installing readline (I've done this many times before) and Tcl/Tk (no idea but I suppose it should work). But the the skip of test_ioctl baffles me. test_ioctl skipped -- Unable to open /dev/tty But when I log in via ssh and try it with the system python: ~ wjanssen$ python python Python 2.5.1 (r251:54863, Jun 17 2009, 20:37:34) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type help, copyright, credits or license for more information. open(/dev/tty) open(/dev/tty) open file '/dev/tty', mode 'r' at 0x597b8 Seems to work fine. So this I don't understand. Any ideas, anyone? Maybe the buildbot runs the tests as a tty-less daemon process. If you ask me it's pretty crazy to have a test that requires a tty. But there you have it -- and it's the same in Python 3. (But then again, who knows, I might have written that test. ;-) -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] OS X buildbots: why am I skipping these tests?
Seems to work fine. So this I don't understand. Any ideas, anyone? Didn't we discuss this before? The buildbot slave has no controlling terminal anymore, hence it cannot open /dev/tty. If you are curious, just patch your checkout to output the exact errno (e.g. to stdout), and trigger a build through the web. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Taking over the Mercurial Migration
It seems that both Dirkjan and Brett are very caught up with real life for the coming months. So I suggest that some other committer who favors the Mercurial transition steps forward and takes over this project. If nobody volunteers, I propose that we release 3.2 from Subversion, and reconsider Mercurial migration next year. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com