Re: [Python-Dev] cpython (3.3): don't run frame if it has no stack (closes #17669)
On Thu, Apr 11, 2013 at 12:28 PM, Nick Coghlan ncogh...@gmail.com wrote: On 11 Apr 2013 07:49, Antoine Pitrou solip...@pitrou.net wrote: On Wed, 10 Apr 2013 23:01:46 +0200 (CEST) benjamin.peterson python-check...@python.org wrote: http://hg.python.org/cpython/rev/35cb75b9d653 changeset: 83238:35cb75b9d653 branch: 3.3 parent: 83235:172f825d7fc9 user:Benjamin Peterson benja...@python.org date:Wed Apr 10 17:00:56 2013 -0400 summary: don't run frame if it has no stack (closes #17669) Wouldn't it be better with a test? Benjamin said much the same thing on the issue, but persuading the interpreter to create a frame without a stack that then gets exposed to this code path isn't straightforward :P Cheers, Nick. Maybe it's worth understanding in which circumstances this holds? Please write a test, we'll run into similar issue at some point Im'm sure. Cheers, fijal ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cpython (3.3): don't run frame if it has no stack (closes #17669)
2013/4/11 Maciej Fijalkowski fij...@gmail.com: On Thu, Apr 11, 2013 at 12:28 PM, Nick Coghlan ncogh...@gmail.com wrote: On 11 Apr 2013 07:49, Antoine Pitrou solip...@pitrou.net wrote: On Wed, 10 Apr 2013 23:01:46 +0200 (CEST) benjamin.peterson python-check...@python.org wrote: http://hg.python.org/cpython/rev/35cb75b9d653 changeset: 83238:35cb75b9d653 branch: 3.3 parent: 83235:172f825d7fc9 user:Benjamin Peterson benja...@python.org date:Wed Apr 10 17:00:56 2013 -0400 summary: don't run frame if it has no stack (closes #17669) Wouldn't it be better with a test? Benjamin said much the same thing on the issue, but persuading the interpreter to create a frame without a stack that then gets exposed to this code path isn't straightforward :P Cheers, Nick. Maybe it's worth understanding in which circumstances this holds? Please write a test, we'll run into similar issue at some point Im'm sure. Probably not. It's related to cyclic GC. -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cpython: Add fast-path in PyUnicode_DecodeCharmap() for pure 8 bit encodings:
On 09.04.13 23:29, victor.stinner wrote: http://hg.python.org/cpython/rev/53879d380313 changeset: 83216:53879d380313 parent: 83214:b7f2d28260b4 user:Victor Stinner victor.stin...@gmail.com date:Tue Apr 09 21:53:09 2013 +0200 summary: Add fast-path in PyUnicode_DecodeCharmap() for pure 8 bit encodings: cp037, cp500 and iso8859_1 codecs I deliberately specialized only most typical case in order to reduce maintaining cost. Further optimization of two not the most popular encodings probably not worth additional 25 lines of code. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cpython: Add fast-path in PyUnicode_DecodeCharmap() for pure 8 bit encodings:
2013/4/11 Serhiy Storchaka storch...@gmail.com: On 09.04.13 23:29, victor.stinner wrote: http://hg.python.org/cpython/rev/53879d380313 changeset: 83216:53879d380313 parent: 83214:b7f2d28260b4 user:Victor Stinner victor.stin...@gmail.com date:Tue Apr 09 21:53:09 2013 +0200 summary: Add fast-path in PyUnicode_DecodeCharmap() for pure 8 bit encodings: cp037, cp500 and iso8859_1 codecs I deliberately specialized only most typical case in order to reduce maintaining cost. Further optimization of two not the most popular encodings probably not worth additional 25 lines of code. I did the commit while I was trying to avoid usage of PyUnicode_READ_CHAR() and PyUnicode_READ() in unicodeobject.c (slow macros). I was surprised that PyUnicode_DecodeCharmap() has a fast path for Py_UCS2 mapping but not Py_UCS1 mapping. After implementing the fast-path, I realized that only a very few codecs use it. So what do you suggest? Revert the commit to restore the following hack (to only have one fast-path)? '\ufffe' ## Widen to UCS2 for optimization The Py_UCS1 fast-path has a small advantage: decoding cannot fail (no need to call an error handler). Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] casefolding in pathlib (PEP 428)
Hey Antoine, Some of my Dropbox colleagues just drew my attention to the occurrence of case folding in pathlib.py. Basically, case folding as an approach to comparing pathnames is fatally flawed. The issues include: - most OSes these days allow the mounting of both case-sensitive and case-insensitive filesystems simultaneously - the case-folding algorithm on some filesystems is burned into the disk when the disk is formatted - case folding requires domain knowledge, e.g. turkish dotless I - normalization is a mess, even on OSX, where it's better defined than elsewhere One or more of them may reply-all to this message with more details. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] casefolding in pathlib (PEP 428)
On Thu, Apr 11, 2013 at 02:11:21PM -0700, Guido van Rossum gu...@python.org wrote: - the case-folding algorithm on some filesystems is burned into the disk when the disk is formatted Into the partition, I guess, not the physical disc? Oleg. -- Oleg Broytmanhttp://phdru.name/p...@phdru.name Programmers don't die, they just GOSUB without RETURN. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] casefolding in pathlib (PEP 428)
On Thu, 11 Apr 2013 14:11:21 -0700 Guido van Rossum gu...@python.org wrote: Hey Antoine, Some of my Dropbox colleagues just drew my attention to the occurrence of case folding in pathlib.py. Basically, case folding as an approach to comparing pathnames is fatally flawed. The issues include: - most OSes these days allow the mounting of both case-sensitive and case-insensitive filesystems simultaneously - the case-folding algorithm on some filesystems is burned into the disk when the disk is formatted The problem is that: - if you always make the comparison case-sensitive, you'll get false negatives - if you make the comparison case-insensitive under Windows, you'll get false positives My assumption was that, globally, the number of false positives in case (2) is much less than the number of false negatives in case (1). On the other hand, one could argue that all comparisons should be case-sensitive *and* the proper way to test for identical paths is to access the filesystem. Which makes me think, perhaps concrete paths should get a samefile method as in os.path.samefile(). Hmm, I think I'm tending towards the latter right now. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] casefolding in pathlib (PEP 428)
On 12 April 2013 09:18, Oleg Broytman p...@phdru.name wrote: On Thu, Apr 11, 2013 at 02:11:21PM -0700, Guido van Rossum gu...@python.org wrote: - the case-folding algorithm on some filesystems is burned into the disk when the disk is formatted Into the partition, I guess, not the physical disc? CDROMs - Joliet IIRC - so yes, physical disc. -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Cloud Services ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] casefolding in pathlib (PEP 428)
On Fri, Apr 12, 2013 at 09:29:44AM +1200, Robert Collins robe...@robertcollins.net wrote: On 12 April 2013 09:18, Oleg Broytman p...@phdru.name wrote: On Thu, Apr 11, 2013 at 02:11:21PM -0700, Guido van Rossum gu...@python.org wrote: - the case-folding algorithm on some filesystems is burned into the disk when the disk is formatted Into the partition, I guess, not the physical disc? CDROMs - Joliet IIRC - so yes, physical disc. Ah, I've completely forgotten about that one. I was thinking in terms of filesystems. Thank you for reminding! Oleg. -- Oleg Broytmanhttp://phdru.name/p...@phdru.name Programmers don't die, they just GOSUB without RETURN. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] casefolding in pathlib (PEP 428)
On Thu, Apr 11, 2013 at 2:27 PM, Antoine Pitrou solip...@pitrou.net wrote: On Thu, 11 Apr 2013 14:11:21 -0700 Guido van Rossum gu...@python.org wrote: Hey Antoine, Some of my Dropbox colleagues just drew my attention to the occurrence of case folding in pathlib.py. Basically, case folding as an approach to comparing pathnames is fatally flawed. The issues include: - most OSes these days allow the mounting of both case-sensitive and case-insensitive filesystems simultaneously - the case-folding algorithm on some filesystems is burned into the disk when the disk is formatted The problem is that: - if you always make the comparison case-sensitive, you'll get false negatives - if you make the comparison case-insensitive under Windows, you'll get false positives My assumption was that, globally, the number of false positives in case (2) is much less than the number of false negatives in case (1). On the other hand, one could argue that all comparisons should be case-sensitive *and* the proper way to test for identical paths is to access the filesystem. Which makes me think, perhaps concrete paths should get a samefile method as in os.path.samefile(). Hmm, I think I'm tending towards the latter right now. Python on OSX has been using (1) for a decade now without major problems. Perhaps it would be best if the code never called lower() or upper() (not even indirectly via os.path.normcase()). Then any case-folding and path-normalization bugs are the responsibility of the application, and we won't have to worry about how to fix the stdlib without breaking backwards compatibility if we ever figure out how to fix this (which I somehow doubt we ever will anyway :-). Some other issues to be mindful of: - On Linux, paths are really bytes; on Windows (at least NTFS), they are really (16-bit) Unicode; on Mac, they are UTF-8 in a specific normal form (except on some external filesystems). - On Windows, short names are still supported, making the number of ways to spell the path for any given file even larger. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] casefolding in pathlib (PEP 428)
On 11Apr2013 14:11, Guido van Rossum gu...@python.org wrote: | Some of my Dropbox colleagues just drew my attention to the occurrence | of case folding in pathlib.py. Basically, case folding as an approach | to comparing pathnames is fatally flawed. The issues include: | | - most OSes these days allow the mounting of both case-sensitive and | case-insensitive filesystems simultaneously | | - the case-folding algorithm on some filesystems is burned into the | disk when the disk is formatted | | - case folding requires domain knowledge, e.g. turkish dotless I | | - normalization is a mess, even on OSX, where it's better defined than elsewhere Yes, but what's the use case? Specificly, _why_ are you comparing pathnames? To my mind case folding is just one mode of filename conflict. Surely there are others (forbidden characters in some domains, like colons; names significant only to a certain number of characters; an so forth). Thus: what specific problem are you case-folding to address? On a personal basis I would normally address this kind of thing with stat(), avoiding any app knowledge about pathname rules: does this path exist, or are these paths referencing the same file? But of course that doesn't solve the wider issue with Dropbox, where the rules differ per platform and where work can take place disparately on separate hosts. Imagining Dropbox, I'd guess there's a file tree in the backing store. What is its policy? Does it allow multiple files differing only by case? I can imagine that would be bad when the tree is presented on a case insensitive platform (eg Windows, default MacOSX). Taking the view that DropBox should avoid that situation (in what are doubtless several forms), does Dropbox pre-emptively prevent making files with specific names based on what is already in the store, or resolve them to the same object (hard link locally, or simply and less confusingly and more portably, diverting opens to the existing name like a CI filesystem would)? What about offline? That suggests that the forbidden modes should known to the Dropbox app too. Is this the use case for comparing filenames instead of just doing a stat() to the local filesystem or to the remote backing store (via a virtual stat, as it were)? What does Dropbox do if the local app is disabled and a user runs riot in the Dropbox directory, making conflicting names: allowed by the local FS but conflicting in the backing store or on other hosts? What does Dropbox do if a user makes conflicting files independently on different hosts, and then syncs? I just feel you've got a name conflist issue to resolve (and how that's done is partly just policy), and pathlib which offers some facilities related to that kind of thing. But a mismatch between what you actually need to do and what pathlib offers. Fixing your problem isn't necessarily a bugfix for pathlib. I think we need to know the wider context. Cheers, -- Cameron Simpson c...@zip.com.au I had a *bad* day. I had to subvert my principles and kowtow to an idiot. Television makes these daily sacrifices possible. It deadens the inner core of my being.- Martin Donovan, _Trust_ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] casefolding in pathlib (PEP 428)
On Thu, Apr 11, 2013 at 4:09 PM, Cameron Simpson c...@zip.com.au wrote: On 11Apr2013 14:11, Guido van Rossum gu...@python.org wrote: | Some of my Dropbox colleagues just drew my attention to the occurrence | of case folding in pathlib.py. Basically, case folding as an approach | to comparing pathnames is fatally flawed. The issues include: | | - most OSes these days allow the mounting of both case-sensitive and | case-insensitive filesystems simultaneously | | - the case-folding algorithm on some filesystems is burned into the | disk when the disk is formatted | | - case folding requires domain knowledge, e.g. turkish dotless I | | - normalization is a mess, even on OSX, where it's better defined than elsewhere Yes, but what's the use case? Specificly, _why_ are you comparing pathnames? Um, this isn't about Dropbox. This is a warning against the inclusion of any behavior depending on case folding in pathlib, based on experience with case folding at Dropbox (both in the client and in the server). To my mind case folding is just one mode of filename conflict. Surely there are others (forbidden characters in some domains, like colons; names significant only to a certain number of characters; an so forth). Of course. Thus: what specific problem are you case-folding to address? Why Dropbox is folding case really doesn't matter. But we have to deal with it because users expect unreasonable things, such as having two files, readme and README, on a Linux box, and then syncing both files to a box running Windows or OSX. (There are many other edge cases, most not involving Linux at all.) We can't always os os.stat() because some of this logic runs on a box where the files don't exist (e.g. the server, or the Linux box in the above example). On a personal basis I would normally address this kind of thing with stat(), avoiding any app knowledge about pathname rules: does this path exist, or are these paths referencing the same file? But of course that doesn't solve the wider issue with Dropbox, where the rules differ per platform and where work can take place disparately on separate hosts. You seem to be completely misunderstanding me. I am not asking for help solving our problem. I am giving advice to avoid baking the wrong solution to this class of problems into a new stdlib module. Imagining Dropbox, I'd guess there's a file tree in the backing store. What is its policy? Does it allow multiple files differing only by case? I can imagine that would be bad when the tree is presented on a case insensitive platform (eg Windows, default MacOSX). You got the basic idea, but we can't just refuse to sync files that might be problematic on some other box. Suppose someone is using Dropbox just as a backup service for their Linux box. They shouldn't have to worry about case clashes not being backed up. Taking the view that DropBox should avoid that situation (in what are doubtless several forms), does Dropbox pre-emptively prevent making files with specific names based on what is already in the store, or resolve them to the same object (hard link locally, or simply and less confusingly and more portably, diverting opens to the existing name like a CI filesystem would)? We have lots of different solutions based on the specific situations. What about offline? That suggests that the forbidden modes should known to the Dropbox app too. Is this the use case for comparing filenames instead of just doing a stat() to the local filesystem or to the remote backing store (via a virtual stat, as it were)? Again, please don't try to solve our problem for us. What does Dropbox do if the local app is disabled and a user runs riot in the Dropbox directory, making conflicting names: allowed by the local FS but conflicting in the backing store or on other hosts? What does Dropbox do if a user makes conflicting files independently on different hosts, and then syncs? I just feel you've got a name conflist issue to resolve (and how that's done is partly just policy), and pathlib which offers some facilities related to that kind of thing. But a mismatch between what you actually need to do and what pathlib offers. Fixing your problem isn't necessarily a bugfix for pathlib. I think we need to know the wider context. My reasoning is as follows. If pathlib supports functionality for checking whether two paths spelled differently point to the same file, users are going to rely on that functionality. But if the implementation is based on knowing how and when to case fold, it will definitely have bugs. So I am proposing to either remove that functionality, or to implement it by consulting the filesystem. Which of course has its own set of issues, for example if the file doesn't exist yet, but there are ways to deal with that too. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org
Re: [Python-Dev] casefolding in pathlib (PEP 428)
On 11Apr2013 16:23, Guido van Rossum gu...@python.org wrote: | On Thu, Apr 11, 2013 at 4:09 PM, Cameron Simpson c...@zip.com.au wrote: | On 11Apr2013 14:11, Guido van Rossum gu...@python.org wrote: | | Some of my Dropbox colleagues just drew my attention to the occurrence | | of case folding in pathlib.py. Basically, case folding as an approach | | to comparing pathnames is fatally flawed. [...] | | Yes, but what's the use case? Specificly, _why_ are you comparing pathnames? | | Um, this isn't about Dropbox. This is a warning against the inclusion | of any behavior depending on case folding in pathlib, based on | experience with case folding at Dropbox (both in the client and in the | server). Ah. That wasn't so apparent to me. I took you to have tripped over a specific problem that pathlib appeared to be missolving. I've always viewed path normalisation and its ilk as hazard prone and very context dependent, so I tend not to do it if I can help it. | You seem to be completely misunderstanding me. I am not asking for | help solving our problem. I am giving advice to avoid baking the wrong | solution to this class of problems into a new stdlib module. Ok, fine. [...snip lots of stuff now not relevant...] | My reasoning is as follows. If pathlib supports functionality for | checking whether two paths spelled differently point to the same file, | users are going to rely on that functionality. But if the | implementation is based on knowing how and when to case fold, it will | definitely have bugs. So I am proposing to either remove that | functionality, or to implement it by consulting the filesystem. Which | of course has its own set of issues, for example if the file doesn't | exist yet, but there are ways to deal with that too. Personally I'd be for removing it, or making the doco quite blunt about the many possible shortcomings of guessing whether two paths are the same thing without being able to stat() them. I'll back out now. Cheers, -- Cameron Simpson c...@zip.com.au Having been erased, The document you're seeking Must now be retyped. - Haiku Error Messages http://www.salonmagazine.com/21st/chal/1998/02/10chal2.html ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Destructors and Closing of File Objects
[ Note: I already asked this on http://stackoverflow.com/questions/15917502 but didn't get any satisfactory answers] Hello, The description of tempfile.NamedTemporaryFile() says: , | If delete is true (the default), the file is deleted as soon as it is | closed. ` In some circumstances, this means that the file is not deleted after the program ends. For example, when running the following test under py.test, the temporary file remains: , | from __future__ import division, print_function, absolute_import | import tempfile | import unittest2 as unittest | class cache_tests(unittest.TestCase): | def setUp(self): | self.dbfile = tempfile.NamedTemporaryFile() | def test_get(self): | self.assertEqual('foo', 'foo') ` In some way this makes sense, because this program never explicitly closes the file object. The only other way for the object to get closed would presumably be in the __del__ destructor, but here the language references states that It is not guaranteed that __del__() methods are called for objects that still exist when the interpreter exits. So everything is consistent with the documentation so far. However, I'm confused about the implications of this. If it is not guaranteed that file objects are closed on interpreter exit, can it possibly happen that some data that was successfully written to a (buffered) file object is lost even though the program exits gracefully, because it was still in the file objects buffer and the file object never got closed? Somehow that seems very unlikely and un-pythonic to me, and the open() documentation doesn't contain any such warnings either. So I (tentatively) conclude that file objects are, after all, guaranteed to be closed. But how does this magic happen, and why can't NamedTemporaryFile() use the same magic to ensure that the file is deleted? Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Destructors and Closing of File Objects
On Fri, Apr 12, 2013 at 12:04 AM, Nikolaus Rath nikol...@rath.org wrote: [ Note: I already asked this on http://stackoverflow.com/questions/15917502 but didn't get any satisfactory answers] Sorry, but that's not a reason to repost your question to this list. If you have to ask somewhere else, it would be python-list, aka, comp.lang.python. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] casefolding in pathlib (PEP 428)
On Thu, Apr 11, 2013 at 11:27 PM, Antoine Pitrou solip...@pitrou.net wrote: Hmm, I think I'm tending towards the latter right now. You might also want to look at what Mercurial does. As a cross-platform filesystem-oriented tool, it has some interesting issues in the department of casefolding. Cheers, Dirkjan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com