Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
On Wed, Jan 19, 2011 at 5:32 AM, s...@pobox.com wrote: The odds that someone will remember the syntax for the diff command for the VCS are much higher than the revert command. My guess is diff is executed more often than any other version control commands except update and commit, and far more often than revert. Personally, I'm not sure I've ever used revert more than a handful of times in my entire professional lifetime. I realize the world is passing me by and that I'm rapidly turning into a dinosaur w.r.t. distributed version control, but as you write/update the developer's guide remember that proficiency in Python does not necessarily equate to proficiency in version control systems, especially with the less frequently used commands. I personally would prefer that more general commands and concepts be used where possible so that newcomers not be put off unnecessarily by the complexity of version control. Interesting. I almost *never* reverse patches - I always use the SVN revert command. Usually, this is because I will have edited the source tree since applying the patch. Reversion has the advantage of not getting confused by any additional changes. I also usually use svn diff to save a copy before I revert in case I change my mind. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
Nick Usually, this is because I will have edited the source tree since Nick applying the patch. Reversion has the advantage of not getting Nick confused by any additional changes. I also usually use svn diff Nick to save a copy before I revert in case I change my mind. I routinely use CVS and Subversion at work, occasionally SCCS (yes, we still have a little of that other dinosaur laying about - our sysadmins, what can I say? they are luddites). Most of my interaction with these tools is mediated through the Emacs vc package, so my direct use of the command line is reduced even from what you might think normal. It's generally only when I need to operate on a group of files that I revert to using the command line. That tends to be to check in a group of files or discard one or changes before checking in, generally by taking a diff and unapplying it with with patch, perhaps first saving it to a file. If I want to revert a change after checking it in, I can just pipe the confirmation email through patch. S ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Import and unicode: part two
Hi, I patched Python 3.2 to support modules with non-ASCII paths (*). It works well on all operating systems. But the task is not completly done: (a) Python 3 doesn't support non-ASCII module names (b) Python 3 doesn't support unencodable characters in the module path I would like to know if we need to support that. Terry J. Reedy wrote (issue #10828): I think bugs in core syntax should have high priority. I appreciate your work toward fixing it. I wrote a patch (issue #3080) fixing both points. If you agree that both issues should be fixed, I will fix them in Python 3.3. (a) is the issue #10828 reported recently (january 2011): import gui_jämföra doesn't work with a locale encoding different than UTF-8 (so it doesn't work on Windows). (b) is specific to Windows: FAT32 and NTFS filesystems store filenames in unicode, but Python encodes paths to the ANSI code page (which is a very small subset of Unicode). If a character cannot be encoded to the code page, you cannot load a module. Eg. add a japanese character in a directory name on a Windows using cp1252 (english) code page. I don't think that (b) was already reported by an user, it's more a theorical problem. My patch is huge, but it simplifies the code. We doesn't need to regulary convert from/to UTF-8. And for the functions using PyUnicodeObject objects (and not a Py_UNICODE* buffer): PyUnicodeObject stores the string length (it avoids calls to strlen()) and PyUnicode_FromFormat() doesn't need a buffer size (no risk of buffer overflow). I suppose that it makes Python faster, but I didn't try. (*) Python 3.2 doesn't support non-ASCII in the module *name*, only in the path (sys.path). Victor Stinner ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Moving stuff out of Misc and over to the devguide
Le 17/01/2011 23:41, Nick Coghlan a écrit : On Tue, Jan 18, 2011 at 6:54 AM, Antoine Pitrou solip...@pitrou.net wrote: [...] Also, I see no need to put the maintainers list in the dev guide, actually. Every time I see someone syncing the version-independent maintainers list across branches a little alarm bell goes off in my head to say that file should be somewhere other than the main source tree. It's also quite possible that once the maintainer list is part of the dev guide, triagers will start using the official copy on python.org and the search function in their web browser rather than running grep over a source checkout. +1 to moving maintainers.rst to the devguide, a wiki page (I’m volunteering to monitor that page for vandalism), or make it somehow part of the bug tracker. Let’s also take the opportunity to rename it to “experts”, following R. David Murray: “Any module without a listed maintainer is maintained by the community as a whole [...] I think perhaps the name chosen for the file was unfortunate. I view it more as the 'experts' file, rather than the maintainers file, though in some cases the expert is indeed the principle maintainer of the module (such as Vinay for logging).” Bonus question: if we remove maintainers.rst from py3k, what do we do in 3.1 and 2.7? I’d favor removing them over keeping outdated versions. Regards ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
s...@pobox.com wrote: I realize the world is passing me by and that I'm rapidly turning into a dinosaur w.r.t. distributed version control, but as you write/update the developer's guide remember that proficiency in Python does not necessarily equate to proficiency in version control systems, especially with the less frequently used commands. I personally would prefer that more general commands and concepts be used where possible so that newcomers not be put off unnecessarily by the complexity of version control. What he said, only bolded and underlined. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
On Thu, 20 Jan 2011 01:23:26 +1100 Steven D'Aprano st...@pearwood.info wrote: s...@pobox.com wrote: I realize the world is passing me by and that I'm rapidly turning into a dinosaur w.r.t. distributed version control, but as you write/update the developer's guide remember that proficiency in Python does not necessarily equate to proficiency in version control systems, especially with the less frequently used commands. I personally would prefer that more general commands and concepts be used where possible so that newcomers not be put off unnecessarily by the complexity of version control. What he said, only bolded and underlined. I'm not sure what the issue is. Is there something, concretely, that needs to be fixed? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
Antoine Pitrou wrote: On Thu, 20 Jan 2011 01:23:26 +1100 Steven D'Aprano st...@pearwood.info wrote: s...@pobox.com wrote: I realize the world is passing me by and that I'm rapidly turning into a dinosaur w.r.t. distributed version control, but as you write/update the developer's guide remember that proficiency in Python does not necessarily equate to proficiency in version control systems, especially with the less frequently used commands. I personally would prefer that more general commands and concepts be used where possible so that newcomers not be put off unnecessarily by the complexity of version control. What he said, only bolded and underlined. I'm not sure what the issue is. Is there something, concretely, that needs to be fixed? You'll have to ask Skip if he thinks there's a concrete problem. I haven't seen one, but I've only been reading this thread with one eye and it may be I've missed the mother of all problems. The (non-concrete) issue, as I understand it, is simple: be aware that not all Python developers are necessarily expert in DVCSes, and please keep it simple. -- Steven ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 2:34 PM, Victor Stinner victor.stin...@haypocalc.com wrote: (a) Python 3 doesn't support non-ASCII module names -0: I'm vaguely against this being supported because I'd rather not have to deal with what happens when the guess regarding the filesystem encoding is wrong. On the other hand, a general encouragement to stick to ASCII module names is probably functionally equivalent without imposing a hard restriction. (b) Python 3 doesn't support unencodable characters in the module path +1: It'd be nice if Python could import modules regardless of what folder names people happen to have on their module path. Schiavo Simon ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
On Thu, 20 Jan 2011 01:54:37 +1100 Steven D'Aprano st...@pearwood.info wrote: You'll have to ask Skip if he thinks there's a concrete problem. I haven't seen one, but I've only been reading this thread with one eye and it may be I've missed the mother of all problems. The (non-concrete) issue, as I understand it, is simple: be aware that not all Python developers are necessarily expert in DVCSes, and please keep it simple. Well svn revert is one of the basic SVN commands (that I personally use far more often than patch -R, but YMMV). We're not talking about some advanced use of Mercurial queues. The point is a bit subtler here though: if you use patch -R after you have done some changes of your own, the checkout will not be restored to its pristine state, which may bite you later. svn revert -R . ensures everything is clean. Arguably, even patch isn't familiar to Windows developers. It doesn't come bundled and has to be installed separately, and I've seen some people use the TortoiseSVN GUI for applying patches. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Moving stuff out of Misc and over to the devguide
Bonus question: if we remove maintainers.rst from py3k, what do we do in 3.1 and 2.7? Iâd favor removing them over keeping outdated versions. Is there not some advantage to knowing who was the maintainer (or expert) of a given module at the time of a release? Eric. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
What he said, only bolded and underlined. Antoine I'm not sure what the issue is. Is there something, concretely, Antoine that needs to be fixed? Strictly speaking, nothing needs to be fixed because nothing is broken. Rephrasing my earlier messages: 1. Being a sophisticated Python programmer (and thus being a potential core developer) does not necessarily equate to being a sophisticated user of (especially distributed) version control systems. I have been programming in Python for about 15 years and have made contributions to the core off-and-on for about 10 years. I have never, not even once, been tempted to learn about or use svnmerge. Even considering the more mundane subcommands of the normal svn and hg commands (not to mention cvs, bzr, git, darcs, etc) there are plenty of different ways to structure the workflow, not all of which will make sense for each of those vcs's, nor will they all make sense to all potential users. 2. There is more than one way to skin many of the cats involved in version control. My preference to use vcs diff | patch -p0 -R or patch -p0 -R some-email in preference to vcs revert some flags is just one example. I'm sure I will be able to master svn revert and hg revert if necessary, but that knowledge won't transfer at all to CVS (no revert command) and won't transfer 100% to other vcs's because their revert commands will have semantic differences or use different command line flags to dictate the specifics of the action to perform. 3. Not everyone will use the command line (strange as that may seem coming from a decades-long Unix user). Many Windows users (and probably some Mac users) will have GUIs like TortoiseHg. Smart/lazy/ memory-challenged Emacs and vim users will have version control commands built into their editors precisely to paper over the arcane differences which exist between vcs's even for common operations. Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
Steven D'Aprano st...@pearwood.info wrote: s...@pobox.com wrote: I realize the world is passing me by and that I'm rapidly turning into a dinosaur w.r.t. distributed version control, but as you write/update the developer's guide remember that proficiency in Python does not necessarily equate to proficiency in version control systems, especially with the less frequently used commands. I personally would prefer that more general commands and concepts be used where possible so that newcomers not be put off unnecessarily by the complexity of version control. What he said, only bolded and underlined. Indeed. I now have to deal with an unholy mix of CVS, Subversion, git, and Mercurial -- a twisty maze of little one-letter options, all so similar, all too powerful. At least with CVS and Subversion you could concentrate your mistakes on a single file :-). Bill ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
On Wed, 19 Jan 2011 10:36:04 -0600 s...@pobox.com wrote: What he said, only bolded and underlined. Antoine I'm not sure what the issue is. Is there something, concretely, Antoine that needs to be fixed? Strictly speaking, nothing needs to be fixed because nothing is broken. Rephrasing my earlier messages: [...] Ok, thank you but... are you suggesting something or not? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Moving stuff out of Misc and over to the devguide
Am 19.01.2011 16:25, schrieb Eric Smith: Bonus question: if we remove maintainers.rst from py3k, what do we do in 3.1 and 2.7? I’d favor removing them over keeping outdated versions. Is there not some advantage to knowing who was the maintainer (or expert) of a given module at the time of a release? I don't see much advantage. And if you need the version of maintainers.rst in another repo, it's not hard to find the revision that corresponds to the release date. Georg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
Antoine Ok, thank you but... are you suggesting something or not? Yes. Keep the vcs command recommendations simple. At least mention idioms which likely to apply across a wider range of version control systems. S ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
On 19/01/2011 11:35, Nick Coghlan wrote: On Wed, Jan 19, 2011 at 5:32 AM,s...@pobox.com wrote: The odds that someone will remember the syntax for the diff command for the VCS are much higher than the revert command. My guess is diff is executed more often than any other version control commands except update and commit, and far more often than revert. Personally, I'm not sure I've ever used revert more than a handful of times in my entire professional lifetime. I realize the world is passing me by and that I'm rapidly turning into a dinosaur w.r.t. distributed version control, but as you write/update the developer's guide remember that proficiency in Python does not necessarily equate to proficiency in version control systems, especially with the less frequently used commands. I personally would prefer that more general commands and concepts be used where possible so that newcomers not be put off unnecessarily by the complexity of version control. Interesting. I almost *never* reverse patches - I always use the SVN revert command. Usually, this is because I will have edited the source tree since applying the patch. Reversion has the advantage of not getting confused by any additional changes. I also usually use svn diff to save a copy before I revert in case I change my mind. Ditto, same here. For me (by no stretch of the imagination an expert VCS user) the revert commands (of svn, Hg and bzr) are basically straightforward (and cross-platform). To me it is tinkering with the patch command that is arcane... All the best, Michael Cheers, Nick. -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
On 19/01/2011 19:10, s...@pobox.com wrote: Antoine Ok, thank you but... are you suggesting something or not? Yes. Keep the vcs command recommendations simple. At least mention idioms which likely to apply across a wider range of version control systems. The revert works with svn, hg and bzr. Using patch is not going to work on Windoze unless cygwin has been installed. Michael S ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 07:01, Simon Cross hodgestar+python...@gmail.com wrote: On Wed, Jan 19, 2011 at 2:34 PM, Victor Stinner victor.stin...@haypocalc.com wrote: (a) Python 3 doesn't support non-ASCII module names -0: I'm vaguely against this being supported because I'd rather not have to deal with what happens when the guess regarding the filesystem encoding is wrong. On the other hand, a general encouragement to stick to ASCII module names is probably functionally equivalent without imposing a hard restriction. -0 from me (unless the Unicode variable naming PEP says otherwise). (b) Python 3 doesn't support unencodable characters in the module path +1: It'd be nice if Python could import modules regardless of what folder names people happen to have on their module path. +1 from me as well (nervously hoping importlib already supports it =) . ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
On Wed, Jan 19, 2011 at 10:10, s...@pobox.com wrote: Antoine Ok, thank you but... are you suggesting something or not? Yes. Keep the vcs command recommendations simple. At least mention idioms which likely to apply across a wider range of version control systems. I was hoping this would flame out, but two days of discussion suggests otherwise. I am of the opinion of always listing how to use the CVS to its fullest. It is the thing you will have to interact with the most when doing work on Python, so trying to avoid it is not doing anyone any favours. That being said, I am not opposed to someone (other than me as I am not going to bother) **adding** a not about `patch -R`, but it should not replace the `svn revert` explanation. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 1:23 PM, Brett Cannon br...@python.org wrote: .. (a) Python 3 doesn't support non-ASCII module names .. -0 from me (unless the Unicode variable naming PEP says otherwise). I am not sure what exactly is not supported. On my OSX system: $ ./python.exe Python 3.2b2+ .. import саша саша.foo 42 from саша import foo foo 42 PEP 3131 does not distinguish between different types of identifiers, so I think it assumes that non-ascii module names should be supported. +1 on fixing any remaining bugs ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tidying up the Meta-PEP and Other Informational PEP sections of PEP 0
On Jan 19, 2011, at 12:16 AM, Nick Coghlan wrote: For the release schedule PEPs it means done and dusted (similar to the meaning for ordinary PEPs). For the API standardisation PEPs (like WSGI) it instead means the spec has been locked down and any changes will require a new PEP. This caused a problem for the PEP 0 generator, since the former kind of PEP should be moved to the new historical section, while the latter kind should remain up top. Would anyone object if I switched all the API definition PEPs to the Active state? PEP 1 indicates that is the appropriate state for reference PEPs that are never truly finished (in the sense of code being implemented and committed to the source control system). Perhaps we need a new type for API PEPs instead? Type: API Type: Consensus ? If not, then I'd rather come up with a different status to describe an API PEP that has been locked down. Re-using Active doesn't seem right to me. -Barry signature.asc Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 10:38, Alexander Belopolsky alexander.belopol...@gmail.com wrote: On Wed, Jan 19, 2011 at 1:23 PM, Brett Cannon br...@python.org wrote: .. (a) Python 3 doesn't support non-ASCII module names .. -0 from me (unless the Unicode variable naming PEP says otherwise). I am not sure what exactly is not supported. On my OSX system: Victor said this is a Windows-specific issue. -Brett $ ./python.exe Python 3.2b2+ .. import саша саша.foo 42 from саша import foo foo 42 PEP 3131 does not distinguish between different types of identifiers, so I think it assumes that non-ascii module names should be supported. +1 on fixing any remaining bugs ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 1:42 PM, Brett Cannon br...@python.org wrote: .. I am not sure what exactly is not supported. On my OSX system: Victor said this is a Windows-specific issue. I missed that part. In this case, I change my vote to +0 to reflect my lack of knowledge or exposure to Windows-only issues. However, if Victor's patch simplifies the code (as many of his changes in this area do), I will be happy to review it. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
2011/1/19 Michael Foord fuzzy...@voidspace.org.uk: On 19/01/2011 19:10, s...@pobox.com wrote: Antoine Ok, thank you but... are you suggesting something or not? Yes. Keep the vcs command recommendations simple. At least mention idioms which likely to apply across a wider range of version control systems. The revert works with svn, hg and bzr. Using patch is not going to work on Windoze unless cygwin has been installed. I thought you were supposed to use some variant of update on hg instead revert, though. -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
The revert works with svn, hg and bzr. Using patch is not going to work on Windoze unless cygwin has been installed. I thought you were supposed to use some variant of update on hg instead revert, though. I think what is discouraged is to hg revert to a different revision. We are talking about reverting your working copy to its pristine state. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
Le mercredi 19 janvier 2011 à 10:42 -0800, Brett Cannon a écrit : I am not sure what exactly is not supported. On my OSX system: Victor said this is a Windows-specific issue. Autoquote: (a) (...) doesn't work with a locale encoding different than UTF-8 Hum, it's not exactly the locale encoding, but the Python filesystem encoding. On Mac OS X, this encoding is *hardcoded* to UTF-8, so it is possible to use non-ASCII module names on this OS. It is also possible on other BSD/UNIX systems using UTF-8 locale encoding. But this issue only concerns any BSD/UNIX using a locale encoding different than UTF-8. Eg. MvL's buildbot (x86 debian parallel 3.x) uses ISO-8859-15 (see #10492, issue fixed 13 days ago). Even if UTF-8 becomes a de facto standard locale encoding, many systems still use something else. And Python 2 users will complain that their script works with Python 2 but not with Python 3 :-) If we decide to reject non-ASCII module names, it should be done on any operating systems, not only on Windows. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On 1/19/2011 11:31 AM, Victor Stinner wrote: If we decide to reject non-ASCII module names, it should be done on any operating systems, not only on Windows. Since Python allows non-ASCII variable names, I think it should allow non-ASCII module names also, on any platform that supports the appropriate characters in the filesystem. Since some platforms already accept them, dropping them would be incompatible. If Victor already has a patch coded (i.e. the work is basically done, no waiting in line 3), I'm even more in favor of it. If it took lots of future hard work, and no one volunteered to do it, that would perhaps be justification for retaining module name restrictions. I guess that is not the case here, so... +1 on supporting full Unicode module names on all platforms. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On 1/19/2011 7:34 AM, Victor Stinner wrote: Hi, I patched Python 3.2 to support modules with non-ASCII paths (*). It works well on all operating systems. But the task is not completly done: (a) Python 3 doesn't support non-ASCII module names (b) Python 3 doesn't support unencodable characters in the module path I would like to know if we need to support that. Terry J. Reedy wrote (issue #10828): I think bugs in core syntax should have high priority. I appreciate your work toward fixing it. I am a little shocked at the so-far tepid response to (a), so let me defend and explain my claim that it is a bug. In the simplest case (from 6.11. The import statement and 2.3. Identifiers and keywords) import_stmt ::= import module module ::= indentifier identifier ::= appropriate Unicode start and continue chars There is nothing, nothing, about any restriction on identifiers. The rest of 6.11 discusses the complex import algorithm but leaves out the simple semantics that cover 99% of cases (import a ???.py file in a directory on sys.path), and never mentions .py. So lets go to Tutorial 6. Modules which does explain the simple case: A module is a file containing Python definitions and statements. The file name is the module name with the suffix .py appended. So, if xyz is a legal identifier and xyx.py exists on sys.path, it is reasonable from the docs to expect 'import xyz' to work. (Sys.path is memtioned in the reference.) But we now have the following possibility: Let xyz.py be def double(x): return 2*x if __name__==__main__: if double(2) == 4: print(test passed) We run the file, get test passed, and write zyx.py: import xyz ... We run zyx and Python says No module named xyz. Bad, and quite puzzling to anyone who does not understand the subtle difference between running and importing a file. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
Le mercredi 19 janvier 2011 à 13:38 -0500, Alexander Belopolsky a écrit : PEP 3131 does not distinguish between different types of identifiers, so I think it assumes that non-ascii module names should be supported. My opinion is that we should suport non-ASCII module names and unencodable paths if it doesn't introduce an overhead (make Python slower and add a lot of code). My patch adds ~400 lines of code (I think that it is small: the patch adds many functions), but I think that it makes Python as fast, or maybe faster. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 10:32 PM, Terry Reedy tjre...@udel.edu wrote: I am a little shocked at the so-far tepid response to (a), so let me defend and explain my claim that it is a bug. In the simplest case (from 6.11. The import statement and 2.3. Identifiers and keywords) import_stmt ::= import module module ::= indentifier identifier ::= appropriate Unicode start and continue chars There is nothing, nothing, about any restriction on identifiers. I have no problem with non-ASCII module identifiers being valid syntax. It's a question of whether attempting to translate a non-ASCII module name into a file name (so the file can be imported) is a good idea and whether these sorts of files can be safely transferred among diverse filesystems. For similar reasons we tend to avoid capital letters in module names. Schiavo Simon ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
Am 19.01.2011 21:32, schrieb Terry Reedy: On 1/19/2011 7:34 AM, Victor Stinner wrote: Hi, I patched Python 3.2 to support modules with non-ASCII paths (*). It works well on all operating systems. But the task is not completly done: (a) Python 3 doesn't support non-ASCII module names (b) Python 3 doesn't support unencodable characters in the module path I would like to know if we need to support that. Terry J. Reedy wrote (issue #10828): I think bugs in core syntax should have high priority. I appreciate your work toward fixing it. I am a little shocked at the so-far tepid response to (a), so let me defend and explain my claim that it is a bug. In the simplest case (from 6.11. The import statement and 2.3. Identifiers and keywords) import_stmt ::= import module module ::= indentifier identifier ::= appropriate Unicode start and continue chars There is nothing, nothing, about any restriction on identifiers. +1. The restriction on valid identifiers is very sensible (obviously, since m needs to be accessible after import m), but a further restriction seems just arbitrary. Georg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
On 1/19/2011 1:25 PM, Brett Cannon wrote: On Wed, Jan 19, 2011 at 10:10,s...@pobox.com wrote: Antoine Ok, thank you but... are you suggesting something or not? Yes. Keep the vcs command recommendations simple. At least mention idioms which likely to apply across a wider range of version control systems. I was hoping this would flame out, but two days of discussion suggests otherwise. I am of the opinion of always listing how to use the CVS to its fullest. It is the thing you will have to interact with the most when doing work on Python, so trying to avoid it is not doing anyone any favours. That being said, I am not opposed to someone (other than me as I am not going to bother) **adding** a not about `patch -R`, but it should not replace the `svn revert` explanation. As a neophyte vcs user, I like specific commands that can only do what I want, and not screw up with a wrong flag, so I agree with this. The most important thing is being clear about which data will have which effect on which other data. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide: Cover how to (un-)apply a patch.
On 19/01/2011 19:47, Antoine Pitrou wrote: On Wed, 19 Jan 2011 19:20:01 +0100 Michael Foordfuzzy...@voidspace.org.uk wrote: On 19/01/2011 19:10, s...@pobox.com wrote: Antoine Ok, thank you but... are you suggesting something or not? Yes. Keep the vcs command recommendations simple. At least mention idioms which likely to apply across a wider range of version control systems. The revert works with svn, hg and bzr. Using patch is not going to work on Windoze unless cygwin has been installed. You don't need cygwin, just something much smaller with GNU in its name: http://gnuwin32.sourceforge.net/packages/patch.htm (yes, the suggestion is already in the dev guide) Unfortunately gnuwin32 patch doesn't play well with Windows 7. I remember giving up on it completely and installing cygwin. This page seems to explain the details: http://math.nist.gov/oommf/software-patchsets/patch_on_Windows7.html Michael ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On 1/19/2011 4:05 PM, Simon Cross wrote: I have no problem with non-ASCII module identifiers being valid syntax. It's a question of whether attempting to translate a non-ASCII If the names are the same, ie, produced with the same sequence of keystrokes in the save-as box and importing box, then there is no translation, at least from the user's view. module name into a file name (so the file can be imported) is a good idea and whether these sorts of files can be safely transferred among diverse filesystems. I believe we now have the situation that a package that works on *nix could fail on Windows, whereas I believe that patch would *improve* portability. For similar reasons we tend to avoid capital letters in module names. That is a stdlib style guide followed by many, but intentionally not enforced. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
Le mercredi 19 janvier 2011 à 12:19 -0800, Glenn Linderman a écrit : Since Python allows non-ASCII variable names, I think it should allow non-ASCII module names also, on any platform that supports the appropriate characters in the filesystem. Since some platforms already accept them, dropping them would be incompatible. ok If Victor already has a patch coded (i.e. the work is basically done, no waiting in line 3), I'm even more in favor of it. If it took lots of future hard work, and no one volunteered to do it, that would perhaps be justification for retaining module name restrictions. I guess that is not the case here, so... I am volunteer to do the work, and I already have a working patch (but it is not ready yet to be commited, it requires a long review). FYI, I rewrote the patch 4 times since one year, for different reasons: - the patch is huge, complex, and I was unable to write it correctly the first time - I splitted the work into two big parts: support non-ASCII paths (done in Python 3.2) and the other changes in the part two - Update an huge patchset on py3k tree is hard, even with git-svn (and git svn rebase) - In my first tries, I didn't patch the import machinery to support non-ASCII module names, I only patched the support of non-ASCII paths But I don't want to apply such huge patch if Python code developers don't want to support non-ASCII module names and unencodable paths. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 4:40 PM, Terry Reedy tjre...@udel.edu wrote: .. For similar reasons we tend to avoid capital letters in module names. That is a stdlib style guide followed by many, but intentionally not enforced. Indeed. Last time I looked, we still had cProfile in stdlib. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 14:23, Alexander Belopolsky alexander.belopol...@gmail.com wrote: On Wed, Jan 19, 2011 at 4:40 PM, Terry Reedy tjre...@udel.edu wrote: .. For similar reasons we tend to avoid capital letters in module names. That is a stdlib style guide followed by many, but intentionally not enforced. Indeed. Last time I looked, we still had cProfile in stdlib. Yes, but that is because no one got around to hiding cProfile behind profile before we released Python 3.0. I would still like to see it (slowly) go away from being directly visible. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 5:47 PM, Brett Cannon br...@python.org wrote: .. Indeed. Last time I looked, we still had cProfile in stdlib. Yes, but that is because no one got around to hiding cProfile behind profile before we released Python 3.0. I would still like to see it (slowly) go away from being directly visible. Another big offender is the idlelib package. Most of the modules there are in mixed case. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tidying up the Meta-PEP and Other Informational PEP sections of PEP 0
On Thu, Jan 20, 2011 at 4:40 AM, Barry Warsaw ba...@python.org wrote: On Jan 19, 2011, at 12:16 AM, Nick Coghlan wrote: For the release schedule PEPs it means done and dusted (similar to the meaning for ordinary PEPs). For the API standardisation PEPs (like WSGI) it instead means the spec has been locked down and any changes will require a new PEP. This caused a problem for the PEP 0 generator, since the former kind of PEP should be moved to the new historical section, while the latter kind should remain up top. Would anyone object if I switched all the API definition PEPs to the Active state? PEP 1 indicates that is the appropriate state for reference PEPs that are never truly finished (in the sense of code being implemented and committed to the source control system). Perhaps we need a new type for API PEPs instead? Type: API Type: Consensus ? If not, then I'd rather come up with a different status to describe an API PEP that has been locked down. Re-using Active doesn't seem right to me. Oh, I like Consensus. I was going to suggest a new state, but I couldn't think of any names I liked. Hmm, guess I'll add propose a revision to PEP 1 to the to-do list... Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] devguide: Short doc about where to get tech help related to developing Python.
Hi, On Wed, Jan 19, 2011 at 23:19, brett.cannon python-check...@python.org wrote: +Where to Get Help += +If you are working on Python it is very possible you will come across an issue +where you need some assistance in solving (this happens to core developers all +the time). You have a couple of options depending on what kind of help you need. +If the question involves process or tool usage then please check the developer's +guide first as is should answer your question. as it should +Filing a Bug + +If you come across an odd error message that seems like a bug, then file a bug +on the `issue tracker`_. In the bug you can explain that you are not sure why +the error is coming up or that the exact nature of the problem is. Someone will ...or what the exact...? +Asking a Technical Question +--- +You have two avenues of communication out of the :ref:`myriad of options +available communication`. If you are comfortable with IRC you can try asking +in #python-dev. Typically there are a couple of experienced developers, ranging +from triagers to core developers, who can ask questions about developing for who can answer questions Cheers, -- Sandro Tosi (aka morph, morpheus, matrixhasu) My website: http://matrixhasu.altervista.org/ Me at Debian: http://wiki.debian.org/SandroTosi ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Moving stuff out of Misc and over to the devguide
OK, here is my plan that I will implement: MOVE -- developers.txt maintainers.rst README.gdb README.coverity README.Emacs DELETE (seem way too old to still be relevant; tell me if I am wrong) --- README.OpenBSD README.AIX cheatsheet LEAVE everything else (with README properly edited and simplified to only list files with non-obvious names) On Mon, Jan 17, 2011 at 12:32, Brett Cannon br...@python.org wrote: There is a bunch of stuff in Misc that probably belongs in the devguide (under Resources) instead of in svn. Here are the files I think can be moved (in order of how strongly I think they should be moved): PURIFY.README README.coverty README.klocwork README.valgrind Porting developers.txt maintainers.rst SpecialBuilds.txt Now before anyone yells that is inconvenient, don't forget that all core developers can check out and edit the devguide, and that almost all of the files listed (SpecialBuilds.txt is the exception) are typically edited and viewed on their own. Anyway, if there is a file listed here you don't think should move out of py3k and into the devguide, speak up. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 04:40:24PM -0500, Terry Reedy wrote: On 1/19/2011 4:05 PM, Simon Cross wrote: I have no problem with non-ASCII module identifiers being valid syntax. It's a question of whether attempting to translate a non-ASCII If the names are the same, ie, produced with the same sequence of keystrokes in the save-as box and importing box, then there is no translation, at least from the user's view. module name into a file name (so the file can be imported) is a good idea and whether these sorts of files can be safely transferred among diverse filesystems. I believe we now have the situation that a package that works on *nix could fail on Windows, whereas I believe that patch would *improve* portability. I'm not so sure about this You may have something that works on Windows and on *NIX under certain circumstances but it seems likely to fail when moving files between them (for instance, as packages downloaded from pypi). Additionally, many unix filesystem don't specify a filesystem encoding for filenames; they deal in legal and illegal bytes which could lead to troubles. This problem of which encoding to use is a problem that can be seen on UNIX systems even now. Try this: echo 'print(hi)' café.py convmv -f utf-8 -t latin1 café.py python3 -c 'import café' ASCII seems very sensible to me when faced with these ambiguities. Other options I can brainstorm that could be explored: * Specify an encoding per platform and stick to that. (So, for instance, all module names on posix platforms would have to be utf-8). Force translation between encoding when installing packages (But that doesn't help for people that are creating their modules using their own build scripts rather than distutils, copying the files using raw tar, etc.) * Change import semantics to allow specifying the encoding of the module on the filesystem (seems really icky). -Toshio pgpsh1AqAY9Vd.pgp Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Moving stuff out of Misc and over to the devguide
On Wed, 19 Jan 2011 15:31:24 -0800 Brett Cannon br...@python.org wrote: OK, here is my plan that I will implement: MOVE -- developers.txt maintainers.rst README.gdb README.coverity README.Emacs DELETE (seem way too old to still be relevant; tell me if I am wrong) --- README.OpenBSD README.AIX cheatsheet README.gdb is useful to more than core developers and contributors, so I think it should stay inside Misc. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On 1/19/2011 6:44 PM, Toshio Kuratomi wrote: I believe we now have the situation that a package that works on *nix could fail on Windows, whereas I believe that patch would *improve* portability. I'm not so sure about this Forget that claim if it is not true. The patch will certainly improve consistency with a box so that files that run can also be imported. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On 1/19/2011 6:05 PM, Alexander Belopolsky wrote: On Wed, Jan 19, 2011 at 5:47 PM, Brett Cannonbr...@python.org wrote: .. Indeed. Last time I looked, we still had cProfile in stdlib. Yes, but that is because no one got around to hiding cProfile behind profile before we released Python 3.0. I would still like to see it (slowly) go away from being directly visible. Another big offender is the idlelib package. Most of the modules there are in mixed case. Given that the individual modules are not documented and that the only programs importing the individual modules are other idlelib modules (true?) then a rename should be possible. In the other hand, the same facts sort of make it unnecessary ;-). -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
Le mercredi 19 janvier 2011 à 15:44 -0800, Toshio Kuratomi a écrit : Additionally, many unix filesystem don't specify a filesystem encoding for filenames; they deal in legal and illegal bytes which could lead to troubles. This problem of which encoding to use is a problem that can be seen on UNIX systems even now. If the system is not correctly configured, it is not a bug in Python, but a bug in the system config. Python relies on the locale to choose the filesystem encoding (sys.getfilesystemencoding()). Python uses this encoding to decode and encode all filenames. * Specify an encoding per platform and stick to that. It doesn't work: on UNIX/BSD, the user chooses its own encoding and all programs will use it. Anyway, I don't see why it is a problem to have different encodings on different systems. Each system can use its own encoding. The bug that I'm trying to solve is a Python bug, not an OS bug. * Change import semantics to allow specifying the encoding of the module on the filesystem (seems really icky). This is a very bad idea. I introduced PYTHONFSENCODING environment variable in Python 3.2, but then quickly removed it, because it introduced a lot of inconsistencies. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Jan 19, 2011, at 6:44 PM, Toshio Kuratomi wrote: This problem of which encoding to use is a problem that can be seen on UNIX systems even now. Try this: echo 'print(hi)' café.py convmv -f utf-8 -t latin1 café.py python3 -c 'import café' ASCII seems very sensible to me when faced with these ambiguities. Other options I can brainstorm that could be explored: * Specify an encoding per platform and stick to that. (So, for instance, all module names on posix platforms would have to be utf-8). Force translation between encoding when installing packages (But that doesn't help for people that are creating their modules using their own build scripts rather than distutils, copying the files using raw tar, etc.) * Change import semantics to allow specifying the encoding of the module on the filesystem (seems really icky). None of this is unique to import -- the same exact issue occurs with open(u'café'). I don't see any reason why import café should be though of as more of a problem, or treated any differently. It's reasonable to recommend that people use ASCII in their module names if they want wide portability, but it should still be supported to use non-ASCII. James ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 07:11:52PM -0500, James Y Knight wrote: On Jan 19, 2011, at 6:44 PM, Toshio Kuratomi wrote: This problem of which encoding to use is a problem that can be seen on UNIX systems even now. Try this: echo 'print(hi)' café.py convmv -f utf-8 -t latin1 café.py python3 -c 'import café' ASCII seems very sensible to me when faced with these ambiguities. Other options I can brainstorm that could be explored: * Specify an encoding per platform and stick to that. (So, for instance, all module names on posix platforms would have to be utf-8). Force translation between encoding when installing packages (But that doesn't help for people that are creating their modules using their own build scripts rather than distutils, copying the files using raw tar, etc.) * Change import semantics to allow specifying the encoding of the module on the filesystem (seems really icky). None of this is unique to import -- the same exact issue occurs with open(u'café'). I don't see any reason why import café should be though of as more of a problem, or treated any differently. It's unique in several ways: 1) With open, you can specify a byte string:: open(b'caf\xe9.py').read() I don't know of any way to do that with import. This is needed when the filename is not compatible with your current locale. 2) import assigns a name to the module that it imports whereas open lets the programmer assign the name. So even if you can specify what to use as a byte string for this filename on this particular filesystem you'd still end up with some ugly pseudo-representation of bytes when attempting to access it in code:: import caf\xe9 caf\xe9.do_something() -Toshio pgp3UpXl83i8t.pgp Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Thu, Jan 20, 2011 at 01:26:01AM +0100, Victor Stinner wrote: Le mercredi 19 janvier 2011 à 15:44 -0800, Toshio Kuratomi a écrit : Additionally, many unix filesystem don't specify a filesystem encoding for filenames; they deal in legal and illegal bytes which could lead to troubles. This problem of which encoding to use is a problem that can be seen on UNIX systems even now. If the system is not correctly configured, it is not a bug in Python, but a bug in the system config. Python relies on the locale to choose the filesystem encoding (sys.getfilesystemencoding()). Python uses this encoding to decode and encode all filenames. Saying that multiple encodings on a single system is a misconfiguration every time it comes up does not make it true. There's been multiple examples of how you can end up with multiple encodings of filenames on a single system listed in past threads: multiple users with different encodings for their locales, mounting remote filesystems, downloading a file To the existing list I'd add getting a package from pypi -- neither tar nor zip files contain encoding information about the filenames. Therefore if I create an sdist of a python module using non-ascii filenames using a locale of latin1 and then upload to pypi, people downloading that on a utf-8 using locale will end up not being able to use the module. * Specify an encoding per platform and stick to that. It doesn't work: on UNIX/BSD, the user chooses its own encoding and all programs will use it. The proposal is that you ignore that when talking about loading and creating (I mentioned distutils because my thought was that distutils could grow the ability to translate from the system locale to a chosen neutral encoding when running setup.py any of the dist commands but that doesn't address the issue when testing a module that you've just written so perhaps that's not necessary.) python modules. Python modules would have a set of defined filesystem encodings per system. This prevents getting a mixture of encodings of modules and having things work in one location but fail when used somewhere else. Instead, you get an upfront failure until you correct the encoding. Anyway, I don't see why it is a problem to have different encodings on different systems. Each system can use its own encoding. The bug that I'm trying to solve is a Python bug, not an OS bug. There is no OS bug here. There is perhaps an OS design flaw but it's not a flaw that will be going away soon (in part, because the present OS designers do not see it as an OS flaw... to them it's a bug in code that attempts to build a simpler interface on top of it.) * Change import semantics to allow specifying the encoding of the module on the filesystem (seems really icky). This is a very bad idea. I introduced PYTHONFSENCODING environment variable in Python 3.2, but then quickly removed it, because it introduced a lot of inconsistencies. Thanks for getting rid of that, PYTHONFSENCODING is a bad idea because it doesn't solve the underlying issues. However, when I say specifying the encoding of the module on the filesystem, I don't mean something global like PYTHONFSENCODING -- I mean something at the python code level:: import café encoded_as('latin1') After thinking about this one, though, I don't think it will work either. This takes care of importing modules where the fs encoding of the module is known but it doesn't where the fs encoding may be translated between platforms. I believe that this could arise when untarring a module on windows using winzip or similar that gives you the option of translating from utf-8 bytes into bytes that have meaning as characters on that platform, for instance. Do you have a solution to the problem? I haven't looked at your patch so perhaps you have an ingenous method of translating from the unicode representation of the module in the import statement to the bytes in arbitrary encodings on the filesystem that I haven't thought of. If you don't, however, then really - ASCII-only seems like the sanest of the three solutions I can think of. -Toshio pgpxKdCbo8dSk.pgp Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
Le mercredi 19 janvier 2011 à 18:07 -0800, Toshio Kuratomi a écrit : Saying that multiple encodings on a single system is a misconfiguration every time it comes up does not make it true. Yes, each filesystem can have its own encoding. For example, this is supported by Linux. Python doesn't support such configuration, but this limitation is wider than the import machinery. If you consider it import enough, please open an issue. To the existing list I'd add getting a package from pypi -- neither tar nor zip files contain encoding information about the filenames. ZIP contain a flag to indicate the encoding: cp437 or UTF-8. TAR has an extension called PAX which stores filenames as UTF-8. But yes, most tarballs store filenames as raw byte strings. Anyway, if you would like to share your code on PyPI, you should not use non-ASCII module names (or any other non-ASCII name/identifier :-)). Python 3 supports non-ASCII identifiers (PEP 3131), but the developer is responsible to decide if (s)he uses it or not, depending on its audience. For a lesson at school, it is nice to write examples in the mother language, instead of using raw english with ASCII identifiers and filenames. In a school, you can use the same configuration (encoding) on all computers. * Specify an encoding per platform and stick to that. It doesn't work: on UNIX/BSD, the user chooses its own encoding and all programs will use it. (...) This prevents getting a mixture of encodings of modules (...) If you have an issue with encodings, when have to fix it when you create a module (on disk), not when you load a module (it is too late). (...) I mean something at the python code level:: import café encoded_as('latin1') Import a module using its byte name? You mean that café filename was not encoded to the Python filesystem encoding, but to other (wrong) encoding, at the creation of the module. As written before, you should fix your filename, instead of using an (ugly) workaround in Python. I haven't looked at your patch so perhaps you have an ingenous method of translating from the unicode representation of the module in the import statement to the bytes in arbitrary encodings on the filesystem that I haven't thought of. On Windows, My patch tries to avoid any conversion: it uses unicode everywhere. On other OSes, it uses the Python filesystem encoding to encode a module name (as it is done for any other operation on the filesystem with an unicode filename). -- Python 3 supports bytes filename to be able to read/copy/delete undecodable filenames, filenames stored in a encoding different than the system encoding, broken filenames. It is also possible to access these files using PEP 383 (with surrogate characters). This is useful to use Python on an old system. If you don't, however, then really - ASCII-only seems like the sanest of the three solutions I can think of. But a (Python 3) module is not supposed to have a broken filename. If it is the case, you have better to fix its name, instead of trying to fix the problem later (in Python). With UTF-8 filesystem encoding (eg. on Mac OS X, and most Linux setups), it is already possible to use non-ASCII module names. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Moving stuff out of Misc and over to the devguide
On Wed, Jan 19, 2011 at 15:49, Antoine Pitrou solip...@pitrou.net wrote: On Wed, 19 Jan 2011 15:31:24 -0800 Brett Cannon br...@python.org wrote: OK, here is my plan that I will implement: MOVE -- developers.txt maintainers.rst README.gdb README.coverity README.Emacs DELETE (seem way too old to still be relevant; tell me if I am wrong) --- README.OpenBSD README.AIX cheatsheet README.gdb is useful to more than core developers and contributors, so I think it should stay inside Misc. That's true of README.Emacs as well. But I'm willing to bet more people will find out about the gdb and Emacs details if we put them online through search engines, blogs, and reading the devguide than anyone ever did by digging through the Misc directory. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Thu, Jan 20, 2011 at 03:51:05AM +0100, Victor Stinner wrote: For a lesson at school, it is nice to write examples in the mother language, instead of using raw english with ASCII identifiers and filenames. Then use this:: import cafe as café When you do things this way you do not have to translate between unknown encodings into unicode. Everything is within python source where you have a defined encoding. Teaching students to write non-portable code (relying on filesystem encoding where your solution is, don't upload to pypi anything that has non-ascii filenames) seems like the exact opposite of how you'd want to shape a young student's understanding of good programming practices. In a school, you can use the same configuration (encoding) on all computers. In a school computer lab perhaps. But not on all the students' and professors' machines. How many professors will be cursing python when they discover that the example code that they wrote on their Linux workstation doesn't work when the students try to use it in their windows computer lab? How many students will be upset when the code they turn in runs on their professor's test machine if the lab computers were booted into the Linux partition but not if the they were booted into Windows? * Specify an encoding per platform and stick to that. It doesn't work: on UNIX/BSD, the user chooses its own encoding and all programs will use it. (...) This prevents getting a mixture of encodings of modules (...) If you have an issue with encodings, when have to fix it when you create a module (on disk), not when you load a module (it is too late). It's not too late to throw a clear error of what's wrong. I haven't looked at your patch so perhaps you have an ingenous method of translating from the unicode representation of the module in the import statement to the bytes in arbitrary encodings on the filesystem that I haven't thought of. On Windows, My patch tries to avoid any conversion: it uses unicode everywhere. On other OSes, it uses the Python filesystem encoding to encode a module name (as it is done for any other operation on the filesystem with an unicode filename). The other interfaces are somewhat of a red herring here. As I wrote in another email, importing modules has ramifications that open(), for instance, does not. Additionally, those other filesystem operations have been growing the ability to take byte values and encoding parameters because unicode translation via a single filesystem encoding is a good default but not a complete solution. I think that this problem demands a complete solution, however, and it seems to me that limiting the scope of the problem is the most pleasant method to accomplish this. Your solution creates modules which aren't portable. One of my proposals creates python code which isn't portable. The other one suffers some of the same disadvantages as your solution in portability but allows for tools that could automatically correct modules. -- Python 3 supports bytes filename to be able to read/copy/delete undecodable filenames, filenames stored in a encoding different than the system encoding, broken filenames. It is also possible to access these files using PEP 383 (with surrogate characters). This is useful to use Python on an old system. If you don't, however, then really - ASCII-only seems like the sanest of the three solutions I can think of. But a (Python 3) module is not supposed to have a broken filename. If it is the case, you have better to fix its name, instead of trying to fix the problem later (in Python). We agree that there should not be broken module names. However it seems we very hotly disagree about the definition of that. You think that if a module is named appropriately on one system but is not portable to another system, that's fine. I think that portability between systems is very important and sacrificing that so that someone can locally use a module with non-ASCII characters doesn't have a justifiable reward. With UTF-8 filesystem encoding (eg. on Mac OS X, and most Linux setups), it is already possible to use non-ASCII module names. Tangent: This is not true about Linux. UTF-8 is a matter of the interpretation of the filesystem bytes that the user specifies by setting their system locale. Setting system locale to ASCII for use in system-wide scripts, is quite common as is changing locale settings in other parts of the world (as I can tell you from the bug reports colleagues CC me on to fix for the problems with unicode support in their python2 programs). Allowing module names incompatible with ascii without specifying an encoding will just lead to bug reports down the line. Relatively few programmers understand the difference between the python unicode abstraction and the byte representations possible for those strings. Allowing non-ascii characters in module filenames without specifying an
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 9:07 PM, Toshio Kuratomi a.bad...@gmail.com wrote: .. Do you have a solution to the problem? I haven't looked at your patch so perhaps you have an ingenous method of translating from the unicode representation of the module in the import statement to the bytes in arbitrary encodings on the filesystem that I haven't thought of. If I understand what Victor's patch does correctly, it allows Python on Windows to bypass translation from Unicode to bytes by using Windows wide character APIs. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On 1/19/2011 8:39 PM, Toshio Kuratomi wrote: use this:: import cafe as café When you do things this way you do not have to translate between unknown encodings into unicode. Everything is within python source where you have a defined encoding. This is a great way of converting non-portable module names, if the module ever leaves the bounds of its computer, and runs into problems there. It may be that the best practices for writing platform portable modules should include * ASCII module filenames * Code that can handle 16 or 32 bit Unicode * and likely some other things. But for local code, having to think up an ASCII name for a module rather than use the obvious native-language name, is just brain-burden when creating the code. Your demonstration of such an easy solution to the concerns you raise convinces me more than ever that it is acceptable to allow non-ASCII module names. For those programmers in a single locale environment, it'll just work. And for those not in a single locale environment, there is your above simple solution to achieve portability without changing large numbers of lines of code. Glenn ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Jan 20, 2011, at 12:02 AM, Glenn Linderman wrote: But for local code, having to think up an ASCII name for a module rather than use the obvious native-language name, is just brain-burden when creating the code. Is it really? You already had to type 'import', presumably if you can think in Python you can think in ASCII. (After my experiences with namespace crowding in Twisted, I'm inclined to suggest something more like import m_07117FE4A1EBD544965DC19573183DA2 as café - then I never need to worry about café2 looking ugly or cafe being incompatible :).) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 11:39 PM, Toshio Kuratomi a.bad...@gmail.com wrote: .. Teaching students to write non-portable code (relying on filesystem encoding where your solution is, don't upload to pypi anything that has non-ascii filenames) seems like the exact opposite of how you'd want to shape a young student's understanding of good programming practices. Let's not confuse language definition with the quality of implementation. It would be a perfectly valid Python implementation that would run on a system that does not even have a traditional filesystem and import foo looks up foo module code in an in-memory database. Should Python be redefined so that module names are case insensitive simply because case insensitive filesystems are still popular? I don't think so. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On 1/19/2011 9:11 PM, Glyph Lefkowitz wrote: On Jan 20, 2011, at 12:02 AM, Glenn Linderman wrote: But for local code, having to think up an ASCII name for a module rather than use the obvious native-language name, is just brain-burden when creating the code. Is it really? You already had to type 'import', presumably if you can think in Python you can think in ASCII. There is a difference between memorizing and typing keywords, and inventing new names in non-native scripts. It is hard to even invent all the names in one's native language; if restricted to inventing them, even some of them, in some non-native script such as ASCII, it is just brain-burden indeed. (After my experiences with namespace crowding in Twisted, I'm inclined to suggest something more like import m_07117FE4A1EBD544965DC19573183DA2 as café - then I never need to worry about café2 looking ugly or cafe being incompatible :).) Now if the stuff after m_ was the hex UTF-8 of café, that could get interesting :) But now you are talking about automating the creation of ASCII file names from the actual non-ASCII names of the modules, or something. Sadly, the module is not required to contain its name, so if it differs from the filename, some global view or non-Python annotation would be required to create/maintain the mapping. [This paragraph is only semi-serious, like yours.] ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Jan 20, 2011, at 12:19 AM, Glenn Linderman wrote: Now if the stuff after m_ was the hex UTF-8 of café, that could get interesting :) (As it happens, it's the hex digest of the MD5 of the UTF-8 of café... ;-))___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Thu, Jan 20, 2011 at 12:11 AM, Glyph Lefkowitz gl...@twistedmatrix.com wrote: .. But for local code, having to think up an ASCII name for a module rather than use the obvious native-language name, is just brain-burden when creating the code. Is it really? You already had to type 'import', presumably if you can think in Python you can think in ASCII. Yes, it is a burden. For example, Russian word щи can be transliterated into ASCII as schi, shchi, stchi, or even wji. There are many incompatible standards and neither is well-known or natural. Reading transliterated Cyrillic text is not hard, but guessing the correct spelling is nearly impossible. Good programming style guides recommend avoiding arbitrary contractions in variable names for the same reason. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On Wed, Jan 19, 2011 at 09:02:17PM -0800, Glenn Linderman wrote: On 1/19/2011 8:39 PM, Toshio Kuratomi wrote: use this:: import cafe as café When you do things this way you do not have to translate between unknown encodings into unicode. Everything is within python source where you have a defined encoding. This is a great way of converting non-portable module names, if the module ever leaves the bounds of its computer, and runs into problems there. You're missing a piece here. If you mandate ascii you can convert to a unicode name using import as because python knows that it has ascii text from the filesystem when it converts it to an abstract unicode string that you've specified in the program text. You cannot go the other way because python lacks the information (the encoding of the filename on the filesystem) to do the transformation. Your demonstration of such an easy solution to the concerns you raise convinces me more than ever that it is acceptable to allow non-ASCII module names. For those programmers in a single locale environment, it'll just work. And for those not in a single locale environment, there is your above simple solution to achieve portability without changing large numbers of lines of code. Does my demonstration that you can't do that mean that it's no longer acceptable? :-) /me guesses that the relative merits of being forced to write portable code vs convenience of writing a module name in your native script still has a different balance than in mine, thus the smiley :-) -Toshio pgpVg5DKpRDXA.pgp Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
On 1/19/2011 11:20 PM, Toshio Kuratomi wrote: On Wed, Jan 19, 2011 at 09:02:17PM -0800, Glenn Linderman wrote: On 1/19/2011 8:39 PM, Toshio Kuratomi wrote: use this:: import cafe as café When you do things this way you do not have to translate between unknown encodings into unicode. Everything is within python source where you have a defined encoding. This is a great way of converting non-portable module names, if the module ever leaves the bounds of its computer, and runs into problems there. You're missing a piece here. If you mandate ascii you can convert to a unicode name using import as because python knows that it has ascii text from the filesystem when it converts it to an abstract unicode string that you've specified in the program text. You cannot go the other way because python lacks the information (the encoding of the filename on the filesystem) to do the transformation. Your demonstration of such an easy solution to the concerns you raise convinces me more than ever that it is acceptable to allow non-ASCII module names. For those programmers in a single locale environment, it'll just work. And for those not in a single locale environment, there is your above simple solution to achieve portability without changing large numbers of lines of code. Does my demonstration that you can't do that mean that it's no longer acceptable? :-) /me guesses that the relative merits of being forced to write portable code vs convenience of writing a module name in your native script still has a different balance than in mine, thus the smiley :-) -Toshio Sadly, you didn't demonstrate it, you seem to have misunderstood my statement, which was probably not all that clear, somehow. Let me try again. User codes module café.py, tests, debugs, completes, is happy. User moves code to a different computer, different locale, no é character, module can't be found, is sad. User renames file to cafefromuser.py, changes the import statement from import café to import cafefromuser as café module now imports successfully, no other code changes needed. User is happy again, thanks Toshio for great solution to file system encoding problem. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com