[Python-Dev] Fwd: Accepting PEP 440: Version Identification and Dependency Specification
Antoine pointed out that it would still be a good idea to forward packaging PEP acceptance announcements to python-dev, even when the actual acceptance happens on distutils-sig. That makes sense to me, so here's last week's notice of the acceptance of PEP 440, the implementation independent versioning standard derived from pkg_resources, PEP 386, and ideas from both Linux distributions and other open source language communities. Regards, Nick. -- Forwarded message -- From: Nick Coghlan ncogh...@gmail.com Date: 22 August 2014 22:34 Subject: Accepting PEP 440: Version Identification and Dependency Specification To: DistUtils mailing list distutils-...@python.org I just pushed Donald's final round of edits in response to the feedback on the last PEP 440 thread, and as such I'm happy to announce that I am accepting PEP 440 as the recommended approach to identifying versions and specifying dependencies when distributing Python software. The PEP is available in the usual place at http://www.python.org/dev/peps/pep-0440/ It's been a long road to get to an implementation independent versioning standard that has a feasible migration path from the current pkg_resources defined de facto standard, and I'd like to thank a few folks: * Donald Stufft for his extensive work on PEP 440 itself, especially the proof of concept integration into pip * Vinay Sajip for his efforts in validating earlier versions of the PEP * Tarek Ziadé for starting us down the road to an implementation independent versioning standard with the initial creation of PEP 386 back in June 2009, more than five years ago! Regards, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bytes path support
Am 24.08.14 03:11, schrieb Greg Ewing: Isaac Morland wrote: In HTML 5 it allows non-ASCII-compatible encodings as long as U+FEFF (byte order mark) is used: http://www.w3.org/TR/html-markup/syntax.html#encoding-declaration Not sure about XML. According to Appendix F here: http://www.w3.org/TR/xml/#sec-guessing an XML parser needs to be prepared to try all the encodings it supports until it finds one that works well enough to decode the XML declaration, then it can find out the exact encoding used. That's not what this section says. Instead, it says that you need to auto-detect UCS-4, UTF-16, UTF-8 from the BOM, or guess them or EBCDIC from the encoding of '?'. This should be enough to actually parse the encoding declaration. Other non-ASCII-compatible encodings can only be used if declared in an upper-level protocol (such as HTTP). The parser is not expected to try out all encodings it supports. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bytes path related questions for Guido
On 2014-08-26 03:11, Stephen J. Turnbull wrote: Nick Coghlan writes: purge_surrogate_escapes was the other term that occurred to me. purge suggests removal, not replacement. That may be useful too. neutralize_surrogate_escapes(s, remove=False, replacement='\uFFFD') How about: replace_surrogate_escapes(s, replacement='\uFFFD') If you want them removed, just pass an empty string as the replacement. maybe? (Of course the remove argument is feature creep, so I'm only about +0.5 myself. And the name is long, but I can't think of any better synonyms for make safe in English right now). Either way, my use case is to filter them out when I *don't* want to pass them along to other software, but would prefer the Unicode replacement character to the ASCII question mark created by using the replace filter when encoding. I think it would be preferable to be unicodely correct here by default, since this is a str - str function. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bytes path support
On Sun, 24 Aug 2014 13:27:55 +1000, Nick Coghlan ncogh...@gmail.com wrote: As some examples of where bilingual computing breaks down: * My NFS client and server may have different locale settings * My FTP client and server may have different locale settings * My SSH client and server may have different locale settings * I save a file locally and send it to someone with a different locale setting * I attempt to access a Windows share from a Linux client (or vice-versa) * I clone my POSIX hosted git or Mercurial repository on a Windows client * I have to connect my Linux client to a Windows Active Directory domain (or vice-versa) * I have to interoperate between native code and JVM code The entire computing industry is currently struggling with this monolingual (ASCII/Extended ASCII/EBCDIC/etc) - bilingual (locale encoding/code pages) - multilingual (Unicode) transition. It's been going on for decades, and it's still going to be quite some time before we're done. The POSIX world is slowly clawing its way towards a multilingual model that actually works: UTF-8 Windows (including the CLR) and the JVM adopted a different multilingual model, but still one that actually works: UTF-16-LE This kind of puts the length of the python2-python3 transition period in perspective, doesn't it? --David ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Windows Unicode console support [Was: Bytes path support]
On 24 August 2014 04:27, Nick Coghlan ncogh...@gmail.com wrote: One of those areas is the fact that we still use the old 8-bit APIs to interact with the Windows console. Those are just as broken in a multilingual world as the other Windows 8-bit APIs, so Drekin came up with a project to expose the Windows console as a UTF-16-LE stream that uses the 16-bit APIs instead: https://pypi.python.org/pypi/win_unicode_console I personally hope we'll be able to get the issues Drekin references there resolved for Python 3.5 - if other folks hope for the same thing, then one of the best ways to help that happen is to try out the win_unicode_console module and provide feedback on what does and doesn't work. This looks very cool, and I plan on giving it a try. But I don't see any issues mentioned there (unless you mean the fact that it's not possible to hook into Python's interactive interpreter directly, but I don't see how that could be fixed in an external module). There's no open issues on the project's github tracker. I'd love to see this go into 3.5, so any more specific suggestions as to what would be needed to move it forwards would be great. Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bytes path support
On 8/26/2014 9:11 AM, R. David Murray wrote: On Sun, 24 Aug 2014 13:27:55 +1000, Nick Coghlan ncogh...@gmail.com wrote: As some examples of where bilingual computing breaks down: * My NFS client and server may have different locale settings * My FTP client and server may have different locale settings * My SSH client and server may have different locale settings * I save a file locally and send it to someone with a different locale setting * I attempt to access a Windows share from a Linux client (or vice-versa) * I clone my POSIX hosted git or Mercurial repository on a Windows client * I have to connect my Linux client to a Windows Active Directory domain (or vice-versa) * I have to interoperate between native code and JVM code The entire computing industry is currently struggling with this monolingual (ASCII/Extended ASCII/EBCDIC/etc) - bilingual (locale encoding/code pages) - multilingual (Unicode) transition. It's been going on for decades, and it's still going to be quite some time before we're done. The POSIX world is slowly clawing its way towards a multilingual model that actually works: UTF-8 Windows (including the CLR) and the JVM adopted a different multilingual model, but still one that actually works: UTF-16-LE Nick, I think the first half of your post is one of the clearest expositions yet of 'why Python 3' (in particular, the str to unicode change). It is worthy of wider distribution and without much change, it would be a great blog post. This kind of puts the length of the python2-python3 transition period in perspective, doesn't it? -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bytes path support
On 27 Aug 2014 02:52, Terry Reedy tjre...@udel.edu wrote: On 8/26/2014 9:11 AM, R. David Murray wrote: On Sun, 24 Aug 2014 13:27:55 +1000, Nick Coghlan ncogh...@gmail.com wrote: As some examples of where bilingual computing breaks down: * My NFS client and server may have different locale settings * My FTP client and server may have different locale settings * My SSH client and server may have different locale settings * I save a file locally and send it to someone with a different locale setting * I attempt to access a Windows share from a Linux client (or vice-versa) * I clone my POSIX hosted git or Mercurial repository on a Windows client * I have to connect my Linux client to a Windows Active Directory domain (or vice-versa) * I have to interoperate between native code and JVM code The entire computing industry is currently struggling with this monolingual (ASCII/Extended ASCII/EBCDIC/etc) - bilingual (locale encoding/code pages) - multilingual (Unicode) transition. It's been going on for decades, and it's still going to be quite some time before we're done. The POSIX world is slowly clawing its way towards a multilingual model that actually works: UTF-8 Windows (including the CLR) and the JVM adopted a different multilingual model, but still one that actually works: UTF-16-LE Nick, I think the first half of your post is one of the clearest expositions yet of 'why Python 3' (in particular, the str to unicode change). It is worthy of wider distribution and without much change, it would be a great blog post. Indeed, I had the same idea - I had been assuming users already understood this context, which is almost certainly an invalid assumption. The blog post version is already mostly written, but I ran out of weekend. Will hopefully finish it up and post it some time in the next few days :) This kind of puts the length of the python2-python3 transition period in perspective, doesn't it? I realised in writing the post that ASCII is over 50 years old at this point, while Unicode as an official standard is more than 20. By the time this is done, we'll likely be talking 30+ years for Unicode to displace the confusing mess that is code pages and locale encodings :) Cheers, Nick. -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bytes path support
Nick Coghlan ncogh...@gmail.com writes: As some examples of where bilingual computing breaks down: * My NFS client and server may have different locale settings * My FTP client and server may have different locale settings * My SSH client and server may have different locale settings * I save a file locally and send it to someone with a different locale setting * I attempt to access a Windows share from a Linux client (or vice-versa) * I clone my POSIX hosted git or Mercurial repository on a Windows client * I have to connect my Linux client to a Windows Active Directory domain (or vice-versa) * I have to interoperate between native code and JVM code The entire computing industry is currently struggling with this monolingual (ASCII/Extended ASCII/EBCDIC/etc) - bilingual (locale encoding/code pages) - multilingual (Unicode) transition. It's been going on for decades, and it's still going to be quite some time before we're done. The POSIX world is slowly clawing its way towards a multilingual model that actually works: UTF-8 Windows (including the CLR) and the JVM adopted a different multilingual model, but still one that actually works: UTF-16-LE Nick, I think the first half of your post is one of the clearest expositions yet of 'why Python 3' (in particular, the str to unicode change). It is worthy of wider distribution and without much change, it would be a great blog post. Indeed, I had the same idea - I had been assuming users already understood this context, which is almost certainly an invalid assumption. The blog post version is already mostly written, but I ran out of weekend. Will hopefully finish it up and post it some time in the next few days :) In that case, maybe it'd be nice to also explain why you use the term bilingual for codepage based encoding. At least to me, a codepage/locale is pretty monolingual, or alternatively covering a whole region (e.g. western europe). I figure with bilingual you mean ascii + something, but that's mostly a guess from my side. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bytes path support
Nikolaus Rath writes: In that case, maybe it'd be nice to also explain why you use the term bilingual for codepage based encoding. Modern computing systems are written in languages which are invariably based on syntax expressed using ASCII, and provide by default functionality for expressing dates etc suitable for rendering American English. Thus ASCII (ie, American English) is always an available language. Code pages provide facilities for rendering one or more languages languages sharing a common coded character set, but are unsuitable for rendering most of the rest of the world's dozens of language groups (grouping languages by common character set). Multilingual has come to mean able to express (almost) any set of languages in a single text (see, for example, Emacs's HELLO file), not just more than two. So code pages are closer in spirit to bilingual (two of many) than to multilingual (all of many). It's messy, analogical terminology. But then, natural language is messy and analogical.wink/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com