[Python-Dev] Bytes path related questions for Guido

2014-08-23 Thread Nick Coghlan
At Guido's request, splitting out two specific questions from Serhiy's thread where I believe we could do with an explicit "yes or no" from him. 1. Should we accept patches adding support for the direct use of bytes paths in lower level filesystem manipulation APIs? (i.e. everything that isn't pat

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Guido van Rossum
I declare this thread irreparably broken. Do not make any decisions in this thread. Tell me (in another thread) when it's time to decide and I will. On Sat, Aug 23, 2014 at 8:27 PM, Nick Coghlan wrote: > On 24 August 2014 04:37, Oleg Broytman wrote: > > On Sat, Aug 23, 2014 at 06:40:37PM +0100

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Nick Coghlan
On 24 August 2014 04:37, Oleg Broytman wrote: > On Sat, Aug 23, 2014 at 06:40:37PM +0100, Paul Moore > wrote: >> Generally, it seems to be mostly a reaction to the repeated claims >> that Python, or Windows, or whatever, is "broken". > >Ah, if that's the only problem I certainly can live wit

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Greg Ewing
Isaac Morland wrote: In HTML 5 it allows non-ASCII-compatible encodings as long as U+FEFF (byte order mark) is used: http://www.w3.org/TR/html-markup/syntax.html#encoding-declaration Not sure about XML. According to Appendix F here: http://www.w3.org/TR/xml/#sec-guessing an XML parser need

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Paul Moore
On 23 August 2014 19:37, Oleg Broytman wrote: > Unix takes the idea that everything is text and a stream of bytes to > its extreme. I don't really understand the idea of "text and a stream of bytes". The two are fundamentally different in my view. But I guess that's why we have to agree to differ

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Oleg Broytman
Hi! On Sat, Aug 23, 2014 at 06:40:37PM +0100, Paul Moore wrote: > On 23 August 2014 16:15, Oleg Broytman wrote: > > On Sat, Aug 23, 2014 at 06:02:06PM +0900, "Stephen J. Turnbull" > > wrote: > >> And that's the big problem with Oleg's complaint, too. It's not at > >> all clear what he wants

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Paul Moore
On 23 August 2014 16:15, Oleg Broytman wrote: > On Sat, Aug 23, 2014 at 06:02:06PM +0900, "Stephen J. Turnbull" > wrote: >> And that's the big problem with Oleg's complaint, too. It's not at >> all clear what he wants > >The first thing is I want to understand why people continue to refer >

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Marko Rauhamaa
"R. David Murray" : > The same problem existed in python2 if your goal was to produce a stream > with a consistent encoding, but now python3 treats that as an error. I have a different interpretation of the situation: as a rule, use byte strings in Python3. Text strings are a special corner case

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Isaac Morland
On Sat, 23 Aug 2014, Marko Rauhamaa wrote: "Stephen J. Turnbull" : Just read as bytes and decode piecewise in one way or another. For Oleg's HTML case, there's a well-understood structure that can be used to determine retry points HTML and XML are interesting examples since their encoding is

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Oleg Broytman
On Sat, Aug 23, 2014 at 07:14:47PM +0900, "Stephen J. Turnbull" wrote: > I cannot believe you are going to find a better environment for > dealing with these issues than Python 3. Well, that's may be. Oleg. -- Oleg Broytmanhttp://phdru.name/p...@phdru.name

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Oleg Broytman
On Sat, Aug 23, 2014 at 06:02:06PM +0900, "Stephen J. Turnbull" wrote: > And that's the big problem with Oleg's complaint, too. It's not at > all clear what he wants The first thing is I want to understand why people continue to refer to Unix was as "broken". Better yet, to persuade them it'

Re: [Python-Dev] Bytes path support

2014-08-23 Thread R. David Murray
On Sat, 23 Aug 2014 21:08:29 +1000, Steven D'Aprano wrote: > When I started this email, I originally began to say that the actual > problem was with byte file names that cannot be decoded into Unicode > using the system encoding (typically UTF-8 on Linux systems. But I've > actually had difficu

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Steven D'Aprano
On Fri, Aug 22, 2014 at 11:53:01AM -0700, Chris Barker wrote: > The point is that if you are reading a file name from the system, and then > passing it back to the system, then you can treat it as just bytes -- who > cares? And if you add the byte value of 47 thing, then you can even do > basic pa

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Stephen J. Turnbull
Oleg Broytman writes: >This is the core of the problem. Python2 favors Unix model but > Windows people pays the price. Python3 reverses that This is certainly not true. What is true is that Python 3 makes no attempt to make it easy to write crappy software in the old Unix style, that break

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Marko Rauhamaa
Isaac Morland : >> HTTP/1.1 200 OK >> Content-Type: text/html; charset=ISO-8859-1 >> >> >> >> >> > > For HTML it's not quite so bad. According to the HTML 4 standard: > [...] > > The Content-Type header takes precedence over a element. I > thought I read once that the reason was to all

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Chris Angelico
On Sat, Aug 23, 2014 at 7:02 PM, Stephen J. Turnbull wrote: > Chris Barker writes: > > > So I write bytes that are encoded one way into a text file that's encoded > > another way, and expect to be abel to read that later? > > No, not you. Crap software does that. Your MUD server. Oleg's > fav

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Marko Rauhamaa
"Stephen J. Turnbull" : > Just read as bytes and decode piecewise in one way or another. For > Oleg's HTML case, there's a well-understood structure that can be used > to determine retry points HTML and XML are interesting examples since their encoding is initially unknown:

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Stephen J. Turnbull
Chris Barker writes: > So I write bytes that are encoded one way into a text file that's encoded > another way, and expect to be abel to read that later? No, not you. Crap software does that. Your MUD server. Oleg's favorite web pages with ads, or more likely the ad servers. > Not for me (

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Stephen J. Turnbull
Chris Angelico writes: > Not sure why 1251, All of those codes have repertoires that are Cyrillic supersets, presumably Russian-language content, based on Oleg's top domain. > But it's important to note that this is a method of handling junk. > It's not a design intention; this is for a situa

Re: [Python-Dev] Bytes path support

2014-08-23 Thread Stephen J. Turnbull
Chris Barker writes: > > The third is to specify the UTF-8 with the surrogate escape error > > handler. This allows non-UTF-8 codes to be loaded into > > memory. Read as bytes and incrementally decode. If you hit an Exception, retry from that point. > Just so I'm clear here -- if you write