At Guido's request, splitting out two specific questions from Serhiy's
thread where I believe we could do with an explicit "yes or no" from
him.
1. Should we accept patches adding support for the direct use of bytes
paths in lower level filesystem manipulation APIs? (i.e. everything
that isn't pat
I declare this thread irreparably broken. Do not make any decisions in this
thread. Tell me (in another thread) when it's time to decide and I will.
On Sat, Aug 23, 2014 at 8:27 PM, Nick Coghlan wrote:
> On 24 August 2014 04:37, Oleg Broytman wrote:
> > On Sat, Aug 23, 2014 at 06:40:37PM +0100
On 24 August 2014 04:37, Oleg Broytman wrote:
> On Sat, Aug 23, 2014 at 06:40:37PM +0100, Paul Moore
> wrote:
>> Generally, it seems to be mostly a reaction to the repeated claims
>> that Python, or Windows, or whatever, is "broken".
>
>Ah, if that's the only problem I certainly can live wit
Isaac Morland wrote:
In HTML 5 it allows non-ASCII-compatible encodings as long as U+FEFF
(byte order mark) is used:
http://www.w3.org/TR/html-markup/syntax.html#encoding-declaration
Not sure about XML.
According to Appendix F here:
http://www.w3.org/TR/xml/#sec-guessing
an XML parser need
On 23 August 2014 19:37, Oleg Broytman wrote:
> Unix takes the idea that everything is text and a stream of bytes to
> its extreme.
I don't really understand the idea of "text and a stream of bytes".
The two are fundamentally different in my view. But I guess that's why
we have to agree to differ
Hi!
On Sat, Aug 23, 2014 at 06:40:37PM +0100, Paul Moore
wrote:
> On 23 August 2014 16:15, Oleg Broytman wrote:
> > On Sat, Aug 23, 2014 at 06:02:06PM +0900, "Stephen J. Turnbull"
> > wrote:
> >> And that's the big problem with Oleg's complaint, too. It's not at
> >> all clear what he wants
On 23 August 2014 16:15, Oleg Broytman wrote:
> On Sat, Aug 23, 2014 at 06:02:06PM +0900, "Stephen J. Turnbull"
> wrote:
>> And that's the big problem with Oleg's complaint, too. It's not at
>> all clear what he wants
>
>The first thing is I want to understand why people continue to refer
>
"R. David Murray" :
> The same problem existed in python2 if your goal was to produce a stream
> with a consistent encoding, but now python3 treats that as an error.
I have a different interpretation of the situation: as a rule, use byte
strings in Python3. Text strings are a special corner case
On Sat, 23 Aug 2014, Marko Rauhamaa wrote:
"Stephen J. Turnbull" :
Just read as bytes and decode piecewise in one way or another. For
Oleg's HTML case, there's a well-understood structure that can be used
to determine retry points
HTML and XML are interesting examples since their encoding is
On Sat, Aug 23, 2014 at 07:14:47PM +0900, "Stephen J. Turnbull"
wrote:
> I cannot believe you are going to find a better environment for
> dealing with these issues than Python 3.
Well, that's may be.
Oleg.
--
Oleg Broytmanhttp://phdru.name/p...@phdru.name
On Sat, Aug 23, 2014 at 06:02:06PM +0900, "Stephen J. Turnbull"
wrote:
> And that's the big problem with Oleg's complaint, too. It's not at
> all clear what he wants
The first thing is I want to understand why people continue to refer
to Unix was as "broken". Better yet, to persuade them it'
On Sat, 23 Aug 2014 21:08:29 +1000, Steven D'Aprano wrote:
> When I started this email, I originally began to say that the actual
> problem was with byte file names that cannot be decoded into Unicode
> using the system encoding (typically UTF-8 on Linux systems. But I've
> actually had difficu
On Fri, Aug 22, 2014 at 11:53:01AM -0700, Chris Barker wrote:
> The point is that if you are reading a file name from the system, and then
> passing it back to the system, then you can treat it as just bytes -- who
> cares? And if you add the byte value of 47 thing, then you can even do
> basic pa
Oleg Broytman writes:
>This is the core of the problem. Python2 favors Unix model but
> Windows people pays the price. Python3 reverses that
This is certainly not true. What is true is that Python 3 makes no
attempt to make it easy to write crappy software in the old Unix
style, that break
Isaac Morland :
>> HTTP/1.1 200 OK
>> Content-Type: text/html; charset=ISO-8859-1
>>
>>
>>
>>
>>
>
> For HTML it's not quite so bad. According to the HTML 4 standard:
> [...]
>
> The Content-Type header takes precedence over a element. I
> thought I read once that the reason was to all
On Sat, Aug 23, 2014 at 7:02 PM, Stephen J. Turnbull wrote:
> Chris Barker writes:
>
> > So I write bytes that are encoded one way into a text file that's encoded
> > another way, and expect to be abel to read that later?
>
> No, not you. Crap software does that. Your MUD server. Oleg's
> fav
"Stephen J. Turnbull" :
> Just read as bytes and decode piecewise in one way or another. For
> Oleg's HTML case, there's a well-understood structure that can be used
> to determine retry points
HTML and XML are interesting examples since their encoding is initially
unknown:
Chris Barker writes:
> So I write bytes that are encoded one way into a text file that's encoded
> another way, and expect to be abel to read that later?
No, not you. Crap software does that. Your MUD server. Oleg's
favorite web pages with ads, or more likely the ad servers.
> Not for me (
Chris Angelico writes:
> Not sure why 1251,
All of those codes have repertoires that are Cyrillic supersets,
presumably Russian-language content, based on Oleg's top domain.
> But it's important to note that this is a method of handling junk.
> It's not a design intention; this is for a situa
Chris Barker writes:
> > The third is to specify the UTF-8 with the surrogate escape error
> > handler. This allows non-UTF-8 codes to be loaded into
> > memory.
Read as bytes and incrementally decode. If you hit an Exception,
retry from that point.
> Just so I'm clear here -- if you write
20 matches
Mail list logo