date:20140109

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Stephen J. Turnbull

Chris Angelico writes: > I'm not saying that chardet is bad, but I *am* saying, and I stand > by this, that an auto-detect option on file open is a bad idea. I have used it by default in Emacs and XEmacs since 1990, and I certainly haven't experienced it as a bad idea at *any* time in more than

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Stephen J. Turnbull

INADA Naoki writes: > latin1 is OK but is it Pythonic? Yes. EIBTI, including being explicit that you're doing something that has semantics that you are ignoring but may come back to bite you or somebody who naively uses your module. There's nothing un-Pythonic about using potentially dangerous

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Ben Finney

Steven D'Aprano writes: > I think that heuristics to guess the encoding have their role to play, > if the caller understands the risks. I think, for a language whose developers espouse a principle “In the face of ambiguity, refuse the temptation to guess”, heuristics have no role to play in the

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Chris Angelico

On Fri, Jan 10, 2014 at 1:39 PM, Steven D'Aprano wrote: > On Fri, Jan 10, 2014 at 12:22:02PM +1100, Chris Angelico wrote: >> On Fri, Jan 10, 2014 at 11:53 AM, anatoly techtonik >> wrote: >> > 2. introduce autodetect mode to open functions >> > 1. read and transform on the fly, maintaining

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Lennart Regebro

On Fri, Jan 10, 2014 at 2:03 AM, Joao S. O. Bueno wrote: > On 9 January 2014 04:50, Lennart Regebro wrote: >> To be honest, you can define text as "A stream of bytes that are split >> up in lines separated by a linefeed", and do some basic text >> processing like that. Just very *basic*, but stil

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Lennart Regebro

On Thu, Jan 9, 2014 at 10:06 AM, Kristján Valur Jónsson wrote: > Do I speak Chinese to my grocer because china is a growing force in the > world? Or start every discussion with my children with a negotiation on what > language to use? No, because your environment have a default language. And P

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Steven D'Aprano

On Fri, Jan 10, 2014 at 12:22:02PM +1100, Chris Angelico wrote: > On Fri, Jan 10, 2014 at 11:53 AM, anatoly techtonik > wrote: > > 2. introduce autodetect mode to open functions > > 1. read and transform on the fly, maintaining a buffer that > > stores original bytes > > and their

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Terry Reedy

On 1/9/2014 6:25 PM, Chris Barker wrote: as so -- I want to replace a bit of ascii text surrounded by arbitrary binary: (apologies for the py2...) In [24]: b Out[24]: '\x01\x00\xd1\x80\xd1a name\xd0\x80' In [25]: u = b.decode('latin-1') In [26]: u2 = u.replace('a name', 'a different name') In [

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Steven D'Aprano

On Thu, Jan 09, 2014 at 02:08:57PM -0800, Ethan Furman wrote: > If latin1 is used to convert binary to text, how convoluted is it to then > take chunks of that text and convert to int, or some other variety of > unicode? > > For example: b'\x01\x00\xd1\x80\xd1\83\xd0\x80' > > If that were dec

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-09 Thread Nick Coghlan

On 10 Jan 2014 03:32, "Antoine Pitrou" wrote: > > On Fri, 10 Jan 2014 05:26:04 +1000 > Nick Coghlan wrote: > > > > We should probably include format_map for consistency with the str API. > > Yes, you're right. > > > >However, I > > > also added bytearray into the mix, as bytearray objects should

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Chris Angelico

On Fri, Jan 10, 2014 at 11:53 AM, anatoly techtonik wrote: > 2. introduce autodetect mode to open functions > 1. read and transform on the fly, maintaining a buffer that > stores original bytes > and their mapping to letters. The mapping is updated as bytes > frequency >

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Joao S. O. Bueno

On 9 January 2014 04:50, Lennart Regebro wrote: > To be honest, you can define text as "A stream of bytes that are split > up in lines separated by a linefeed", and do some basic text > processing like that. Just very *basic*, but still. Replacing > characters. Extracting certain lines etc. That

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread anatoly techtonik

On Thu, Jan 9, 2014 at 10:00 AM, Mark Lawrence wrote: > On 09/01/2014 06:50, Lennart Regebro wrote: >> >> On Thu, Jan 9, 2014 at 1:07 AM, Ben Finney >> wrote: >>> >>> Kristján Valur Jónsson writes: >>> Believe it or not, sometimes you really don't care about encodings. Sometimes you ju

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread INADA Naoki

latin1 is OK but is it Pythonic? I've posted suggestion about add 'bytes' as a alias for 'latin1'. http://comments.gmane.org/gmane.comp.python.ideas/10315 I want one Pythonic way to handle "binary containing ascii (or latin1 or utf-8 or other ascii compatible)". On Fri, Jan 10, 2014 at 8:53 AM

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Chris Barker

On Thu, Jan 9, 2014 at 3:14 PM, Ethan Furman wrote: > Sorry, I was too short with my example. My use case is binary files, with > ASCII metadata and binary metadata, as well as ASCII-encoded numeric > values, binary-coded numeric values, ASCII-encoded boolean values, and > who-knows-what-(before

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Ethan Furman

On 01/09/2014 02:54 PM, Paul Moore wrote: On 9 January 2014 22:08, Ethan Furman wrote: For example: b'\x01\x00\xd1\x80\xd1\83\xd0\x80' If that were decoded using latin1 how would I then get the first two bytes to the integer 256 and the last six bytes to their Cyrillic meaning? (Apologies for

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Chris Barker

On Thu, Jan 9, 2014 at 2:54 PM, Paul Moore > For example: b'\x01\x00\xd1\x80\xd1\83\xd0\x80' > > > > If that were decoded using latin1 how would I then get the first two > bytes > > to the integer 256 and the last six bytes to their Cyrillic meaning? > > (Apologies for not testing myself, short

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Ethan Furman

On 01/09/2014 02:54 PM, Paul Moore wrote: On 9 January 2014 22:08, Ethan Furman wrote: For example: b'\x01\x00\xd1\x80\xd1\83\xd0\x80' If that were decoded using latin1 how would I then get the first two bytes to the integer 256 and the last six bytes to their Cyrillic meaning? (Apologies for

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Paul Moore

On 9 January 2014 22:08, Ethan Furman wrote: > For example: b'\x01\x00\xd1\x80\xd1\83\xd0\x80' > > If that were decoded using latin1 how would I then get the first two bytes > to the integer 256 and the last six bytes to their Cyrillic meaning? > (Apologies for not testing myself, short on time.)

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Ethan Furman

On 01/09/2014 02:00 PM, Chris Barker wrote: On Thu, Jan 9, 2014 at 1:45 PM, Antoine Pitrou wrote: Chris Barker wrote: latin-1 guaranteed to work with any binary data, and round-trip accurately? Yes, it is. and will surrogateescape work for arbitrary binary data? Yes, it will. Then ma

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Paul Moore

On 9 January 2014 22:00, Chris Barker wrote: > On Thu, Jan 9, 2014 at 1:45 PM, Antoine Pitrou wrote: >> >> > latin-1 guaranteed to work with any binary data, and round-trip >> > accurately? >> >> Yes, it is. >> >> > and will surrogateescape work for arbitrary binary data? >> >> Yes, it will. > >

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Brett Cannon

On Thu, Jan 9, 2014 at 5:00 PM, Chris Barker wrote: > On Thu, Jan 9, 2014 at 1:45 PM, Antoine Pitrou wrote: > >> > latin-1 guaranteed to work with any binary data, and round-trip >> accurately? >> >> Yes, it is. >> >> > and will surrogateescape work for arbitrary binary data? >> >> Yes, it will.

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Chris Barker

On Thu, Jan 9, 2014 at 1:45 PM, Antoine Pitrou wrote: > > latin-1 guaranteed to work with any binary data, and round-trip > accurately? > > Yes, it is. > > > and will surrogateescape work for arbitrary binary data? > > Yes, it will. > Then maybe this is really a documentation issue, after all.

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Antoine Pitrou

On Thu, 9 Jan 2014 13:36:05 -0800 Chris Barker wrote: > > Some folks have suggested using latin-1 (or other 8-bit encoding) -- is > that guaranteed to work with any binary data, and round-trip accurately? Yes, it is. > and will surrogateescape work for arbitrary binary data? Yes, it will. Reg

Re: [Python-Dev] [Python-checkins] peps: PEP 460: add .format_map()

2014-01-09 Thread Eric V. Smith

I'm not sure how format_map helps in porting from 2 to 3, since it doesn't exist in any version of 2. Although that said, it's no doubt a useful feature, just not useful in code that supports both 2 and 3 with a single code base or when porting to 3. Eric. On 1/9/2014 4:02 PM, antoine.pitrou wro

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Chris Barker

This has all gotten a bit complicated because everyone has been thinking in terms of actual encodings and actual text files. But I think the use-case here is something different: A file with a bunch of bytes in it, _some_of which are ascii, and the rest are other bytes (maybe binary data, maybe no

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Kristján Valur Jónsson

Thanks Nick. This does seem to cover it all. Perhaps it is worth mentioning cp1252 as the windows version of latin1, which _does_not_ cover all code points and hence requires surrogateescapes for best effort solution. K From: Nick Coghlan [[email protected]

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-09 Thread Antoine Pitrou

On Fri, 10 Jan 2014 05:26:04 +1000 Nick Coghlan wrote: > > We should probably include format_map for consistency with the str API. Yes, you're right. > >However, I > > also added bytearray into the mix, as bytearray objects should > > generally support the same operations as bytes (and they can

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-09 Thread Nick Coghlan

On 9 Jan 2014 06:43, "Antoine Pitrou" wrote: > > > Hi, > > With Victor's consent, I overhauled PEP 460 and made the feature set > more restricted and consistent with the bytes/str separation. +1 I was initially dubious about the idea, but the proposed semantics look good to me. We should probab

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Nick Coghlan

On 9 Jan 2014 22:25, "Kristján Valur Jónsson" wrote: > > > > > -Original Message- > > From: Victor Stinner [mailto:[email protected]] > > Sent: 9. janúar 2014 13:51 > > To: Kristján Valur Jónsson > > Cc: Antoine Pitrou; [email protected] > > Subject: Re: [Python-Dev] Python3 "co

Re: [Python-Dev] Python3 "complexity" (was RFC: PEP 460: Add bytes...)

2014-01-09 Thread Nick Coghlan

On 9 Jan 2014 22:08, "Antoine Pitrou" wrote: > > On Thu, 9 Jan 2014 09:03:40 -0500 > Daniel Holth wrote: > > They emphatically do not want the Python 2 > > model especially not implicit coercion. They only want additional > > tools for text or string processing in Python 3. > > That's a good poin

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-09 Thread Barry Warsaw

On Jan 08, 2014, at 01:51 PM, Stephen J. Turnbull wrote: >Benjamin Peterson writes: > > > I agree. This is a very important, much-requested feature for low-level > > networking code. > >I hear it's much-requested, but is there any description of typical >use cases? The two unported libraries that

[Python-Dev] A test case for what's missing in Python 3 (Re: RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5)

2014-01-09 Thread Barry Warsaw

(Resending with an adjusted Subject and not through Gmane. Apologies for duplicates.) On Jan 08, 2014, at 01:51 PM, Stephen J. Turnbull wrote: >Benjamin Peterson writes: > > > I agree. This is a very important, much-requested feature for low-level > > networking code. > >I hear it's much-request

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Stephen J. Turnbull

Steven D'Aprano writes: > If it were, we wouldn't need text strings :-) Speak for yourself, Kemosabe. Red man need Unicode, full meal not just a few bytes. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/pyt

Re: [Python-Dev] Changing Clinic's output

2014-01-09 Thread Ethan Furman

On 01/09/2014 03:39 AM, Serhiy Storchaka wrote: 07.01.14 22:51, Ethan Furman написав(ла): AFAIK you don't write much C code. So perhaps C sources maintainability is not too valuable for you. I don't write much C code yet, no, but C source maintainability is even more important to me because o

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Kristján Valur Jónsson

> -Original Message- > From: Victor Stinner [mailto:[email protected]] > Sent: 9. janúar 2014 13:51 > To: Kristján Valur Jónsson > Cc: Antoine Pitrou; [email protected] > Subject: Re: [Python-Dev] Python3 "complexity" > > 2014/1/9 Kristján Valur Jónsson : > > This definition i

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Steven D'Aprano

On Thu, Jan 09, 2014 at 01:00:59PM +, Kristján Valur Jónsson wrote: > Which reminds me, can Python3 read text files with BOM automatically yet? I'm not sure what you mean by that. If you mean, can Python3 distinguish between UTF-16BE and UTF-16LE on the basis of a BOM, then it's been able t

Re: [Python-Dev] Python3 "complexity" (was RFC: PEP 460: Add bytes...)

2014-01-09 Thread Antoine Pitrou

On Thu, 9 Jan 2014 09:03:40 -0500 Daniel Holth wrote: > They emphatically do not want the Python 2 > model especially not implicit coercion. They only want additional > tools for text or string processing in Python 3. That's a good point. Now it's up to people who need those additional tools to p

Re: [Python-Dev] Python3 "complexity" (was RFC: PEP 460: Add bytes...)

2014-01-09 Thread Daniel Holth

So the customer you're looking for is the person who cares a lot about encodings, knows how to do Unicode correctly, and has noticed that certain valid cases not limited to imperialist simpletons (dealing with specific common things invented before 1996, dealing with mixed encodings, doing what Nic

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Kristján Valur Jónsson

> -Original Message- > From: Python-Dev [mailto:python-dev- > [email protected]] On Behalf Of Kristján Valur > Jónsson > Sent: 9. janúar 2014 13:37 > To: Antoine Pitrou; [email protected] > Subject: Re: [Python-Dev] Python3 "complexity" > > This definition is f

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Victor Stinner

2014/1/9 Kristján Valur Jónsson : > This definition is funny, because according to Wikipedia, it is a "superset" > of 8869-1 ( latin1) Bytes 0x80..0x9f are unassigned in ISO/CEI 8859-1... but are assigned in (IANA's) ISO-8859-1. Python implements the latter, ISO-8859-1. Wikipedia says "This enc

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Kristján Valur Jónsson

> -Original Message- > From: Python-Dev [mailto:python-dev- > [email protected]] On Behalf Of Antoine Pitrou > Sent: 9. janúar 2014 13:18 > To: [email protected] > Subject: Re: [Python-Dev] Python3 "complexity" > > On Thu, 9 Jan 2014 12:55:35 + > Kristján V

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Paul Moore

On 9 January 2014 13:00, Kristján Valur Jónsson wrote: >> You don't say what problems, but I assume encoding/decoding errors. So the >> files apparently weren't in the system encoding. OK, at that point I'd >> probably say to heck with it and use latin-1. Assuming I was sure that (a) >> I'd >> ne

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Antoine Pitrou

On Thu, 9 Jan 2014 12:55:35 + Kristján Valur Jónsson wrote: > > If you don't "care" about the encoding, why don't you use latin1? > > Things will roundtrip fine and work as well as under Python 2. > > Because latin1 does not define all code points, giving you errors there. >>> b = bytes(rang

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Kristján Valur Jónsson

> -Original Message- > From: Python-Dev [mailto:python-dev- > [email protected]] On Behalf Of Antoine Pitrou > Sent: 9. janúar 2014 12:42 > To: [email protected] > Subject: Re: [Python-Dev] Python3 "complexity" > > On Thu, 9 Jan 2014 10:15:08 + > Kristján V

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Martin v. Löwis

> Right. But even latin-1, or better, cp1252 (on windows) does not solve it > because these have undefined > code points. That's not true. latin-1 does not have undefined code points. Regards, Martin ___ Python-Dev mailing list [email protected]

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Kristján Valur Jónsson

> -Original Message- > From: Paul Moore [mailto:[email protected]] > Sent: 9. janúar 2014 10:53 > To: Kristján Valur Jónsson > Cc: Stefan Ring; [email protected] > > Moving to python 3, I found that this quickly caused problems. > > You don't say what problems, but I assume encodin

Re: [Python-Dev] Python3 "complexity" (was RFC: PEP 460: Add bytes...)

2014-01-09 Thread Antoine Pitrou

On Thu, 9 Jan 2014 17:09:10 +1000 Nick Coghlan wrote: > > There's also the fact that POSIX folks are used to "r" and "rb" being > the same thing. Which fails immediately under Windows :-) Regards Antoine. ___ Python-Dev mailing list Python-Dev@pyth

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-09 Thread Antoine Pitrou

On Thu, 09 Jan 2014 03:54:13 + MRAB wrote: > I'm thinking that the "i" format could be used for signed integers and > the "u" for unsigned integers. The width would be the number of bytes. > You would also need to have a way of specifying the endianness. > > For example: > > >>> b'{:<2i}'.f

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Antoine Pitrou

On Thu, 9 Jan 2014 10:15:08 + Kristján Valur Jónsson wrote: > > Moving to python 3, I found that this quickly caused problems. So, I > explicitly added an encoding. Better guess an encoding, something that is > likely, e.g. cp1252 > with open(fn1, encoding='cp1252') as f1: > with open

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Steven D'Aprano

On Thu, Jan 09, 2014 at 05:11:06PM +1000, Nick Coghlan wrote: > On 9 January 2014 10:07, Ben Finney wrote: > > So, if what you want is to parse text and not get gibberish, you need to > > *tell* Python what the encoding is. That's a brute fact of the world of > > text in computing. > > Set the m

Re: [Python-Dev] Changing Clinic's output

2014-01-09 Thread Serhiy Storchaka

07.01.14 22:51, Ethan Furman написав(ла): On 01/07/2014 12:39 PM, Serhiy Storchaka wrote: * It clutters up hg log and hg blame results. Every time when you change clinic.py to generate different output, it touches multiple lines in all files which use Argument Clinic and clutters up their histor

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Paul Moore

On 9 January 2014 10:15, Kristján Valur Jónsson wrote: > Also, the problem I'm describing has to do with real world stuff. > This is the python 2 program: > with open(fn1) as f1: > with open(fn2, 'w') as f2: > f2.write(process_text(f1.read()) > > Moving to python 3, I found that this q

Re: [Python-Dev] [RELEASED] Python 3.4.0b2

2014-01-09 Thread Martin v. Löwis

Am 06.01.14 17:26, schrieb Michael Urman: > Here's some more guesswork. Does it seem possible that msiexec is > trying to verify the revocation status of the certificate used to sign > the python .msi file? Per > http://blogs.technet.com/b/pki/archive/2006/11/30/basic-crl-checking-with-certutil.asp

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-09 Thread Nick Coghlan

On 9 Jan 2014 11:29, "INADA Naoki" wrote: > > >> And I think everyone was well intentioned - and python3 covers most of the >> bases, but working with binary data is not only a "wire-protocol programmer's" >> problem. If you're working with binary data, use the binary API offered by bytes, bytear

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Kristján Valur Jónsson

> -Original Message- > From: Python-Dev [mailto:python-dev- > [email protected]] On Behalf Of Stefan Ring > Sent: 9. janúar 2014 09:32 > To: [email protected] > Subject: Re: [Python-Dev] Python3 "complexity" > > > just became harder to use for that purpose. > >

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Stephen J. Turnbull

Paul Moore writes: > So I think that if this discussion is to be of any real benefit, a > specific example is needed. I honestly don't think I've ever > encountered a case where "Sometimes [I] just want to parse text > files" and code that uses the default encoding (i.e., looks pretty > much

Re: [Python-Dev] [RELEASED] Python 3.4.0b2

2014-01-09 Thread Martin v. Löwis

Am 08.01.14 16:03, schrieb Nick Coghlan: > On 9 January 2014 00:43, Bob Hanson wrote: >> When I read this comment of yours, Guido, I immediately started >> wondering about this. You may well be right -- indeed, I have a >> very old install (c.2007) which has not been updated (other than >> one or

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Stefan Ring

> just became harder to use for that purpose. The entire discussion reminds me very much of the situation with file names in OS X. Whenever I want to look at an old zip file or tarball which happens to have been lying around on my hard drive for a decade or more, I can't because OS X insist that f

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Paul Moore

On 9 January 2014 09:01, Mark Shannon wrote: > On 09/01/14 00:07, Ben Finney wrote: >> >> Kristján Valur Jónsson writes: >> >>> Believe it or not, sometimes you really don't care about encodings. >>> Sometimes you just want to parse text files. >> >> >> Files don't contain text, they contain byte

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Kristján Valur Jónsson

> -Original Message- > From: Python-Dev [mailto:python-dev- > [email protected]] On Behalf Of Ben Finney > Sent: 9. janúar 2014 00:50 > To: [email protected] > Subject: Re: [Python-Dev] Python3 "complexity" > > Kristján Valur Jónsson writes: > > > I didn't us

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Mark Shannon

On 09/01/14 00:07, Ben Finney wrote: Kristján Valur Jónsson writes: Believe it or not, sometimes you really don't care about encodings. Sometimes you just want to parse text files. Files don't contain text, they contain bytes. Bytes only become text when filtered through the correct encoding

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Lennart Regebro

On Thu, Jan 9, 2014 at 8:16 AM, Ben Finney wrote: > Nick Coghlan writes: >> Set the mode to "rb", process it as binary. Done. > > Which entails abandoning the stated goal of “just want to parse text > files” :-) Only if your definition of "text files" means it's unicode.

Re: [Python-Dev] Python3 "complexity"

2014-01-09 Thread Chris Angelico

On Thu, Jan 9, 2014 at 5:50 PM, Lennart Regebro wrote: > To be honest, you can define text as "A stream of bytes that are split > up in lines separated by a linefeed", and do some basic text > processing like that. Just very *basic*, but still. Replacing > characters. Extracting certain lines etc.

64 matches

Mail list logo