-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 2014-01-10, 17:34 GMT, you wrote:
> From my experience, the concept of a default locale is deeply
> flawed. What if I log into a (Linux) machine using an old
> latin-1 putty from the Windows XP era, have most file names
> and contents in UTF-8 e
"Jim J. Jewett" writes:
>
> > Steven D'Aprano wrote:
> >> I think that heuristics to guess the encoding have their role to play,
> >> if the caller understands the risks.
>
> Ben Finney wrote:
> > In my opinion, content-type guessing heuristics certainly don't belong
> > in the standard library
On 01/10/2014 03:22 PM, Mark Lawrence wrote:
On 10/01/2014 22:06, Chris Barker wrote:
I'm not so sure -- it could be used (abused?) for that, but I'm
suggesting it be used for mixed ascii-binary data. I don't know that
there IS a "right" way to do that -- at least not an efficient or easy
to re
On Fri, Jan 10, 2014 at 3:22 PM, Mark Lawrence wrote:
> The correct way is to read the interface specification which tells you
> what should be in the data. Or do people not use interface specifications
> these days, preferring to guess what they've got instead?
>
No one is suggesting guessing (
On 10/01/2014 22:06, Chris Barker wrote:
On Fri, Jan 10, 2014 at 6:05 AM, Paul Moore mailto:p.f.mo...@gmail.com>> wrote:
> Using the 'latin-1' to mean unknown encoding can easily result
> in Mojibake (unreadable text) entering your application with
> dangerous effects on your othe
On Fri, Jan 10, 2014 at 6:05 AM, Paul Moore wrote:
> > Using the 'latin-1' to mean unknown encoding can easily result
> > in Mojibake (unreadable text) entering your application with
> > dangerous effects on your other text data.
>
> Agreed. The latin-1 suggestion is purely for people who object
10.01.14 18:27, Baptiste Carvello написав(ла):
would it make sense to be more general, and allow a "lenient mode",
where all files implicitly opened with the default encoding would also
use the surrogateescape error handler ?
The surrogateescape error handler is compatible only with
ASCII-comp
> Steven D'Aprano wrote:
>> I think that heuristics to guess the encoding have their role to play,
>> if the caller understands the risks.
Ben Finney wrote:
> In my opinion, content-type guessing heuristics certainly don't belong
> in the standard library.
It would be great if there were never
INADA Naoki wrote:
latin1 is OK but is it Pythonic?
Latin is most certainly a Pythonic subject:
http://www.youtube.com/watch?v=IIAdHEwiAy8
--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
On Jan 10, 2014, at 7:35 AM, Nick Coghlan wrote:
> Putting this here because I found out today it's not in any of the
> PEPs and folks have to go digging in mailing list archives to find it.
> I'll add it to my Python 3 Q&A at some point.
>
> The reason Python 3 currently tries to rely on the PO
10.01.14 14:19, M.-A. Lemburg написав(ла):
BTW: Perhaps it would be a good idea to backport the
surrogateescape error handler to Python 2.7 to simplify
writing code which works in both Python 2 and 3.
You also should change the UTF-8 codec so that it will reject surrogates
(i.e. u'\ud880'.enco
On Fri, Jan 10, 2014 at 4:35 PM, Nick Coghlan wrote:
> On 10 January 2014 13:32, Lennart Regebro wrote:
>> No, because your environment have a default language. And Python has a
>> default encoding. You only get problems when some file doesn't use the
>> default encoding.
>
> The reason Python 3
Le 10/01/2014 16:35, Nick Coghlan a écrit :
> One idea we're considering for Python 3.5 is to have a report of
> "ascii" on a POSIX OS imply the surrogateescape error handler (at
> least for the standard streams, and perhaps in other contexts), since
> the OS reporting the POSIX/C locale almost ce
Now I feel it is bad thing that encouraging using unicode for binary with
latin-1 encoding or surrogateescape errorhandler.
Handling binary data in str type using latin-1 is just a hack.
Surrogateescape is just a workaround to keep undecodable bytes in text.
Encouraging binary data in str type wi
Nick Coghlan wrote:
> One idea we're considering for Python 3.5 is to have a report of
> "ascii" on a POSIX OS imply the surrogateescape error handler (at
> least for the standard streams, and perhaps in other contexts), since
> the OS reporting the POSIX/C locale almost certainly indicates a
> co
On 10 January 2014 13:32, Lennart Regebro wrote:
> On Thu, Jan 9, 2014 at 10:06 AM, Kristján Valur Jónsson
> wrote:
>> Do I speak Chinese to my grocer because china is a growing force in the
>> world? Or start every discussion with my children with a negotiation on
>> what language to use?
>
>
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 2014-01-10, 12:19 GMT, you wrote:
> Using the 'latin-1' to mean unknown encoding can easily result
> in Mojibake (unreadable text) entering your application with
> dangerous effects on your other text data.
>
> E.g. "Marc-André" read using 'latin-1'
On 10 January 2014 12:19, M.-A. Lemburg wrote:
> Just a word of caution:
>
> Using the 'latin-1' to mean unknown encoding can easily result
> in Mojibake (unreadable text) entering your application with
> dangerous effects on your other text data.
Agreed. The latin-1 suggestion is purely for peop
On 09.01.2014 22:45, Antoine Pitrou wrote:
> On Thu, 9 Jan 2014 13:36:05 -0800
> Chris Barker wrote:
>>
>> Some folks have suggested using latin-1 (or other 8-bit encoding) -- is
>> that guaranteed to work with any binary data, and round-trip accurately?
>
> Yes, it is.
Just a word of caution:
Chris Angelico writes:
> I'm not saying that chardet is bad, but I *am* saying, and I stand
> by this, that an auto-detect option on file open is a bad idea.
I have used it by default in Emacs and XEmacs since 1990, and I
certainly haven't experienced it as a bad idea at *any* time in more
than
INADA Naoki writes:
> latin1 is OK but is it Pythonic?
Yes. EIBTI, including being explicit that you're doing something that
has semantics that you are ignoring but may come back to bite you or
somebody who naively uses your module.
There's nothing un-Pythonic about using potentially dangerous
Steven D'Aprano writes:
> I think that heuristics to guess the encoding have their role to play,
> if the caller understands the risks.
I think, for a language whose developers espouse a principle “In the
face of ambiguity, refuse the temptation to guess”, heuristics have no
role to play in the
On Fri, Jan 10, 2014 at 1:39 PM, Steven D'Aprano wrote:
> On Fri, Jan 10, 2014 at 12:22:02PM +1100, Chris Angelico wrote:
>> On Fri, Jan 10, 2014 at 11:53 AM, anatoly techtonik
>> wrote:
>> > 2. introduce autodetect mode to open functions
>> > 1. read and transform on the fly, maintaining
On Fri, Jan 10, 2014 at 2:03 AM, Joao S. O. Bueno wrote:
> On 9 January 2014 04:50, Lennart Regebro wrote:
>> To be honest, you can define text as "A stream of bytes that are split
>> up in lines separated by a linefeed", and do some basic text
>> processing like that. Just very *basic*, but stil
On Thu, Jan 9, 2014 at 10:06 AM, Kristján Valur Jónsson
wrote:
> Do I speak Chinese to my grocer because china is a growing force in the
> world? Or start every discussion with my children with a negotiation on what
> language to use?
No, because your environment have a default language. And P
On Fri, Jan 10, 2014 at 12:22:02PM +1100, Chris Angelico wrote:
> On Fri, Jan 10, 2014 at 11:53 AM, anatoly techtonik
> wrote:
> > 2. introduce autodetect mode to open functions
> > 1. read and transform on the fly, maintaining a buffer that
> > stores original bytes
> > and their
On 1/9/2014 6:25 PM, Chris Barker wrote:
as so -- I want to replace a bit of ascii text surrounded by arbitrary
binary:
(apologies for the py2...)
In [24]: b
Out[24]: '\x01\x00\xd1\x80\xd1a name\xd0\x80'
In [25]: u = b.decode('latin-1')
In [26]: u2 = u.replace('a name', 'a different name')
In [
On Thu, Jan 09, 2014 at 02:08:57PM -0800, Ethan Furman wrote:
> If latin1 is used to convert binary to text, how convoluted is it to then
> take chunks of that text and convert to int, or some other variety of
> unicode?
>
> For example: b'\x01\x00\xd1\x80\xd1\83\xd0\x80'
>
> If that were dec
On Fri, Jan 10, 2014 at 11:53 AM, anatoly techtonik wrote:
> 2. introduce autodetect mode to open functions
> 1. read and transform on the fly, maintaining a buffer that
> stores original bytes
> and their mapping to letters. The mapping is updated as bytes
> frequency
>
On 9 January 2014 04:50, Lennart Regebro wrote:
> To be honest, you can define text as "A stream of bytes that are split
> up in lines separated by a linefeed", and do some basic text
> processing like that. Just very *basic*, but still. Replacing
> characters. Extracting certain lines etc.
That
On Thu, Jan 9, 2014 at 10:00 AM, Mark Lawrence wrote:
> On 09/01/2014 06:50, Lennart Regebro wrote:
>>
>> On Thu, Jan 9, 2014 at 1:07 AM, Ben Finney
>> wrote:
>>>
>>> Kristján Valur Jónsson writes:
>>>
Believe it or not, sometimes you really don't care about encodings.
Sometimes you ju
latin1 is OK but is it Pythonic?
I've posted suggestion about add 'bytes' as a alias for 'latin1'.
http://comments.gmane.org/gmane.comp.python.ideas/10315
I want one Pythonic way to handle "binary containing ascii (or latin1 or
utf-8 or other ascii compatible)".
On Fri, Jan 10, 2014 at 8:53 AM
On Thu, Jan 9, 2014 at 3:14 PM, Ethan Furman wrote:
> Sorry, I was too short with my example. My use case is binary files, with
> ASCII metadata and binary metadata, as well as ASCII-encoded numeric
> values, binary-coded numeric values, ASCII-encoded boolean values, and
> who-knows-what-(before
On 01/09/2014 02:54 PM, Paul Moore wrote:
On 9 January 2014 22:08, Ethan Furman wrote:
For example: b'\x01\x00\xd1\x80\xd1\83\xd0\x80'
If that were decoded using latin1 how would I then get the first two bytes
to the integer 256 and the last six bytes to their Cyrillic meaning?
(Apologies for
On Thu, Jan 9, 2014 at 2:54 PM, Paul Moore
> For example: b'\x01\x00\xd1\x80\xd1\83\xd0\x80'
> >
> > If that were decoded using latin1 how would I then get the first two
> bytes
> > to the integer 256 and the last six bytes to their Cyrillic meaning?
> > (Apologies for not testing myself, short
On 01/09/2014 02:54 PM, Paul Moore wrote:
On 9 January 2014 22:08, Ethan Furman wrote:
For example: b'\x01\x00\xd1\x80\xd1\83\xd0\x80'
If that were decoded using latin1 how would I then get the first two bytes
to the integer 256 and the last six bytes to their Cyrillic meaning?
(Apologies for
On 9 January 2014 22:08, Ethan Furman wrote:
> For example: b'\x01\x00\xd1\x80\xd1\83\xd0\x80'
>
> If that were decoded using latin1 how would I then get the first two bytes
> to the integer 256 and the last six bytes to their Cyrillic meaning?
> (Apologies for not testing myself, short on time.)
On 01/09/2014 02:00 PM, Chris Barker wrote:
On Thu, Jan 9, 2014 at 1:45 PM, Antoine Pitrou wrote:
Chris Barker wrote:
latin-1 guaranteed to work with any binary data, and round-trip accurately?
Yes, it is.
and will surrogateescape work for arbitrary binary data?
Yes, it will.
Then ma
On 9 January 2014 22:00, Chris Barker wrote:
> On Thu, Jan 9, 2014 at 1:45 PM, Antoine Pitrou wrote:
>>
>> > latin-1 guaranteed to work with any binary data, and round-trip
>> > accurately?
>>
>> Yes, it is.
>>
>> > and will surrogateescape work for arbitrary binary data?
>>
>> Yes, it will.
>
>
On Thu, Jan 9, 2014 at 5:00 PM, Chris Barker wrote:
> On Thu, Jan 9, 2014 at 1:45 PM, Antoine Pitrou wrote:
>
>> > latin-1 guaranteed to work with any binary data, and round-trip
>> accurately?
>>
>> Yes, it is.
>>
>> > and will surrogateescape work for arbitrary binary data?
>>
>> Yes, it will.
On Thu, Jan 9, 2014 at 1:45 PM, Antoine Pitrou wrote:
> > latin-1 guaranteed to work with any binary data, and round-trip
> accurately?
>
> Yes, it is.
>
> > and will surrogateescape work for arbitrary binary data?
>
> Yes, it will.
>
Then maybe this is really a documentation issue, after all.
On Thu, 9 Jan 2014 13:36:05 -0800
Chris Barker wrote:
>
> Some folks have suggested using latin-1 (or other 8-bit encoding) -- is
> that guaranteed to work with any binary data, and round-trip accurately?
Yes, it is.
> and will surrogateescape work for arbitrary binary data?
Yes, it will.
Reg
This has all gotten a bit complicated because everyone has been thinking in
terms of actual encodings and actual text files. But I think the use-case
here is something different:
A file with a bunch of bytes in it, _some_of which are ascii, and the rest
are other bytes (maybe binary data, maybe no
...@gmail.com]
Sent: Thursday, January 09, 2014 18:08
To: Kristján Valur Jónsson
Cc: Victor Stinner; Antoine Pitrou; python-dev@python.org
Subject: Re: [Python-Dev] Python3 "complexity"
http://python-notes.curiousefficiency.org/en/latest/python3/text_file_processing.html
is currently linke
On 9 Jan 2014 22:25, "Kristján Valur Jónsson" wrote:
>
>
>
> > -Original Message-
> > From: Victor Stinner [mailto:victor.stin...@gmail.com]
> > Sent: 9. janúar 2014 13:51
> > To: Kristján Valur Jónsson
> > Cc: Antoine Pitrou; python-dev
On 9 Jan 2014 22:08, "Antoine Pitrou" wrote:
>
> On Thu, 9 Jan 2014 09:03:40 -0500
> Daniel Holth wrote:
> > They emphatically do not want the Python 2
> > model especially not implicit coercion. They only want additional
> > tools for text or string processing in Python 3.
>
> That's a good poin
Steven D'Aprano writes:
> If it were, we wouldn't need text strings :-)
Speak for yourself, Kemosabe. Red man need Unicode, full meal not
just a few bytes.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/pyt
> -Original Message-
> From: Victor Stinner [mailto:victor.stin...@gmail.com]
> Sent: 9. janúar 2014 13:51
> To: Kristján Valur Jónsson
> Cc: Antoine Pitrou; python-dev@python.org
> Subject: Re: [Python-Dev] Python3 "complexity"
>
> 2014/1/9 Kristján V
On Thu, Jan 09, 2014 at 01:00:59PM +, Kristján Valur Jónsson wrote:
> Which reminds me, can Python3 read text files with BOM automatically yet?
I'm not sure what you mean by that. If you mean, can Python3 distinguish
between UTF-16BE and UTF-16LE on the basis of a BOM, then it's been able
t
On Thu, 9 Jan 2014 09:03:40 -0500
Daniel Holth wrote:
> They emphatically do not want the Python 2
> model especially not implicit coercion. They only want additional
> tools for text or string processing in Python 3.
That's a good point. Now it's up to people who need those additional
tools to p
So the customer you're looking for is the person who cares a lot about
encodings, knows how to do Unicode correctly, and has noticed that
certain valid cases not limited to imperialist simpletons (dealing
with specific common things invented before 1996, dealing with mixed
encodings, doing what Nic
> -Original Message-
> From: Python-Dev [mailto:python-dev-
> bounces+kristjan=ccpgames@python.org] On Behalf Of Kristján Valur
> Jónsson
> Sent: 9. janúar 2014 13:37
> To: Antoine Pitrou; python-dev@python.org
> Subject: Re: [Python-Dev] Python3 "complexit
2014/1/9 Kristján Valur Jónsson :
> This definition is funny, because according to Wikipedia, it is a "superset"
> of 8869-1 ( latin1)
Bytes 0x80..0x9f are unassigned in ISO/CEI 8859-1... but are assigned
in (IANA's) ISO-8859-1.
Python implements the latter, ISO-8859-1.
Wikipedia says "This enc
> -Original Message-
> From: Python-Dev [mailto:python-dev-
> bounces+kristjan=ccpgames@python.org] On Behalf Of Antoine Pitrou
> Sent: 9. janúar 2014 13:18
> To: python-dev@python.org
> Subject: Re: [Python-Dev] Python3 "complexity"
>
> On Thu, 9
On 9 January 2014 13:00, Kristján Valur Jónsson wrote:
>> You don't say what problems, but I assume encoding/decoding errors. So the
>> files apparently weren't in the system encoding. OK, at that point I'd
>> probably say to heck with it and use latin-1. Assuming I was sure that (a)
>> I'd
>> ne
On Thu, 9 Jan 2014 12:55:35 +
Kristján Valur Jónsson wrote:
> > If you don't "care" about the encoding, why don't you use latin1?
> > Things will roundtrip fine and work as well as under Python 2.
>
> Because latin1 does not define all code points, giving you errors there.
>>> b = bytes(rang
> -Original Message-
> From: Python-Dev [mailto:python-dev-
> bounces+kristjan=ccpgames@python.org] On Behalf Of Antoine Pitrou
> Sent: 9. janúar 2014 12:42
> To: python-dev@python.org
> Subject: Re: [Python-Dev] Python3 "complexity"
>
> On Thu, 9
> Right. But even latin-1, or better, cp1252 (on windows) does not solve it
> because these have undefined
> code points.
That's not true. latin-1 does not have undefined code points.
Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.o
> -Original Message-
> From: Paul Moore [mailto:p.f.mo...@gmail.com]
> Sent: 9. janúar 2014 10:53
> To: Kristján Valur Jónsson
> Cc: Stefan Ring; python-dev@python.org
> > Moving to python 3, I found that this quickly caused problems.
>
> You don't say what problems, but I assume encodin
On Thu, 9 Jan 2014 17:09:10 +1000
Nick Coghlan wrote:
>
> There's also the fact that POSIX folks are used to "r" and "rb" being
> the same thing.
Which fails immediately under Windows :-)
Regards
Antoine.
___
Python-Dev mailing list
Python-Dev@pyth
On Thu, 9 Jan 2014 10:15:08 +
Kristján Valur Jónsson wrote:
>
> Moving to python 3, I found that this quickly caused problems. So, I
> explicitly added an encoding. Better guess an encoding, something that is
> likely, e.g. cp1252
> with open(fn1, encoding='cp1252') as f1:
> with open
On Thu, Jan 09, 2014 at 05:11:06PM +1000, Nick Coghlan wrote:
> On 9 January 2014 10:07, Ben Finney wrote:
> > So, if what you want is to parse text and not get gibberish, you need to
> > *tell* Python what the encoding is. That's a brute fact of the world of
> > text in computing.
>
> Set the m
On 9 January 2014 10:15, Kristján Valur Jónsson wrote:
> Also, the problem I'm describing has to do with real world stuff.
> This is the python 2 program:
> with open(fn1) as f1:
> with open(fn2, 'w') as f2:
> f2.write(process_text(f1.read())
>
> Moving to python 3, I found that this q
> -Original Message-
> From: Python-Dev [mailto:python-dev-
> bounces+kristjan=ccpgames@python.org] On Behalf Of Stefan Ring
> Sent: 9. janúar 2014 09:32
> To: python-dev@python.org
> Subject: Re: [Python-Dev] Python3 "complexity"
>
> > just
Paul Moore writes:
> So I think that if this discussion is to be of any real benefit, a
> specific example is needed. I honestly don't think I've ever
> encountered a case where "Sometimes [I] just want to parse text
> files" and code that uses the default encoding (i.e., looks pretty
> much
> just became harder to use for that purpose.
The entire discussion reminds me very much of the situation with file
names in OS X. Whenever I want to look at an old zip file or tarball
which happens to have been lying around on my hard drive for a decade
or more, I can't because OS X insist that f
On 9 January 2014 09:01, Mark Shannon wrote:
> On 09/01/14 00:07, Ben Finney wrote:
>>
>> Kristján Valur Jónsson writes:
>>
>>> Believe it or not, sometimes you really don't care about encodings.
>>> Sometimes you just want to parse text files.
>>
>>
>> Files don't contain text, they contain byte
> -Original Message-
> From: Python-Dev [mailto:python-dev-
> bounces+kristjan=ccpgames@python.org] On Behalf Of Ben Finney
> Sent: 9. janúar 2014 00:50
> To: python-dev@python.org
> Subject: Re: [Python-Dev] Python3 "complexity"
>
> Kristján Valu
On 09/01/14 00:07, Ben Finney wrote:
Kristján Valur Jónsson writes:
Believe it or not, sometimes you really don't care about encodings.
Sometimes you just want to parse text files.
Files don't contain text, they contain bytes. Bytes only become text
when filtered through the correct encoding
On Thu, Jan 9, 2014 at 8:16 AM, Ben Finney wrote:
> Nick Coghlan writes:
>> Set the mode to "rb", process it as binary. Done.
>
> Which entails abandoning the stated goal of “just want to parse text
> files” :-)
Only if your definition of "text files" means it's unicode.
On Thu, Jan 9, 2014 at 5:50 PM, Lennart Regebro wrote:
> To be honest, you can define text as "A stream of bytes that are split
> up in lines separated by a linefeed", and do some basic text
> processing like that. Just very *basic*, but still. Replacing
> characters. Extracting certain lines etc.
On 9 January 2014 15:22, Greg Ewing wrote:
> Kristján Valur Jónsson wrote:
>>
>> all you want is to open that .txt
>> file on the drive and extract some phone numbers and merge in some email
>> addresses. What encoding does the file have? Do I care? Must I care?
>
>
> To some extent, yes. If the e
Nick Coghlan writes:
> On 9 January 2014 10:07, Ben Finney wrote:
> > Kristján Valur Jónsson writes:
> >
> >> Believe it or not, sometimes you really don't care about encodings.
> >> Sometimes you just want to parse text files.
> >
> > Files don't contain text, they contain bytes. Bytes only be
On 9 January 2014 10:07, Ben Finney wrote:
> Kristján Valur Jónsson writes:
>
>> Believe it or not, sometimes you really don't care about encodings.
>> Sometimes you just want to parse text files.
>
> Files don't contain text, they contain bytes. Bytes only become text
> when filtered through the
-Dev [python-dev-bounces+kristjan=ccpgames@python.org] on
> behalf of Ben Finney [ben+pyt...@benfinney.id.au]
> Sent: Thursday, January 09, 2014 00:07
> To: python-dev@python.org
> Subject: Re: [Python-Dev] Python3 "complexity"
>
> Kristján Valur Jónsson writes:
>
>>
On 09/01/2014 06:50, Lennart Regebro wrote:
On Thu, Jan 9, 2014 at 1:07 AM, Ben Finney wrote:
Kristján Valur Jónsson writes:
Believe it or not, sometimes you really don't care about encodings.
Sometimes you just want to parse text files.
Files don't contain text, they contain bytes. Bytes
On Thu, Jan 9, 2014 at 1:07 AM, Ben Finney wrote:
> Kristján Valur Jónsson writes:
>
>> Believe it or not, sometimes you really don't care about encodings.
>> Sometimes you just want to parse text files.
>
> Files don't contain text, they contain bytes. Bytes only become text
> when filtered thro
Ben Finney writes:
> That's a much better analogy. The customer may not care, but the
> question is essential and must be answered; if the supplier guesses what
> the customer wants, they are doing the customer a disservice.
It is a much better analogy for me on my desktop, and for programmers
Kristján Valur Jónsson writes:
> Still playing the devil's advocate:
> I didn't used to must. Why must I must now? Did the universe just
> shift when I fired up python3?
No. Go look at the Economist's tag cloud and notice how big "China"
and "India" are most days. The universe has been shi
Kristján Valur Jónsson wrote:
all you want is to open that .txt
file on the drive and extract some phone numbers and merge in some email
addresses. What encoding does the file have? Do I care? Must I care?
To some extent, yes. If the encoding happens to be an
ascii-compatible one, such as latin
On Wed, Jan 8, 2014 at 2:04 PM, Kristján Valur Jónsson
wrote:
>
> Believe it or not, sometimes you really don't care about encodings.
> Sometimes you just want to parse text files. Python 3 forces you to think
> about abstract concepts like encodings when all you want is to open that .txt
> fil
On Thu, Jan 9, 2014 at 11:21 AM, MRAB wrote:
> On the other hand:
>
> "I need a new battery."
>
> "What kind of battery?"
>
> "I don't care!"
Or, bringing it back to Python: How do you write a set out to a file?
foo = {1, 2, 4, 8, 16, 32}
open("foo.txt","w").write(foo) # Uh... nope!
On 1/8/2014 5:04 PM, Kristján Valur Jónsson wrote:
Believe it or not, sometimes you really don't care about encodings.
Sometimes you just want to parse text files. Python 3 forces you to
think about abstract concepts like encodings when all you want is to
open that .txt file on the drive and ex
On Thu, 09 Jan 2014 00:12:57 +, wrote:
> I think there might be a different analogy: Having to specify an
> encoding is like having strong typing. In Python 2.7, we _can_ forego
> that and just duck-type our strings :)
Python is a strongly typed language.
Saying that python2 let you duck t
On 09/01/2014 00:12, Kristján Valur Jónsson wrote:
Just to avoid confusion, let me state up front that I am very well aware of
encodings and all that, having internationalized one largish app in python 2.x.
I know the problems that 2.x had with tracking down the source of errors and
understan
Kristján Valur Jónsson writes:
> I didn't used to must. Why must I must now? Did the universe just
> shift when I fired up python3?
In a sense, yes. The world of software has been shifting for decades, as
a reasult of broader changes in how different segments of humanity have
changed their int
, 2014 23:40
To: python-dev@python.org
Subject: Re: [Python-Dev] Python3 "complexity" (was RFC: PEP 460: Add
bytes...)
Why *do* you care? Isn't your system configured for utf-8, and all your
.txt files encoded with utf-8 by default? Or at least configured
with a single consist
MRAB writes:
> On 2014-01-09 00:07, Ben Finney wrote:
> > Kristján Valur Jónsson writes:
> >> Python 3 forces you to think about abstract concepts like encodings
> >> when all you want is to open that .txt file on the drive and
> >> extract some phone numbers and merge in some email addresses. W
s+kristjan=ccpgames@python.org] on
behalf of Ben Finney [ben+pyt...@benfinney.id.au]
Sent: Thursday, January 09, 2014 00:07
To: python-dev@python.org
Subject: Re: [Python-Dev] Python3 "complexity"
Kristján Valur Jónsson writes:
> Python 3 forces you to think about abstract concept
On 09/01/2014 00:21, MRAB wrote:
"I need a new battery."
"What kind of battery?"
"I don't care!"
A neat summary of the draft requirements specification for Python 2.8.
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language
On Wed, 8 Jan 2014, Kristján Valur Jónsson wrote:
Believe it or not, sometimes you really don't care about encodings.
Sometimes you just want to parse text files. Python 3 forces you to
think about abstract concepts like encodings when all you want is to
open that .txt file on the drive and
On 2014-01-09 00:07, Ben Finney wrote:
Kristján Valur Jónsson writes:
Believe it or not, sometimes you really don't care about encodings.
Sometimes you just want to parse text files.
Files don't contain text, they contain bytes. Bytes only become text
when filtered through the correct encodi
Kristján Valur Jónsson writes:
> Believe it or not, sometimes you really don't care about encodings.
> Sometimes you just want to parse text files.
Files don't contain text, they contain bytes. Bytes only become text
when filtered through the correct encoding.
Python should not guess the encodi
On Wed, 08 Jan 2014 22:04:56 +, wrote:
> Believe it or not, sometimes you really don't care about encodings.
> Sometimes you just want to parse text files. Python 3 forces you to
> think about abstract concepts like encodings when all you want is to
> open that .txt file on the drive and extr
Hi,
> Python 3 forces you to think about abstract concepts like encodings when all
> you want is to open that .txt file on the drive and extract some phone
> numbers and merge in some email addresses.
You can open a text file using ascii + surrogateescape, or just open
the file in binary.
Vic
On 8 January 2014 20:04, Kristján Valur Jónsson wrote:
> Believe it or not, sometimes you really don't care about encodings.
> Sometimes you just want to parse text files. Python 3 forces you to think
> about abstract concepts like encodings when all you want is to open that .txt
> file on the
__
From: Python-Dev [python-dev-bounces+kristjan=ccpgames@python.org] on
behalf of R. David Murray [rdmur...@bitdance.com]
Sent: Wednesday, January 08, 2014 21:29
To: python-dev@python.org
Subject: Re: [Python-Dev] Python3 "complexity" (was RFC: PEP 460: Add
bytes...)
...
It
On Wed, 08 Jan 2014 19:22:08 +, "Matt Billenstein" wrote:
> I started in Python blissfully unaware of unicode - it was a different time
> for
> sure, but what I knew from C worked pretty much the same in Python - I could
> read some binary data out of a file, twiddle some bits, and write it b
98 matches
Mail list logo