On Sat, Jan 23, 2010 at 10:09:14PM +0100, Cesare Di Mauro wrote:
Introducing C++ is a big step, also. Aside the problems it can bring on some
platforms, it means that C++ can now be used by CPython developers. It
doesn't make sense to force people use C for everything but the JIT part. In
the
On 23 Jan 2010, at 07:53, Martin v. Löwis mar...@v.loewis.de wrote:
[snip...]
Yes, definitely. It is this very reasoning that caused Python 2.x to
use ASCII as the default encoding (when mixing strings and unicode),
and, for the entire lifetime of 2.x, has caused endless pain for
developers,
Michael Foord writes:
This is why I'm keen that by *default* Python should honour the UTF8
signature when reading files;
Unfortunately, your caveat about a lot of the time it will *seem* to
work applies to this as well. The only way that honoring
signatures really works is if Python
On 24/01/2010 14:23, Stephen J. Turnbull wrote:
Michael Foord writes:
This is why I'm keen that by *default* Python should honour the UTF8
signature when reading files;
Unfortunately, your caveat about a lot of the time it will *seem* to
work applies to this as well. The only way that
Michael Foord writes:
When reading text files the presence of the UTF-8 signature *almost
invariably* means a UTF-8 encoding. Honouring this will almost always be
better than using the wrong encoding. Of course there are caveats, but
it will be a substantial improvement.
Sure, that
Stephen J. Turnbull stephen at xemacs.org writes:
That's throwing the baby out with the bathwater. Very few practical
applications that care about the input encoding are going to be
willing to accept an output encoding that doesn't correspond to the
input encoding in an appropriate way.
Antoine Pitrou writes:
Perhaps you are speaking with your emacs hat, where the purpose is
to output to the same file that serves as input.
No, I'm not wearing my Emacs hat. If I was, there would be no
problem. You just use binary for most such purposes. Historically
that was how even
Stephen J. Turnbull stephen at xemacs.org writes:
But it *does* determine the charset of ErrorDocuments displayed by
Apache. Users are likely to get somewhat confused if the
ErrorDocuments are in a different charset from your dynamic HTML.
Why would they? The browser picks the encoding from
2010/1/24 Floris Bruynooghe floris.bruynoo...@gmail.com
Introducing C++ is a big step, but I disagree that it means C++ should
be allowed in the other CPython code. C++ can be problematic on more
obscure platforms (certainly when static initialisers are used) and
being able to build a python
However it is likely to be often wrong, and where the user's locale
specifies an encoding like CP1252 then it will result in silent
corruption rather than an immediate exception.
Why do you say that? Why do you think it will likely be often wrong?
Most likely, encoding text files with cp1252
So what is your naive programmer supposed to expect
when writing a cat program?
This may be a bit out of context - however, a simple cat program should
open files in binary, and be done.
(not sure whether the average naive programmer is able to grasp the
notion of binary IO and to oppose to
On 24/01/2010 18:41, Martin v. Löwis wrote:
However it is likely to be often wrong, and where the user's locale
specifies an encoding like CP1252 then it will result in silent
corruption rather than an immediate exception.
Why do you say that? Why do you think it will likely be often
On Sun, Jan 24, 2010 at 07:45:20PM +0100, Martin v. L?wis wrote:
This may be a bit out of context - however, a simple cat program should
open files in binary, and be done.
(not sure whether the average naive programmer is able to grasp the
notion of binary IO and to oppose to text IO, and
I concede that I have no better statistics on the matter than you do,
but I think that's wishful thinking. It is quite common for pure
output to be mixed with echoed input, for example. Even if a file
is converted to another format (eg, restructured text to LaTeX), it's
very common for the
Oleg Broytman phd at phd.pp.ru writes:
Depends on the kind of cat and especially on the ways of using it. If
you ask cat to number lines (see manual for GNU cat) - what do lines mean
for binary IO?
b\n-separated chunks of data. See the docs:
On Sun, Jan 24, 2010 at 1:54 PM, Oleg Broytman p...@phd.pp.ru wrote:
..
Depends on the kind of cat and especially on the ways of using it. If
you ask cat to number lines (see manual for GNU cat) - what do lines mean
for binary IO?
Maybe this is yet another reason why some kinds of cat are a
Subject: [ANN] Python 2.5.5 Release Candidate 2.
On behalf of the Python development team and the Python community, I'm
happy to announce the release candidate 2 of Python 2.5.5.
This is a source-only release that only includes security fixes. The
last full bug-fix release of Python 2.5 was
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Stephen J. Turnbull wrote:
You just can't get away from the need for explicit management of
codecs if you want a robust internationalized application. I don't
object to giving users an easy way to get the behavior Michael
proposes; it just
Antoine Pitrou writes:
Stephen J. Turnbull stephen at xemacs.org writes:
But it *does* determine the charset of ErrorDocuments displayed by
Apache. Users are likely to get somewhat confused if the
ErrorDocuments are in a different charset from your dynamic HTML.
Why would
Martin v. Löwis writes:
My bet is that the majority of Python applications written today do
web stuff. In the web, input encoding and output encoding are
fairly decorrelated - in particular for databases and files read
from disk.
Sure. Which means that programmers have to do a lot of
Using any guessing based on the locale (which describes the codec used
byt the user's console, but is completely uncorrelated to any particular
file on the user's filesystem)
No, it's not just the encoding of the console. It is also the encoding
that text editors will use, in absence of a more
21 matches
Mail list logo