Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
Committed revision 86893 that makes untabify.py respect encoding cookie in the
files it processes. I don't think there is anything else that needs to be done
here.
--
resolution: - fixed
stage: -
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
From IRC:
Me: UTF-8 was not strictly valid in ANSI C comments, so it is a bug in untabify
to assume UTF-8 in C files.
Merwok: Works for me.
I am lowering the priority because it looks like untabify does not fail on the
Éric Araujo mer...@netwok.org added the comment:
Why would it be the job of untabify to report invalid non-ASCII characters in C
files?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9598
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
On Tue, Sep 7, 2010 at 8:08 PM, Éric Araujo rep...@bugs.python.org wrote:
..
Why would it be the job of untabify to report invalid non-ASCII characters in
C files?
Since untabify works by loading C code as text, it has
Éric Araujo mer...@netwok.org added the comment:
My real question was: Shouldn’t this be a VCS hook instead of untabify’s job?
(or in addition to untabify if you insist)
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9598
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
On Tue, Sep 7, 2010 at 8:31 PM, Éric Araujo rep...@bugs.python.org wrote:
..
My real question was: Shouldn’t this be a VCS hook instead of untabify’s job?
(or in addition to untabify if you insist)
Yes, VCS hook makes
Éric Araujo mer...@netwok.org added the comment:
I agree with your reply (that’s what I meant with “works for me”, the question
about untabify vs. hooks only occurred to me after our IRC exchange).
--
___
Python tracker rep...@bugs.python.org
Florent Xicluna florent.xicl...@gmail.com added the comment:
Other C files converted from latin-1 to utf-8 with r84485.
--
components: +Unicode
nosy: +flox
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9598
Éric Araujo mer...@netwok.org added the comment:
Fixed encoding error in r84472 through r84474.
This bug should be reassessed and retitled. If untabify fails because a file
has an incorrect encoding, is it really a problem in untabify? This is a
developer’s tool, so getting a traceback here
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
If untabify fails because a file has an incorrect encoding, is it really
a problem in untabify? This is a developer’s tool, so getting a
traceback here seems okay to me.
I disagree. I think we should use this
Éric Araujo mer...@netwok.org added the comment:
I agree about the need to define the encoding for comments. My vote goes to #2,
since I wouldn’t want to see names of authors/contributors mangled in the
source. I would reconsider if a specification explicitly forbade that.
I repeat that the
Alexander Belopolsky belopol...@users.sourceforge.net added the comment:
I wouldn’t want to see names of authors/contributors mangled
in the source.
This is a reason to write names in ASCII. While Latin-1 is a grey area
because most of it's characters look familiar to English-speaking
Éric Araujo mer...@netwok.org added the comment:
I wouldn’t want to see names of authors/contributors mangled
in the source.
This is a reason to write names in ASCII.
Oh, sorry, by “mangled” I meant “forced into ASCII”. I was not speaking about
mojibake.
While Latin-1 is a grey area
Éric Araujo mer...@netwok.org added the comment:
The builtin open in 3.2 is similar to codecs.open. If you read the error
message closely, you’ll see that the decoding that failed did try to use UTF-8.
The cause of the problem here is that the bytes used for the ç in François’
name are not
Popa Claudiu pcmantic...@gmail.com added the comment:
Hello.
As it seems, untabify.py opens the file using the builtin function open, making
the call error-prone when encountering non-ascii character. The proper handling
should be done by using open from codecs library, specifying the encoding
New submission from Alexander Belopolsky belopol...@users.sourceforge.net:
For example:
$ ./python.exe Tools/scripts/untabify.py Modules/_heapqmodule.c
Traceback (most recent call last):
...
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf8'
Changes by Alexander Belopolsky belopol...@users.sourceforge.net:
--
nosy: +eric.araujo
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9598
___
___
17 matches
Mail list logo