New submission from Curtis Doty:
I first stumbled across this bug attempting to install use pip's cool editable
mode:
$ pip install -e git+git://github.com/appliedsec/pygeoip.git#egg=pygeoip
Obtaining pygeoip from git+git://github.com/appliedsec/pygeoip.git#egg=pygeoip
Cloning git://github.com/appliedsec/pygeoip.git to ./src/pygeoip
Running setup.py egg_info for package pygeoip
Traceback (most recent call last):
File string, line 16, in module
File /home/curtis/python/3.3.3/lib/python3.3/encodings/ascii.py, line
26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1098:
ordinal not in range(128)
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File string, line 16, in module
File /home/curtis/python/3.3.3/lib/python3.3/encodings/ascii.py, line 26,
in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1098:
ordinal not in range(128)
Cleaning up...
Command python setup.py egg_info failed with error code 1 in
/home/curtis/python/2013-11-20/src/pygeoip
Storing complete log in /home/curtis/.pip/pip.log
It turns out this is related to a local LANG=C environment. If I set
LANG=en_US.UTF-8, the problem goes away. But it seems pip/python3 open() should
be more intelligently handling this.
Worse, the file in this case
https://github.com/appliedsec/pygeoip/blob/master/setup.py already has a source
code decorator *declaring* it as utf-8.
Ugly workaround patch is to force pip to always use 8-bit encoding on setup.py:
--- pip.orig/req.py 2013-11-19 15:53:49.0 -0800
+++ pip/req.py 2013-11-20 16:37:23.642656132 -0800
@@ -281,7 +281,7 @@ def replacement_run(self):
writer(self, ep.name, os.path.join(self.egg_info,ep.name))
self.find_sources()
egg_info.egg_info.run = replacement_run
-exec(compile(open(__file__).read().replace('\\r\\n', '\\n'), __file__, 'exec'))
+exec(compile(open(__file__,encoding='utf_8').read().replace('\\r\\n', '\\n'),
__file__, 'exec'))
def egg_info_data(self, filename):
@@ -687,7 +687,7 @@ exec(compile(open(__file__).read().repla
## FIXME: should we do --install-headers here too?
call_subprocess(
[sys.executable, '-c',
- import setuptools; __file__=%r;
exec(compile(open(__file__).read().replace('\\r\\n', '\\n'), __file__,
'exec')) % self.setup_py]
+ import setuptools; __file__=%r;
exec(compile(open(__file__,encoding='utf_8').read().replace('\\r\\n', '\\n'),
__file__, 'exec')) % self.setup_py]
+ list(global_options) + ['develop', '--no-deps'] +
list(install_options),
cwd=self.source_dir, filter_stdout=self._filter_install,
But that only treats the symptom. Root cause appears to be in python3 as
demonstrated by this simple script:
wrong-codec.py:
#! /bin/env python3
from urllib.request import urlretrieve
urlretrieve('https://raw.github.com/appliedsec/pygeoip/master/setup.py',
filename='setup.py')
# if LANC=C then locale.py:getpreferredencoding()-'ANSI_X3.4-1968'
foo= open('setup.py')
# bang! ascii_decode() cannot handle the unicode
bar= foo.read()
This does not occur in python2. Is this bug in pip or python3?
--
components: Unicode
messages: 203673
nosy: GreenKey, ezio.melotti, haypo
priority: normal
severity: normal
status: open
title: open() fails to autodetect utf-8 if LANG=C
type: crash
versions: Python 3.3
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19685
___
___
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com