Changes by Jakub Wilk jw...@jwilk.net:
--
nosy: +jwilk
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2562
___
___
Python-bugs-list mailing list
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment:
On 2008-09-08 23:45, Benjamin Peterson wrote:
Benjamin Peterson [EMAIL PROTECTED] added the comment:
Does this need to be merged into py3k? If so, can someone who handled
this bug do it. I met a few test failures in my attempt...
As
Benjamin Peterson [EMAIL PROTECTED] added the comment:
Does this need to be merged into py3k? If so, can someone who handled
this bug do it. I met a few test failures in my attempt...
--
nosy: +benjamin.peterson
___
Python tracker [EMAIL PROTECTED]
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment:
Removing Python 2.5 from the version list, since the patch may in some
cases (e.g. using a different encoding than UTF-8) cause problems with
existing setup.py files out there.
The patch is not compatible with Python 3.0 for obvious
Tarek Ziadé [EMAIL PROTECTED] added the comment:
Sure, sounds fine to me, thanks for the help on this issue
___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2562
___
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment:
Checked in as r66181 on trunk.
--
status: open - closed
___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2562
___
Tarek Ziadé [EMAIL PROTECTED] added the comment:
ok I will ask for this on the ML
___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2562
___
___
Python-bugs-list
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment:
Is this still an issue in 2.6 ?
AFAIK, there have been a few changes both to setuptools and PyPI that
make it easy to just use Unicode objects in the setup() call for
non-ASCII values.
___
Python
Tarek Ziadé [EMAIL PROTECTED] added the comment:
The problem is in distutils code, not in setuptools or PyPI.
As long as I can see, the problem remains in the trunk. It is dead
simple to reproduce : put an unicode name for the author in a plain setup.py
with a non ascii character. (for example
Changes by Tarek Ziadé [EMAIL PROTECTED]:
Removed file: http://bugs.python.org/file9961/unicode.patch
___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2562
___
Changes by Tarek Ziadé [EMAIL PROTECTED]:
Removed file:
http://bugs.python.org/file10065/distutils.unicode.simplified.patch
___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2562
___
Changes by Tarek Ziadé [EMAIL PROTECTED]:
Removed file: http://bugs.python.org/file9967/unicode.metadata.patch
___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2562
___
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment:
Here's an updated patch that applies the same logic to all meta-data
fields, instead of just a few. This simplifies the code somewhat.
I've tested it with the test you provided and also with eGenix packages
using Unicode author names (ie.
Changes by Neal Norwitz [EMAIL PROTECTED]:
--
type: crash - behavior
___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2562
___
___
Python-bugs-list
Tarek Ziadé [EMAIL PROTECTED] added the comment:
I think this should also be fixed in 2.5
--
versions: +Python 2.5
__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2562
__
Tarek Ziadé [EMAIL PROTECTED] added the comment:
I suppose the simplest way to deal with the problem is to force utf-8
encoding for the concerned fields, since this problem will dissapear in 3k.
Here's a simplified patch, that does it, so write_pkg_file behaves as
expected.
Added file:
Tarek Ziadé [EMAIL PROTECTED] added the comment:
For writing the metadata, we don't need to make any assumptions. We
can just write the bytes as-is. This is how distutils has behaved
for many releases now, and this is how users have been using it.
But write_pkg_file will use ascii encoding
Martin v. Löwis [EMAIL PROTECTED] added the comment:
But write_pkg_file will use ascii encoding if we don't indicate it
here:
pkg_info.write('Author: %s\n' % self.get_contact() )
Why do you say that it uses ascii? It uses whatever encoding the string
returned by get_contact uses. See the
Tarek Ziadé [EMAIL PROTECTED] added the comment:
pkg_info.write('Author: %s\n' % self.get_contact() )
Why do you say that it uses ascii? It uses whatever encoding the string
returned by get_contact uses. See the attached P1-1.0.tar.gz for an
example. This doesn't use ASCII, and doesn't use
Tarek Ziadé [EMAIL PROTECTED] added the comment:
ok, I'll summarize this in distutils-sig sometime today.
If we do use Unicode, I think we might need an extra meta-data,
encoding, that would default to utf8, and that could be used when
the class needs to serialize the data in a file.
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment:
Note that
value = unicode(value).encode(utf-8)
will also work if value is already Unicode, so a backwards compatible
fix would be to allow passing in:
* ASCII encoded strings
* Unicode objects
for the meta data keyword parameters and
Martin v. Löwis [EMAIL PROTECTED] added the comment:
If we do use Unicode, I think we might need an extra meta-data,
encoding, that would default to utf8, and that could be used when
the class needs to serialize the data in a file.
I don't think so. Whenever the data is written to a file,
Martin v. Löwis [EMAIL PROTECTED] added the comment:
I don't think that we should support non-ASCII encodings for meta-data
strings passed to setup().
If setuptools is broken in this respect, it needs to be fixed. Dito for
other 3rd party tools.
We do need to support non-ASCII files, as
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment:
Agreed, but any change will target the package authors who can easily
upgrade their packages to use Unicode for e.g. names.
If the change were to address distutils users, we'd have to be a lot
more careful.
In any case, if UTF-8 is the
Martin v. Löwis [EMAIL PROTECTED] added the comment:
Agreed, but any change will target the package authors who can easily
upgrade their packages to use Unicode for e.g. names.
They can't: that would break their 2.5-and-earlier compatibility.
If the change were to address distutils users,
Marc-Andre Lemburg [EMAIL PROTECTED] added the comment:
With distutils users I'm referring to people that are told to run
python setup.py install. Changed affecting the way this line behaves
need to be carefully considered.
OTOH, when upgrading a package to a new Python version (and distutils
Changes by Tarek Ziadé [EMAIL PROTECTED]:
Removed file: http://bugs.python.org/file9960/unicode.patch
__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2562
__
___
New submission from Tarek Ziadé [EMAIL PROTECTED]:
If I try to put my name in the Author field as a string field,
it will brake because distutils makes the assumption that
the fields are string encoded in ascii, before it decodes
it into unicode, then encode it in utf8 to send the data.
See
Changes by Tarek Ziadé [EMAIL PROTECTED]:
Added file: http://bugs.python.org/file9961/unicode.patch
__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2562
__
___
Python-bugs-list
Martin v. Löwis [EMAIL PROTECTED] added the comment:
The official supported way for non-ASCII characters in distutils is to
use Unicode strings. If anything else fails, that's not a bug.
IIUC, in this case, it's setuptools that fails, not distutils. Assuming
I understood correctly, I'm closing
Tarek Ziadé [EMAIL PROTECTED] added the comment:
In that case, distutils should not do a unicode() call over each field
passed before .encode('utf8') is called, because it makes the assumption
that string type can be used.
__
Tracker [EMAIL PROTECTED]
Martin v. Löwis [EMAIL PROTECTED] added the comment:
I don't understand. It is *certainly* allowed to use byte strings for
these data, as long as they are ASCII. The Unicode requirement exists
only for non-ASCII characters, and distutils makes explicit, deliberate
use of the default encoding
Tarek Ziadé [EMAIL PROTECTED] added the comment:
ok I see what you mean, thanks for the explanation
__
Tracker [EMAIL PROTECTED]
http://bugs.python.org/issue2562
__
___
Python-bugs-list
Tarek Ziadé [EMAIL PROTECTED] added the comment:
oh, hold one, it is more complicated in fact :)
setuptools calls DistributionMetadata.dist.write_pkg_file()
method to write the .egg-info file.
This method make the assertion that the metadata fields are string
so it is not setuptools fault.
Martin v. Löwis [EMAIL PROTECTED] added the comment:
I agree there is a bug in distutils. Before we proceed, I think
distutils-sig needs to be consulted. My proposal would be the one I
suggested earlier: all strings should either be Unicode or ASCII-only
byte strings. This contradicts to the
Martin v. Löwis [EMAIL PROTECTED] added the comment:
As a follow-up: for compatibility, it might be possible to support
either Unicode or arbitrary plain strings in write_pkg_file. In 3k, such
support can then be dropped.
As that constitutes a new feature, it shouldn't be applied to 2.5.
36 matches
Mail list logo