[issue17439] insufficient error message for failed unicode conversion

2013-03-17 Thread R. David Murray

R. David Murray added the comment:

No.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17439] insufficient error message for failed unicode conversion

2013-03-17 Thread anatoly techtonik

anatoly techtonik added the comment:

Ok. Does the data (string literals) has a scope? Does Python know at runtime 
that a string literal stored in its memory was defined in the input stream or a 
file?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17439] insufficient error message for failed unicode conversion

2013-03-16 Thread R. David Murray

R. David Murray added the comment:

Python doesn't store the encoding information anywhere.  The coding cookie is 
used to correctly convert the bytes in the file into unicode...otherwise they 
are just treated as bytes.

For the stdin case, the encoding is associated with the input stream, and again 
you either get unicode or bytes, there is no encoding information that is 
carried along with the data.

So, when the conversion is attempted, there is no encoding information 
available to add to the error message.

--
nosy: +r.david.murray
resolution:  -> invalid
stage:  -> committed/rejected
status: open -> closed
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17439] insufficient error message for failed unicode conversion

2013-03-16 Thread anatoly techtonik

New submission from anatoly techtonik:

When Python 2.x compares ordinary string with unicode, it tries to convert the 
former, and shows an error message if the conversion fails. Attached example 
with Russian strings gives the following:

russian.py:11: UnicodeWarning: Unicode equal comparison failed to convert both 
arguments to Unicode - interpreting them as being unequal
  print(nonu2 == ustr2)

This message is missing information about what source encoding Python used for 
the conversion. russian.py is encoded in UTF-8, so this information at least 
will give a hint what encoding is expected.


A little different question. As you may see, russian.py has a coding header set 
to UTF-8. When Python parses source files, it reads and stores string literals 
encountered in this file. Are those literals linked to this source file? And 
does it store this coding information somewhere? Because if it does, then 
conversion can be automatically possible without side effects. And the error 
message above could contain reference to encoding and explanation where this 
coding information was taken from (i.e. from file header).

When Python evaluates strings from stdin file, they also have some encoding. Is 
this problem solved for this case? Where Python stores encoding for stdin input?

--
components: Unicode
files: russian.py
messages: 184315
nosy: ezio.melotti, techtonik
priority: normal
severity: normal
status: open
title: insufficient error message for failed unicode conversion
versions: Python 2.7
Added file: http://bugs.python.org/file29424/russian.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com