[issue11246] PyUnicode_FromFormat(%V) decodes the byte string from ISO-8859-1

2011-03-01 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

Fixed in Python 3.3 (r88697) and 3.2 (r88698). Thank you Ray.

--
resolution:  - fixed
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11246
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11246] PyUnicode_FromFormat(%V) decodes the byte string from ISO-8859-1

2011-02-21 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

+text = PyUnicode_FromFormat(b'repr=%V', 'abcdef', b'abcdef')
+self.assertEqual(text, 'repr=abcdef')

How do you know which argument is used? For example, you should use instead 
'abc' and b'xyz'.

+text = PyUnicode_FromFormat(b'repr=%V', None, '人民'.encode('UTF-8'))
+self.assertEqual(text, 'repr=人民')

I prefer ASCII literals using \x or \u: '\xe4\xba\xe6\xb0\u2018'.

You should also add a test specific to the replace error handler, e.g. (None, 
b'abc\xff') = 'abc\ufffd'.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11246
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11246] PyUnicode_FromFormat(%V) decodes the byte string from ISO-8859-1

2011-02-21 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

Thanks haypo!

Here the updated patch, following your comments.

--
type:  - behavior
Added file: http://bugs.python.org/file20831/issue11246.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11246
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11246] PyUnicode_FromFormat(%V) decodes the byte string from ISO-8859-1

2011-02-20 Thread Ray.Allen

Ray.Allen ysj@gmail.com added the comment:

Yes. The %V should be combination of %U and %s.

Here is a patch which fixed this problem.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11246
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11246] PyUnicode_FromFormat(%V) decodes the byte string from ISO-8859-1

2011-02-20 Thread Ray.Allen

Changes by Ray.Allen ysj@gmail.com:


--
keywords: +patch
Added file: http://bugs.python.org/file20818/issue11246.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11246
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11246] PyUnicode_FromFormat(%V) decodes the byte string from ISO-8859-1

2011-02-18 Thread STINNER Victor

New submission from STINNER Victor victor.stin...@haypocalc.com:

While testing a patch fixing issue #7330, I found a bug in 
PyUnicode_FromFormat() in the %V format: it decodes the byte string from 
ISO-8859-1, whereas I would expect that the string is decodes from UTF-8, as 
the %s format.

--
messages: 128816
nosy: haypo, ysj.ray
priority: normal
severity: normal
status: open
title: PyUnicode_FromFormat(%V) decodes the byte string from ISO-8859-1
versions: Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11246
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11246] PyUnicode_FromFormat(%V) decodes the byte string from ISO-8859-1

2011-02-18 Thread STINNER Victor

Changes by STINNER Victor victor.stin...@haypocalc.com:


--
components: +Library (Lib)

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11246
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com