[issue24025] str(bytes_obj) should raise an error

2015-04-23 Thread Guido van Rossum

Guido van Rossum added the comment:

It would be unacceptable if print(b) were to raise an exception. The reason the 
transitional period is long is just that people are still porting Python 2 code.

--
assignee:  -> gvanrossum
status: pending -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24025] str(bytes_obj) should raise an error

2015-04-23 Thread Antoine Pitrou

Antoine Pitrou added the comment:

> I'm not sure what the "transitional period" refers to, though.

The Python 2 -> Python 3 migration.

> It's 8 years later now and doesn't look like str(bytes_object) will
go away a source of subtle bugs anytime soon

str(bytes_object) is perfectly reasonable when logging stuff, for example.

Recommend closing.

--
nosy: +gvanrossum, pitrou
resolution:  -> rejected
status: open -> pending
superseder:  -> py3k-pep3137: issue warnings / errors on str(bytes()) and 
similar operations

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24025] str(bytes_obj) should raise an error

2015-04-22 Thread R. David Murray

R. David Murray added the comment:

Yeah, that's why I run tests with -bb myself.  Except that there was a bug in 
-W/-bb handling that meant I wasn't really...and that bit me because there is 
at least one buildbot that really does, and it complained...

(Although in that case the 'bug' was really benign, since it was just optional 
debug print output for which the repr of the bytes was actually fine.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24025] str(bytes_obj) should raise an error

2015-04-22 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 22.04.2015 15:52, R. David Murray wrote:
> str accepting bytes and returning the repr was a conscious design choice, as 
> evidenced by the -bb option, and I'm sure there is code that is both 
> unintentionally and *intentionally* using this, despite the warning.  Unless 
> we want to discuss making the -bb behavior the default in a future version of 
> python, this issue should be closed.

I guess that would be helpful, yes.

Here's the original patch which introduced -b and -bb:

http://bugs.python.org/issue1392

This was Guido's answer back then:

"""
I'll look at the patches later, but we've gone over this before on the
list. str() of *any* object needs to return *something*. Yes, it's
unfortunate that this masks bugs in the transitional period, but it
really is the best thing in the long run. We had other exceptional
treatement for str vs. bytes (e.g. the comparison was raising TypeError
for a while) and we had to kill that too.
"""

I'm not sure what the "transitional period" refers to, though.
It's 8 years later now and doesn't look like str(bytes_object) will
go away a source of subtle bugs anytime soon :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24025] str(bytes_obj) should raise an error

2015-04-22 Thread R. David Murray

R. David Murray added the comment:

str accepting bytes and returning the repr was a conscious design choice, as 
evidenced by the -bb option, and I'm sure there is code that is both 
unintentionally and *intentionally* using this, despite the warning.  Unless we 
want to discuss making the -bb behavior the default in a future version of 
python, this issue should be closed.

--
nosy: +r.david.murray

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24025] str(bytes_obj) should raise an error

2015-04-22 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

In Python 2, the unicode() constructor accepts bytes argument if it is 
decodeable with sys.getdefaultencoding().

>>> unicode(b'abc')
u'abc'
>>> import sys
>>> reload(sys)

>>> sys.setdefaultencoding("utf-8")
>>> unicode(u'abcäöü'.encode('utf-8'))
u'abc\xe4\xf6\xfc'

In Python 3, the str() constructor does not accept bytes arguments if Python is 
ran with -bb option.

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24025] str(bytes_obj) should raise an error

2015-04-22 Thread eryksun

eryksun added the comment:

bytes.__str__ can already raise either a warning (-b) 

>>> str('abcäöü'.encode('utf-8'))
__main__:1: BytesWarning: str() on a bytes instance
"b'abc\\xc3\\xa4\\xc3\\xb6\\xc3\\xbc'"

or error (-bb), which applies equally to implicit conversion by print():

>>> print('abcäöü'.encode('utf-8'))
Traceback (most recent call last):
  File "", line 1, in 
BytesWarning: str() on a bytes instance

--
nosy: +eryksun

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24025] str(bytes_obj) should raise an error

2015-04-22 Thread Marc-Andre Lemburg

New submission from Marc-Andre Lemburg:

In Python 2, the unicode() constructor does not accept bytes arguments, unless 
an encoding argument is given:

>>> unicode(u'abcäöü'.encode('utf-8'))
Traceback (most recent call last):
  File "", line 1, in 
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal 
not in range(128)

In Python 3, the str() constructor masks this programming error by returning 
the repr() of the bytes object:

>>> str('abcäöü'.encode('utf-8'))
"b'abc\\xc3\\xa4\\xc3\\xb6\\xc3\\xbc'"

I think it would be more helpful to point the programmer to the most probably 
missing encoding argument by raising an error.

Also note that you get a different output with encoding argument set:

>>> str('abcäöü'.encode('utf-8'), 'utf-8')
'abcäöü'

I know this is documented, but it is still not very helpful and can easily hide 
errors.

--
components: Interpreter Core, Unicode
messages: 241800
nosy: ezio.melotti, haypo, lemburg
priority: normal
severity: normal
status: open
title: str(bytes_obj) should raise an error
versions: Python 3.5, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com