[issue38003] Incorrect "fixing" of isinstance tests for basestring

2019-09-06 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

basestring in Python 2 means "thing that is logically text", because in Python 
2, str can mean *either* logical text *or* binary data, and unicode is always 
logical text. str and unicode can kinda sorta interoperate on Python 2, so it 
can make sense to test for basestring if you're planning to use it as logical 
text; if you do 'foo' + u'bar', that's fine in Python 2. In Python 3, only str 
is logically text; b'foo' + 'bar' is completely illegal, so it doesn't make 
sense to convert it to recognize both bytes and str.

Your problem is that you're using basestring incorrectly in Python 2, and it 
happens to work only because Python 2 did a bad job of separating text and 
binary data. Your original example code should actually have been written in 
Python 2 as:


if isinstance(value, bytes):  # bytes is an alias of str, and only str, on 2.7
value = value.decode(encoding)
elif not isinstance(value, unicode):
some other code

which 2to3 would convert correctly (changing unicode to str, and leaving 
everything else untouched) because you actually tested what you meant to test 
to control the actions taken:

1. If it was binary data (which you interpret all Py2 strs to be), then it is 
decoded to text (Py2 unicode/Py3 str)
2. If it wasn't binary data and it wasn't text, you did something else

Point is, the converter is doing the right thing. You misunderstood the logical 
meaning of basestring, and wrote code that depended on your misinterpretation, 
that's all.

Your try/except to try to detect Python 3-ness was doomed from the start; you 
referenced basestring, and 2to3 (reasonably) converts that to str, which breaks 
your logic. You wrote cross-version code that can't be 2to3-ed because it's 
*already* Python 3 code; Python 3 code should never be subjected to 2to3, 
because it'll do dumb things (e.g. change print(1, 2) to print((1, 2))); it's 
2to3, not 2or3to3 after all.

--
nosy: +josh.r

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38003] Incorrect "fixing" of isinstance tests for basestring

2019-09-06 Thread Bob Kline


Bob Kline  added the comment:

> Unless you have a specific proposal, ...

I _do_ have a specific proposal: replace `basestring` with `(str, bytes)`, 
which preserves the behavior of the original code. So, 

if isinstance(value, basestring)

becomes

if isinstance(value, (str, bytes))

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38003] Incorrect "fixing" of isinstance tests for basestring

2019-09-06 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

Bob, this issue tracker is for managing patches to the cpython repository.  not 
for 'help me understand' requests,  The latter belong on, for instance, 
python-list.  Unless you have a specific proposal, other than leaving 
'basestring' alone(1), that we could reject or accept and implement, please 
close this(2) as 'not a bug' or whatever.

(1) Rejected as leaving code broken for 3.x.
(2) Discussion elsewhere *might* result in a concrete suggestion appropriate 
for this or a new issue.

--
nosy: +terry.reedy

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38003] Incorrect "fixing" of isinstance tests for basestring

2019-09-01 Thread Bob Kline


Bob Kline  added the comment:

> Use str instead.

Sure. I understand the advantages of the new approach to strings. Which, by the 
way, weren't available when this project began. I don't disagree with anything 
you say in the context of writing new code. I was, however, surprised and 
dismayed to learn of the cavalier approach the upgrade tool has taken to 
silently breaking existing code which it is claiming to "fix."

Here's my favorite so far.

--- cdr.py  (original)
+++ cdr.py  (refactored)
@@ -36,15 +36,15 @@
 # ==
 from six import itervalues
 try:
-basestring
+str
 is_python3 = False
 base64encode = base64.encodestring
 base64decode = base64.decodestring
 except:
 base64encode = base64.encodebytes
 base64decode = base64.decodebytes
-basestring = (str, bytes)
-unicode = str
+str = (str, bytes)
+str = str
 is_python3 = True

We wrote this following the example of comparable techniques in 
http://python-future.org/compatible_idioms.html and similar guides to an 
upgrade path. Seems we're being punished for taking the trouble to make our 
code work with Python 2 and 3 during the transition period. :-(

It's hard to see how this conversion resulted in something better than what we 
already had.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38003] Incorrect "fixing" of isinstance tests for basestring

2019-09-01 Thread Karthikeyan Singaravelan

Karthikeyan Singaravelan  added the comment:

https://docs.python.org/3.0/whatsnew/3.0.html

> The builtin basestring abstract type was removed. Use str instead. The str 
> and bytes types don’t have functionality enough in common to warrant a shared 
> base class. The 2to3 tool (see below) replaces every occurrence of basestring 
> with str.

For a longer explanation of this and other changes you might find below link 
useful. In Python 2 str is used to represent both text and bytes. Hence to 
check the type is str in python 2 you have to check it to be basestring and 
then check it to be unicode. In python 3 all strings are unicode with str and 
bytes being two different types. Hence there is no basestring and unicode 
string since they are both unified to be str itself in Python 3.

https://portingguide.readthedocs.io/en/latest/strings.html

Hope this helps.

--
nosy: +xtreak

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38003] Incorrect "fixing" of isinstance tests for basestring

2019-09-01 Thread Bob Kline


New submission from Bob Kline :

We are attempting to convert a large Python 2 code base. Following the guidance 
of the official documentation 
(https://docs.python.org/2/library/functions.html#basestring) we created tests 
in many, many places that look like this:

if isinstance(value, basestring):
if not isinstance(value, unicode):
value = value.decode(encoding)
else:
some other code

It seems that the 2to3 tool is unaware that replacing basestring with str in 
such cases will break the software.

Here's an example.

$ 2to3 repro.py
...
--- repro.py(original)
+++ repro.py(refactored)
@@ -1,8 +1,8 @@
 from frobnitz import transform

 def foo(value, encoding=None):
-if isinstance(value, basestring):
-if not isinstance(value, unicode):
+if isinstance(value, str):
+if not isinstance(value, str):
 value = value.decode(encoding or "utf-8")
 return value
 else:

Help me understand how this "fix" results in the correct behavior.

--
components: 2to3 (2.x to 3.x conversion tool)
messages: 350964
nosy: bkline
priority: normal
severity: normal
status: open
title: Incorrect "fixing" of isinstance tests for basestring
type: behavior
versions: Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com