[issue41497] Potential UnicodeDecodeError in dis

2020-08-07 Thread Inada Naoki


Change by Inada Naoki :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41497] Potential UnicodeDecodeError in dis

2020-08-07 Thread miss-islington


miss-islington  added the comment:


New changeset d9106434f77fa84c8a59f8e60dc9c14cdd989b35 by Miss Islington (bot) 
in branch '3.9':
bpo-41497: Fix potential UnicodeDecodeError in dis CLI (GH-21757)
https://github.com/python/cpython/commit/d9106434f77fa84c8a59f8e60dc9c14cdd989b35


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41497] Potential UnicodeDecodeError in dis

2020-08-07 Thread miss-islington


miss-islington  added the comment:


New changeset 66c899661902edc18df96a5c3f22639310700491 by Miss Islington (bot) 
in branch '3.8':
bpo-41497: Fix potential UnicodeDecodeError in dis CLI (GH-21757)
https://github.com/python/cpython/commit/66c899661902edc18df96a5c3f22639310700491


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41497] Potential UnicodeDecodeError in dis

2020-08-07 Thread Inada Naoki


Inada Naoki  added the comment:


New changeset a4084b9d1e40c1c9259372263d1fe8c8a562b093 by Konge in branch 
'master':
bpo-41497: Fix potential UnicodeDecodeError in dis CLI (GH-21757)
https://github.com/python/cpython/commit/a4084b9d1e40c1c9259372263d1fe8c8a562b093


--
nosy: +inada.naoki

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41497] Potential UnicodeDecodeError in dis

2020-08-07 Thread miss-islington


Change by miss-islington :


--
nosy: +miss-islington
nosy_count: 3.0 -> 4.0
pull_requests: +20923
pull_request: https://github.com/python/cpython/pull/21782

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41497] Potential UnicodeDecodeError in dis

2020-08-07 Thread miss-islington


Change by miss-islington :


--
pull_requests: +20924
pull_request: https://github.com/python/cpython/pull/21783

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41497] Potential UnicodeDecodeError in dis

2020-08-06 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

Good catch. Yes, when read Python source files you should either open them in 
binary mode if reading bytes is enough for use, or open them with 
tokenize.open() if we need string data, or use token.detect_encoding() and pass 
the result to open().

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41497] Potential UnicodeDecodeError in dis

2020-08-06 Thread Inada Naoki


Change by Inada Naoki :


--
versions:  -Python 3.5, Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41497] Potential UnicodeDecodeError in dis

2020-08-06 Thread JIanqiu Tao


JIanqiu Tao  added the comment:

I searched the whole Lib folder and find a lot of code uses "open(filename, 
'r')" without handling default encoding.

Should we open another issue for these problem?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41497] Potential UnicodeDecodeError in dis

2020-08-06 Thread JIanqiu Tao


Change by JIanqiu Tao :


--
keywords: +patch
pull_requests: +20902
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/21757

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41497] Potential UnicodeDecodeError in dis

2020-08-06 Thread JIanqiu Tao

New submission from JIanqiu Tao :

A potential UnicodeDecodeError could be raised when run "python -m dis" on 
non-utf8 encoding environment.

Assume there is a file named "a.py", and contains "print('喵')", then save with 
UTF8 encoding.

Run "python -m dis ./a.py", on non-UTF8 encoding environment, for example a 
Windows PC which default language is Chinese.

A UnicodeDecodeError raised.

Traceback (most recent call last):
  File "C:\Program Files\Python38\lib\runpy.py", line 194, in 
_run_module_as_main
return _run_code(code, main_globals, None,
  File "C:\Program Files\Python38\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
  File "C:\Program Files\Python38\lib\dis.py", line 553, in 
_test()
  File "C:\Program Files\Python38\lib\dis.py", line 548, in _test
source = infile.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xb5 in position 9: illegal 
multibyte sequence

That because Windows' default encoding is decided by language. Chinese use 
cp936(GB2312) as default encoding and can't handle UTF8 encoding.

It just need to read in "rb" mode instead of "r".

--
components: Library (Lib)
messages: 374961
nosy: zkonge
priority: normal
severity: normal
status: open
title: Potential UnicodeDecodeError in dis
type: behavior
versions: Python 3.10, Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 
3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com