[issue40596] str.isidentifier() does not work with non-BMP non-canonicalized strings on Windows

2020-05-12 Thread STINNER Victor


STINNER Victor  added the comment:

Thanks for the fix Serhiy!

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40596] str.isidentifier() does not work with non-BMP non-canonicalized strings on Windows

2020-05-12 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40596] str.isidentifier() does not work with non-BMP non-canonicalized strings on Windows

2020-05-12 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:


New changeset 5650e76f63a6f4ec55d00ec13f143d84a2efee39 by Serhiy Storchaka in 
branch 'master':
bpo-40596: Fix str.isidentifier() for non-canonicalized strings containing 
non-BMP characters on Windows. (GH-20053)
https://github.com/python/cpython/commit/5650e76f63a6f4ec55d00ec13f143d84a2efee39


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40596] str.isidentifier() does not work with non-BMP non-canonicalized strings on Windows

2020-05-12 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
pull_requests: +19362
pull_request: https://github.com/python/cpython/pull/20053

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40596] str.isidentifier() does not work with non-BMP non-canonicalized strings on Windows

2020-05-12 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

I am not sure that changes in issue39500 was correct. It is easier to catch a 
bug if crash consistently when you pass a non-canonicalized strings then if 
silently return a wrong result for specific input on particular platform.

Alternatively, you could reimplement correct handling of surrogate pairs in  
PyUnicode_IsIdentifier().

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40596] str.isidentifier() does not work with non-BMP non-canonicalized strings on Windows

2020-05-11 Thread STINNER Victor


STINNER Victor  added the comment:

My previous change on this function:

commit f3e7ea5b8c220cd63101e419d529c8563f9c6115
Author: Victor Stinner 
Date:   Tue Feb 11 14:29:33 2020 +0100

bpo-39500: Document PyUnicode_IsIdentifier() function (GH-18397)

PyUnicode_IsIdentifier() does not call Py_FatalError() anymore if the
string is not ready.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40596] str.isidentifier() does not work with non-BMP non-canonicalized strings on Windows

2020-05-11 Thread STINNER Victor


STINNER Victor  added the comment:

It's maybe time to speed up the deprecation of the legacy C API using 
Py_UNICODE...

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40596] str.isidentifier() does not work with non-BMP non-canonicalized strings on Windows

2020-05-11 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
keywords: +patch
pull_requests: +19345
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/20035

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40596] str.isidentifier() does not work with non-BMP non-canonicalized strings on Windows

2020-05-11 Thread Serhiy Storchaka


New submission from Serhiy Storchaka :

>>> import _testcapi
>>> u = '\U0001d580\U0001d593\U0001d58e\U0001d588\U0001d594\U0001d589\U0001d58a'
>>> u.isidentifier()
True
>>> _testcapi.unicode_legacy_string(u).isidentifier()
False

--
components: Interpreter Core
messages: 368637
nosy: serhiy.storchaka, vstinner
priority: normal
severity: normal
status: open
title: str.isidentifier() does not work with non-BMP non-canonicalized strings 
on Windows
type: behavior
versions: Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com