2018-01-26 12:17 GMT+01:00 INADA Naoki <songofaca...@gmail.com>: >> No, because you can pass in maxchar to PyUnicode_New() and >> the implementation will take this as hint to the max code point >> used in the string. There is no check done whether maxchar >> is indeed the minimum upper bound to the code point ordinals. > > API doc says: > > """ > maxchar should be the true maximum code point to be placed in the string. > As an approximation, it can be rounded up to the nearest value in the > sequence 127, 255, 65535, 1114111. > """ > https://docs.python.org/3/c-api/unicode.html#c.PyUnicode_New > > Since doc says *should*, strings created with wrong maxchar > are considered invalid object.
PyUnicode objects must always use the most efficient storage. It's a very strong requirement of the PEP 393. As Naoki wrote, many functions rely on this assumption to implement fast-path. The assumption is even implemented in the debug check _PyUnicode_CheckConsistency(): https://github.com/python/cpython/blob/e76daebc0c8afa3981a4c5a8b54537f756e805de/Objects/unicodeobject.c#L453-L485 Victor _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/