New submission from william.ayd <william....@icloud.com>:

With the attached extension module, if I run the following in the REPL:

>>> import libtest
>>>
>>> libtest.error_if_not_utf8("foo")
'foo'
>>> libtest.error_if_not_utf8("\ud83d")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'utf-8' codec can't encode character '\ud83d' in position 
0: surrogates not allowed
>>> libtest.error_if_not_utf8("foo")
'foo'

Things seem OK. But the next invocation of

>>> libtest.error_if_not_utf8("\ud83d")

Then causes a segfault. Note that the order of the input seems important; 
simply repeating the call with the invalid surrogate doesn't cause the segfault

----------
files: testmodule.c
messages: 358755
nosy: william.ayd
priority: normal
severity: normal
status: open
title: PyUnicode_AsUTF8AndSize Sometimes Segfaults With Incomplete Surrogate 
Pair
Added file: https://bugs.python.org/file48798/testmodule.c

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue39113>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to