Re: [Python-Dev] PEP 393 review

Stefan Behnel Fri, 26 Aug 2011 11:30:49 -0700

"Martin v. Löwis", 26.08.2011 18:56:

I agree with your observation that somebody should be done about error
handling, and will update the PEP shortly. I propose that
PyUnicode_Ready should be explicitly called on input where raising an
exception is feasible. In contexts where it is not feasible (such
as reading a character, or reading the length or the kind), failing to
ready the string should cause a fatal error.

I consider this an increase in complexity. It will then no longer be enoughto access the data, the user will first have to figure out a suitable placein the code to make sure it's actually there, potentially forgetting aboutit because it works in all test cases, or potentially triggering a hugeamount of overhead that copies and 'recodes' the string data by executingone of the macros that does it automatically.

For the specific case of Cython, I would guess that I could just addanother special case that reads the data from the Py_UNICODE buffer andcombines surrogates at need, but that will only work in some cases(specifically not for indexing). And outside of Cython, most normal usercode won't do that.

My gut feeling leans towards a KISS approach. If you go the route torequire an explicit point for triggering PyUnicode_Ready() calls, why notjust go all the way and make it completely explicit in *all* cases? I.e.remove all implicit calls from the macros and make it part of the new APIsemantics that users *must* call PyUnicode_FAST_READY() before doinganything with a new string data layout. Much fewer surprises.

Note that there isn't currently an official macro way to figure out thatthe flexible string layout has not been initialised yet, i.e. that wstr isset but str is not. If the implicit PyUnicode_Ready() calls get removed,PyUnicode_KIND() could take that place by simply returning WSTR_KIND.

That being said, the main problem I currently see is that basically allexisting code needs to be updated in order to handle these errors.Otherwise, it would be possible to trigger crashes by properly forging astring and passing it into an unprepared C library to let it run into aNULL pointer return value of PyUnicode_AS_UNICODE().


Stefan

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 393 review

Reply via email to