Author: Armin Rigo <ar...@tunes.org> Branch: Changeset: r2963:c60281bf502f Date: 2017-06-02 09:19 +0200 http://bitbucket.org/cffi/cffi/changeset/c60281bf502f/
Log: Document char16_t and char32_t diff --git a/doc/source/cdef.rst b/doc/source/cdef.rst --- a/doc/source/cdef.rst +++ b/doc/source/cdef.rst @@ -178,7 +178,8 @@ * intN_t, uintN_t (for N=8,16,32,64), intptr_t, uintptr_t, ptrdiff_t, size_t, ssize_t -* wchar_t (if supported by the backend) +* wchar_t (if supported by the backend). *New in version 1.11:* + char16_t and char32_t. * _Bool and bool (equivalent). If not directly supported by the C compiler, this is declared with the size of ``unsigned char``. diff --git a/doc/source/ref.rst b/doc/source/ref.rst --- a/doc/source/ref.rst +++ b/doc/source/ref.rst @@ -104,11 +104,13 @@ returns a ``bytes``, not a ``str``. - If 'cdata' is a pointer or array of wchar_t, returns a unicode string - following the same rules. + following the same rules. *New in version 1.11:* can also be + char16_t or char32_t. -- If 'cdata' is a single character or byte or a wchar_t, returns it as a - byte string or unicode string. (Note that in some situation a single - wchar_t may require a Python unicode string of length 2.) +- If 'cdata' is a single character or byte or a wchar_t or charN_t, + returns it as a byte string or unicode string. (Note that in some + situation a single wchar_t or char32_t may require a Python unicode + string of length 2.) - If 'cdata' is an enum, returns the value of the enumerator as a string. If the value is out of range, it is simply returned as the stringified @@ -125,7 +127,7 @@ - If 'cdata' is a pointer to 'wchar_t', returns a unicode string. ('length' is measured in number of wchar_t; it is not the size in - bytes.) + bytes.) *New in version 1.11:* can also be char16_t or char32_t. - If 'cdata' is a pointer to anything else, returns a list, of the given 'length'. (A slower way to do that is ``[cdata[i] for i in @@ -626,10 +628,10 @@ | ``char`` | a string of length 1 | a string of | int(), bool(), | | | or another <cdata char>| length 1 | ``<`` | +---------------+------------------------+------------------+----------------+ -| ``wchar_t`` | a unicode of length 1 | a unicode of | | -| | (or maybe 2 if | length 1 | int(), bool(), | -| | surrogates) or | (or maybe 2 if | ``<`` | -| | another <cdata wchar_t>| surrogates) | | +| ``wchar_t``, | a unicode of length 1 | a unicode of | | +| ``char16_t``, | (or maybe 2 if | length 1 | int(), bool(), | +| ``char32_t`` | surrogates) or | (or maybe 2 if | ``<`` | +| | another similar <cdata>| surrogates) | | +---------------+------------------------+------------------+----------------+ | ``float``, | a float or anything on | a Python float | float(), int(),| | ``double`` | which float() works | | bool(), ``<`` | @@ -671,9 +673,9 @@ | ``char[]``, | | | ``-`` | | ``_Bool[]`` | | | | +---------------+------------------------+ +----------------+ -| ``wchar_t[]`` | same as arrays, or a | | len(), iter(), | -| | Python unicode string | | ``[]``, | -| | | | ``+``, ``-`` | +|``wchar_t[]``, | same as arrays, or a | | len(), iter(), | +|``char16_t[]``,| Python unicode string | | ``[]``, | +|``char32_t[]`` | | | ``+``, ``-`` | | | | | | +---------------+------------------------+------------------+----------------+ | structure | a list or tuple or | a <cdata> | read/write | diff --git a/doc/source/using.rst b/doc/source/using.rst --- a/doc/source/using.rst +++ b/doc/source/using.rst @@ -25,6 +25,11 @@ unicode string to an integer, ``ord(x)`` does not work; use instead ``int(ffi.cast('wchar_t', x))``. +*New in version 1.11:* in addition to ``wchar_t``, the C types +``char16_t`` and ``char32_t`` work the same but with a known fixed size. +In previous versions, this could be achieved using ``uint16_t`` and +``int32_t`` but without automatic convertion to Python unicodes. + Pointers, structures and arrays are more complex: they don't have an obvious Python equivalent. Thus, they correspond to objects of type ``cdata``, which are printed for example as @@ -197,9 +202,10 @@ >>> ffi.string(x) # interpret 'x' as a regular null-terminated string 'Hello' -Similarly, arrays of wchar_t can be initialized from a unicode string, +Similarly, arrays of wchar_t or char16_t or char32_t can be initialized +from a unicode string, and calling ``ffi.string()`` on the cdata object returns the current unicode -string stored in the wchar_t array (adding surrogates if necessary). +string stored in the source array (adding surrogates if necessary). Note that unlike Python lists or tuples, but like C, you *cannot* index in a C array from the end using negative numbers. @@ -347,7 +353,8 @@ assert lib.strlen("hello") == 5 -You can also pass unicode strings as ``wchar_t *`` arguments. Note that +You can also pass unicode strings as ``wchar_t *`` or ``char16_t *`` or +``char32_t *`` arguments. Note that the C language makes no difference between argument declarations that use ``type *`` or ``type[]``. For example, ``int *`` is fully equivalent to ``int[]`` (or even ``int[5]``; the 5 is ignored). For CFFI, diff --git a/doc/source/whatsnew.rst b/doc/source/whatsnew.rst --- a/doc/source/whatsnew.rst +++ b/doc/source/whatsnew.rst @@ -6,6 +6,13 @@ v1.11 ===== +* Support the modern standard types ``char16_t`` and ``char32_t``. + These work like ``wchar_t``: they represent one unicode character, or + when used as ``charN_t *`` or ``charN_t[]`` they represent a unicode + string. The difference with ``wchar_t`` is that they have a known, + fixed size. They should work at all places that used to work with + ``wchar_t`` (please report an issue if I missing something). + * Support the C99 types ``float _Complex`` and ``double _Complex``. Note that libffi doesn't support them, which means that in the ABI mode you still cannot call C functions that take complex numbers _______________________________________________ pypy-commit mailing list pypy-commit@python.org https://mail.python.org/mailman/listinfo/pypy-commit