Re: [bug-libunistring] case folding output size?

Bruno Haible Tue, 27 Apr 2010 17:46:33 -0700

Hi,

Aleksander Morgado wrote:
> Small questions regarding casefolding in UTF-8:
> 
> — Function: uint8_t * u8_casefold (const uint8_t *s, size_t n, const
> char *iso639_language, uninorm_t nf, uint8_t *resultbuf, size_t
> *lengthp)
> 
> What if the resultbuf passed doesn't have enough space for the
> case-folded and normalized string?


This is documented at the end of the doc section "Conventions":
  <http://www.gnu.org/software/libunistring/manual/html_node/Conventions.html>
  "Functions returning a string result take a (resultbuf, lengthp)
   argument pair. If resultbuf is not NULL and the result fits into *lengthp
   units, it is put in resultbuf, and resultbuf is returned. Otherwise, a
   freshly allocated string is returned. In both cases, *lengthp is set to
   the length (number of units) of the returned string. In case of error,
   NULL is returned and errno is set."

> And, if NFC normalization desired in the output, would it be safe to say
> that the output length will be less or equal than the input length?

No, it is not. The file tests/test-u8-casefold.c has a couple of examples that
show a case-folded string can be longer than the original string.

In summary, these Unicode aware string manipulations have so complex details
that the classical assumptions all fail.

Bruno

Re: [bug-libunistring] case folding output size?

Reply via email to