Re: [GNUnet-developers] r30485

Christian Grothoff Thu, 31 Oct 2013 04:38:15 -0700

Well, r30485 did not change that assumption -- as you can see, the code
before _also_ simply assumed that 'output' was 'big enough'.  And the
few places in the code that called this function all allocated 'output'
as strlen(input)+1, so in cases where utf8 toupper returns a longer
string, the code was incorrect before r30485 in the same way --- only
with the new API it might be more obvious that the caller is/was
expected to allocate output (and that this might be asking a bit much).


Maybe we should just change this API to *return* an allocated string
instead of passing 'output'?  I don't quite understand why this API
was written like this to begin with -- returning the uppercase string
would seem more natural.

If you change this, please also change the tolower function in the same
way.

Happy hacking!

Christian

On 10/31/2013 12:09 AM, LRN wrote:
> On 30.10.2013 22:15, [email protected] wrote:
>> Author: grothoff
>> Date: 2013-10-30 19:15:48 +0100 (Wed, 30 Oct 2013)
>> New Revision: 30485
> 
>>  /**
>> - * Convert the utf-8 input string to uppercase
>> - * Output needs to be allocated appropriately
>> + * Convert the utf-8 input string to uppercase.
>> + * Output needs to be allocated appropriately.
>>   *
>>   * @param input input string
>>   * @param output output buffer
>>   */
>>  void
>> -GNUNET_STRINGS_utf8_toupper(const char* input, char** output)
>> +GNUNET_STRINGS_utf8_toupper(const char *input,
>> +                            char *output)
>>  {
>>    uint8_t *tmp_in;
>>    size_t len;
> 
>>    tmp_in = u8_toupper ((uint8_t*)input, strlen ((char *) input),
>>                         NULL, UNINORM_NFD, NULL, &len);
>> -  memcpy(*output, tmp_in, len);
>> -  (*output)[len] = '\0';
>> -  free(tmp_in);
>> +  memcpy (output, tmp_in, len);
>> +  output[len] = '\0';
>> +  free (tmp_in);
>>  }
> 
> u8_toupper allocates its output, then you copy it into the buffer that
> user provided, using the length that u8_toupper reported (not the actual
> length of the buffer).
> 
> I'm not sure that this conversion always produces the output that has
> the same length as the input (which is, AFAIU, what you're relying on),
> not for all languages.
> 
> The docs that i've found on UNINORM_NFD do not indicate (AFAICU) that
> this is some kind of special transform that guarantees the same (or
> less) number of bytes in the output.
> 
> 
> _______________________________________________
> GNUnet-developers mailing list
> [email protected]
> https://lists.gnu.org/mailman/listinfo/gnunet-developers
>

0x48426C7E.asc
Description: application/pgp-keys

signature.asc
Description: OpenPGP digital signature

_______________________________________________
GNUnet-developers mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/gnunet-developers

Re: [GNUnet-developers] r30485

Reply via email to