Greetings,

On Mon 06 Sep 2010 18:28, Mike Gran <spk...@yahoo.com> writes:

> there is a failure case to consider for scm_from_utf8_string.  The C
> utf8 string could contain incorrectly encoded data.

There is the analogous case of scm_to_locale_string, if the string is not
encodable in the current locale.

> You could throw the encoding error, or you could replace the 
> bad utf8 with U+FFFD or the question mark.
>
> The bytevector's utf8->string always throws encoding-error.
> Maybe that's good enough.

Yeah, maybe so.

> Otherwise, perhaps something like
>
> scm_from_utf8_stringn (str, len, error_or_replace_strategy)
>
> If you didn't mind the overhead of calling the somewhat 
> heavyweight scm_{to,from}_stringn, these could be macros
> or inline functions that wrap that.

Ah, I did not see scm_{to,from}_stringn. Cool! I think
scm_from_utf8_stringn et al should be proper functions, and probably
their initial implementations just call scm_{to,from}_stringn. But we
should at least do the straightforward optimization for the latin1 case.

Cheers,

Andy
-- 
http://wingolog.org/

Reply via email to