Lucifers,

Regarding issue LUCY-311:

------------------------------------

             Summary: Non-ASCII error messages from strerror cause exceptions
                 Key: LUCY-311
                 URL: https://issues.apache.org/jira/browse/LUCY-311
             Project: Lucy
          Issue Type: Bug
          Components: Store
            Reporter: Nick Wellnhofer


The code in Lucy/Store creates Err objects with error messages returned from strerror. Especially under non-English locales, these messages aren't necessarily valid UTF-8. Now that CB_VCatF checks C strings for invalid UTF-8, this results in an exception.

Here's an example with a German error message in Latin1 encoding:

http://www.cpantesters.org/cpan/report/20d4902a-8673-11e6-9bc4-e52240a03099

------------------------------------

Does any have a good idea how to solve this? I can see the following approaches.

1. Switch to numeric error codes. Not very informative. Maybe use custom messages for a couple of error codes.

2. Replace non-ASCII chars in the error message with Unicode replacement character.

3. Use strerror_l with the "C" locale and hope that error messages are ASCII, replacing unlikely non-ASCII chars. POSIX only.

4. Call nl_langinfo(CODESET) to detect the character set and try to convert. POSIX only. Complicated.

Nick

Reply via email to