On Mon, Feb 11, 2013 at 2:43 PM, Matthieu Moy
<matthieu....@grenoble-inp.fr> wrote:
> Erik Faye-Lund <kusmab...@gmail.com> writes:
>> --- a/parse-options.c
>> +++ b/parse-options.c
>> @@ -3,6 +3,7 @@
>>  #include "cache.h"
>>  #include "commit.h"
>>  #include "color.h"
>> +#include "utf8.h"
>>  static int parse_options_usage(struct parse_opt_ctx_t *ctx,
>>                              const char * const *usagestr,
>> @@ -462,7 +463,9 @@ int parse_options(int argc, const char **argv, const 
>> char *prefix,
>>               if (ctx.argv[0][1] == '-') {
>>                       error("unknown option `%s'", ctx.argv[0] + 2);
>>               } else {
>> -                     error("unknown switch `%c'", *ctx.opt);
>> +                     const char *next = ctx.opt;
>> +                     utf8_width(&next, NULL);
>> +                     error("unknown switch `%.*s'", (int)(next - ctx.opt), 
>> ctx.opt);
>>               }
>>               usage_with_options(usagestr, options);
>>       }
> You should be careful with the case where the user has a non-UTF8
> environment, and entered a non-ascii sequence. I can see two cases:
> 1) The non-ascii sequence is valid UTF-8, then I guess your patch would
>    show two characters instead of one. Not really correct, but not really
>    serious either.

Hm. So we would end up trading some form of corruption for some other.
Not the biggest problem in the world, but perhaps there's a way of
fixing it?

I'm not entirely sure how to correctly know what encoding stdin is
supposed to be. On Windows, that's easy; it's UTF-16, we re-encode it
to UTF-8 on startup in Git for Windows. But on other platforms, I have
no clue.

But isn't UTF-8 constructed to be very unlikely to clash with existing
encodings? If so, I could add a case for non-ascii and non-UTF-8, that
simply writes the byte as a hex-tuple?

> 2) The non-ascii sequence is NOT valid UTF-8, then if I read correctly
>    (I didn't test) utf8_width would set next to NULL, and then you are
>    in big trouble.

Outch. Yeah, you are right; this is not good at all :)

But I guess the solution above should fix this as well, no?
