Re: [PATCH v10 9/9] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-03-07 Thread Lars Schneider

> On 07 Mar 2018, at 23:52, Junio C Hamano  wrote:
> 
> Lars Schneider  writes:
> 
>> I don't think HT makes too much sense. However, isspace() is nice
>> and I will use it. Being more permissive on the inputs should hurt.
> 
> You are being incoherent in these three sentences.  If you want to
> be more strict and only allow SP, then isspace() is already too
> broad, as it does allow HT (and even CR and LF).

I meant "Being more permissive on the inputs shouldN'T hurt." :-)


> I do think HT makes just as much sense as SP, though, so if you use
> isspace(), that would be perfectly fine.

OK!

- Lars



Re: [PATCH v10 9/9] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-03-07 Thread Junio C Hamano
Lars Schneider  writes:

> Although the line is unnecessary, I felt it is safer/easier to 
> understand and maintain. Since both of you tripped over it, I will
> remove it though.

I didn't actually trip over it.  It made it look as if the coder
didn't understand what the code is doing to have that extra line.

> I don't think HT makes too much sense. However, isspace() is nice
> and I will use it. Being more permissive on the inputs should hurt.

You are being incoherent in these three sentences.  If you want to
be more strict and only allow SP, then isspace() is already too
broad, as it does allow HT (and even CR and LF).

I do think HT makes just as much sense as SP, though, so if you use
isspace(), that would be perfectly fine.


Re: [PATCH v10 9/9] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-03-07 Thread Lars Schneider

> On 07 Mar 2018, at 20:59, Junio C Hamano  wrote:
> 
> lars.schnei...@autodesk.com writes:
> 
>> +static int check_roundtrip(const char* enc_name)
> 
> The asterisk sticks to the variable, not type.

Argh. I need to put this check into Travis CI ;-)


>> +{
>> +/*
>> + * check_roundtrip_encoding contains a string of space and/or
>> + * comma separated encodings (eg. "UTF-16, ASCII, CP1125").
>> + * Search for the given encoding in that string.
>> + */
>> +const char *found = strcasestr(check_roundtrip_encoding, enc_name);
>> +const char *next;
>> +int len;
>> +if (!found)
>> +return 0;
>> +next = found + strlen(enc_name);
>> +len = strlen(check_roundtrip_encoding);
>> +return (found && (
>> +/*
>> + * check that the found encoding is at the
>> + * beginning of check_roundtrip_encoding or
>> + * that it is prefixed with a space or comma
>> + */
>> +found == check_roundtrip_encoding || (
>> +found > check_roundtrip_encoding &&
>> +(*(found-1) == ' ' || *(found-1) == ',')
>> +)
> 
> The second line is unneeded, as we know a non-NULL found either
> points at check_roundtrip_encoding or somewhere to the right, and
> the first test already checked the "points exactly at" case.

Eric objected that too [1], but noted the documentation value:

Although the 'found > check_roundtrip_encoding' expression is
effectively dead code in this case, it does document that the
programmer didn't just blindly use '*(found-1)' without taking
boundary conditions into consideration.

(It's dead code because, at this point, we know that strcasestr()
didn't return NULL and we know that 'found' doesn't equal
'check_roundtrip_encoding', therefore it _must_ be greater than
'check_roundtrip_encoding'.)

[1] 
https://public-inbox.org/git/CAPig+cR81J3fTOtrgAumAs=rc5hqyffsmeb-ru-yf_ahfub...@mail.gmail.com/

Although the line is unnecessary, I felt it is safer/easier to 
understand and maintain. Since both of you tripped over it, I will
remove it though.


> This is defined to be a comma separated list, so it is unnecessary
> to accept  == <"FOO, SHIFT-JIS, BAR", "SHIFT-JIS">; if you
> allow SP, perhaps "isspace(found[-1]) || found[-1] == ','" to also
> allow HT may also be appropriate.

I don't think HT makes too much sense. However, isspace() is nice
and I will use it. Being more permissive on the inputs should hurt.


>  I think "comma or whitespace
> separated list" is fine; in any case, the comment near the beginning
> of this function does not match new text in Documentation/config.txt
> added by this patch.

Nice catch. Will fix.


Thanks,
Lars



Re: [PATCH v10 9/9] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-03-07 Thread Junio C Hamano
lars.schnei...@autodesk.com writes:

> +static int check_roundtrip(const char* enc_name)

The asterisk sticks to the variable, not type.

> +{
> + /*
> +  * check_roundtrip_encoding contains a string of space and/or
> +  * comma separated encodings (eg. "UTF-16, ASCII, CP1125").
> +  * Search for the given encoding in that string.
> +  */
> + const char *found = strcasestr(check_roundtrip_encoding, enc_name);
> + const char *next;
> + int len;
> + if (!found)
> + return 0;
> + next = found + strlen(enc_name);
> + len = strlen(check_roundtrip_encoding);
> + return (found && (
> + /*
> +  * check that the found encoding is at the
> +  * beginning of check_roundtrip_encoding or
> +  * that it is prefixed with a space or comma
> +  */
> + found == check_roundtrip_encoding || (
> + found > check_roundtrip_encoding &&
> + (*(found-1) == ' ' || *(found-1) == ',')
> + )

The second line is unneeded, as we know a non-NULL found either
points at check_roundtrip_encoding or somewhere to the right, and
the first test already checked the "points exactly at" case.

This is defined to be a comma separated list, so it is unnecessary
to accept  == <"FOO, SHIFT-JIS, BAR", "SHIFT-JIS">; if you
allow SP, perhaps "isspace(found[-1]) || found[-1] == ','" to also
allow HT may also be appropriate.  I think "comma or whitespace
separated list" is fine; in any case, the comment near the beginning
of this function does not match new text in Documentation/config.txt
added by this patch.