Re: [PATCH v10 9/9] convert: add round trip check based on 'core.checkRoundtripEncoding'
> On 07 Mar 2018, at 23:52, Junio C Hamano wrote: > > Lars Schneider writes: > >> I don't think HT makes too much sense. However, isspace() is nice >> and I will use it. Being more permissive on the inputs should hurt. > > You are being incoherent in these three sentences. If you want to > be more strict and only allow SP, then isspace() is already too > broad, as it does allow HT (and even CR and LF). I meant "Being more permissive on the inputs shouldN'T hurt." :-) > I do think HT makes just as much sense as SP, though, so if you use > isspace(), that would be perfectly fine. OK! - Lars
Re: [PATCH v10 9/9] convert: add round trip check based on 'core.checkRoundtripEncoding'
Lars Schneider writes: > Although the line is unnecessary, I felt it is safer/easier to > understand and maintain. Since both of you tripped over it, I will > remove it though. I didn't actually trip over it. It made it look as if the coder didn't understand what the code is doing to have that extra line. > I don't think HT makes too much sense. However, isspace() is nice > and I will use it. Being more permissive on the inputs should hurt. You are being incoherent in these three sentences. If you want to be more strict and only allow SP, then isspace() is already too broad, as it does allow HT (and even CR and LF). I do think HT makes just as much sense as SP, though, so if you use isspace(), that would be perfectly fine.
Re: [PATCH v10 9/9] convert: add round trip check based on 'core.checkRoundtripEncoding'
> On 07 Mar 2018, at 20:59, Junio C Hamano wrote: > > lars.schnei...@autodesk.com writes: > >> +static int check_roundtrip(const char* enc_name) > > The asterisk sticks to the variable, not type. Argh. I need to put this check into Travis CI ;-) >> +{ >> +/* >> + * check_roundtrip_encoding contains a string of space and/or >> + * comma separated encodings (eg. "UTF-16, ASCII, CP1125"). >> + * Search for the given encoding in that string. >> + */ >> +const char *found = strcasestr(check_roundtrip_encoding, enc_name); >> +const char *next; >> +int len; >> +if (!found) >> +return 0; >> +next = found + strlen(enc_name); >> +len = strlen(check_roundtrip_encoding); >> +return (found && ( >> +/* >> + * check that the found encoding is at the >> + * beginning of check_roundtrip_encoding or >> + * that it is prefixed with a space or comma >> + */ >> +found == check_roundtrip_encoding || ( >> +found > check_roundtrip_encoding && >> +(*(found-1) == ' ' || *(found-1) == ',') >> +) > > The second line is unneeded, as we know a non-NULL found either > points at check_roundtrip_encoding or somewhere to the right, and > the first test already checked the "points exactly at" case. Eric objected that too [1], but noted the documentation value: Although the 'found > check_roundtrip_encoding' expression is effectively dead code in this case, it does document that the programmer didn't just blindly use '*(found-1)' without taking boundary conditions into consideration. (It's dead code because, at this point, we know that strcasestr() didn't return NULL and we know that 'found' doesn't equal 'check_roundtrip_encoding', therefore it _must_ be greater than 'check_roundtrip_encoding'.) [1] https://public-inbox.org/git/CAPig+cR81J3fTOtrgAumAs=rc5hqyffsmeb-ru-yf_ahfub...@mail.gmail.com/ Although the line is unnecessary, I felt it is safer/easier to understand and maintain. Since both of you tripped over it, I will remove it though. > This is defined to be a comma separated list, so it is unnecessary > to accept == <"FOO, SHIFT-JIS, BAR", "SHIFT-JIS">; if you > allow SP, perhaps "isspace(found[-1]) || found[-1] == ','" to also > allow HT may also be appropriate. I don't think HT makes too much sense. However, isspace() is nice and I will use it. Being more permissive on the inputs should hurt. > I think "comma or whitespace > separated list" is fine; in any case, the comment near the beginning > of this function does not match new text in Documentation/config.txt > added by this patch. Nice catch. Will fix. Thanks, Lars
Re: [PATCH v10 9/9] convert: add round trip check based on 'core.checkRoundtripEncoding'
lars.schnei...@autodesk.com writes: > +static int check_roundtrip(const char* enc_name) The asterisk sticks to the variable, not type. > +{ > + /* > + * check_roundtrip_encoding contains a string of space and/or > + * comma separated encodings (eg. "UTF-16, ASCII, CP1125"). > + * Search for the given encoding in that string. > + */ > + const char *found = strcasestr(check_roundtrip_encoding, enc_name); > + const char *next; > + int len; > + if (!found) > + return 0; > + next = found + strlen(enc_name); > + len = strlen(check_roundtrip_encoding); > + return (found && ( > + /* > + * check that the found encoding is at the > + * beginning of check_roundtrip_encoding or > + * that it is prefixed with a space or comma > + */ > + found == check_roundtrip_encoding || ( > + found > check_roundtrip_encoding && > + (*(found-1) == ' ' || *(found-1) == ',') > + ) The second line is unneeded, as we know a non-NULL found either points at check_roundtrip_encoding or somewhere to the right, and the first test already checked the "points exactly at" case. This is defined to be a comma separated list, so it is unnecessary to accept == <"FOO, SHIFT-JIS, BAR", "SHIFT-JIS">; if you allow SP, perhaps "isspace(found[-1]) || found[-1] == ','" to also allow HT may also be appropriate. I think "comma or whitespace separated list" is fine; in any case, the comment near the beginning of this function does not match new text in Documentation/config.txt added by this patch.