Re: [PATCH 2/9] add strip_suffix function
Jeff King writes: > Having had a day to mull it over, and having read your email, I think I > still prefer strip_suffix. That's OK. I was only trying to help you explore different avenues, without having strong preference myself either way. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/9] add strip_suffix function
On Wed, Jul 02, 2014 at 08:54:44AM -0700, Junio C Hamano wrote: > Jeff King writes: > > > For that reason, the "mem" form puts its length parameter > > next to the buffer (since they are a pair), and the string > > form puts it at the end (since it is an out-parameter). The > > compiler can notice when you get the order wrong, which > > should help prevent writing one when you meant the other. > > Very sensible consideration. I like commits that careful thinking > behind them shows through them. I would like to take credit for advanced thinking, but I actually did what felt natural, and only noticed the "compiler will tell you when you are wrong" effect when I got it wrong while writing a later patch in the series. :) > If we want to avoid implying NUL-termination, the only way to do so > would be to use wording that does not hint shortening. At least for > the C-string variant, which is measuring the length of the basename > part (i.e. `basename $str $suffix`) without touching anything else, > e.g. basename_length("hello.c", ".c", &len), but at the same time > you want to make it a boolean to signal if the string ends with the > suffix, so perhaps has_suffix("hello.c", ".c", &len)? I think that invites some confusion with "ends_with", which is the same thing (but just does not take the "len" parameter). We could just add this feature to ends_with, and ask callers who do not care to pass NULL, but that makes those call sites uglier. Having had a day to mull it over, and having read your email, I think I still prefer strip_suffix. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/9] add strip_suffix function
Jeff King writes: > For that reason, the "mem" form puts its length parameter > next to the buffer (since they are a pair), and the string > form puts it at the end (since it is an out-parameter). The > compiler can notice when you get the order wrong, which > should help prevent writing one when you meant the other. Very sensible consideration. I like commits that careful thinking behind them shows through them. > Signed-off-by: Jeff King > --- > I hope the word "strip" is OK, as it does not actually NUL-terminate > (doing so would make it unusable for many cases). Between the comment > below and the "const" in the parameter, I think it should be pretty > clear that it does not touch the string. And I could not think of a > better word. All other words I can think of offhand, trim, chomp, etc., hint shortening of the input string, and by definition shortening of a string implies NUL-termination. The "mem" variant deals with a counted string, however, so its shortening implies NUL-termination a lot less [*1*] and should be fine. If we want to avoid implying NUL-termination, the only way to do so would be to use wording that does not hint shortening. At least for the C-string variant, which is measuring the length of the basename part (i.e. `basename $str $suffix`) without touching anything else, e.g. basename_length("hello.c", ".c", &len), but at the same time you want to make it a boolean to signal if the string ends with the suffix, so perhaps has_suffix("hello.c", ".c", &len)? [Footnote] *1* ... but not entirely, because we often NUL-terminate even counted strings (the buffer returned from read_sha1_file() and the payload of strbuf are two examples). > git-compat-util.h | 27 +++ > 1 file changed, 27 insertions(+) > > diff --git a/git-compat-util.h b/git-compat-util.h > index b6f03b3..d044c42 100644 > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -358,6 +358,33 @@ static inline const char *skip_prefix(const char *str, > const char *prefix) > return NULL; > } > > +/* > + * If buf ends with suffix, return 1 and subtract the length of the suffix > + * from *len. Otherwise, return 0 and leave *len untouched. > + */ > +static inline int strip_suffix_mem(const char *buf, size_t *len, > +const char *suffix) > +{ > + size_t suflen = strlen(suffix); > + if (*len < suflen || memcmp(buf + (*len - suflen), suffix, suflen)) > + return 0; > + *len -= suflen; > + return 1; > +} > + > +/* > + * If str ends with suffix, return 1 and set *len to the size of the string > + * without the suffix. Otherwise, return 0 and set *len to the size of the > + * string. > + * > + * Note that we do _not_ NUL-terminate str to the new length. > + */ > +static inline int strip_suffix(const char *str, const char *suffix, size_t > *len) > +{ > + *len = strlen(str); > + return strip_suffix_mem(str, len, suffix); > +} > + > #if defined(NO_MMAP) || defined(USE_WIN32_MMAP) > > #ifndef PROT_READ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/9] add strip_suffix function
Many callers of ends_with want to not only find out whether a string has a suffix, but want to also strip it off. Doing that separately has two minor problems: 1. We often run over the string twice (once to find the suffix, and then once more to find its length to subtract the suffix length). 2. We have to specify the suffix length again, which means either a magic number, or repeating ourselves with strlen("suffix"). Just as we have skip_prefix to avoid these cases with starts_with, we can add a strip_suffix to avoid them with ends_with. Note that we add two forms of strip_suffix here: one that takes a string, with the resulting length as an out-parameter; and one that takes a pointer/length pair, and reuses the length as an out-parameter. The latter is more efficient when the caller already has the length (e.g., when using strbufs), but it can be easy to confuse the two, as they take the same number and types of parameters. For that reason, the "mem" form puts its length parameter next to the buffer (since they are a pair), and the string form puts it at the end (since it is an out-parameter). The compiler can notice when you get the order wrong, which should help prevent writing one when you meant the other. Signed-off-by: Jeff King --- I hope the word "strip" is OK, as it does not actually NUL-terminate (doing so would make it unusable for many cases). Between the comment below and the "const" in the parameter, I think it should be pretty clear that it does not touch the string. And I could not think of a better word. git-compat-util.h | 27 +++ 1 file changed, 27 insertions(+) diff --git a/git-compat-util.h b/git-compat-util.h index b6f03b3..d044c42 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -358,6 +358,33 @@ static inline const char *skip_prefix(const char *str, const char *prefix) return NULL; } +/* + * If buf ends with suffix, return 1 and subtract the length of the suffix + * from *len. Otherwise, return 0 and leave *len untouched. + */ +static inline int strip_suffix_mem(const char *buf, size_t *len, + const char *suffix) +{ + size_t suflen = strlen(suffix); + if (*len < suflen || memcmp(buf + (*len - suflen), suffix, suflen)) + return 0; + *len -= suflen; + return 1; +} + +/* + * If str ends with suffix, return 1 and set *len to the size of the string + * without the suffix. Otherwise, return 0 and set *len to the size of the + * string. + * + * Note that we do _not_ NUL-terminate str to the new length. + */ +static inline int strip_suffix(const char *str, const char *suffix, size_t *len) +{ + *len = strlen(str); + return strip_suffix_mem(str, len, suffix); +} + #if defined(NO_MMAP) || defined(USE_WIN32_MMAP) #ifndef PROT_READ -- 2.0.0.566.gfe3e6b2 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html