Re: [PATCH 2/9] add strip_suffix function

2014-07-02 Thread Junio C Hamano
Jeff King  writes:

> Having had a day to mull it over, and having read your email, I think I
> still prefer strip_suffix.

That's OK.  I was only trying to help you explore different avenues,
without having strong preference myself either way.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] add strip_suffix function

2014-07-02 Thread Jeff King
On Wed, Jul 02, 2014 at 08:54:44AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > For that reason, the "mem" form puts its length parameter
> > next to the buffer (since they are a pair), and the string
> > form puts it at the end (since it is an out-parameter). The
> > compiler can notice when you get the order wrong, which
> > should help prevent writing one when you meant the other.
> 
> Very sensible consideration.  I like commits that careful thinking
> behind them shows through them.

I would like to take credit for advanced thinking, but I actually did
what felt natural, and only noticed the "compiler will tell you when you
are wrong" effect when I got it wrong while writing a later patch in the
series. :)

> If we want to avoid implying NUL-termination, the only way to do so
> would be to use wording that does not hint shortening.  At least for
> the C-string variant, which is measuring the length of the basename
> part (i.e. `basename $str $suffix`) without touching anything else,
> e.g. basename_length("hello.c", ".c", &len), but at the same time
> you want to make it a boolean to signal if the string ends with the
> suffix, so perhaps has_suffix("hello.c", ".c", &len)?

I think that invites some confusion with "ends_with", which is the same
thing (but just does not take the "len" parameter). We could just add
this feature to ends_with, and ask callers who do not care to pass NULL,
but that makes those call sites uglier.

Having had a day to mull it over, and having read your email, I think I
still prefer strip_suffix.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] add strip_suffix function

2014-07-02 Thread Junio C Hamano
Jeff King  writes:

> For that reason, the "mem" form puts its length parameter
> next to the buffer (since they are a pair), and the string
> form puts it at the end (since it is an out-parameter). The
> compiler can notice when you get the order wrong, which
> should help prevent writing one when you meant the other.

Very sensible consideration.  I like commits that careful thinking
behind them shows through them.

> Signed-off-by: Jeff King 
> ---
> I hope the word "strip" is OK, as it does not actually NUL-terminate
> (doing so would make it unusable for many cases). Between the comment
> below and the "const" in the parameter, I think it should be pretty
> clear that it does not touch the string. And I could not think of a
> better word.

All other words I can think of offhand, trim, chomp, etc., hint
shortening of the input string, and by definition shortening of a
string implies NUL-termination.

The "mem" variant deals with a counted string, however, so its
shortening implies NUL-termination a lot less [*1*] and should be
fine.

If we want to avoid implying NUL-termination, the only way to do so
would be to use wording that does not hint shortening.  At least for
the C-string variant, which is measuring the length of the basename
part (i.e. `basename $str $suffix`) without touching anything else,
e.g. basename_length("hello.c", ".c", &len), but at the same time
you want to make it a boolean to signal if the string ends with the
suffix, so perhaps has_suffix("hello.c", ".c", &len)?

[Footnote]

 *1* ... but not entirely, because we often NUL-terminate even
 counted strings (the buffer returned from read_sha1_file() and
 the payload of strbuf are two examples).

>  git-compat-util.h | 27 +++
>  1 file changed, 27 insertions(+)
>
> diff --git a/git-compat-util.h b/git-compat-util.h
> index b6f03b3..d044c42 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -358,6 +358,33 @@ static inline const char *skip_prefix(const char *str, 
> const char *prefix)
>   return NULL;
>  }
>  
> +/*
> + * If buf ends with suffix, return 1 and subtract the length of the suffix
> + * from *len. Otherwise, return 0 and leave *len untouched.
> + */
> +static inline int strip_suffix_mem(const char *buf, size_t *len,
> +const char *suffix)
> +{
> + size_t suflen = strlen(suffix);
> + if (*len < suflen || memcmp(buf + (*len - suflen), suffix, suflen))
> + return 0;
> + *len -= suflen;
> + return 1;
> +}
> +
> +/*
> + * If str ends with suffix, return 1 and set *len to the size of the string
> + * without the suffix. Otherwise, return 0 and set *len to the size of the
> + * string.
> + *
> + * Note that we do _not_ NUL-terminate str to the new length.
> + */
> +static inline int strip_suffix(const char *str, const char *suffix, size_t 
> *len)
> +{
> + *len = strlen(str);
> + return strip_suffix_mem(str, len, suffix);
> +}
> +
>  #if defined(NO_MMAP) || defined(USE_WIN32_MMAP)
>  
>  #ifndef PROT_READ
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/9] add strip_suffix function

2014-06-30 Thread Jeff King
Many callers of ends_with want to not only find out whether
a string has a suffix, but want to also strip it off. Doing
that separately has two minor problems:

  1. We often run over the string twice (once to find
 the suffix, and then once more to find its length to
 subtract the suffix length).

  2. We have to specify the suffix length again, which means
 either a magic number, or repeating ourselves with
 strlen("suffix").

Just as we have skip_prefix to avoid these cases with
starts_with, we can add a strip_suffix to avoid them with
ends_with.

Note that we add two forms of strip_suffix here: one that
takes a string, with the resulting length as an
out-parameter; and one that takes a pointer/length pair, and
reuses the length as an out-parameter. The latter is more
efficient when the caller already has the length (e.g., when
using strbufs), but it can be easy to confuse the two, as
they take the same number and types of parameters.

For that reason, the "mem" form puts its length parameter
next to the buffer (since they are a pair), and the string
form puts it at the end (since it is an out-parameter). The
compiler can notice when you get the order wrong, which
should help prevent writing one when you meant the other.

Signed-off-by: Jeff King 
---
I hope the word "strip" is OK, as it does not actually NUL-terminate
(doing so would make it unusable for many cases). Between the comment
below and the "const" in the parameter, I think it should be pretty
clear that it does not touch the string. And I could not think of a
better word.

 git-compat-util.h | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index b6f03b3..d044c42 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -358,6 +358,33 @@ static inline const char *skip_prefix(const char *str, 
const char *prefix)
return NULL;
 }
 
+/*
+ * If buf ends with suffix, return 1 and subtract the length of the suffix
+ * from *len. Otherwise, return 0 and leave *len untouched.
+ */
+static inline int strip_suffix_mem(const char *buf, size_t *len,
+  const char *suffix)
+{
+   size_t suflen = strlen(suffix);
+   if (*len < suflen || memcmp(buf + (*len - suflen), suffix, suflen))
+   return 0;
+   *len -= suflen;
+   return 1;
+}
+
+/*
+ * If str ends with suffix, return 1 and set *len to the size of the string
+ * without the suffix. Otherwise, return 0 and set *len to the size of the
+ * string.
+ *
+ * Note that we do _not_ NUL-terminate str to the new length.
+ */
+static inline int strip_suffix(const char *str, const char *suffix, size_t 
*len)
+{
+   *len = strlen(str);
+   return strip_suffix_mem(str, len, suffix);
+}
+
 #if defined(NO_MMAP) || defined(USE_WIN32_MMAP)
 
 #ifndef PROT_READ
-- 
2.0.0.566.gfe3e6b2

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html