Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-15 Thread Eric Sunshine
On Tue, Sep 15, 2015 at 11:36 AM, Jeff King  wrote:
> We sometimes sprintf into static buffers when we know that
> the size of the buffer is large enough to fit the input
> (either because it's a constant, or because it's numeric
> input that is bounded in size). Likewise with strcpy of
> constant strings.
>
> However, these sites make it hard to audit sprintf and
> strcpy calls for buffer overflows, as a reader has to
> cross-reference the size of the array with the input. Let's
> use xsnprintf instead, which communicates to a reader that
> we don't expect this to overflow (and catches the mistake in
> case we do).
>
> Signed-off-by: Jeff King 
> ---
> diff --git a/builtin/merge-index.c b/builtin/merge-index.c
> index 1a1eafa..1d66111 100644
> --- a/builtin/merge-index.c
> +++ b/builtin/merge-index.c
> @@ -23,7 +23,7 @@ static int merge_entry(int pos, const char *path)
> break;
> found++;
> strcpy(hexbuf[stage], sha1_to_hex(ce->sha1));
> -   sprintf(ownbuf[stage], "%o", ce->ce_mode);
> +   xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", 
> ce->ce_mode);

Interesting. I wonder if there are any (old/broken) compilers which
would barf on this. If we care, perhaps sizeof(ownbuf[0]) instead?

> arguments[stage] = hexbuf[stage];
> arguments[stage + 4] = ownbuf[stage];
> } while (++pos < active_nr);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 26/67] replace trivial malloc + sprintf /strcpy calls to xstrfmt

2015-09-15 Thread Eric Sunshine
On Tue, Sep 15, 2015 at 11:45 AM, Jeff King  wrote:
> replace trivial malloc + sprintf /strcpy calls to xstrfmt

s/to/with/

Also, do you want either to add a space after '/' or drop the one before it?

> It's a common pattern to do:
>
>   foo = xmalloc(strlen(one) + strlen(two) + 1 + 1);
>   sprintf(foo, "%s %s", one, two);
>
> (or possibly some variant with strcpy()s or a more
> complicated length computation).  We can switch these to use
> xstrfmt, which is shorter, involves less error-prone manual
> computation, and removes many sprintf and strcpy calls which
> make it harder to audit the code for real buffer overflows.
>
> Signed-off-by: Jeff King 
> ---
> --- a/imap-send.c
> +++ b/imap-send.c
> @@ -889,9 +889,8 @@ static char *cram(const char *challenge_64, const char 
> *user, const char *pass)
> }
>
> /* response: " " */
> -   resp_len = strlen(user) + 1 + strlen(hex) + 1;
> -   response = xmalloc(resp_len);
> -   sprintf(response, "%s %s", user, hex);
> +   response = xstrfmt("%s %s", user, hex);
> +   resp_len = strlen(response);
>
> response_64 = xmalloc(ENCODED_SIZE(resp_len) + 1);

The original resp_len calculation included the NUL but the revised
does not. If I'm reading this correctly, the revised calculation is
correct, and the original was over-allocating response_64, right?

> encoded_len = EVP_EncodeBlock((unsigned char *)response_64,
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] strtoul_ui: actually report error in case of negative input

2015-09-15 Thread Max Kirillov
On Tue, Sep 15, 2015 at 08:50:03AM +0200, Matthieu Moy wrote:
> I think it would be better to just return a long to avoid needless
> limitations, but changing the argument to "long" would interfer with
> in-flight topics. Not worth the trouble.

Sure.

> 
> One potential issue with your patch is that you're forbidding the
> interval [2^31, 2^32[ which was previously allowed, both on 32 and 64
> bits. I'm not sure whether we have a use for this in the codebase.

As far as I could see it was used only for file modes. Which
does not need that big numbers.

> This alternative patch is rather ugly to, but I think it is less
> limiting and does not have the "large negative wrapped to positive"
> issue:
> 
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -814,6 +814,9 @@ static inline int strtoul_ui(char const *s, int base, 
> unsigned int *result)
> char *p;
>  
> errno = 0;
> +   /* negative values would be accepted by strtoul */
> +   if (strchr(s, '-'))
> +   return -1;
> ul = strtoul(s, , base);
> if (errno || *p || p == s || (unsigned int) ul != ul)
> return -1;
> 
> What do you think?

Explicit rejection of '-' is of course useful addition.

I still find "(unsigned int) ul != ul" bad. As far as I
understand it makes no sense for i386. And even for 64-bit
it's too obscure. In form of "(ul & 0xL) == 0" it
would be more clear. Or just make explicit comparison with
intended limit, like I did.

Well, actually I don't have strong preferences as long as
"make -C t" does not alarm me with things I did not break.
Maybe somebody else will comment more.

-- 
Max
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 29/67] use strip_suffix and xstrfmt to replace suffix

2015-09-15 Thread Eric Sunshine
On Tue, Sep 15, 2015 at 11:47 AM, Jeff King  wrote:
> When we want to convert "foo.pack" to "foo.idx", we do it by
> duplicating the original string and then munging the bytes
> in place. Let's use strip_suffix and xstrfmt instead, which
> has several advantages:
>
>   1. It's more clear what the intent is.
>
>   2. It does not implicitly rely on the fact that
>  strlen(".idx") <= strlen(".pack") to avoid an overflow.
>
>   3. We communicate the assumption that the input file ends
>  with ".pack" (and get a run-time check that this is so).
>
>   4. We drop calls to strcpy, which makes auditing the code
>  base easier.
>
> Likewise, we can do this to convert ".pack" to ".bitmap",
> avoiding some manual memory computation.
>
> Signed-off-by: Jeff King 
> ---
> diff --git a/http.c b/http.c
> index 7b02259..e0ff876 100644
> --- a/http.c
> +++ b/http.c
> @@ -1511,6 +1511,7 @@ int finish_http_pack_request(struct http_pack_request 
> *preq)
> struct packed_git **lst;
> struct packed_git *p = preq->target;
> char *tmp_idx;
> +   size_t len;
> struct child_process ip = CHILD_PROCESS_INIT;
> const char *ip_argv[8];
>
> @@ -1524,9 +1525,9 @@ int finish_http_pack_request(struct http_pack_request 
> *preq)
> lst = &((*lst)->next);
> *lst = (*lst)->next;
>
> -   tmp_idx = xstrdup(preq->tmpfile);
> -   strcpy(tmp_idx + strlen(tmp_idx) - strlen(".pack.temp"),
> -  ".idx.temp");
> +   if (!strip_suffix(preq->tmpfile, ".pack.temp", ))
> +   die("BUG: pack tmpfile does not end in .pack.temp?");
> +   tmp_idx = xstrfmt("%.*s.idx.temp", (int)len, preq->tmpfile);

These instances of repeated replacement code may argue in favor of a
general purpose replace_suffix() function:

char *replace_suffix(const char *s, const char *old, const char *new)
{
size_t n;
if (!strip_suffix(s, old, ))
die("BUG: '%s' does not end with '%s', s, old);
return xstrfmt("%.*s%s", (int)n, s, new);
}

or something.

> ip_argv[0] = "index-pack";
> ip_argv[1] = "-o";
> diff --git a/pack-bitmap.c b/pack-bitmap.c
> index 637770a..7dfcb34 100644
> --- a/pack-bitmap.c
> +++ b/pack-bitmap.c
> @@ -252,16 +252,11 @@ static int load_bitmap_entries_v1(struct bitmap_index 
> *index)
>
>  static char *pack_bitmap_filename(struct packed_git *p)
>  {
> -   char *idx_name;
> -   int len;
> -
> -   len = strlen(p->pack_name) - strlen(".pack");
> -   idx_name = xmalloc(len + strlen(".bitmap") + 1);
> -
> -   memcpy(idx_name, p->pack_name, len);
> -   memcpy(idx_name + len, ".bitmap", strlen(".bitmap") + 1);
> +   size_t len;
>
> -   return idx_name;
> +   if (!strip_suffix(p->pack_name, ".pack", ))
> +   die("BUG: pack_name does not end in .pack");
> +   return xstrfmt("%.*s.bitmap", (int)len, p->pack_name);
>  }
>
>  static int open_pack_bitmap_1(struct packed_git *packfile)
> diff --git a/sha1_file.c b/sha1_file.c
> index 28352a5..88996f0 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -671,13 +671,15 @@ static int check_packed_git_idx(const char *path, 
> struct packed_git *p)
>  int open_pack_index(struct packed_git *p)
>  {
> char *idx_name;
> +   size_t len;
> int ret;
>
> if (p->index_data)
> return 0;
>
> -   idx_name = xstrdup(p->pack_name);
> -   strcpy(idx_name + strlen(idx_name) - strlen(".pack"), ".idx");
> +   if (!strip_suffix(p->pack_name, ".pack", ))
> +   die("BUG: pack_name does not end in .pack");
> +   idx_name = xstrfmt("%.*s.idx", (int)len, p->pack_name);
> ret = check_packed_git_idx(idx_name, p);
> free(idx_name);
> return ret;
> --
> 2.6.0.rc2.408.ga2926b9
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-15 Thread Junio C Hamano
Ramsay Jones  writes:

> How about using strlcpy() instead? Thus:
>
> - strcpy(header.name, "pax_global_header");
> + strlcpy(header.name, "pax_global_header", sizeof(header.name));
>
> Ditto for other similar (strcpy->xsnprintf) hunks below.

Please do not advocate use of strlcpy(), which substitutes
overwriting beyond the end of the buffer (which is a bug) with a
silent truncation (which is almost always another bug, unless in a
very narrow case of producing a non-essential string result where
loss of information does not matter).

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/67] add reentrant variants of sha1_to_hex and find_unique_abbrev

2015-09-15 Thread Junio C Hamano
Jeff King  writes:

>> Hmm, I haven't read any other patches yet (including those which use these
>> new '_to' functions), but I can't help feeling they should be named something
>> like 'sha1_to_hex_str()' and 'find_unique_abbrev_str()' instead.  i.e. I 
>> don't get
>> the '_to' thing - not that I'm any good at naming things ...
>
> I meant it as a contrast with their original. sha1_to_hex() formats into
> an internal buffer and returns it. But sha1_to_hex_to() formats "to" a
> buffer of your choice.

I think that naming makes perfect sense.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/67] strbuf: make strbuf_complete_line more generic

2015-09-15 Thread Junio C Hamano
Eric Sunshine  writes:

>> +static inline void strbuf_complete(struct strbuf *sb, char term)
>> +{
>> +   if (sb->len && sb->buf[sb->len - 1] != term)
>> +   strbuf_addch(sb, term);
>> +}
>
> Hmm, so this only adds 'term' if not already present *and* if 'sb' is
> not empty, which doesn't seem to match the documentation which says
> that it "ensures" termination.
>
> But, is that reasonable behavior? Intuitively, I'd expect 'term' to be
> added when 'sb' is empty:
>
> if (!sb->len || sb->buf[sb->len - 1] != term)
> strbuf_addch(sb, term);
>
> strbuf_complete_line()'s existing behavior of not adding '\n' to an
> empty string may have been intentional, but actually smells like a
> bug.

I would expect two different scenarios for which this function would
be useful.

One is when dealing with a text file and want to avoid incomplete
lines at the end.  In this scenario, an empty file with zero lines
should be left as-is, instead of getting turned into a file with one
empty line.  "Leave the empty input as-is" is the behaviour the
callers want.

The other is when you are given a directory name in the strbuf, you
have a name of a file you want to be in that directory, and want to
have the full path to the file in the strbuf.  In this scenario,
what does it mean for the caller to give you an empty "directory
name"?  I think at least in our codebase, that almost always would
mean that "the path is relative to $CWD", i.e. you would want to see
the "complete" to leave the input intact and then append the
filename there.

So to these two plausible and different set of callers that would be
helped by this function, the behaviour Peff gives it would match
what the callers want better than your version.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/67] war on sprintf, strcpy, etc

2015-09-15 Thread Junio C Hamano
Jeff King  writes:

> Obviously this is not intended for v2.6.0. But all of the spots touched
> here are relatively quiet right now, so I wanted to get it out onto the
> list.  There are a few minor conflicts against "pu", but they're all
> just from touching nearby lines.

Thanks.  Looks like a lot of good work you did ;-)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] strtoul_ui: actually report error in case of negative input

2015-09-15 Thread Junio C Hamano
Matthieu Moy  writes:

> Not just the return type (which is the error status), but also the type
> of the result argument indeed. It's not clear to me whether this is
> intentional (09f2825 (git-grep: don't use sscanf, 2007-03-12) introduced
> it, the commit message doesn't help). I first read strtoul_ui as
> "strtoul with a better UI (user interface)", but maybe the name was
> meant to say "a fuction that uses strtoul and returns an ui (unsigned
> int)".

Just for this part.  Yes, ui does not mean user interface but "we
are grabbing an unsigned int and as its internal implementation we
happen to use strtoul" is where the name comes from.

> I went through the thread quickly, my understanding is that there were
> more work to do, but no objection to merging.

Yes, there were some in-flight topics that interfered with it and
the topic quickly went stale without getting rerolled.  There wasn't
any fundamental issue with the topic itself.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: "Medium" log format: change proposal for author != committer

2015-09-15 Thread Junio C Hamano
"Robin H. Johnson"  writes:

> Specifically, if the author is NOT the same as the committer, then
> display both in the header. Otherwise continue to display only the
> author.

I too found myself wanting to see both of the names sometimes, and
the "fuller" format was added explicitly for that purpose.

Even though I agree "show only one, and both only when they are
different" is a reasonable and possibly useful format, it is out of
question to change what "--pretty=medium" does.  It has been with us
forever and people and their scripts do rely on it.

It would be good if we can say

$ git log --pretty=robinsformat

but with a better name to show such an output.


Having said that, I'm moderately negative about adding it as yet
another hard-coded format.  We simply have too many, and we do not
need one more.  What we need instead is a flexible framework to let
users get what they want.

I think what needs to happen is:

 * Enhance the "--pretty=format:" thing so that the current set of
   hardcoded --pretty=medium,short,... formats and your modified
   "medium" can be expressed as a custom format string.

 * Introduce a configuration mechanism to allow users to define new
   short-hand, e.g. if you have this in your $HOME/.gitconfig:

[pretty "robin"]
format = "commit %H%nAuthor: %an <%ae>%n..."

   and run "git log --pretty=robin", it would behave as if you said
   "git log --pretty="format:commit %H%nAuthor: %an <%ae>%n...".

 * (optional) Replace the hardcoded implementations of pretty
   formats with short-hand names like "medium", "short", etc. with a
   built-in set of pretty.$name.format using the configuration
   mechanism.  But we need to make sure this does not hurt
   performance for common cases.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5] remote: add get-url subcommand

2015-09-15 Thread Ben Boeckel
Expanding `insteadOf` is a part of ls-remote --url and there is no way
to expand `pushInsteadOf` as well. Add a get-url subcommand to be able
to query both as well as a way to get all configured urls.

Signed-off-by: Ben Boeckel 
---
 Documentation/git-remote.txt | 10 
 builtin/remote.c | 59 
 t/t5505-remote.sh| 39 +
 3 files changed, 108 insertions(+)

diff --git a/Documentation/git-remote.txt b/Documentation/git-remote.txt
index 4c6d6de..3c9bf45 100644
--- a/Documentation/git-remote.txt
+++ b/Documentation/git-remote.txt
@@ -15,6 +15,7 @@ SYNOPSIS
 'git remote remove' 
 'git remote set-head'  (-a | --auto | -d | --delete | )
 'git remote set-branches' [--add]  ...
+'git remote get-url' [--push] [--all] 
 'git remote set-url' [--push]   []
 'git remote set-url --add' [--push]  
 'git remote set-url --delete' [--push]  
@@ -131,6 +132,15 @@ The named branches will be interpreted as if specified 
with the
 With `--add`, instead of replacing the list of currently tracked
 branches, adds to that list.
 
+'get-url'::
+
+Retrieves the URLs for a remote. Configurations for `insteadOf` and
+`pushInsteadOf` are expanded here. By default, only the first URL is listed.
++
+With '--push', push URLs are queried rather than fetch URLs.
++
+With '--all', all URLs for the remote will be listed.
+
 'set-url'::
 
 Changes URLs for the remote. Sets first URL for remote  that matches
diff --git a/builtin/remote.c b/builtin/remote.c
index 181668d..e4c3ea1 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -18,6 +18,7 @@ static const char * const builtin_remote_usage[] = {
N_("git remote prune [-n | --dry-run] "),
N_("git remote [-v | --verbose] update [-p | --prune] [( | 
)...]"),
N_("git remote set-branches [--add]  ..."),
+   N_("git remote get-url [--push] [--all] "),
N_("git remote set-url [--push]   []"),
N_("git remote set-url --add  "),
N_("git remote set-url --delete  "),
@@ -65,6 +66,11 @@ static const char * const builtin_remote_update_usage[] = {
NULL
 };
 
+static const char * const builtin_remote_geturl_usage[] = {
+   N_("git remote get-url [--push] [--all] "),
+   NULL
+};
+
 static const char * const builtin_remote_seturl_usage[] = {
N_("git remote set-url [--push]   []"),
N_("git remote set-url --add  "),
@@ -1467,6 +1473,57 @@ static int set_branches(int argc, const char **argv)
return set_remote_branches(argv[0], argv + 1, add_mode);
 }
 
+static int get_url(int argc, const char **argv)
+{
+   int i, push_mode = 0, all_mode = 0;
+   const char *remotename = NULL;
+   struct remote *remote;
+   const char **url;
+   int url_nr;
+   struct option options[] = {
+   OPT_BOOL('\0', "push", _mode,
+N_("query push URLs rather than fetch URLs")),
+   OPT_BOOL('\0', "all", _mode,
+N_("return all URLs")),
+   OPT_END()
+   };
+   argc = parse_options(argc, argv, NULL, options, 
builtin_remote_geturl_usage, 0);
+
+   if (argc != 1)
+   usage_with_options(builtin_remote_geturl_usage, options);
+
+   remotename = argv[0];
+
+   if (!remote_is_configured(remotename))
+   die(_("No such remote '%s'"), remotename);
+   remote = remote_get(remotename);
+
+   url_nr = 0;
+   if (push_mode) {
+   url = remote->pushurl;
+   url_nr = remote->pushurl_nr;
+   }
+   /* else fetch mode */
+
+   /* Use the fetch URL when no push URLs were found or requested. */
+   if (!url_nr) {
+   url = remote->url;
+   url_nr = remote->url_nr;
+   }
+
+   if (!url_nr)
+   die(_("no URLs configured for remote '%s'"), remotename);
+
+   if (all_mode) {
+   for (i = 0; i < url_nr; i++)
+   printf_ln("%s", url[i]);
+   } else {
+   printf_ln("%s", *url);
+   }
+
+   return 0;
+}
+
 static int set_url(int argc, const char **argv)
 {
int i, push_mode = 0, add_mode = 0, delete_mode = 0;
@@ -1576,6 +1633,8 @@ int cmd_remote(int argc, const char **argv, const char 
*prefix)
result = set_head(argc, argv);
else if (!strcmp(argv[0], "set-branches"))
result = set_branches(argc, argv);
+   else if (!strcmp(argv[0], "get-url"))
+   result = get_url(argc, argv);
else if (!strcmp(argv[0], "set-url"))
result = set_url(argc, argv);
else if (!strcmp(argv[0], "show"))
diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh
index 7a8499c..f03ba4c 100755
--- a/t/t5505-remote.sh
+++ b/t/t5505-remote.sh
@@ -919,6 +919,21 @@ test_expect_success 'new remote' '
cmp expect actual
 '
 
+get_url_test () {
+   cat >expect &&
+   test_expect_success 

Re: [PATCH 07/67] strbuf: make strbuf_complete_line more generic

2015-09-15 Thread Eric Sunshine
On Tue, Sep 15, 2015 at 11:25 AM, Jeff King  wrote:
> The strbuf_complete_line function make sure that a buffer

s/make/makes/

> ends in a newline. But we may want to do this for any
> character (e.g., "/" on the end of a path). Let's factor out
> a generic version, and keep strbuf_complete_line as a thin
> wrapper.
>
> Signed-off-by: Jeff King 
> ---
> +/**
> + * Ensure that `sb` ends with the character `term`, if it does not
> + * already.
> + */
> +static inline void strbuf_complete(struct strbuf *sb, char term)
> +{
> +   if (sb->len && sb->buf[sb->len - 1] != term)
> +   strbuf_addch(sb, term);
> +}

Hmm, so this only adds 'term' if not already present *and* if 'sb' is
not empty, which doesn't seem to match the documentation which says
that it "ensures" termination.

But, is that reasonable behavior? Intuitively, I'd expect 'term' to be
added when 'sb' is empty:

if (!sb->len || sb->buf[sb->len - 1] != term)
strbuf_addch(sb, term);

strbuf_complete_line()'s existing behavior of not adding '\n' to an
empty string may have been intentional, but actually smells like a
bug.

> +
> +/**
> + * Ensure that `sb` ends with a newline.
> + */
>  static inline void strbuf_complete_line(struct strbuf *sb)
>  {
> -   if (sb->len && sb->buf[sb->len - 1] != '\n')
> -   strbuf_addch(sb, '\n');
> +   strbuf_complete(sb, '\n');
>  }
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/67] mailsplit: make PATH_MAX buffers dynamic

2015-09-15 Thread Eric Sunshine
On Tue, Sep 15, 2015 at 11:28 AM, Jeff King  wrote:
> There are several static PATH_MAX-sized buffers in
> mailsplit, along with some questionable uses of sprintf.
> These are not really of security interest, as local
> mailsplit pathnames are not typically under control of an
> attacker.  But it does not hurt to be careful, and as a
> bonus we lift some limits for systems with too-small
> PATH_MAX varibles.
>
> Signed-off-by: Jeff King 
> ---
> diff --git a/builtin/mailsplit.c b/builtin/mailsplit.c
> index 9de06e3..fb0bc08 100644
> --- a/builtin/mailsplit.c
> +++ b/builtin/mailsplit.c
> @@ -148,8 +155,7 @@ static int maildir_filename_cmp(const char *a, const char 
> *b)
>  static int split_maildir(const char *maildir, const char *dir,
> int nr_prec, int skip)
>  {
> -   char file[PATH_MAX];
> -   char name[PATH_MAX];
> +   struct strbuf file = STRBUF_INIT;
> FILE *f = NULL;
> int ret = -1;
> int i;
> @@ -161,20 +167,25 @@ static int split_maildir(const char *maildir, const 
> char *dir,
> goto out;
>
> for (i = 0; i < list.nr; i++) {
> -   snprintf(file, sizeof(file), "%s/%s", maildir, 
> list.items[i].string);
> -   f = fopen(file, "r");
> +   char *name;
> +
> +   strbuf_reset();
> +   strbuf_addf(, "%s/%s", maildir, list.items[i].string);
> +
> +   f = fopen(file.buf, "r");
> if (!f) {
> -   error("cannot open mail %s (%s)", file, 
> strerror(errno));
> +   error("cannot open mail %s (%s)", file.buf, 
> strerror(errno));
> goto out;
> }
>
> if (strbuf_getwholeline(, f, '\n')) {
> -   error("cannot read mail %s (%s)", file, 
> strerror(errno));
> +   error("cannot read mail %s (%s)", file.buf, 
> strerror(errno));
> goto out;
> }
>
> -   sprintf(name, "%s/%0*d", dir, nr_prec, ++skip);
> +   name = xstrfmt("%s/%0*d", dir, nr_prec, ++skip);
> split_one(f, name, 1);
> +   free(name);

Hmm, why does 'file' become a strbuf which is re-used each time
through the loop, but 'name' is treated differently and gets
re-allocated upon each iteration? Why doesn't 'name' deserve the same
treatment as 'file'?

> fclose(f);
> f = NULL;
> @@ -184,6 +195,7 @@ static int split_maildir(const char *maildir, const char 
> *dir,
>  out:
> if (f)
> fclose(f);
> +   strbuf_release();
> string_list_clear(, 1);
> return ret;
>  }
> @@ -191,7 +203,6 @@ out:
>  static int split_mbox(const char *file, const char *dir, int allow_bare,
>   int nr_prec, int skip)
>  {
> -   char name[PATH_MAX];
> int ret = -1;
> int peek;
>
> @@ -218,8 +229,9 @@ static int split_mbox(const char *file, const char *dir, 
> int allow_bare,
> }
>
> while (!file_done) {
> -   sprintf(name, "%s/%0*d", dir, nr_prec, ++skip);
> +   char *name = xstrfmt("%s/%0*d", dir, nr_prec, ++skip);
> file_done = split_one(f, name, allow_bare);
> +   free(name);

Same question, pretty much: Why not make 'name' a strbuf which is
re-used in the loop? (I don't have a strong preference; I'm just
trying to understand the apparent inconsistency of treatment.)

> }
>
> if (f != stdin)
> --
> 2.6.0.rc2.408.ga2926b9
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/67] trace: use strbuf for quote_crnl output

2015-09-15 Thread Eric Sunshine
On Tue, Sep 15, 2015 at 11:28 AM, Jeff King  wrote:
> When we output GIT_TRACE_SETUP paths, we quote any
> meta-characters. But our buffer to hold the result is only
> PATH_MAX bytes, and we could double the size of the input
> path (if every character needs quoted). We could use a

s/quoted/to be &/ ...or... s/quoted/quoting/

> 2*PATH_MAX buffer, if we assume the input will never be more
> than PATH_MAX. But it's easier still to just switch to a
> strbuf and not worry about whether the input can exceed
> PATH_MAX or not.
>
> Signed-off-by: Jeff King 
> ---
> diff --git a/trace.c b/trace.c
> index 7393926..0c06d71 100644
> --- a/trace.c
> +++ b/trace.c
> @@ -277,25 +277,25 @@ void trace_performance_fl(const char *file, int line, 
> uint64_t nanos,
>
>  static const char *quote_crnl(const char *path)
>  {
> -   static char new_path[PATH_MAX];
> +   static struct strbuf new_path = STRBUF_INIT;
> const char *p2 = path;
> -   char *p1 = new_path;

It's a little sad that this leaves a variable named 'p2' when there is
no corresponding 'p1'. Would this deserve a cleanup patch which
renames 'p2' to 'p' or do we not care enough?

> if (!path)
> return NULL;
>
> +   strbuf_reset(_path);
> +
> while (*p2) {
> switch (*p2) {
> -   case '\\': *p1++ = '\\'; *p1++ = '\\'; break;
> -   case '\n': *p1++ = '\\'; *p1++ = 'n'; break;
> -   case '\r': *p1++ = '\\'; *p1++ = 'r'; break;
> +   case '\\': strbuf_addstr(_path, ""); break;
> +   case '\n': strbuf_addstr(_path, "\\n"); break;
> +   case '\r': strbuf_addstr(_path, "\\r"); break;
> default:
> -   *p1++ = *p2;
> +   strbuf_addch(_path, *p2);
> }
> p2++;
> }
> -   *p1 = '\0';
> -   return new_path;
> +   return new_path.buf;
>  }
>
>  /* FIXME: move prefix to startup_info struct and get rid of this arg */
> --
> 2.6.0.rc2.408.ga2926b9
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Feature Request] git blame showing only revisions from git rev-list --first-parent

2015-09-15 Thread Junio C Hamano
Jeff King  writes:

> It seems like nobody is actually that interested in what "blame
> --first-parent --reverse" does in the first place, though, and there's
> no reason for its complexity to hold up vanilla --first-parent. So what
> do you think of:

I like the part that explicitly disables the combination of the two
;-)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


"Medium" log format: change proposal for author != committer

2015-09-15 Thread Robin H. Johnson
Hi,

I want to propose a change to the 'medium' log output format, to improve
readability.

Specifically, if the author is NOT the same as the committer, then
display both in the header. Otherwise continue to display only the
author.

This would aid quick review of changes in git-log & git-show output.

-- 
Robin Hugh Johnson
Gentoo Linux: Developer, Infrastructure Lead
E-Mail : robb...@gentoo.org
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] git-p4: handle "Translation of file content failed"

2015-09-15 Thread Luke Diamand

On 15/09/15 16:38, Lars Schneider wrote:


On 15 Sep 2015, at 08:43, Luke Diamand  wrote:





Do we know the mechanism by which we end up in this state?

Unfortunately no. I tried hard to reproduce the error with “conventional” 
methods. As you can see I ended up manipulating the P4 database…

However, it looks like this error happens in the wild, too:
https://stackoverflow.com/questions/5156909/translation-of-file-content-failed-error-in-perforce
https://stackoverflow.com/questions/887006/perforce-translation-of-file-content-failed-error


It's described in the Perforce FAQ here:

http://answers.perforce.com/articles/KB/3117

i.e. it looks to be caused by mixing old and new P4 clients.



Known issue: This works only if git-p4 is executed in verbose mode.
In normal mode no exceptions are thrown and git-p4 just exits.


Does that mean that the error will only be detected in verbose mode? That 
doesn't seem right!

Correct. I don’t like this either but I also don’t want to make huge changes to 
git-p4.
You can see the root problem here:
https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L110-L114

Any idea how to approach that best?


I guess what we have is not ideal but probably good enough.



+try:
+text = p4_read_pipe(['print', '-q', '-o', '-', '%s@%s' % 
(file['depotFile'], file['change'])])
+except Exception as e:


Would it be better to specify which kind of Exception you are catching? Looks 
like you could get OSError, ValueError and CalledProcessError; it's the last of 
these you want (I think).

I think it is just a plain exception. See here:
https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L111


OK, you're right (probably less than ideal behaviour from read_pipe() 
and die() but let's not try to fix that).




+if p4_version_string().find('/NT') >= 0:
+text = text.replace('\r\n', '\n')
+contents = [ text ]


The indentation on this bit doesn't look right to me.

I believe it is exactly how it was:
https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L2397-L2399


OK.




In general, what is the appropriate way to reference code in this email list? 
Are GitHub links OK?


I'm not an expert, but it feels possibly a bit ephemeral - if someone is 
digging through email archives in a future where that github project has 
been moved elsewhere, the links will all be dead.


Luke
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Git Deployment using existing multiple environments

2015-09-15 Thread Johannes Schindelin
Hi,

please stop top-posting. It is quite irritating by now. Thank you.

On 2015-09-15 08:50, Sukhwinder Singh wrote:

> Now lets say we set up a repository at github which has the latest
> code (all test code)., Now at each of our own servers we already have
> existing code, that is Test, UAT and Live. For example, first he'll
> pull code from github to our Test Server and then move branches to UAT
> and then to Live. Can it work?

Yes, it can work.

The major problem here will be that you have only a single, central repository 
that every developer has write access to, but you expect only a single trusted 
person to deploy that code via three different stages. That can be tricky, in 
particular if your trusted person is a newbie.

> If it can work then can I please get some some example commands or the 
> procedure to set it up?

Sorry, this feels a bit too much like "could you please do all my work for me?" 
to me. And if I really provide exact commands here, I may even be liable when 
it does not work out in the end? I am not going to do that.

Instead, I will strongly suggest to learn enough about Git to answer those 
quite simple questions ("How do I pull?", "What is a pull?", "How do I update a 
test machine's working directory with the newest branch tip?") yourself. I 
usually recommend https://try.github.io/

> Time is a bit of problem right now or I would have read book suggested by 
> Johannes. I have searched on the internet but couldn't find any similar case.

If I was in your shoes, I would spend the time *now*, rather than quite 
possibly spending 10x as much time later when I have to clean up a mess. In my 
experience, 
what looks like the cheap route (copying commands without understanding them, 
really), invariably turns out to be the most expensive route possible.

In any case, this was as much useful feedback as I had to give to your 
questions.

Ciao,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Sep 2015, #03; Mon, 14)

2015-09-15 Thread Philip Oakley

From: "Junio C Hamano" 
Sent: Monday, September 14, 2015 11:43 PM

--



[New Topics]



* po/doc-branch-desc (2015-09-14) 1 commit
 (merged to 'next' on 2015-09-14 at 4934a96)
+ doc: show usage of branch description

The branch descriptions that are set with "git 
branch --edit-description"

option were used in many places but they weren't clearly documented.

Will merge to 'master'.


Thanks.
Shall I just rework/resend the V2 patch-up ($gmane/277829) that also 
links to 'merge's' usage as a fresh patch (would be tonight UK)?


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] git-p4: handle "Translation of file content failed"

2015-09-15 Thread Luke Diamand

On 14/09/15 17:55, larsxschnei...@gmail.com wrote:

From: Lars Schneider 

A P4 repository can get into a state where it contains a file with
type UTF-16 that does not contain a valid UTF-16 BOM. If git-p4


Sorry - what's a BOM? I'm assuming it's not a Bill of Materials?

Do we know the mechanism by which we end up in this state?




attempts to retrieve the file then the process crashes with a
"Translation of file content failed" error.

Fix this by detecting this error and retrieving the file as binary
instead. The result in Git is the same.

Known issue: This works only if git-p4 is executed in verbose mode.
In normal mode no exceptions are thrown and git-p4 just exits.


Does that mean that the error will only be detected in verbose mode? 
That doesn't seem right!




Signed-off-by: Lars Schneider 
---
  git-p4.py | 27 ---
  1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 073f87b..5ae25a6 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -134,13 +134,11 @@ def read_pipe(c, ignore_error=False):
  sys.stderr.write('Reading pipe: %s\n' % str(c))

  expand = isinstance(c,basestring)
-p = subprocess.Popen(c, stdout=subprocess.PIPE, shell=expand)
-pipe = p.stdout
-val = pipe.read()
-if p.wait() and not ignore_error:
-die('Command failed: %s' % str(c))
-
-return val
+p = subprocess.Popen(c, stdout=subprocess.PIPE, stderr=subprocess.PIPE, 
shell=expand)
+(out, err) = p.communicate()
+if p.returncode != 0 and not ignore_error:
+die('Command failed: %s\nError: %s' % (str(c), err))
+return out

  def p4_read_pipe(c, ignore_error=False):
  real_cmd = p4_build_cmd(c)
@@ -2186,10 +2184,17 @@ class P4Sync(Command, P4UserMap):
  # them back too.  This is not needed to the cygwin windows 
version,
  # just the native "NT" type.
  #
-text = p4_read_pipe(['print', '-q', '-o', '-', "%s@%s" % 
(file['depotFile'], file['change']) ])
-if p4_version_string().find("/NT") >= 0:
-text = text.replace("\r\n", "\n")
-contents = [ text ]
+try:
+text = p4_read_pipe(['print', '-q', '-o', '-', '%s@%s' % 
(file['depotFile'], file['change'])])
+except Exception as e:


Would it be better to specify which kind of Exception you are catching? 
Looks like you could get OSError, ValueError and CalledProcessError; 
it's the last of these you want (I think).



+if 'Translation of file content failed' in str(e):
+type_base = 'binary'
+else:
+raise e
+else:
+if p4_version_string().find('/NT') >= 0:
+text = text.replace('\r\n', '\n')
+contents = [ text ]


The indentation on this bit doesn't look right to me.



  if type_base == "apple":
  # Apple filetype files will be streamed as a concatenation of



Luke

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 1/7] git-p4: add optional type specifier to gitConfig reader

2015-09-15 Thread Luke Diamand

On 14/09/15 14:26, larsxschnei...@gmail.com wrote:

From: Lars Schneider 

The functions “gitConfig” and “gitConfigBool” are almost identical. Make 
“gitConfig” more generic by adding an optional type specifier. Use the type 
specifier “—bool” with “gitConfig” to implement “gitConfigBool. This prepares 
the implementation of other type specifiers such as “—int”.


Looks good to me, Ack.




Signed-off-by: Lars Schneider 
---
  git-p4.py | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index 073f87b..c139cab 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -604,9 +604,12 @@ def gitBranchExists(branch):

  _gitConfig = {}

-def gitConfig(key):
+def gitConfig(key, typeSpecifier=None):
  if not _gitConfig.has_key(key):
-cmd = [ "git", "config", key ]
+cmd = [ "git", "config" ]
+if typeSpecifier:
+cmd += [ typeSpecifier ]
+cmd += [ key ]
  s = read_pipe(cmd, ignore_error=True)
  _gitConfig[key] = s.strip()
  return _gitConfig[key]
@@ -617,10 +620,7 @@ def gitConfigBool(key):
 in the config."""

  if not _gitConfig.has_key(key):
-cmd = [ "git", "config", "--bool", key ]
-s = read_pipe(cmd, ignore_error=True)
-v = s.strip()
-_gitConfig[key] = v == "true"
+_gitConfig[key] = gitConfig(key, '--bool') == "true"
  return _gitConfig[key]

  def gitConfigList(key):



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7] git-p4: improve path encoding verbose output

2015-09-15 Thread Luke Diamand

On 15/09/15 08:31, Luke Diamand wrote:

On 14/09/15 18:10, larsxschnei...@gmail.com wrote:

It would be better to query this once at startup. Otherwise we're
potentially forking "git config" twice per file which on a large repo
could become significant. Make it an instance variable perhaps?


This is of course complete nonsense since gitConfig caches its results!


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7] git-p4: improve path encoding verbose output

2015-09-15 Thread Luke Diamand

On 14/09/15 18:10, larsxschnei...@gmail.com wrote:

From: Lars Schneider 

If a path with non-ASCII characters is detected then print always the


s/print always/print/



encoding and the encoded string in verbose mode.

Signed-off-by: Lars Schneider 
---
  git-p4.py | 19 +--
  1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index d45cf2b..da25d3f 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap):
  text = regexp.sub(r'$\1$', text)
  contents = [ text ]

-if gitConfig("git-p4.pathEncoding"):
-relPath = 
relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace')
-elif self.verbose:
-try:
-relPath.decode('ascii')
-except:
-print (
-"Path with Non-ASCII characters detected and no path encoding 
defined. "
-"Please check the encoding: %s" % relPath
-)
+try:
+relPath.decode('ascii')
+except:
+encoding = 'utf8'
+if gitConfig('git-p4.pathEncoding'):
+encoding = gitConfig('git-p4.pathEncoding')


It would be better to query this once at startup. Otherwise we're 
potentially forking "git config" twice per file which on a large repo 
could become significant. Make it an instance variable perhaps?



+relPath = relPath.decode(encoding).encode('utf8', 'replace')
+if self.verbose:
+print 'Path with non-ASCII characters detected. Used %s to 
encode: %s ' % (encoding, relPath)

  self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))




--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Git Deployment using existing multiple environments

2015-09-15 Thread Sukhwinder Singh
Thank you very much for the detailed reply. There is going to be just one 
person who'll manage all the moving of the code. On approval he'll move code 
from one environment to other on our own servers (right now it is being done by 
manual merging). Development team is not that big right now. They'll be 
committing all new code to the git repository we have setup up at github using 
all over test environment code. And the person handling the code at our server 
will move code from one environment to other by USING Git and not by manual 
merge.

Now lets say we set up a repository at github which has the latest code (all 
test code)., Now at each of our own servers we already have existing code, that 
is Test, UAT and Live. For example, first he'll pull code from github to our 
Test Server and then move branches to UAT and then to Live. Can it work? If it 
can work then can I please get some some example commands or the procedure to 
set it up? Time is a bit of problem right now or I would have read book 
suggested by Johannes. I have searched on the internet but couldn't find any 
similar case.

Regards,
Sukhwinder Singh


> From: jacob.kel...@gmail.com
> Date: Mon, 14 Sep 2015 01:32:39 -0700
> Subject: Re: Git Deployment using existing multiple environments
> To: php_programmer_in...@hotmail.com
> CC: johannes.schinde...@gmx.de; git@vger.kernel.org
>
> On Sun, Sep 13, 2015 at 10:55 PM, Sukhwinder Singh
>  wrote:
>> Thank you for the reply. Let's say I do setup three different repositories 
>> then how can we move work from one repository to the other. For example, 
>> from Test Environment to UAT. If there are any links that you can provide me 
>> that I can check, it'll be great.
>>
>> Regards,
>> Sukhwinder Singh
>>
>
> Generally speaking there are two ways of moving work from one
> repository to another. The first is the "pull" where you request data
> from a remote repository and then merge that data into your own. This
> is what you're doing when you perform a clone, a fetch, or a pull.
> It's what everyone does all the time when working with a local copy of
> a "trusted" remote repository. It can also be done between two
> "trusted" remotes, if your workflow is more distributed. (ie: more
> than one "official" source).
>
> The second form of moving work is the "push" where you upload your
> work into another repository. This is most commonly used when the
> workflow is "centralized". By that I mean there is a single
> authoritative repository. Or when you are moving your own work on a
> local machine into a remotely accessible machine for others to pull
> from.
>
> As Johannes said above, you really need to determine the work flow and
> team style you want before you can really understand the best way to
> setup repositories. For example, if you setup using a distributed
> chain of command, you can have one person be the "maintainer" of each
> given trusted repository. Then, maintainers can pull (or
> equivalent-ly, pull-request) between each other. This is generally how
> a project would work when using github. One person is the maintainer,
> then a developer "forks" the project, makes some changes, then
> requests that the maintainer pull these changes. The maintainer has
> final say and will perform the final merge in cases of conflict. In
> addition, maintainer is the one who says "this is ok to go into this
> repository".
>
> You can also instead opt to use a single centralized repository. Thus,
> developers would work on code and get it ready to submit, and then
> simply perform a push. If the push requires a merge git will tell the
> user to update. There are many tools such as server side hooks in
> order to enforce various behaviors.
>
> This flow generally doesn't use sole maintainers, as each developer
> has access to push directly. It may work well for smaller teams or for
> dedicated teams who don't change developers often.
>
> A lot comes down to how your team is structured. Do you have one
> person who's job can be to maintain the repository? Do you have
> several developers who don't want to be the sole owner? Is your team
> willing to function much more distributed?
>
> In the end, it's generally always a good idea to designate at least
> one repository as the "authority" so that everyone knows where to look
> for release tags and other such data.
>
> Myself, I would say that I prefer to use the pull-request model so
> that code gets more review, as "push" based models tend not to do
> review. (Exception: Gerrit, but this uses "git push" on the command
> line to do something very much not like a push)
>
> Regards,
> Jake
  --
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] strtoul_ui: actually report error in case of negative input

2015-09-15 Thread Matthieu Moy
[ Cc-ing Michael Haggerty who wrote the numparse module ]

Max Kirillov  writes:

> On Mon, Sep 14, 2015 at 08:30:54AM +0200, Matthieu Moy wrote:
>>> Fix it by changing the last check to trigger earlier, as soon as it
>>> becomes bigger than INT_MAX.
>> 
>> What if the value is actually greater than INT_MAX? The function is
>> returning an unsigned long (64 bits on 64bits architectures), and your
>> version is restricting it to integers smaller than 2^31, right?
>
> the return type of the function is "int", so this is not
> going to work anyway.

Not just the return type (which is the error status), but also the type
of the result argument indeed. It's not clear to me whether this is
intentional (09f2825 (git-grep: don't use sscanf, 2007-03-12) introduced
it, the commit message doesn't help). I first read strtoul_ui as
"strtoul with a better UI (user interface)", but maybe the name was
meant to say "a fuction that uses strtoul and returns an ui (unsigned
int)".

I think it would be better to just return a long to avoid needless
limitations, but changing the argument to "long" would interfer with
in-flight topics. Not worth the trouble.

One potential issue with your patch is that you're forbidding the
interval [2^31, 2^32[ which was previously allowed, both on 32 and 64
bits. I'm not sure whether we have a use for this in the codebase.

This alternative patch is rather ugly to, but I think it is less
limiting and does not have the "large negative wrapped to positive"
issue:

--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -814,6 +814,9 @@ static inline int strtoul_ui(char const *s, int base, 
unsigned int *result)
char *p;
 
errno = 0;
+   /* negative values would be accepted by strtoul */
+   if (strchr(s, '-'))
+   return -1;
ul = strtoul(s, , base);
if (errno || *p || p == s || (unsigned int) ul != ul)
return -1;

What do you think?

> As I mentioned, some negative values are still accepted
> as coresponding mod 2**32 positive numbers (-3221225472 as
> 1073741824), so there really is room for improvement, but it
> cannot be accomplished just by examining strtoul output.

On 64 bits architectures, it's not as bad: you need to go really far in
the negatives to wrap to positive values.

> I saw in the list archives an attempt to abandon the
> function in favor of more accurate parser [1], but seems
> like it did not make it into the project.
>
> [1] http://thread.gmane.org/gmane.comp.version-control.git/265635

I went through the thread quickly, my understanding is that there were
more work to do, but no objection to merging.

Michael, any plan to resurect the topic?

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-15 Thread Ramsay Jones


On 15/09/15 16:36, Jeff King wrote:
> We sometimes sprintf into static buffers when we know that
> the size of the buffer is large enough to fit the input
> (either because it's a constant, or because it's numeric
> input that is bounded in size). Likewise with strcpy of
> constant strings.
>
> However, these sites make it hard to audit sprintf and
> strcpy calls for buffer overflows, as a reader has to
> cross-reference the size of the array with the input. Let's
> use xsnprintf instead, which communicates to a reader that
> we don't expect this to overflow (and catches the mistake in
> case we do).
>
> Signed-off-by: Jeff King 
> ---
> These are all pretty trivial; the obvious thing to get wrong is that
> "sizeof(buf)" is not the correct length if "buf" is a pointer. I
> considered a macro wrapper like:
>
>   #define xsnprintf_array(dst, fmt, ...) \
>   xsnprintf(dst, sizeof(dst) + BARF_UNLESS_AN_ARRAY(dst), \
> fmt, __VA_ARGS__)
>
> but obviously that requires variadic macro support.
>
>  archive-tar.c |  2 +-
>  builtin/gc.c  |  2 +-
>  builtin/init-db.c | 11 ++-
>  builtin/ls-tree.c |  9 +
>  builtin/merge-index.c |  2 +-
>  builtin/merge-recursive.c |  2 +-
>  builtin/read-tree.c   |  2 +-
>  builtin/unpack-file.c |  2 +-
>  compat/mingw.c|  8 +---
>  compat/winansi.c  |  2 +-
>  connect.c |  2 +-
>  convert.c |  3 ++-
>  daemon.c  |  4 ++--
>  diff.c| 12 ++--
>  http-push.c   |  2 +-
>  http.c|  6 +++---
>  ll-merge.c| 12 ++--
>  refs.c|  8 
>  sideband.c|  4 ++--
>  strbuf.c  |  4 ++--
>  20 files changed, 52 insertions(+), 47 deletions(-)
>
> diff --git a/archive-tar.c b/archive-tar.c
> index b6b30bb..d543f93 100644
> --- a/archive-tar.c
> +++ b/archive-tar.c
> @@ -301,7 +301,7 @@ static int write_global_extended_header(struct 
> archiver_args *args)
>   memset(, 0, sizeof(header));
>   *header.typeflag = TYPEFLAG_GLOBAL_HEADER;
>   mode = 0100666;
> - strcpy(header.name, "pax_global_header");
> + xsnprintf(header.name, sizeof(header.name), "pax_global_header");

How about using strlcpy() instead? Thus:

-   strcpy(header.name, "pax_global_header");
+   strlcpy(header.name, "pax_global_header", sizeof(header.name));

Ditto for other similar (strcpy->xsnprintf) hunks below.

ATB,
Ramsay Jones

>   prepare_header(args, , mode, ext_header.len);
>   write_blocked(, sizeof(header));
>   write_blocked(ext_header.buf, ext_header.len);
> diff --git a/builtin/gc.c b/builtin/gc.c
> index 0ad8d30..57584bc 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -194,7 +194,7 @@ static const char *lock_repo_for_gc(int force, pid_t* 
> ret_pid)
>   return NULL;
>  
>   if (gethostname(my_host, sizeof(my_host)))
> - strcpy(my_host, "unknown");
> + xsnprintf(my_host, sizeof(my_host), "unknown");
>  
>   pidfile_path = git_pathdup("gc.pid");
>   fd = hold_lock_file_for_update(, pidfile_path,
> diff --git a/builtin/init-db.c b/builtin/init-db.c
> index 69323e1..e7d0e31 100644
> --- a/builtin/init-db.c
> +++ b/builtin/init-db.c
> @@ -262,7 +262,8 @@ static int create_default_files(const char *template_path)
>   }
>  
>   /* This forces creation of new config file */
> - sprintf(repo_version_string, "%d", GIT_REPO_VERSION);
> + xsnprintf(repo_version_string, sizeof(repo_version_string),
> +   "%d", GIT_REPO_VERSION);
>   git_config_set("core.repositoryformatversion", repo_version_string);
>  
>   path[len] = 0;
> @@ -414,13 +415,13 @@ int init_db(const char *template_dir, unsigned int 
> flags)
>*/
>   if (shared_repository < 0)
>   /* force to the mode value */
> - sprintf(buf, "0%o", -shared_repository);
> + xsnprintf(buf, sizeof(buf), "0%o", -shared_repository);
>   else if (shared_repository == PERM_GROUP)
> - sprintf(buf, "%d", OLD_PERM_GROUP);
> + xsnprintf(buf, sizeof(buf), "%d", OLD_PERM_GROUP);
>   else if (shared_repository == PERM_EVERYBODY)
> - sprintf(buf, "%d", OLD_PERM_EVERYBODY);
> + xsnprintf(buf, sizeof(buf), "%d", OLD_PERM_EVERYBODY);
>   else
> - die("oops");
> + die("BUG: invalid value for shared_repository");
>   git_config_set("core.sharedrepository", buf);
>   git_config_set("receive.denyNonFastforwards", "true");
>   }
> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
> index 3b04a0f..0e30d86 100644
> --- a/builtin/ls-tree.c
> +++ b/builtin/ls-tree.c
> @@ -96,12 +96,13 @@ static int 

Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-15 Thread Jeff King
On Tue, Sep 15, 2015 at 07:32:29PM +0100, Ramsay Jones wrote:

> > diff --git a/archive-tar.c b/archive-tar.c
> > index b6b30bb..d543f93 100644
> > --- a/archive-tar.c
> > +++ b/archive-tar.c
> > @@ -301,7 +301,7 @@ static int write_global_extended_header(struct 
> > archiver_args *args)
> > memset(, 0, sizeof(header));
> > *header.typeflag = TYPEFLAG_GLOBAL_HEADER;
> > mode = 0100666;
> > -   strcpy(header.name, "pax_global_header");
> > +   xsnprintf(header.name, sizeof(header.name), "pax_global_header");
> 
> How about using strlcpy() instead? Thus:
> 
> - strcpy(header.name, "pax_global_header");
> + strlcpy(header.name, "pax_global_header", sizeof(header.name));
> 
> Ditto for other similar (strcpy->xsnprintf) hunks below.

That misses the "assert" behavior of xsnprintf. We are preventing
overflow here, but also truncation. What should happen if
"pax_global_header" does not fit in header.name? I think complaining
loudly and immediately is the most helpful thing, because it is surely a
programming error.

We could make xstrlcpy(), of course, but I don't see much point when
xsnprintf does the same thing (and more).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Sep 2015, #03; Mon, 14)

2015-09-15 Thread Philip Oakley

Oops..
From: "Philip Oakley" 

[New Topics]



* po/doc-branch-desc (2015-09-14) 1 commit
 (merged to 'next' on 2015-09-14 at 4934a96)
+ doc: show usage of branch description

The branch descriptions that are set with "git
branch --edit-description"
option were used in many places but they weren't clearly documented.

Will merge to 'master'.


Thanks.
Shall I just rework/resend the V2 patch-up ($gmane/277829) that also
links to 'merge's' usage as a fresh patch (would be tonight UK)?


I now see that the full V2 patch is already there at 4934a96.
I'd mistakenly just compared your note to the slightly fuller V2 commit
message and in the morning rush hadn't had time to check.

Sorry for the noise.

Philip

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] doc: show usage of branch description

2015-09-15 Thread Philip Oakley

From: "Junio C Hamano" 

"Philip Oakley"  writes:


It still means that my patch is incomplete in its aim to bring out
these possible broader usages.

I haven't yet looked at the mail archives to see if there is more
around the time of those introductions.


I guess this is largely my fault, but I think "git grep" is an
easier source of truth to work with than the list archive.

It eventually boils down to branch.*.description configuration and
all users of that would call read_branch_desc(), so if you check
callers of that helper function and see which commit introduced that
call for what purpose ("blame" is your friend), you would know how
they use the information under what condition.


$ git grep -n read_branch_desc -- \*.c
branch.c:143:int read_branch_desc(struct strbuf *buf, const char
*branch_name)
builtin/branch.c:771:   read_branch_desc(, branch_name);
builtin/fmt-merge-msg.c:211:if (!read_branch_desc(, name)) {
builtin/log.c:888:  read_branch_desc(, branch_name);

$ git blame -L210,212 -s builtin/fmt-merge-msg.c
898eacd8 210)
898eacd8 211)   if (!read_branch_desc(, name)) {
898eacd8 212)   const char *bp = desc.buf;

$ git show -s 898eacd8
commit 898eacd8ada2d012f977948350ed60845e238037
Author: Junio C Hamano 

[...]>

   Signed-off-by: Junio C Hamano 

etc. etc.


---
Thanks.
That was very useful seeing a few more of the options in combination.
That, combined with the updated G4W, is a lot better/faster.

I've also searched for:
$ git grep -n "\.description" -- \*.sh

which only came up with
git-request-pull.sh:74: ! git config "branch.$branch_name.description" 
>/dev/null

git-request-pull.sh:156: git config "branch.$branch_name.description"

as relevant hits:

Sometimes one can be a bit feart to try out some command options..

Philip

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 04/67] fsck: don't fsck alternates for connectivity-only check

2015-09-15 Thread Johannes Schindelin
Hi Peff,

On 2015-09-15 17:24, Jeff King wrote:
> Commit 02976bf (fsck: introduce `git fsck --connectivity-only`,
> 2015-06-22) recently gave fsck an option to perform only a
> subset of the checks, by skipping the fsck_object_dir()
> call. However, it does so only for the local object
> directory, and we still do expensive checks on any alternate
> repos. We should skip them in this case, too.
> 
> Signed-off-by: Jeff King 

ACK!

Sorry for missing this spot.
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 22/67] entry.c: convert strcpy to xsnprintf

2015-09-15 Thread Ramsay Jones


On 15/09/15 16:40, Jeff King wrote:
> This particular conversion is non-obvious, because nobody
> has passed our function the length of the destination
> buffer. However, the interface to checkout_entry specifies
> that the buffer must be at least TEMPORARY_FILENAME_LENGTH
> bytes long, so we can check that (meaning the existing code
> was not buggy, but merely worrisome to somebody reading it).
>
> Signed-off-by: Jeff King 
> ---
>  entry.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/entry.c b/entry.c
> index 1eda8e9..582c400 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -96,8 +96,8 @@ static int open_output_fd(char *path, const struct 
> cache_entry *ce, int to_tempf
>  {
>   int symlink = (ce->ce_mode & S_IFMT) != S_IFREG;
>   if (to_tempfile) {
> - strcpy(path, symlink
> -? ".merge_link_XX" : ".merge_file_XX");
> + xsnprintf(path, TEMPORARY_FILENAME_LENGTH, "%s",
> +   symlink ? ".merge_link_XX" : 
> ".merge_file_XX");
>   return mkstemp(path);
>   } else {
>   return create_file(path, !symlink ? ce->ce_mode : 0666);

Hmm, I was going to suggest strlcpy() again. However, if you expect an overflow 
to
occur, then xsnprintf() will at least bring it to your attention!  Checking for 
overflow
with strlcpy() is not rocket science either, and I guess we could add 
xstrlcpy() ... :-D

dunno.

ATB,
Ramsay Jones




--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] doc: show usage of branch description

2015-09-15 Thread Philip Oakley

From: "Junio C Hamano" 

"Philip Oakley"  writes:


It still means that my patch is incomplete in its aim to bring out
these possible broader usages.

I haven't yet looked at the mail archives to see if there is more
around the time of those introductions.


I guess this is largely my fault, but I think "git grep" is an
easier source of truth to work with than the list archive.

It eventually boils down to branch.*.description configuration and
all users of that would call read_branch_desc(), so if you check
callers of that helper function and see which commit introduced that
call for what purpose ("blame" is your friend), you would know how
they use the information under what condition.


$ git grep -n read_branch_desc -- \*.c
branch.c:143:int read_branch_desc(struct strbuf *buf, const char
*branch_name)
builtin/branch.c:771:   read_branch_desc(, branch_name);
builtin/fmt-merge-msg.c:211:if (!read_branch_desc(, name)) {
builtin/log.c:888:  read_branch_desc(, branch_name);

$ git blame -L210,212 -s builtin/fmt-merge-msg.c
898eacd8 210)
898eacd8 211)   if (!read_branch_desc(, name)) {
898eacd8 212)   const char *bp = desc.buf;

$ git show -s 898eacd8
commit 898eacd8ada2d012f977948350ed60845e238037
Author: Junio C Hamano 

[...]>

   Signed-off-by: Junio C Hamano 

etc. etc.


---
Thanks.
That was very useful seeing a few more of the options in combination.
That, combined with the updated G4W, is a lot better/faster.
Sometimes one can be a bit feart to try out some command options..


I've also searched for:

$ git grep -n "\.description" -- \*.sh

which only came up with

git-request-pull.sh:74: ! git config "branch.$branch_name.description" 
>/dev/null

git-request-pull.sh:156: git config "branch.$branch_name.description"

as relevant hits.

Philip

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-15 Thread Ramsay Jones


On 15/09/15 19:42, Jeff King wrote:
> On Tue, Sep 15, 2015 at 07:32:29PM +0100, Ramsay Jones wrote:
>
>>> diff --git a/archive-tar.c b/archive-tar.c
>>> index b6b30bb..d543f93 100644
>>> --- a/archive-tar.c
>>> +++ b/archive-tar.c
>>> @@ -301,7 +301,7 @@ static int write_global_extended_header(struct 
>>> archiver_args *args)
>>> memset(, 0, sizeof(header));
>>> *header.typeflag = TYPEFLAG_GLOBAL_HEADER;
>>> mode = 0100666;
>>> -   strcpy(header.name, "pax_global_header");
>>> +   xsnprintf(header.name, sizeof(header.name), "pax_global_header");
>> How about using strlcpy() instead? Thus:
>>
>> -strcpy(header.name, "pax_global_header");
>> +strlcpy(header.name, "pax_global_header", sizeof(header.name));
>>
>> Ditto for other similar (strcpy->xsnprintf) hunks below.
> That misses the "assert" behavior of xsnprintf. We are preventing
> overflow here, but also truncation. What should happen if
> "pax_global_header" does not fit in header.name? I think complaining
> loudly and immediately is the most helpful thing, because it is surely a
> programming error.
>
> We could make xstrlcpy(), of course, but I don't see much point when
> xsnprintf does the same thing (and more).

Heh, I just sent an email about patch 22/67 which says similar things. I don't 
feel
too strongly, either way, but I have a slight preference for the use of 
[x]strlcpy()
in these cases.

I have to stop at patch #22 for now.

ATB,
Ramsay Jones


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Git Deployment using existing multiple environments

2015-09-15 Thread Sukhwinder Singh
---
> Date: Tue, 15 Sep 2015 09:48:34 +0200
> From: johannes.schinde...@gmx.de
> To: php_programmer_in...@hotmail.com
> CC: jacob.kel...@gmail.com; git@vger.kernel.org
> Subject: RE: Git Deployment using existing multiple environments
>
> Hi,
>
> please stop top-posting. It is quite irritating by now. Thank you.

Sorry about that. Haven't used mailing lists since almost a decade. Forgot how 
it should be used.

Thank you very much everyone for replying.  The advice from everyone is much 
appreciated.

Regards,
Sukhwinder Singh
  --
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [bug] .git/hooks/commit-msg.sample test is reversed

2015-09-15 Thread Matthieu Moy
Federico Fissore  writes:

> Hello everyone
>
> In file commit-msg.sample, grep test is reversed. It greps
> '^Signed-off-by: '
> while it should grep
> 'Signed-off-by: '
>
> If you run the sample against attached file, it won't complain. Remove
> the ^ and it will work as expected

I think the test is correct.

In grep, ^ can have two meanings:

'^foo' means "foo at the start of a line"
'[^abc]' means "one character but not a, b or c"

The Signed-off-by: trailer is meant to be at the start of a line, hence
the ^.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Feature Request] git blame showing only revisions from git rev-list --first-parent

2015-09-15 Thread Jeff King
On Sun, Sep 13, 2015 at 10:19:33PM -0700, Junio C Hamano wrote:

> For that matter, I am not sure how "blame A..B" with first-parent &
> reverse should behave when A is not an ancestor on the first-parent
> chain. Wouldn't we try to find a cut-point on the first-parent chain by
> traversing first-parent chain from B and painting them as positive,
> while traversing _all_ parents from B and painting them as negative,
> until the traversal intersect? And wouldn't we discover at least two
> children (one positive and one negative) for the cut point we discover
> by that traversal? That cut point would be the (fake) latest state the
> blame traversal starts at, and then we try to use the first (fake) parent
> that in real life is the first child (which we do not have a good definition
> for). And at that point a simple panda brain explodes ;-)
> 
> We might end up doing the right thing even in that case, but I haven't
> convinced myself about that (yet).  If the change were limited to "blame",
> the change may be much less problematic.

Good point.

It seems like nobody is actually that interested in what "blame
--first-parent --reverse" does in the first place, though, and there's
no reason for its complexity to hold up vanilla --first-parent. So what
do you think of:

-- >8 --
Subject: [PATCH] blame: handle --first-parent

The revision.c options-parser will parse "--first-parent"
for us, but the blame code does not actually respect it, as
we simply iterate over the whole list returned by
first_scapegoat(). We can fix this by returning a
truncated parent list.

Note that we could technically also do so by limiting the
return value of num_scapegoats(), but that is less robust.
We would rely on nobody ever looking at the "next" pointer
from the returned list.

Combining "--reverse" with "--first-parent" is more
complicated, and will probably involve cooperation from
revision.c. Since the desired semantics are not even clear,
let's punt on this for now, but explicitly disallow it to
avoid confusing users (this is not really a regression,
since it did something nonsensical before).

Signed-off-by: Jeff King 
---
 builtin/blame.c | 11 ++-
 t/annotate-tests.sh |  4 
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 4db01c1..ae4301c 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1366,8 +1366,15 @@ static void pass_whole_blame(struct scoreboard *sb,
  */
 static struct commit_list *first_scapegoat(struct rev_info *revs, struct 
commit *commit)
 {
-   if (!reverse)
+   if (!reverse) {
+   if (revs->first_parent_only &&
+   commit->parents &&
+   commit->parents->next) {
+   free_commit_list(commit->parents->next);
+   commit->parents->next = NULL;
+   }
return commit->parents;
+   }
return lookup_decoration(>children, >object);
 }
 
@@ -2680,6 +2687,8 @@ parse_done:
}
else if (contents_from)
die("--contents and --children do not blend well.");
+   else if (revs.first_parent_only)
+   die("combining --first-parent and --reverse is not supported");
else {
final_commit_name = prepare_initial();
sb.commits.compare = compare_commits_by_reverse_commit_date;
diff --git a/t/annotate-tests.sh b/t/annotate-tests.sh
index f5c0175..b1673b3 100644
--- a/t/annotate-tests.sh
+++ b/t/annotate-tests.sh
@@ -111,6 +111,10 @@ test_expect_success 'blame 2 authors + 2 merged-in 
authors' '
check_count A 2 B 1 B1 2 B2 1
 '
 
+test_expect_success 'blame --first-parent blames merge for branch1' '
+   check_count --first-parent A 2 B 1 "A U Thor" 2 B2 1
+'
+
 test_expect_success 'blame ancestor' '
check_count -h master A 2 B 2
 '
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Developing- Where to Start

2015-09-15 Thread Matthieu Moy
Breanna Devore-McDonald  writes:

> Hello all,

Hello,

> I'm a third year Computer Science student at the University of Notre
> Dame, and for the final project of my Data Structures class, my group
> and I have to find a way to contribute our (hopefully) newly-found
> data structures and optimization knowledge to a well-known open source
> project. We thought Git would be a great project to work on, but we
> have no idea where to start.

I am a teacher and I offer my students a similar project. I maintain a
list of small projects as source of inspiration for students:

  https://git.wiki.kernel.org/index.php/SmallProjectsIdeas

Some advices before you get started:

* Start with a very, very small contribution, as soon as possible. The
  list of microprojects from GSoC is a good place to start, and the
  easiest projects in the page above can be of interest too.

  This way, you will:

  - Understand how submitting contributions work (if you haven't done
so, read 
https://github.com/git/git/blob/master/Documentation/SubmittingPatches )

  - Get an idea of how productive you are when working with the Git
codebase.

* Even for the actual project, don't try something hard. The difficulty
  here is not to write the first draft, but to get it properly tested,
  reviewed, to respond to reviewers, and finally get the code accepted
  into git.git. Unlike traditional school projects, you can't anticipate
  difficulties (no teacher wrote the correct version for you in
  advance...), and the quality standard is much higher than what you're
  probably used to. A rough heuristics: estimate how long you are going
  to take for your project, multiply by 2 or 3, and you're still getting
  an underestimation.

  As a comparison, for Google summer of code, a very good student works
  full-time for 2 months and writes about 2000 to 3000 lines of code.

  BTW, how long is your project? How many students in your group?

* Interact with the mailing-list as much as you can and as early as you
  can. What you want to avoid is to write a large amount of code and get
  it reviewed at the end of your project. That would almost certainly
  result in requests for changes that you wouldn't have time to apply.

OK, that may all be a bit discourraging ;-). But it shouldn't: it's
_really_ very interesting to contribute to a project like Git.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[bug] .git/hooks/commit-msg.sample test is reversed

2015-09-15 Thread Federico Fissore

Hello everyone

In file commit-msg.sample, grep test is reversed. It greps
'^Signed-off-by: '
while it should grep
'Signed-off-by: '

If you run the sample against attached file, it won't complain. Remove 
the ^ and it will work as expected


Regards

Federico
Commit message

Signed-off-by: Me 
Signed-off-by: Me 



[PATCH 19/67] stop_progress_msg: convert sprintf to xsnprintf

2015-09-15 Thread Jeff King
The usual arguments for using xsnprintf over sprintf apply,
but this case is a little tricky. We print to a static
buffer if we have room, and otherwise to an allocated
buffer. So there should be no overflow here, but it is still
good to communicate our intention, as well as to check our
earlier math for how much space the string will need.

Signed-off-by: Jeff King 
---
 progress.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/progress.c b/progress.c
index a3efcfd..353bd37 100644
--- a/progress.c
+++ b/progress.c
@@ -254,7 +254,7 @@ void stop_progress_msg(struct progress **p_progress, const 
char *msg)
throughput_string(>display, tp->curr_total, rate);
}
progress_update = 1;
-   sprintf(bufp, ", %s.\n", msg);
+   xsnprintf(bufp, len + 1, ", %s.\n", msg);
display(progress, progress->last_value, bufp);
if (buf != bufp)
free(bufp);
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] git-p4: handle "Translation of file content failed"

2015-09-15 Thread Lars Schneider

On 15 Sep 2015, at 08:43, Luke Diamand  wrote:

> On 14/09/15 17:55, larsxschnei...@gmail.com wrote:
>> From: Lars Schneider 
>> 
>> A P4 repository can get into a state where it contains a file with
>> type UTF-16 that does not contain a valid UTF-16 BOM. If git-p4
> 
> Sorry - what's a BOM? I'm assuming it's not a Bill of Materials?
BOM stands for Byte Order Mark. The UTF-16 BOM is a two byte sequence at the 
beginning of a UTF-16 file. It is not part of the actual content. It is only 
used to define the encoding of the remaining file. FEFF stands for UTF-16 
big-endian encoding and FFFE for little-endian encoding.

More info here: http://www.unicode.org/faq/utf_bom.html#bom1


> Do we know the mechanism by which we end up in this state?
Unfortunately no. I tried hard to reproduce the error with “conventional” 
methods. As you can see I ended up manipulating the P4 database…

However, it looks like this error happens in the wild, too:
https://stackoverflow.com/questions/5156909/translation-of-file-content-failed-error-in-perforce
https://stackoverflow.com/questions/887006/perforce-translation-of-file-content-failed-error


>> attempts to retrieve the file then the process crashes with a
>> "Translation of file content failed" error.
>> 
>> Fix this by detecting this error and retrieving the file as binary
>> instead. The result in Git is the same.
>> 
>> Known issue: This works only if git-p4 is executed in verbose mode.
>> In normal mode no exceptions are thrown and git-p4 just exits.
> 
> Does that mean that the error will only be detected in verbose mode? That 
> doesn't seem right!
Correct. I don’t like this either but I also don’t want to make huge changes to 
git-p4.
You can see the root problem here:
https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L110-L114

Any idea how to approach that best?


> 
>> 
>> Signed-off-by: Lars Schneider 
>> ---
>>  git-p4.py | 27 ---
>>  1 file changed, 16 insertions(+), 11 deletions(-)
>> 
>> diff --git a/git-p4.py b/git-p4.py
>> index 073f87b..5ae25a6 100755
>> --- a/git-p4.py
>> +++ b/git-p4.py
>> @@ -134,13 +134,11 @@ def read_pipe(c, ignore_error=False):
>>  sys.stderr.write('Reading pipe: %s\n' % str(c))
>> 
>>  expand = isinstance(c,basestring)
>> -p = subprocess.Popen(c, stdout=subprocess.PIPE, shell=expand)
>> -pipe = p.stdout
>> -val = pipe.read()
>> -if p.wait() and not ignore_error:
>> -die('Command failed: %s' % str(c))
>> -
>> -return val
>> +p = subprocess.Popen(c, stdout=subprocess.PIPE, stderr=subprocess.PIPE, 
>> shell=expand)
>> +(out, err) = p.communicate()
>> +if p.returncode != 0 and not ignore_error:
>> +die('Command failed: %s\nError: %s' % (str(c), err))
>> +return out
>> 
>>  def p4_read_pipe(c, ignore_error=False):
>>  real_cmd = p4_build_cmd(c)
>> @@ -2186,10 +2184,17 @@ class P4Sync(Command, P4UserMap):
>>  # them back too.  This is not needed to the cygwin windows 
>> version,
>>  # just the native "NT" type.
>>  #
>> -text = p4_read_pipe(['print', '-q', '-o', '-', "%s@%s" % 
>> (file['depotFile'], file['change']) ])
>> -if p4_version_string().find("/NT") >= 0:
>> -text = text.replace("\r\n", "\n")
>> -contents = [ text ]
>> +try:
>> +text = p4_read_pipe(['print', '-q', '-o', '-', '%s@%s' % 
>> (file['depotFile'], file['change'])])
>> +except Exception as e:
> 
> Would it be better to specify which kind of Exception you are catching? Looks 
> like you could get OSError, ValueError and CalledProcessError; it's the last 
> of these you want (I think).
I think it is just a plain exception. See here:
https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L111


> 
>> +if 'Translation of file content failed' in str(e):
>> +type_base = 'binary'
>> +else:
>> +raise e
>> +else:
>> +if p4_version_string().find('/NT') >= 0:
>> +text = text.replace('\r\n', '\n')
>> +contents = [ text ]
> 
> The indentation on this bit doesn't look right to me.
I believe it is exactly how it was:
https://github.com/git/git/blob/97d7ad75b6fe74960d2a12e4a9151a55a5a87d6d/git-p4.py#L2397-L2399


In general, what is the appropriate way to reference code in this email list? 
Are GitHub links OK?


Thanks,
Lars

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 18/67] find_short_object_filename: convert sprintf to xsnprintf

2015-09-15 Thread Jeff King
We use sprintf() to format some hex data into a buffer. The
buffer is clearly long enough, and using snprintf here is
not necessary. And in fact, it does not really make anything
easier to audit, as the size we feed to snprintf accounts
for the magic extra 42 bytes found in each alt->name field
of struct alternate_object_database (which is there exactly
to do this formatting).

Still, it is nice to remove an sprintf call and replace it
with an xsnprintf and explanatory comment, which makes it
easier to audit the code base for overflows.

Signed-off-by: Jeff King 
---
 sha1_name.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index 416e408..ed42f79 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -96,11 +96,15 @@ static void find_short_object_filename(int len, const char 
*hex_pfx, struct disa
}
fakeent->next = alt_odb_list;
 
-   sprintf(hex, "%.2s", hex_pfx);
+   xsnprintf(hex, sizeof(hex), "%.2s", hex_pfx);
for (alt = fakeent; alt && !ds->ambiguous; alt = alt->next) {
struct dirent *de;
DIR *dir;
-   sprintf(alt->name, "%.2s/", hex_pfx);
+   /*
+* every alt_odb struct has 42 extra bytes after the base
+* for exactly this purpose
+*/
+   xsnprintf(alt->name, 42, "%.2s/", hex_pfx);
dir = opendir(alt->base);
if (!dir)
continue;
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 20/67] compat/hstrerror: convert sprintf to snprintf

2015-09-15 Thread Jeff King
This is a trivially correct use of sprintf, as our error
number should not be excessively long. But it's still nice
to drop an sprintf call.

Note that we cannot use xsnprintf here, because this is
compat code which does not load git-compat-util.h.

Signed-off-by: Jeff King 
---
 compat/hstrerror.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/compat/hstrerror.c b/compat/hstrerror.c
index 069c555..b85a2fa 100644
--- a/compat/hstrerror.c
+++ b/compat/hstrerror.c
@@ -16,6 +16,6 @@ const char *githstrerror(int err)
case TRY_AGAIN:
return "Non-authoritative \"host not found\", or SERVERFAIL";
}
-   sprintf(buffer, "Name resolution error %d", err);
+   snprintf(buffer, sizeof(buffer), "Name resolution error %d", err);
return buffer;
 }
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/67] test-dump-cache-tree: avoid overflow of cache-tree name

2015-09-15 Thread Jeff King
When dumping a cache-tree, we sprintf sub-tree names directly
into a fixed-size buffer, which can overflow. We can
trivially fix this by converting to xsnprintf to at least
notice and die.

This probably should handle arbitrary-sized names, but
there's not much point. It's used only by the test scripts,
so the trivial fix is enough.

Signed-off-by: Jeff King 
---
 test-dump-cache-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/test-dump-cache-tree.c b/test-dump-cache-tree.c
index 54c0872..bb53c0a 100644
--- a/test-dump-cache-tree.c
+++ b/test-dump-cache-tree.c
@@ -47,7 +47,7 @@ static int dump_cache_tree(struct cache_tree *it,
struct cache_tree_sub *rdwn;
 
rdwn = cache_tree_sub(ref, down->name);
-   sprintf(path, "%s%.*s/", pfx, down->namelen, down->name);
+   xsnprintf(path, sizeof(path), "%s%.*s/", pfx, down->namelen, 
down->name);
if (dump_cache_tree(down->cache_tree, rdwn->cache_tree, path))
errs = 1;
}
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 21/67] grep: use xsnprintf to format failure message

2015-09-15 Thread Jeff King
This looks at first glance like the sprintf can overflow our
buffer, but it's actually fine; the p->origin string is
something constant and small, like "command line" or "-e
option".

Signed-off-by: Jeff King 
---
 grep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/grep.c b/grep.c
index b58c7c6..6c68d5b 100644
--- a/grep.c
+++ b/grep.c
@@ -306,9 +306,9 @@ static NORETURN void compile_regexp_failed(const struct 
grep_pat *p,
char where[1024];
 
if (p->no)
-   sprintf(where, "In '%s' at %d, ", p->origin, p->no);
+   xsnprintf(where, sizeof(where), "In '%s' at %d, ", p->origin, 
p->no);
else if (p->origin)
-   sprintf(where, "%s, ", p->origin);
+   xsnprintf(where, sizeof(where), "%s, ", p->origin);
else
where[0] = 0;
 
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 41/67] apply: convert root string to strbuf

2015-09-15 Thread Jeff King
We use manual computation and strcpy to allocate the "root"
variable. This would be much simpler using xstrfmt.  But
since we store the length, too, we can just use a strbuf,
which handles that for us.

Note that we stop distinguishing between "no root" and
"empty root" in some cases, but that's OK; the results are
the same (e.g., inserting an empty string is a noop).

Signed-off-by: Jeff King 
---
 builtin/apply.c | 26 ++
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/builtin/apply.c b/builtin/apply.c
index 094a20f..1d4d439 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -77,8 +77,7 @@ static enum ws_ignore {
 
 
 static const char *patch_input_file;
-static const char *root;
-static int root_len;
+struct strbuf root = STRBUF_INIT;
 static int read_stdin = 1;
 static int options;
 
@@ -494,8 +493,8 @@ static char *find_name_gnu(const char *line, const char 
*def, int p_value)
}
 
strbuf_remove(, 0, cp - name.buf);
-   if (root)
-   strbuf_insert(, 0, root, root_len);
+   if (root.len)
+   strbuf_insert(, 0, root.buf, root.len);
return squash_slash(strbuf_detach(, NULL));
 }
 
@@ -697,8 +696,8 @@ static char *find_name_common(const char *line, const char 
*def,
return squash_slash(xstrdup(def));
}
 
-   if (root) {
-   char *ret = xstrfmt("%s%.*s", root, len, start);
+   if (root.len) {
+   char *ret = xstrfmt("%s%.*s", root.buf, len, start);
return squash_slash(ret);
}
 
@@ -1274,8 +1273,8 @@ static int parse_git_header(const char *line, int len, 
unsigned int size, struct
 * the default name from the header.
 */
patch->def_name = git_header_name(line, len);
-   if (patch->def_name && root) {
-   char *s = xstrfmt("%s%s", root, patch->def_name);
+   if (patch->def_name && root.len) {
+   char *s = xstrfmt("%s%s", root.buf, patch->def_name);
free(patch->def_name);
patch->def_name = s;
}
@@ -4498,14 +4497,9 @@ static int option_parse_whitespace(const struct option 
*opt,
 static int option_parse_directory(const struct option *opt,
  const char *arg, int unset)
 {
-   root_len = strlen(arg);
-   if (root_len && arg[root_len - 1] != '/') {
-   char *new_root;
-   root = new_root = xmalloc(root_len + 2);
-   strcpy(new_root, arg);
-   strcpy(new_root + root_len++, "/");
-   } else
-   root = arg;
+   strbuf_reset();
+   strbuf_addstr(, arg);
+   strbuf_complete(, '/');
return 0;
 }
 
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 42/67] transport: use strbufs for status table "quickref" strings

2015-09-15 Thread Jeff King
We generate range strings like "1234abcd...5678efab" for use
in the the fetch and push status tables. We use fixed-size
buffers along with strcat to do so. These aren't buggy, as
our manual size computation is correct, but there's nothing
checking that this is so.  Let's switch them to strbufs
instead, which are obviously correct, and make it easier to
audit the code base for problematic calls to strcat().

Signed-off-by: Jeff King 
---
 builtin/fetch.c | 22 --
 transport.c | 13 +++--
 2 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 4703725..841880e 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -528,36 +528,38 @@ static int update_local_ref(struct ref *ref,
}
 
if (in_merge_bases(current, updated)) {
-   char quickref[83];
+   struct strbuf quickref = STRBUF_INIT;
int r;
-   strcpy(quickref, find_unique_abbrev(current->object.sha1, 
DEFAULT_ABBREV));
-   strcat(quickref, "..");
-   strcat(quickref, find_unique_abbrev(ref->new_sha1, 
DEFAULT_ABBREV));
+   strbuf_add_unique_abbrev(, current->object.sha1, 
DEFAULT_ABBREV);
+   strbuf_addstr(, "..");
+   strbuf_add_unique_abbrev(, ref->new_sha1, 
DEFAULT_ABBREV);
if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
(recurse_submodules != RECURSE_SUBMODULES_ON))
check_for_new_submodule_commits(ref->new_sha1);
r = s_update_ref("fast-forward", ref, 1);
strbuf_addf(display, "%c %-*s %-*s -> %s%s",
r ? '!' : ' ',
-   TRANSPORT_SUMMARY_WIDTH, quickref,
+   TRANSPORT_SUMMARY_WIDTH, quickref.buf,
REFCOL_WIDTH, remote, pretty_ref,
r ? _("  (unable to update local ref)") : "");
+   strbuf_release();
return r;
} else if (force || ref->force) {
-   char quickref[84];
+   struct strbuf quickref = STRBUF_INIT;
int r;
-   strcpy(quickref, find_unique_abbrev(current->object.sha1, 
DEFAULT_ABBREV));
-   strcat(quickref, "...");
-   strcat(quickref, find_unique_abbrev(ref->new_sha1, 
DEFAULT_ABBREV));
+   strbuf_add_unique_abbrev(, current->object.sha1, 
DEFAULT_ABBREV);
+   strbuf_addstr(, "...");
+   strbuf_add_unique_abbrev(, ref->new_sha1, 
DEFAULT_ABBREV);
if ((recurse_submodules != RECURSE_SUBMODULES_OFF) &&
(recurse_submodules != RECURSE_SUBMODULES_ON))
check_for_new_submodule_commits(ref->new_sha1);
r = s_update_ref("forced-update", ref, 1);
strbuf_addf(display, "%c %-*s %-*s -> %s  (%s)",
r ? '!' : '+',
-   TRANSPORT_SUMMARY_WIDTH, quickref,
+   TRANSPORT_SUMMARY_WIDTH, quickref.buf,
REFCOL_WIDTH, remote, pretty_ref,
r ? _("unable to update local ref") : _("forced 
update"));
+   strbuf_release();
return r;
} else {
strbuf_addf(display, "! %-*s %-*s -> %s  %s",
diff --git a/transport.c b/transport.c
index 2d51348..3b47d49 100644
--- a/transport.c
+++ b/transport.c
@@ -654,23 +654,24 @@ static void print_ok_ref_status(struct ref *ref, int 
porcelain)
"[new branch]"),
ref, ref->peer_ref, NULL, porcelain);
else {
-   char quickref[84];
+   struct strbuf quickref = STRBUF_INIT;
char type;
const char *msg;
 
-   strcpy(quickref, status_abbrev(ref->old_sha1));
+   strbuf_addstr(, status_abbrev(ref->old_sha1));
if (ref->forced_update) {
-   strcat(quickref, "...");
+   strbuf_addstr(, "...");
type = '+';
msg = "forced update";
} else {
-   strcat(quickref, "..");
+   strbuf_addstr(, "..");
type = ' ';
msg = NULL;
}
-   strcat(quickref, status_abbrev(ref->new_sha1));
+   strbuf_addstr(, status_abbrev(ref->new_sha1));
 
-   print_ref_status(type, quickref, ref, ref->peer_ref, msg, 
porcelain);
+   print_ref_status(type, quickref.buf, ref, ref->peer_ref, msg, 
porcelain);
+   strbuf_release();
}
 }
 
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

[PATCH 48/67] fetch-pack: use argv_array for index-pack / unpack-objects

2015-09-15 Thread Jeff King
This cleans up a magic number that must be kept in sync with
the rest of the code (the number of argv slots). It also
lets us drop some fixed buffers and an sprintf (since we
can now use argv_array_pushf).

We do still have to keep one fixed buffer for calling
gethostname, but at least now the size computations for it
are much simpler.

Signed-off-by: Jeff King 
---
 fetch-pack.c | 56 +++-
 1 file changed, 27 insertions(+), 29 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 820251a..2dabee9 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -681,11 +681,10 @@ static int get_pack(struct fetch_pack_args *args,
int xd[2], char **pack_lockfile)
 {
struct async demux;
-   const char *argv[22];
-   char keep_arg[256];
-   char hdr_arg[256];
-   const char **av, *cmd_name;
int do_keep = args->keep_pack;
+   const char *cmd_name;
+   struct pack_header header;
+   int pass_header = 0;
struct child_process cmd = CHILD_PROCESS_INIT;
int ret;
 
@@ -705,17 +704,11 @@ static int get_pack(struct fetch_pack_args *args,
else
demux.out = xd[0];
 
-   cmd.argv = argv;
-   av = argv;
-   *hdr_arg = 0;
if (!args->keep_pack && unpack_limit) {
-   struct pack_header header;
 
if (read_pack_header(demux.out, ))
die("protocol error: bad pack header");
-   snprintf(hdr_arg, sizeof(hdr_arg),
-"--pack_header=%"PRIu32",%"PRIu32,
-ntohl(header.hdr_version), ntohl(header.hdr_entries));
+   pass_header = 1;
if (ntohl(header.hdr_entries) < unpack_limit)
do_keep = 0;
else
@@ -723,44 +716,49 @@ static int get_pack(struct fetch_pack_args *args,
}
 
if (alternate_shallow_file) {
-   *av++ = "--shallow-file";
-   *av++ = alternate_shallow_file;
+   argv_array_push(, "--shallow-file");
+   argv_array_push(, alternate_shallow_file);
}
 
if (do_keep) {
if (pack_lockfile)
cmd.out = -1;
-   *av++ = cmd_name = "index-pack";
-   *av++ = "--stdin";
+   cmd_name = "index-pack";
+   argv_array_push(, cmd_name);
+   argv_array_push(, "--stdin");
if (!args->quiet && !args->no_progress)
-   *av++ = "-v";
+   argv_array_push(, "-v");
if (args->use_thin_pack)
-   *av++ = "--fix-thin";
+   argv_array_push(, "--fix-thin");
if (args->lock_pack || unpack_limit) {
-   int s = sprintf(keep_arg,
-   "--keep=fetch-pack %"PRIuMAX " on ", 
(uintmax_t) getpid());
-   if (gethostname(keep_arg + s, sizeof(keep_arg) - s))
-   strcpy(keep_arg + s, "localhost");
-   *av++ = keep_arg;
+   char hostname[256];
+   if (gethostname(hostname, sizeof(hostname)))
+   xsnprintf(hostname, sizeof(hostname), 
"localhost");
+   argv_array_pushf(,
+   "--keep=fetch-pack %"PRIuMAX " on %s",
+   (uintmax_t)getpid(), hostname);
}
if (args->check_self_contained_and_connected)
-   *av++ = "--check-self-contained-and-connected";
+   argv_array_push(, 
"--check-self-contained-and-connected");
}
else {
-   *av++ = cmd_name = "unpack-objects";
+   cmd_name = "unpack-objects";
+   argv_array_push(, cmd_name);
if (args->quiet || args->no_progress)
-   *av++ = "-q";
+   argv_array_push(, "-q");
args->check_self_contained_and_connected = 0;
}
-   if (*hdr_arg)
-   *av++ = hdr_arg;
+
+   if (pass_header)
+   argv_array_pushf(, "--pack_header=%"PRIu32",%"PRIu32,
+ntohl(header.hdr_version),
+ntohl(header.hdr_entries));
if (fetch_fsck_objects >= 0
? fetch_fsck_objects
: transfer_fsck_objects >= 0
? transfer_fsck_objects
: 0)
-   *av++ = "--strict";
-   *av++ = NULL;
+   argv_array_push(, "--strict");
 
cmd.in = demux.out;
cmd.git_cmd = 1;
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 50/67] stat_tracking_info: convert to argv_array

2015-09-15 Thread Jeff King
In addition to dropping the magic number for the fixed-size
argv, we can also drop a fixed-length buffer and some
strcpy's into it.

Signed-off-by: Jeff King 
---
 remote.c | 26 --
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/remote.c b/remote.c
index 1b69751..255d39a 100644
--- a/remote.c
+++ b/remote.c
@@ -8,6 +8,7 @@
 #include "tag.h"
 #include "string-list.h"
 #include "mergesort.h"
+#include "argv-array.h"
 
 enum map_direction { FROM_SRC, FROM_DST };
 
@@ -2027,10 +2028,9 @@ int stat_tracking_info(struct branch *branch, int 
*num_ours, int *num_theirs,
 {
unsigned char sha1[20];
struct commit *ours, *theirs;
-   char symmetric[84];
struct rev_info revs;
-   const char *rev_argv[10], *base;
-   int rev_argc;
+   const char *base;
+   struct argv_array argv = ARGV_ARRAY_INIT;
 
/* Cannot stat unless we are marked to build on top of somebody else. */
base = branch_get_upstream(branch, NULL);
@@ -2059,19 +2059,15 @@ int stat_tracking_info(struct branch *branch, int 
*num_ours, int *num_theirs,
}
 
/* Run "rev-list --left-right ours...theirs" internally... */
-   rev_argc = 0;
-   rev_argv[rev_argc++] = NULL;
-   rev_argv[rev_argc++] = "--left-right";
-   rev_argv[rev_argc++] = symmetric;
-   rev_argv[rev_argc++] = "--";
-   rev_argv[rev_argc] = NULL;
-
-   strcpy(symmetric, sha1_to_hex(ours->object.sha1));
-   strcpy(symmetric + 40, "...");
-   strcpy(symmetric + 43, sha1_to_hex(theirs->object.sha1));
+   argv_array_push(, ""); /* ignored */
+   argv_array_push(, "--left-right");
+   argv_array_pushf(, "%s...%s",
+sha1_to_hex(ours->object.sha1),
+sha1_to_hex(theirs->object.sha1));
+   argv_array_push(, "--");
 
init_revisions(, NULL);
-   setup_revisions(rev_argc, rev_argv, , NULL);
+   setup_revisions(argv.argc, argv.argv, , NULL);
if (prepare_revision_walk())
die("revision walk setup failed");
 
@@ -2091,6 +2087,8 @@ int stat_tracking_info(struct branch *branch, int 
*num_ours, int *num_theirs,
/* clear object flags smudged by the above traversal */
clear_commit_marks(ours, ALL_REV_FLAGS);
clear_commit_marks(theirs, ALL_REV_FLAGS);
+
+   argv_array_clear();
return 0;
 }
 
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 44/67] enter_repo: convert fixed-size buffers to strbufs

2015-09-15 Thread Jeff King
We use two PATH_MAX-sized buffers to represent the repo
path, and must make sure not to overflow them. We do take
care to check the lengths, but the logic is rather hard to
follow, as we use several magic numbers (e.g., "PATH_MAX -
10"). And in fact you _can_ overflow the buffer if you have
a ".git" file with an extremely long path in it.

By switching to strbufs, these problems all go away. We do,
however, retain the check that the initial input we get is
no larger than PATH_MAX. This function is an entry point for
untrusted repo names from the network, and it's a good idea
to keep a sanity check (both to avoid allocating arbitrary
amounts of memory, and also as a layer of defense against
any downstream users of the names).

Signed-off-by: Jeff King 
---
 path.c | 57 +
 1 file changed, 29 insertions(+), 28 deletions(-)

diff --git a/path.c b/path.c
index 46a4d27..60e0390 100644
--- a/path.c
+++ b/path.c
@@ -391,8 +391,8 @@ return_null:
  */
 const char *enter_repo(const char *path, int strict)
 {
-   static char used_path[PATH_MAX];
-   static char validated_path[PATH_MAX];
+   static struct strbuf validated_path = STRBUF_INIT;
+   static struct strbuf used_path = STRBUF_INIT;
 
if (!path)
return NULL;
@@ -407,46 +407,47 @@ const char *enter_repo(const char *path, int strict)
while ((1 < len) && (path[len-1] == '/'))
len--;
 
+   /*
+* We can handle arbitrary-sized buffers, but this remains as a
+* sanity check on untrusted input.
+*/
if (PATH_MAX <= len)
return NULL;
-   strncpy(used_path, path, len); used_path[len] = 0 ;
-   strcpy(validated_path, used_path);
 
-   if (used_path[0] == '~') {
-   char *newpath = expand_user_path(used_path);
-   if (!newpath || (PATH_MAX - 10 < strlen(newpath))) {
-   free(newpath);
+   strbuf_reset(_path);
+   strbuf_reset(_path);
+   strbuf_add(_path, path, len);
+   strbuf_add(_path, path, len);
+
+   if (used_path.buf[0] == '~') {
+   char *newpath = expand_user_path(used_path.buf);
+   if (!newpath)
return NULL;
-   }
-   /*
-* Copy back into the static buffer. A pity
-* since newpath was not bounded, but other
-* branches of the if are limited by PATH_MAX
-* anyway.
-*/
-   strcpy(used_path, newpath); free(newpath);
+   strbuf_attach(_path, newpath, strlen(newpath),
+ strlen(newpath));
}
-   else if (PATH_MAX - 10 < len)
-   return NULL;
-   len = strlen(used_path);
for (i = 0; suffix[i]; i++) {
struct stat st;
-   strcpy(used_path + len, suffix[i]);
-   if (!stat(used_path, ) &&
+   size_t baselen = used_path.len;
+   strbuf_addstr(_path, suffix[i]);
+   if (!stat(used_path.buf, ) &&
(S_ISREG(st.st_mode) ||
-   (S_ISDIR(st.st_mode) && 
is_git_directory(used_path {
-   strcat(validated_path, suffix[i]);
+   (S_ISDIR(st.st_mode) && 
is_git_directory(used_path.buf {
+   strbuf_addstr(_path, suffix[i]);
break;
}
+   strbuf_setlen(_path, baselen);
}
if (!suffix[i])
return NULL;
-   gitfile = read_gitfile(used_path) ;
-   if (gitfile)
-   strcpy(used_path, gitfile);
-   if (chdir(used_path))
+   gitfile = read_gitfile(used_path.buf) ;
+   if (gitfile) {
+   strbuf_reset(_path);
+   strbuf_addstr(_path, gitfile);
+   }
+   if (chdir(used_path.buf))
return NULL;
-   path = validated_path;
+   path = validated_path.buf;
}
else if (chdir(path))
return NULL;
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 46/67] write_loose_object: convert to strbuf

2015-09-15 Thread Jeff King
When creating a loose object tempfile, we use a fixed
PATH_MAX-sized buffer, and strcpy directly into it. This
isn't buggy, because we do a rough check of the size, but
there's no verification that our guesstimate of the required
space is enough (in fact, it's several bytes too big for the
current naming scheme).

Let's switch to a strbuf, which makes this much easier to
verify. The allocation overhead should be negligible, since
we are replacing a static buffer with a static strbuf, and
we'll only need to allocate on the first call.

While we're here, we can also document a subtle interaction
with mkstemp that would be easy to overlook.

Signed-off-by: Jeff King 
---
 sha1_file.c | 41 +
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 9d87b69..374a996 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3007,29 +3007,30 @@ static inline int directory_size(const char *filename)
  * We want to avoid cross-directory filename renames, because those
  * can have problems on various filesystems (FAT, NFS, Coda).
  */
-static int create_tmpfile(char *buffer, size_t bufsiz, const char *filename)
+static int create_tmpfile(struct strbuf *tmp, const char *filename)
 {
int fd, dirlen = directory_size(filename);
 
-   if (dirlen + 20 > bufsiz) {
-   errno = ENAMETOOLONG;
-   return -1;
-   }
-   memcpy(buffer, filename, dirlen);
-   strcpy(buffer + dirlen, "tmp_obj_XX");
-   fd = git_mkstemp_mode(buffer, 0444);
+   strbuf_reset(tmp);
+   strbuf_add(tmp, filename, dirlen);
+   strbuf_addstr(tmp, "tmp_obj_XX");
+   fd = git_mkstemp_mode(tmp->buf, 0444);
if (fd < 0 && dirlen && errno == ENOENT) {
-   /* Make sure the directory exists */
-   memcpy(buffer, filename, dirlen);
-   buffer[dirlen-1] = 0;
-   if (mkdir(buffer, 0777) && errno != EEXIST)
+   /*
+* Make sure the directory exists; note that mkstemp will have
+* put a NUL in our buffer, so we have to rewrite the path,
+* rather than just chomping the length.
+*/
+   strbuf_reset(tmp);
+   strbuf_add(tmp, filename, dirlen - 1);
+   if (mkdir(tmp->buf, 0777) && errno != EEXIST)
return -1;
-   if (adjust_shared_perm(buffer))
+   if (adjust_shared_perm(tmp->buf))
return -1;
 
/* Try again */
-   strcpy(buffer + dirlen - 1, "/tmp_obj_XX");
-   fd = git_mkstemp_mode(buffer, 0444);
+   strbuf_addstr(tmp, "/tmp_obj_XX");
+   fd = git_mkstemp_mode(tmp->buf, 0444);
}
return fd;
 }
@@ -3042,10 +3043,10 @@ static int write_loose_object(const unsigned char 
*sha1, char *hdr, int hdrlen,
git_zstream stream;
git_SHA_CTX c;
unsigned char parano_sha1[20];
-   static char tmp_file[PATH_MAX];
+   static struct strbuf tmp_file = STRBUF_INIT;
const char *filename = sha1_file_name(sha1);
 
-   fd = create_tmpfile(tmp_file, sizeof(tmp_file), filename);
+   fd = create_tmpfile(_file, filename);
if (fd < 0) {
if (errno == EACCES)
return error("insufficient permission for adding an 
object to repository database %s", get_object_directory());
@@ -3094,12 +3095,12 @@ static int write_loose_object(const unsigned char 
*sha1, char *hdr, int hdrlen,
struct utimbuf utb;
utb.actime = mtime;
utb.modtime = mtime;
-   if (utime(tmp_file, ) < 0)
+   if (utime(tmp_file.buf, ) < 0)
warning("failed utime() on %s: %s",
-   tmp_file, strerror(errno));
+   tmp_file.buf, strerror(errno));
}
 
-   return finalize_object_file(tmp_file, filename);
+   return finalize_object_file(tmp_file.buf, filename);
 }
 
 static int freshen_loose_object(const unsigned char *sha1)
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 49/67] http-push: use an argv_array for setup_revisions

2015-09-15 Thread Jeff King
This drops the magic number for the fixed-size argv arrays,
so we do not have to wonder if we are overflowing it. We can
also drop some confusing sha1_to_hex memory allocation
(which seems to predate the ring of buffers allowing
multiple calls), and get rid of an unchecked sprintf call.

Signed-off-by: Jeff King 
---
 http-push.c | 32 ++--
 1 file changed, 10 insertions(+), 22 deletions(-)

diff --git a/http-push.c b/http-push.c
index e501c28..43a9036 100644
--- a/http-push.c
+++ b/http-push.c
@@ -10,6 +10,7 @@
 #include "remote.h"
 #include "list-objects.h"
 #include "sigchain.h"
+#include "argv-array.h"
 
 #ifdef EXPAT_NEEDS_XMLPARSE_H
 #include 
@@ -1856,9 +1857,7 @@ int main(int argc, char **argv)
new_refs = 0;
for (ref = remote_refs; ref; ref = ref->next) {
char old_hex[60], *new_hex;
-   const char *commit_argv[5];
-   int commit_argc;
-   char *new_sha1_hex, *old_sha1_hex;
+   struct argv_array commit_argv = ARGV_ARRAY_INIT;
 
if (!ref->peer_ref)
continue;
@@ -1937,27 +1936,15 @@ int main(int argc, char **argv)
}
 
/* Set up revision info for this refspec */
-   commit_argc = 3;
-   new_sha1_hex = xstrdup(sha1_to_hex(ref->new_sha1));
-   old_sha1_hex = NULL;
-   commit_argv[1] = "--objects";
-   commit_argv[2] = new_sha1_hex;
-   if (!push_all && !is_null_sha1(ref->old_sha1)) {
-   old_sha1_hex = xmalloc(42);
-   sprintf(old_sha1_hex, "^%s",
-   sha1_to_hex(ref->old_sha1));
-   commit_argv[3] = old_sha1_hex;
-   commit_argc++;
-   }
-   commit_argv[commit_argc] = NULL;
+   argv_array_push(_argv, ""); /* ignored */
+   argv_array_push(_argv, "--objects");
+   argv_array_push(_argv, sha1_to_hex(ref->new_sha1));
+   if (!push_all && !is_null_sha1(ref->old_sha1))
+   argv_array_pushf(_argv, "^%s",
+sha1_to_hex(ref->old_sha1));
init_revisions(, setup_git_directory());
-   setup_revisions(commit_argc, commit_argv, , NULL);
+   setup_revisions(commit_argv.argc, commit_argv.argv, , 
NULL);
revs.edge_hint = 0; /* just in case */
-   free(new_sha1_hex);
-   if (old_sha1_hex) {
-   free(old_sha1_hex);
-   commit_argv[1] = NULL;
-   }
 
/* Generate a list of objects that need to be pushed */
pushing = 0;
@@ -1986,6 +1973,7 @@ int main(int argc, char **argv)
printf("%s %s\n", !rc ? "ok" : "error", ref->name);
unlock_remote(ref_lock);
check_locks();
+   argv_array_clear(_argv);
}
 
/* Update remote server info if appropriate */
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 45/67] remove_leading_path: use a strbuf for internal storage

2015-09-15 Thread Jeff King
This function strcpy's directly into a PATH_MAX-sized
buffer. There's only one caller, which feeds the git_dir into
it, so it's not easy to trigger in practice (even if you fed
a large $GIT_DIR through the environment or .git file, it
would have to actually exist and be accessible on the
filesystem to get to this point). We can fix it by moving to
a strbuf.

Signed-off-by: Jeff King 
---
 path.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/path.c b/path.c
index 60e0390..c597473 100644
--- a/path.c
+++ b/path.c
@@ -632,7 +632,7 @@ const char *relative_path(const char *in, const char 
*prefix,
  */
 const char *remove_leading_path(const char *in, const char *prefix)
 {
-   static char buf[PATH_MAX + 1];
+   static struct strbuf buf = STRBUF_INIT;
int i = 0, j = 0;
 
if (!prefix || !prefix[0])
@@ -661,11 +661,13 @@ const char *remove_leading_path(const char *in, const 
char *prefix)
return in;
while (is_dir_sep(in[j]))
j++;
+
+   strbuf_reset();
if (!in[j])
-   strcpy(buf, ".");
+   strbuf_addstr(, ".");
else
-   strcpy(buf, in + j);
-   return buf;
+   strbuf_addstr(, in + j);
+   return buf.buf;
 }
 
 /*
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 43/67] merge-recursive: convert malloc / strcpy to strbuf

2015-09-15 Thread Jeff King
This would be a fairly routine use of xstrfmt, except that
we need to remember the length of the result to pass to
cache_name_pos. So just use a strbuf, which makes this
simple.

As a bonus, this gets rid of confusing references to
"pathlen+1". The "1" is for the trailing slash we added, but
that is automatically accounted for in the strbuf's len
parameter.

Signed-off-by: Jeff King 
---
 merge-recursive.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 44d85be..a5e74d8 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -630,25 +630,24 @@ static char *unique_path(struct merge_options *o, const 
char *path, const char *
 
 static int dir_in_way(const char *path, int check_working_copy)
 {
-   int pos, pathlen = strlen(path);
-   char *dirpath = xmalloc(pathlen + 2);
+   int pos;
+   struct strbuf dirpath = STRBUF_INIT;
struct stat st;
 
-   strcpy(dirpath, path);
-   dirpath[pathlen] = '/';
-   dirpath[pathlen+1] = '\0';
+   strbuf_addstr(, path);
+   strbuf_addch(, '/');
 
-   pos = cache_name_pos(dirpath, pathlen+1);
+   pos = cache_name_pos(dirpath.buf, dirpath.len);
 
if (pos < 0)
pos = -1 - pos;
if (pos < active_nr &&
-   !strncmp(dirpath, active_cache[pos]->name, pathlen+1)) {
-   free(dirpath);
+   !strncmp(dirpath.buf, active_cache[pos]->name, dirpath.len)) {
+   strbuf_release();
return 1;
}
 
-   free(dirpath);
+   strbuf_release();
return check_working_copy && !lstat(path, ) && S_ISDIR(st.st_mode);
 }
 
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7] git-p4: improve path encoding verbose output

2015-09-15 Thread Lars Schneider

On 15 Sep 2015, at 09:31, Luke Diamand  wrote:

> On 14/09/15 18:10, larsxschnei...@gmail.com wrote:
>> From: Lars Schneider 
>> 
>> If a path with non-ASCII characters is detected then print always the
> 
> s/print always/print/
I will fix it.

> 
> 
>> encoding and the encoded string in verbose mode.
>> 
>> Signed-off-by: Lars Schneider 
>> ---
>>  git-p4.py | 19 +--
>>  1 file changed, 9 insertions(+), 10 deletions(-)
>> 
>> diff --git a/git-p4.py b/git-p4.py
>> index d45cf2b..da25d3f 100755
>> --- a/git-p4.py
>> +++ b/git-p4.py
>> @@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap):
>>  text = regexp.sub(r'$\1$', text)
>>  contents = [ text ]
>> 
>> -if gitConfig("git-p4.pathEncoding"):
>> -relPath = 
>> relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace')
>> -elif self.verbose:
>> -try:
>> -relPath.decode('ascii')
>> -except:
>> -print (
>> -"Path with Non-ASCII characters detected and no path 
>> encoding defined. "
>> -"Please check the encoding: %s" % relPath
>> -)
>> +try:
>> +relPath.decode('ascii')
>> +except:
>> +encoding = 'utf8'
>> +if gitConfig('git-p4.pathEncoding'):
>> +encoding = gitConfig('git-p4.pathEncoding')
> 
> It would be better to query this once at startup. Otherwise we're potentially 
> forking "git config" twice per file which on a large repo could become 
> significant. Make it an instance variable perhaps?
solved in other email

> 
>> +relPath = relPath.decode(encoding).encode('utf8', 'replace')
>> +if self.verbose:
>> +print 'Path with non-ASCII characters detected. Used %s to 
>> encode: %s ' % (encoding, relPath)
>> 
>>  self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))

Thanks!--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 47/67] diagnose_invalid_index_path: use strbuf to avoid strcpy/strcat

2015-09-15 Thread Jeff King
We dynamically allocate a buffer and then strcpy and strcat
into it. This isn't buggy, but we'd prefer to avoid these
suspicious functions.

This would be a good candidate for converstion to xstrfmt,
but we need to record the length for dealing with index
entries. A strbuf handles that for us.

Signed-off-by: Jeff King 
---
 sha1_name.c | 21 +
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/sha1_name.c b/sha1_name.c
index ed42f79..0f14ea2 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -1293,8 +1293,7 @@ static void diagnose_invalid_index_path(int stage,
const struct cache_entry *ce;
int pos;
unsigned namelen = strlen(filename);
-   unsigned fullnamelen;
-   char *fullname;
+   struct strbuf fullname = STRBUF_INIT;
 
if (!prefix)
prefix = "";
@@ -1314,21 +1313,19 @@ static void diagnose_invalid_index_path(int stage,
}
 
/* Confusion between relative and absolute filenames? */
-   fullnamelen = namelen + strlen(prefix);
-   fullname = xmalloc(fullnamelen + 1);
-   strcpy(fullname, prefix);
-   strcat(fullname, filename);
-   pos = cache_name_pos(fullname, fullnamelen);
+   strbuf_addstr(, prefix);
+   strbuf_addstr(, filename);
+   pos = cache_name_pos(fullname.buf, fullname.len);
if (pos < 0)
pos = -pos - 1;
if (pos < active_nr) {
ce = active_cache[pos];
-   if (ce_namelen(ce) == fullnamelen &&
-   !memcmp(ce->name, fullname, fullnamelen))
+   if (ce_namelen(ce) == fullname.len &&
+   !memcmp(ce->name, fullname.buf, fullname.len))
die("Path '%s' is in the index, but not '%s'.\n"
"Did you mean ':%d:%s' aka ':%d:./%s'?",
-   fullname, filename,
-   ce_stage(ce), fullname,
+   fullname.buf, filename,
+   ce_stage(ce), fullname.buf,
ce_stage(ce), filename);
}
 
@@ -1338,7 +1335,7 @@ static void diagnose_invalid_index_path(int stage,
die("Path '%s' does not exist (neither on disk nor in the 
index).",
filename);
 
-   free(fullname);
+   strbuf_release();
 }
 
 
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 60/67] color: add color_set helper for copying raw colors

2015-09-15 Thread Jeff King
To set up default colors, we sometimes strcpy() from the
default string literals into our color buffers. This isn't a
bug (assuming the destination is COLOR_MAXLEN bytes), but
makes it harder to audit the code for problematic strcpy
calls.

Let's introduce a color_set which copies under the
assumption that there are COLOR_MAXLEN bytes in the
destination (of course you can call it on a smaller buffer,
so this isn't providing a huge amount of safety, but it's
more convenient than calling xsnprintf yourself).

Signed-off-by: Jeff King 
---
 color.c |  5 +
 color.h |  7 +++
 grep.c  | 32 
 3 files changed, 28 insertions(+), 16 deletions(-)

diff --git a/color.c b/color.c
index 3a3fa33..42f19be 100644
--- a/color.c
+++ b/color.c
@@ -145,6 +145,11 @@ int color_parse(const char *value, char *dst)
return color_parse_mem(value, strlen(value), dst);
 }
 
+void color_set(char *dst, const char *color_bytes)
+{
+   xsnprintf(dst, COLOR_MAXLEN, "%s", color_bytes);
+}
+
 /*
  * Write the ANSI color codes for "c" to "out"; the string should
  * already have the ANSI escape code in it. "out" should have enough
diff --git a/color.h b/color.h
index 7fe77fb..e155d13 100644
--- a/color.h
+++ b/color.h
@@ -75,6 +75,13 @@ extern int color_stdout_is_tty;
 int git_color_config(const char *var, const char *value, void *cb);
 int git_color_default_config(const char *var, const char *value, void *cb);
 
+/*
+ * Set the color buffer (which must be COLOR_MAXLEN bytes)
+ * to the raw color bytes; this is useful for initializing
+ * default color variables.
+ */
+void color_set(char *dst, const char *color_bytes);
+
 int git_config_colorbool(const char *var, const char *value);
 int want_color(int var);
 int color_parse(const char *value, char *dst);
diff --git a/grep.c b/grep.c
index 6c68d5b..7b2b96a 100644
--- a/grep.c
+++ b/grep.c
@@ -31,14 +31,14 @@ void init_grep_defaults(void)
opt->max_depth = -1;
opt->pattern_type_option = GREP_PATTERN_TYPE_UNSPECIFIED;
opt->extended_regexp_option = 0;
-   strcpy(opt->color_context, "");
-   strcpy(opt->color_filename, "");
-   strcpy(opt->color_function, "");
-   strcpy(opt->color_lineno, "");
-   strcpy(opt->color_match_context, GIT_COLOR_BOLD_RED);
-   strcpy(opt->color_match_selected, GIT_COLOR_BOLD_RED);
-   strcpy(opt->color_selected, "");
-   strcpy(opt->color_sep, GIT_COLOR_CYAN);
+   color_set(opt->color_context, "");
+   color_set(opt->color_filename, "");
+   color_set(opt->color_function, "");
+   color_set(opt->color_lineno, "");
+   color_set(opt->color_match_context, GIT_COLOR_BOLD_RED);
+   color_set(opt->color_match_selected, GIT_COLOR_BOLD_RED);
+   color_set(opt->color_selected, "");
+   color_set(opt->color_sep, GIT_COLOR_CYAN);
opt->color = -1;
 }
 
@@ -151,14 +151,14 @@ void grep_init(struct grep_opt *opt, const char *prefix)
opt->regflags = def->regflags;
opt->relative = def->relative;
 
-   strcpy(opt->color_context, def->color_context);
-   strcpy(opt->color_filename, def->color_filename);
-   strcpy(opt->color_function, def->color_function);
-   strcpy(opt->color_lineno, def->color_lineno);
-   strcpy(opt->color_match_context, def->color_match_context);
-   strcpy(opt->color_match_selected, def->color_match_selected);
-   strcpy(opt->color_selected, def->color_selected);
-   strcpy(opt->color_sep, def->color_sep);
+   color_set(opt->color_context, def->color_context);
+   color_set(opt->color_filename, def->color_filename);
+   color_set(opt->color_function, def->color_function);
+   color_set(opt->color_lineno, def->color_lineno);
+   color_set(opt->color_match_context, def->color_match_context);
+   color_set(opt->color_match_selected, def->color_match_selected);
+   color_set(opt->color_selected, def->color_selected);
+   color_set(opt->color_sep, def->color_sep);
 }
 
 void grep_commit_pattern_type(enum grep_pattern_type pattern_type, struct 
grep_opt *opt)
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/67] show-branch: avoid segfault with --reflog of unborn branch

2015-09-15 Thread Jeff King
When no branch is given to the "--reflog" option, we resolve
HEAD to get the default branch. However, if HEAD points to
an unborn branch, resolve_ref returns NULL, and we later
segfault trying to access it.

Signed-off-by: Jeff King 
---
 builtin/show-branch.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 408ce70..092b59b 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -743,6 +743,8 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
fake_av[1] = NULL;
av = fake_av;
ac = 1;
+   if (!*av)
+   die("no branches given, and HEAD is not valid");
}
if (ac != 1)
die("--reflog option needs one branch name");
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/67] mailsplit: make PATH_MAX buffers dynamic

2015-09-15 Thread Jeff King
There are several static PATH_MAX-sized buffers in
mailsplit, along with some questionable uses of sprintf.
These are not really of security interest, as local
mailsplit pathnames are not typically under control of an
attacker.  But it does not hurt to be careful, and as a
bonus we lift some limits for systems with too-small
PATH_MAX varibles.

Signed-off-by: Jeff King 
---
 builtin/mailsplit.c | 46 +-
 1 file changed, 29 insertions(+), 17 deletions(-)

diff --git a/builtin/mailsplit.c b/builtin/mailsplit.c
index 9de06e3..fb0bc08 100644
--- a/builtin/mailsplit.c
+++ b/builtin/mailsplit.c
@@ -98,30 +98,37 @@ static int populate_maildir_list(struct string_list *list, 
const char *path)
 {
DIR *dir;
struct dirent *dent;
-   char name[PATH_MAX];
+   struct strbuf name = STRBUF_INIT;
char *subs[] = { "cur", "new", NULL };
char **sub;
+   int ret = -1;
 
for (sub = subs; *sub; ++sub) {
-   snprintf(name, sizeof(name), "%s/%s", path, *sub);
-   if ((dir = opendir(name)) == NULL) {
+   strbuf_reset();
+   strbuf_addf(, "%s/%s", path, *sub);
+   if ((dir = opendir(name.buf)) == NULL) {
if (errno == ENOENT)
continue;
-   error("cannot opendir %s (%s)", name, strerror(errno));
-   return -1;
+   error("cannot opendir %s (%s)", name.buf, 
strerror(errno));
+   goto out;
}
 
while ((dent = readdir(dir)) != NULL) {
if (dent->d_name[0] == '.')
continue;
-   snprintf(name, sizeof(name), "%s/%s", *sub, 
dent->d_name);
-   string_list_insert(list, name);
+   strbuf_reset();
+   strbuf_addf(, "%s/%s", *sub, dent->d_name);
+   string_list_insert(list, name.buf);
}
 
closedir(dir);
}
 
-   return 0;
+   ret = 0;
+
+out:
+   strbuf_release();
+   return ret;
 }
 
 static int maildir_filename_cmp(const char *a, const char *b)
@@ -148,8 +155,7 @@ static int maildir_filename_cmp(const char *a, const char 
*b)
 static int split_maildir(const char *maildir, const char *dir,
int nr_prec, int skip)
 {
-   char file[PATH_MAX];
-   char name[PATH_MAX];
+   struct strbuf file = STRBUF_INIT;
FILE *f = NULL;
int ret = -1;
int i;
@@ -161,20 +167,25 @@ static int split_maildir(const char *maildir, const char 
*dir,
goto out;
 
for (i = 0; i < list.nr; i++) {
-   snprintf(file, sizeof(file), "%s/%s", maildir, 
list.items[i].string);
-   f = fopen(file, "r");
+   char *name;
+
+   strbuf_reset();
+   strbuf_addf(, "%s/%s", maildir, list.items[i].string);
+
+   f = fopen(file.buf, "r");
if (!f) {
-   error("cannot open mail %s (%s)", file, 
strerror(errno));
+   error("cannot open mail %s (%s)", file.buf, 
strerror(errno));
goto out;
}
 
if (strbuf_getwholeline(, f, '\n')) {
-   error("cannot read mail %s (%s)", file, 
strerror(errno));
+   error("cannot read mail %s (%s)", file.buf, 
strerror(errno));
goto out;
}
 
-   sprintf(name, "%s/%0*d", dir, nr_prec, ++skip);
+   name = xstrfmt("%s/%0*d", dir, nr_prec, ++skip);
split_one(f, name, 1);
+   free(name);
 
fclose(f);
f = NULL;
@@ -184,6 +195,7 @@ static int split_maildir(const char *maildir, const char 
*dir,
 out:
if (f)
fclose(f);
+   strbuf_release();
string_list_clear(, 1);
return ret;
 }
@@ -191,7 +203,6 @@ out:
 static int split_mbox(const char *file, const char *dir, int allow_bare,
  int nr_prec, int skip)
 {
-   char name[PATH_MAX];
int ret = -1;
int peek;
 
@@ -218,8 +229,9 @@ static int split_mbox(const char *file, const char *dir, 
int allow_bare,
}
 
while (!file_done) {
-   sprintf(name, "%s/%0*d", dir, nr_prec, ++skip);
+   char *name = xstrfmt("%s/%0*d", dir, nr_prec, ++skip);
file_done = split_one(f, name, allow_bare);
+   free(name);
}
 
if (f != stdin)
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/67] fsck: use strbuf to generate alternate directories

2015-09-15 Thread Jeff King
When fsck-ing alternates, we make a copy of the alternate
directory in a fixed PATH_MAX buffer. We memcpy directly,
without any check whether we are overflowing the buffer.
This is OK if PATH_MAX is a true representation of the
maximum path on the system, because any path here will have
already been vetted by the alternates subsystem. But that is
not true on every system, so we should be more careful.

Signed-off-by: Jeff King 
---
 builtin/fsck.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 46c7235..a019f4a 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -683,11 +683,12 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
 
prepare_alt_odb();
for (alt = alt_odb_list; alt; alt = alt->next) {
-   char namebuf[PATH_MAX];
-   int namelen = alt->name - alt->base;
-   memcpy(namebuf, alt->base, namelen);
-   namebuf[namelen - 1] = 0;
-   fsck_object_dir(namebuf);
+   /* directory name, minus trailing slash */
+   size_t namelen = alt->name - alt->base - 1;
+   struct strbuf name = STRBUF_INIT;
+   strbuf_add(, alt->base, namelen);
+   fsck_object_dir(name.buf);
+   strbuf_release();
}
}
 
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/67] trace: use strbuf for quote_crnl output

2015-09-15 Thread Jeff King
When we output GIT_TRACE_SETUP paths, we quote any
meta-characters. But our buffer to hold the result is only
PATH_MAX bytes, and we could double the size of the input
path (if every character needs quoted). We could use a
2*PATH_MAX buffer, if we assume the input will never be more
than PATH_MAX. But it's easier still to just switch to a
strbuf and not worry about whether the input can exceed
PATH_MAX or not.

Signed-off-by: Jeff King 
---
 trace.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/trace.c b/trace.c
index 7393926..0c06d71 100644
--- a/trace.c
+++ b/trace.c
@@ -277,25 +277,25 @@ void trace_performance_fl(const char *file, int line, 
uint64_t nanos,
 
 static const char *quote_crnl(const char *path)
 {
-   static char new_path[PATH_MAX];
+   static struct strbuf new_path = STRBUF_INIT;
const char *p2 = path;
-   char *p1 = new_path;
 
if (!path)
return NULL;
 
+   strbuf_reset(_path);
+
while (*p2) {
switch (*p2) {
-   case '\\': *p1++ = '\\'; *p1++ = '\\'; break;
-   case '\n': *p1++ = '\\'; *p1++ = 'n'; break;
-   case '\r': *p1++ = '\\'; *p1++ = 'r'; break;
+   case '\\': strbuf_addstr(_path, ""); break;
+   case '\n': strbuf_addstr(_path, "\\n"); break;
+   case '\r': strbuf_addstr(_path, "\\r"); break;
default:
-   *p1++ = *p2;
+   strbuf_addch(_path, *p2);
}
p2++;
}
-   *p1 = '\0';
-   return new_path;
+   return new_path.buf;
 }
 
 /* FIXME: move prefix to startup_info struct and get rid of this arg */
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 64/67] Makefile: drop D_INO_IN_DIRENT build knob

2015-09-15 Thread Jeff King
Now that fsck has dropped its inode-sorting, there are no
longer any users of this knob, and it can go away.

Signed-off-by: Jeff King 
---
 Makefile | 5 -
 config.mak.uname | 3 ---
 configure.ac | 7 ---
 3 files changed, 15 deletions(-)

diff --git a/Makefile b/Makefile
index 8d5df7e..2f350ca 100644
--- a/Makefile
+++ b/Makefile
@@ -74,8 +74,6 @@ all::
 # Define HAVE_PATHS_H if you have paths.h and want to use the default PATH
 # it specifies.
 #
-# Define NO_D_INO_IN_DIRENT if you don't have d_ino in your struct dirent.
-#
 # Define NO_D_TYPE_IN_DIRENT if your platform defines DT_UNKNOWN but lacks
 # d_type in struct dirent (Cygwin 1.5, fixed in Cygwin 1.7).
 #
@@ -1160,9 +1158,6 @@ endif
 ifdef NO_D_TYPE_IN_DIRENT
BASIC_CFLAGS += -DNO_D_TYPE_IN_DIRENT
 endif
-ifdef NO_D_INO_IN_DIRENT
-   BASIC_CFLAGS += -DNO_D_INO_IN_DIRENT
-endif
 ifdef NO_GECOS_IN_PWENT
BASIC_CFLAGS += -DNO_GECOS_IN_PWENT
 endif
diff --git a/config.mak.uname b/config.mak.uname
index 943c439..f34dcaa 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -166,7 +166,6 @@ endif
 ifeq ($(uname_O),Cygwin)
ifeq ($(shell expr "$(uname_R)" : '1\.[1-6]\.'),4)
NO_D_TYPE_IN_DIRENT = YesPlease
-   NO_D_INO_IN_DIRENT = YesPlease
NO_STRCASESTR = YesPlease
NO_MEMMEM = YesPlease
NO_MKSTEMPS = YesPlease
@@ -370,7 +369,6 @@ ifeq ($(uname_S),Windows)
NO_POSIX_GOODIES = UnfortunatelyYes
NATIVE_CRLF = YesPlease
DEFAULT_HELP_FORMAT = html
-   NO_D_INO_IN_DIRENT = YesPlease
 
CC = compat/vcbuild/scripts/clink.pl
AR = compat/vcbuild/scripts/lib.pl
@@ -520,7 +518,6 @@ ifneq (,$(findstring MINGW,$(uname_S)))
NO_INET_NTOP = YesPlease
NO_POSIX_GOODIES = UnfortunatelyYes
DEFAULT_HELP_FORMAT = html
-   NO_D_INO_IN_DIRENT = YesPlease
COMPAT_CFLAGS += -D__USE_MINGW_ACCESS -D_USE_32BIT_TIME_T -DNOGDI 
-Icompat -Icompat/win32
COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\"
COMPAT_OBJS += compat/mingw.o compat/winansi.o \
diff --git a/configure.ac b/configure.ac
index 14012fa..3fcca61 100644
--- a/configure.ac
+++ b/configure.ac
@@ -767,13 +767,6 @@ elif test x$ac_cv_member_struct_stat_st_mtim_tv_nsec != 
xyes; then
GIT_CONF_SUBST([NO_NSEC])
 fi
 #
-# Define NO_D_INO_IN_DIRENT if you don't have d_ino in your struct dirent.
-AC_CHECK_MEMBER(struct dirent.d_ino,
-[NO_D_INO_IN_DIRENT=],
-[NO_D_INO_IN_DIRENT=YesPlease],
-[#include ])
-GIT_CONF_SUBST([NO_D_INO_IN_DIRENT])
-#
 # Define NO_D_TYPE_IN_DIRENT if your platform defines DT_UNKNOWN but lacks
 # d_type in struct dirent (latest Cygwin -- will be fixed soonish).
 AC_CHECK_MEMBER(struct dirent.d_type,
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 65/67] fsck: use for_each_loose_file_in_objdir

2015-09-15 Thread Jeff King
Since 27e1e22 (prune: factor out loose-object directory
traversal, 2014-10-15), we now have a generic callback
system for iterating over the loose object directories. This
is used by prune, count-objects, etc.

We did not convert git-fsck at the time because it
implemented an inode-sorting scheme that was not part of the
generic code. Now that the inode-sorting code is gone, we
can reuse the generic code.  The result is shorter,
hopefully more readable, and drops some unchecked sprintf
calls.

Signed-off-by: Jeff King 
---
 builtin/fsck.c | 69 --
 1 file changed, 23 insertions(+), 46 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 73c3596..2fe6a31 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -365,45 +365,6 @@ static int fsck_obj_buffer(const unsigned char *sha1, enum 
object_type type,
return fsck_obj(obj);
 }
 
-static inline int is_loose_object_file(struct dirent *de,
-  char *name, unsigned char *sha1)
-{
-   if (strlen(de->d_name) != 38)
-   return 0;
-   memcpy(name + 2, de->d_name, 39);
-   return !get_sha1_hex(name, sha1);
-}
-
-static void fsck_dir(int i, char *path)
-{
-   DIR *dir = opendir(path);
-   struct dirent *de;
-   char name[100];
-
-   if (!dir)
-   return;
-
-   if (verbose)
-   fprintf(stderr, "Checking directory %s\n", path);
-
-   sprintf(name, "%02x", i);
-   while ((de = readdir(dir)) != NULL) {
-   unsigned char sha1[20];
-
-   if (is_dot_or_dotdot(de->d_name))
-   continue;
-   if (is_loose_object_file(de, name, sha1)) {
-   if (fsck_sha1(sha1))
-   errors_found |= ERROR_OBJECT;
-   continue;
-   }
-   if (starts_with(de->d_name, "tmp_obj_"))
-   continue;
-   fprintf(stderr, "bad sha1 file: %s/%s\n", path, de->d_name);
-   }
-   closedir(dir);
-}
-
 static int default_refs;
 
 static void fsck_handle_reflog_sha1(const char *refname, unsigned char *sha1)
@@ -491,9 +452,28 @@ static void get_default_heads(void)
}
 }
 
+static int fsck_loose(const unsigned char *sha1, const char *path, void *data)
+{
+   if (fsck_sha1(sha1))
+   errors_found |= ERROR_OBJECT;
+   return 0;
+}
+
+static int fsck_cruft(const char *basename, const char *path, void *data)
+{
+   if (!starts_with(basename, "tmp_obj_"))
+   fprintf(stderr, "bad sha1 file: %s\n", path);
+   return 0;
+}
+
+static int fsck_subdir(int nr, const char *path, void *progress)
+{
+   display_progress(progress, nr + 1);
+   return 0;
+}
+
 static void fsck_object_dir(const char *path)
 {
-   int i;
struct progress *progress = NULL;
 
if (verbose)
@@ -501,12 +481,9 @@ static void fsck_object_dir(const char *path)
 
if (show_progress)
progress = start_progress(_("Checking object directories"), 
256);
-   for (i = 0; i < 256; i++) {
-   static char dir[4096];
-   sprintf(dir, "%s/%02x", path, i);
-   fsck_dir(i, dir);
-   display_progress(progress, i+1);
-   }
+
+   for_each_loose_file_in_objdir(path, fsck_loose, fsck_cruft, fsck_subdir,
+ progress);
stop_progress();
 }
 
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 63/67] fsck: drop inode-sorting code

2015-09-15 Thread Jeff King
Fsck tries to access loose objects in order of inode number,
with the hope that this would make cold cache access faster
on a spinning disk. This dates back to 7e8c174 (fsck-cache:
sort entries by inode number, 2005-05-02), which predates
the invention of packfiles.

These days, there's not much point in trying to optimize
cold cache for a large number of loose objects. You are much
better off to simply pack the objects, which will reduce the
disk footprint _and_ provide better locality of data access.

So while you can certainly construct pathological cases
where this code might help, it is not worth the trouble
anymore.

Signed-off-by: Jeff King 
---
 builtin/fsck.c | 70 ++
 1 file changed, 2 insertions(+), 68 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index a019f4a..73c3596 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -39,14 +39,6 @@ static int show_dangling = 1;
 #define ERROR_REACHABLE 02
 #define ERROR_PACK 04
 
-#ifdef NO_D_INO_IN_DIRENT
-#define SORT_DIRENT 0
-#define DIRENT_SORT_HINT(de) 0
-#else
-#define SORT_DIRENT 1
-#define DIRENT_SORT_HINT(de) ((de)->d_ino)
-#endif
-
 static int fsck_config(const char *var, const char *value, void *cb)
 {
if (strcmp(var, "fsck.skiplist") == 0) {
@@ -373,64 +365,6 @@ static int fsck_obj_buffer(const unsigned char *sha1, enum 
object_type type,
return fsck_obj(obj);
 }
 
-/*
- * This is the sorting chunk size: make it reasonably
- * big so that we can sort well..
- */
-#define MAX_SHA1_ENTRIES (1024)
-
-struct sha1_entry {
-   unsigned long ino;
-   unsigned char sha1[20];
-};
-
-static struct {
-   unsigned long nr;
-   struct sha1_entry *entry[MAX_SHA1_ENTRIES];
-} sha1_list;
-
-static int ino_compare(const void *_a, const void *_b)
-{
-   const struct sha1_entry *a = _a, *b = _b;
-   unsigned long ino1 = a->ino, ino2 = b->ino;
-   return ino1 < ino2 ? -1 : ino1 > ino2 ? 1 : 0;
-}
-
-static void fsck_sha1_list(void)
-{
-   int i, nr = sha1_list.nr;
-
-   if (SORT_DIRENT)
-   qsort(sha1_list.entry, nr,
- sizeof(struct sha1_entry *), ino_compare);
-   for (i = 0; i < nr; i++) {
-   struct sha1_entry *entry = sha1_list.entry[i];
-   unsigned char *sha1 = entry->sha1;
-
-   sha1_list.entry[i] = NULL;
-   if (fsck_sha1(sha1))
-   errors_found |= ERROR_OBJECT;
-   free(entry);
-   }
-   sha1_list.nr = 0;
-}
-
-static void add_sha1_list(unsigned char *sha1, unsigned long ino)
-{
-   struct sha1_entry *entry = xmalloc(sizeof(*entry));
-   int nr;
-
-   entry->ino = ino;
-   hashcpy(entry->sha1, sha1);
-   nr = sha1_list.nr;
-   if (nr == MAX_SHA1_ENTRIES) {
-   fsck_sha1_list();
-   nr = 0;
-   }
-   sha1_list.entry[nr] = entry;
-   sha1_list.nr = ++nr;
-}
-
 static inline int is_loose_object_file(struct dirent *de,
   char *name, unsigned char *sha1)
 {
@@ -459,7 +393,8 @@ static void fsck_dir(int i, char *path)
if (is_dot_or_dotdot(de->d_name))
continue;
if (is_loose_object_file(de, name, sha1)) {
-   add_sha1_list(sha1, DIRENT_SORT_HINT(de));
+   if (fsck_sha1(sha1))
+   errors_found |= ERROR_OBJECT;
continue;
}
if (starts_with(de->d_name, "tmp_obj_"))
@@ -573,7 +508,6 @@ static void fsck_object_dir(const char *path)
display_progress(progress, i+1);
}
stop_progress();
-   fsck_sha1_list();
 }
 
 static int fsck_head_link(void)
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/67] strbuf: make strbuf_complete_line more generic

2015-09-15 Thread Jeff King
The strbuf_complete_line function make sure that a buffer
ends in a newline. But we may want to do this for any
character (e.g., "/" on the end of a path). Let's factor out
a generic version, and keep strbuf_complete_line as a thin
wrapper.

Signed-off-by: Jeff King 
---
 strbuf.h | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/strbuf.h b/strbuf.h
index aef2794..ba099cd 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -491,10 +491,22 @@ extern void strbuf_add_lines(struct strbuf *sb, const 
char *prefix, const char *
  */
 extern void strbuf_addstr_xml_quoted(struct strbuf *sb, const char *s);
 
+/**
+ * Ensure that `sb` ends with the character `term`, if it does not
+ * already.
+ */
+static inline void strbuf_complete(struct strbuf *sb, char term)
+{
+   if (sb->len && sb->buf[sb->len - 1] != term)
+   strbuf_addch(sb, term);
+}
+
+/**
+ * Ensure that `sb` ends with a newline.
+ */
 static inline void strbuf_complete_line(struct strbuf *sb)
 {
-   if (sb->len && sb->buf[sb->len - 1] != '\n')
-   strbuf_addch(sb, '\n');
+   strbuf_complete(sb, '\n');
 }
 
 extern int strbuf_branchname(struct strbuf *sb, const char *name);
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/67] fsck: don't fsck alternates for connectivity-only check

2015-09-15 Thread Jeff King
Commit 02976bf (fsck: introduce `git fsck --connectivity-only`,
2015-06-22) recently gave fsck an option to perform only a
subset of the checks, by skipping the fsck_object_dir()
call. However, it does so only for the local object
directory, and we still do expensive checks on any alternate
repos. We should skip them in this case, too.

Signed-off-by: Jeff King 
---
 builtin/fsck.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 0794703..46c7235 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -678,16 +678,17 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
git_config(fsck_config, NULL);
 
fsck_head_link();
-   if (!connectivity_only)
+   if (!connectivity_only) {
fsck_object_dir(get_object_directory());
 
-   prepare_alt_odb();
-   for (alt = alt_odb_list; alt; alt = alt->next) {
-   char namebuf[PATH_MAX];
-   int namelen = alt->name - alt->base;
-   memcpy(namebuf, alt->base, namelen);
-   namebuf[namelen - 1] = 0;
-   fsck_object_dir(namebuf);
+   prepare_alt_odb();
+   for (alt = alt_odb_list; alt; alt = alt->next) {
+   char namebuf[PATH_MAX];
+   int namelen = alt->name - alt->base;
+   memcpy(namebuf, alt->base, namelen);
+   namebuf[namelen - 1] = 0;
+   fsck_object_dir(namebuf);
+   }
}
 
if (check_full) {
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/67] add git_path_buf helper function

2015-09-15 Thread Jeff King
If you have a function that uses git_path a lot, but would
prefer to avoid the static buffers, it's useful to keep a
single scratch buffer locally and reuse it for each call.
You used to be able to do this with git_snpath:

  char buf[PATH_MAX];

  foo(git_snpath(buf, sizeof(buf), "foo"));
  bar(git_snpath(buf, sizeof(buf), "bar"));

but since 1a83c24, git_snpath has been replaced with
strbuf_git_path. This is good, because it removes the
arbitrary PATH_MAX limit. But using strbuf_git_path is more
awkward for two reasons:

  1. It adds to the buffer, rather than replacing it. This
 is consistent with other strbuf functions, but makes
 reuse of a single buffer more tedious.

  2. It doesn't return the buffer, so you can't format
 as part of a function's arguments.

The new git_path_buf solves both of these, so you can use it
like:

  struct strbuf buf = STRBUF_INIT;

  foo(git_path_buf(, "foo"));
  bar(git_path_buf(, "bar"));

  strbuf_release();

Signed-off-by: Jeff King 
---
 cache.h |  2 ++
 path.c  | 10 ++
 2 files changed, 12 insertions(+)

diff --git a/cache.h b/cache.h
index 79066e5..e231e47 100644
--- a/cache.h
+++ b/cache.h
@@ -723,6 +723,8 @@ extern char *mksnpath(char *buf, size_t n, const char *fmt, 
...)
__attribute__((format (printf, 3, 4)));
 extern void strbuf_git_path(struct strbuf *sb, const char *fmt, ...)
__attribute__((format (printf, 2, 3)));
+extern char *git_path_buf(struct strbuf *buf, const char *fmt, ...)
+   __attribute__((format (printf, 2, 3)));
 extern void strbuf_git_path_submodule(struct strbuf *sb, const char *path,
  const char *fmt, ...)
__attribute__((format (printf, 3, 4)));
diff --git a/path.c b/path.c
index 95acbaf..46a4d27 100644
--- a/path.c
+++ b/path.c
@@ -175,6 +175,16 @@ static void do_git_path(struct strbuf *buf, const char 
*fmt, va_list args)
strbuf_cleanup_path(buf);
 }
 
+char *git_path_buf(struct strbuf *buf, const char *fmt, ...)
+{
+   va_list args;
+   strbuf_reset(buf);
+   va_start(args, fmt);
+   do_git_path(buf, fmt, args);
+   va_end(args);
+   return buf->buf;
+}
+
 void strbuf_git_path(struct strbuf *sb, const char *fmt, ...)
 {
va_list args;
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/67] add xsnprintf helper function

2015-09-15 Thread Jeff King
There are a number of places in the code where we call
sprintf(), with the assumption that the output will fit into
the buffer. In many cases this is true (e.g., formatting a
number into a large buffer), but it is hard to tell
immediately from looking at the code. It would be nice if we
had some run-time check to make sure that our assumption is
correct (and to communicate to readers of the code that we
are not blindly calling sprintf, but have actually thought
about this case).

This patch introduces xsnprintf, which behaves just like
snprintf, except that it dies whenever the output is
truncated. This acts as a sort of assert() for these cases,
which can help find places where the assumption is violated
(as opposed to truncating and proceeding, which may just
silently give a wrong answer).

Signed-off-by: Jeff King 
---
 git-compat-util.h |  3 +++
 wrapper.c | 16 
 2 files changed, 19 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index f649e81..348b9dc 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -744,6 +744,9 @@ static inline size_t xsize_t(off_t len)
return (size_t)len;
 }
 
+__attribute__((format (printf, 3, 4)))
+extern int xsnprintf(char *dst, size_t max, const char *fmt, ...);
+
 /* in ctype.c, for kwset users */
 extern const unsigned char tolower_trans_tbl[256];
 
diff --git a/wrapper.c b/wrapper.c
index 0e22d43..6fcaa4d 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -621,6 +621,22 @@ char *xgetcwd(void)
return strbuf_detach(, NULL);
 }
 
+int xsnprintf(char *dst, size_t max, const char *fmt, ...)
+{
+   va_list ap;
+   int len;
+
+   va_start(ap, fmt);
+   len = vsnprintf(dst, max, fmt, ap);
+   va_end(ap);
+
+   if (len < 0)
+   die("BUG: your snprintf is broken");
+   if (len >= max)
+   die("BUG: attempt to snprintf into too-small buffer");
+   return len;
+}
+
 static int write_file_v(const char *path, int fatal,
const char *fmt, va_list params)
 {
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 59/67] prefer memcpy to strcpy

2015-09-15 Thread Jeff King
When we already know the length of a string (e.g., because
we just malloc'd to fit it), it's nicer to use memcpy than
strcpy, as it makes it more obvious that we are not going to
overflow the buffer (because the size we pass matches the
size in the allocation).

This also eliminates calls to strcpy, which make auditing
the code base harder.

Signed-off-by: Jeff King 
---
 compat/nedmalloc/nedmalloc.c | 5 +++--
 fast-import.c| 5 +++--
 revision.c   | 2 +-
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/compat/nedmalloc/nedmalloc.c b/compat/nedmalloc/nedmalloc.c
index 609ebba..a0a16eb 100644
--- a/compat/nedmalloc/nedmalloc.c
+++ b/compat/nedmalloc/nedmalloc.c
@@ -957,8 +957,9 @@ char *strdup(const char *s1)
 {
char *s2 = 0;
if (s1) {
-   s2 = malloc(strlen(s1) + 1);
-   strcpy(s2, s1);
+   size_t len = strlen(s1) + 1;
+   s2 = malloc(len);
+   memcpy(s2, s1, len);
}
return s2;
 }
diff --git a/fast-import.c b/fast-import.c
index 895c6b4..cf6d8bc 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -644,8 +644,9 @@ static void *pool_calloc(size_t count, size_t size)
 
 static char *pool_strdup(const char *s)
 {
-   char *r = pool_alloc(strlen(s) + 1);
-   strcpy(r, s);
+   size_t len = strlen(s) + 1;
+   char *r = pool_alloc(len);
+   memcpy(r, s, len);
return r;
 }
 
diff --git a/revision.c b/revision.c
index af2a18e..2236463 100644
--- a/revision.c
+++ b/revision.c
@@ -38,7 +38,7 @@ char *path_name(const struct name_path *path, const char 
*name)
}
n = xmalloc(len);
m = n + len - (nlen + 1);
-   strcpy(m, name);
+   memcpy(m, name, nlen + 1);
for (p = path; p; p = p->up) {
if (p->elem_len) {
m -= p->elem_len + 1;
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 67/67] name-rev: use strip_suffix to avoid magic numbers

2015-09-15 Thread Jeff King
The manual size computations here are correct, but using
strip_suffix makes that obvious, and hopefully communicates
the intent of the code more clearly.

Signed-off-by: Jeff King 
---
 builtin/name-rev.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 8a3a0cd..0377fc1 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -55,16 +55,15 @@ copy_data:
parents;
parents = parents->next, parent_number++) {
if (parent_number > 1) {
-   int len = strlen(tip_name);
+   size_t len;
char *new_name;
 
-   if (len > 2 && !strcmp(tip_name + len - 2, "^0"))
-   len -= 2;
+   strip_suffix(tip_name, "^0", );
if (generation > 0)
-   new_name = xstrfmt("%.*s~%d^%d", len, tip_name,
+   new_name = xstrfmt("%.*s~%d^%d", (int)len, 
tip_name,
   generation, parent_number);
else
-   new_name = xstrfmt("%.*s^%d", len, tip_name,
+   new_name = xstrfmt("%.*s^%d", (int)len, 
tip_name,
   parent_number);
 
name_rev(parents->item, new_name, 0,
-- 
2.6.0.rc2.408.ga2926b9
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/67] archive-tar: fix minor indentation violation

2015-09-15 Thread Jeff King
This looks like a simple omission from 8539070 (archive-tar:
unindent write_tar_entry by one level, 2012-05-03).

Signed-off-by: Jeff King 
---
 archive-tar.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/archive-tar.c b/archive-tar.c
index 0d1e6bd..b6b30bb 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -233,7 +233,7 @@ static int write_tar_entry(struct archiver_args *args,
size_t rest = pathlen - plen - 1;
if (plen > 0 && rest <= sizeof(header.name)) {
memcpy(header.prefix, path, plen);
-   memcpy(header.name, path + plen + 1, rest);
+   memcpy(header.name, path + plen + 1, rest);
} else {
sprintf(header.name, "%s.data",
sha1_to_hex(sha1));
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/67] mailsplit: fix FILE* leak in split_maildir

2015-09-15 Thread Jeff King
If we encounter an error while splitting a maildir, we exit
the function early, leaking the open filehandle. This isn't
a big deal, since we exit the program soon after, but it's
easy enough to be careful.

Signed-off-by: Jeff King 
---
 builtin/mailsplit.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/builtin/mailsplit.c b/builtin/mailsplit.c
index 8e02ea1..9de06e3 100644
--- a/builtin/mailsplit.c
+++ b/builtin/mailsplit.c
@@ -150,6 +150,7 @@ static int split_maildir(const char *maildir, const char 
*dir,
 {
char file[PATH_MAX];
char name[PATH_MAX];
+   FILE *f = NULL;
int ret = -1;
int i;
struct string_list list = STRING_LIST_INIT_DUP;
@@ -160,7 +161,6 @@ static int split_maildir(const char *maildir, const char 
*dir,
goto out;
 
for (i = 0; i < list.nr; i++) {
-   FILE *f;
snprintf(file, sizeof(file), "%s/%s", maildir, 
list.items[i].string);
f = fopen(file, "r");
if (!f) {
@@ -177,10 +177,13 @@ static int split_maildir(const char *maildir, const char 
*dir,
split_one(f, name, 1);
 
fclose(f);
+   f = NULL;
}
 
ret = skip;
 out:
+   if (f)
+   fclose(f);
string_list_clear(, 1);
return ret;
 }
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 22/67] entry.c: convert strcpy to xsnprintf

2015-09-15 Thread Jeff King
This particular conversion is non-obvious, because nobody
has passed our function the length of the destination
buffer. However, the interface to checkout_entry specifies
that the buffer must be at least TEMPORARY_FILENAME_LENGTH
bytes long, so we can check that (meaning the existing code
was not buggy, but merely worrisome to somebody reading it).

Signed-off-by: Jeff King 
---
 entry.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/entry.c b/entry.c
index 1eda8e9..582c400 100644
--- a/entry.c
+++ b/entry.c
@@ -96,8 +96,8 @@ static int open_output_fd(char *path, const struct 
cache_entry *ce, int to_tempf
 {
int symlink = (ce->ce_mode & S_IFMT) != S_IFREG;
if (to_tempfile) {
-   strcpy(path, symlink
-  ? ".merge_link_XX" : ".merge_file_XX");
+   xsnprintf(path, TEMPORARY_FILENAME_LENGTH, "%s",
+ symlink ? ".merge_link_XX" : 
".merge_file_XX");
return mkstemp(path);
} else {
return create_file(path, !symlink ? ce->ce_mode : 0666);
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 54/67] color: add overflow checks for parsing colors

2015-09-15 Thread Jeff King
Our color parsing is designed to never exceed COLOR_MAXLEN
bytes. But the relationship between that hand-computed
number and the parsing code is not at all obvious, and we
merely hope that it has been computed correctly for all
cases.

Let's mark the expected "end" pointer for the destination
buffer and make sure that we do not exceed it.

Signed-off-by: Jeff King 
---
 color.c | 39 ---
 1 file changed, 24 insertions(+), 15 deletions(-)

diff --git a/color.c b/color.c
index 9027352..3a3fa33 100644
--- a/color.c
+++ b/color.c
@@ -150,22 +150,24 @@ int color_parse(const char *value, char *dst)
  * already have the ANSI escape code in it. "out" should have enough
  * space in it to fit any color.
  */
-static char *color_output(char *out, const struct color *c, char type)
+static char *color_output(char *out, int len, const struct color *c, char type)
 {
switch (c->type) {
case COLOR_UNSPECIFIED:
case COLOR_NORMAL:
break;
case COLOR_ANSI:
+   if (len < 2)
+   die("BUG: color parsing ran out of space");
*out++ = type;
*out++ = '0' + c->value;
break;
case COLOR_256:
-   out += sprintf(out, "%c8;5;%d", type, c->value);
+   out += xsnprintf(out, len, "%c8;5;%d", type, c->value);
break;
case COLOR_RGB:
-   out += sprintf(out, "%c8;2;%d;%d;%d", type,
-  c->red, c->green, c->blue);
+   out += xsnprintf(out, len, "%c8;2;%d;%d;%d", type,
+c->red, c->green, c->blue);
break;
}
return out;
@@ -180,12 +182,13 @@ int color_parse_mem(const char *value, int value_len, 
char *dst)
 {
const char *ptr = value;
int len = value_len;
+   char *end = dst + COLOR_MAXLEN;
unsigned int attr = 0;
struct color fg = { COLOR_UNSPECIFIED };
struct color bg = { COLOR_UNSPECIFIED };
 
if (!strncasecmp(value, "reset", len)) {
-   strcpy(dst, GIT_COLOR_RESET);
+   xsnprintf(dst, end - dst, GIT_COLOR_RESET);
return 0;
}
 
@@ -224,12 +227,18 @@ int color_parse_mem(const char *value, int value_len, 
char *dst)
goto bad;
}
 
+#define OUT(x) do { \
+   if (dst == end) \
+   die("BUG: color parsing ran out of space"); \
+   *dst++ = (x); \
+} while(0)
+
if (attr || !color_empty() || !color_empty()) {
int sep = 0;
int i;
 
-   *dst++ = '\033';
-   *dst++ = '[';
+   OUT('\033');
+   OUT('[');
 
for (i = 0; attr; i++) {
unsigned bit = (1 << i);
@@ -237,24 +246,24 @@ int color_parse_mem(const char *value, int value_len, 
char *dst)
continue;
attr &= ~bit;
if (sep++)
-   *dst++ = ';';
-   dst += sprintf(dst, "%d", i);
+   OUT(';');
+   dst += xsnprintf(dst, end - dst, "%d", i);
}
if (!color_empty()) {
if (sep++)
-   *dst++ = ';';
+   OUT(';');
/* foreground colors are all in the 3x range */
-   dst = color_output(dst, , '3');
+   dst = color_output(dst, end - dst, , '3');
}
if (!color_empty()) {
if (sep++)
-   *dst++ = ';';
+   OUT(';');
/* background colors are all in the 4x range */
-   dst = color_output(dst, , '4');
+   dst = color_output(dst, end - dst, , '4');
}
-   *dst++ = 'm';
+   OUT('m');
}
-   *dst = 0;
+   OUT(0);
return 0;
 bad:
return error(_("invalid color value: %.*s"), value_len, value);
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 55/67] use alloc_ref rather than hand-allocating "struct ref"

2015-09-15 Thread Jeff King
This saves us some manual computation, and eliminates a call
to strcpy.

Signed-off-by: Jeff King 
---
 builtin/fetch.c | 3 +--
 remote-curl.c   | 5 +
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 841880e..ed84963 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -639,8 +639,7 @@ static int store_updated_refs(const char *raw_url, const 
char *remote_name,
continue;
 
if (rm->peer_ref) {
-   ref = xcalloc(1, sizeof(*ref) + 
strlen(rm->peer_ref->name) + 1);
-   strcpy(ref->name, rm->peer_ref->name);
+   ref = alloc_ref(rm->peer_ref->name);
hashcpy(ref->old_sha1, rm->peer_ref->old_sha1);
hashcpy(ref->new_sha1, rm->old_sha1);
ref->force = rm->peer_ref->force;
diff --git a/remote-curl.c b/remote-curl.c
index 71fbbb6..cc7a8a6 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -168,10 +168,7 @@ static struct ref *parse_info_refs(struct discovery *heads)
url.buf);
data[i] = 0;
ref_name = mid + 1;
-   ref = xmalloc(sizeof(struct ref) +
- strlen(ref_name) + 1);
-   memset(ref, 0, sizeof(struct ref));
-   strcpy(ref->name, ref_name);
+   ref = alloc_ref(ref_name);
get_sha1_hex(start, ref->old_sha1);
if (!refs)
refs = ref;
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 57/67] receive-pack: simplify keep_arg computation

2015-09-15 Thread Jeff King
To generate "--keep=receive-pack $pid on $host", we write
progressively into a single buffer, which requires keeping
track of how much we've written so far. But since the result
is destined to go into our argv array, we can simply use
argv_array_pushf.

Unfortunately we still have to have a static buffer for the
gethostname() call, but at least it now doesn't involve any
extra size computation. And as a bonus, we drop an sprintf
and a strcpy call.

Signed-off-by: Jeff King 
---
 builtin/receive-pack.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 8b50e48..2c82274 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1524,15 +1524,18 @@ static const char *unpack(int err_fd, struct 
shallow_info *si)
if (status)
return "unpack-objects abnormal exit";
} else {
-   int s;
-   char keep_arg[256];
-
-   s = sprintf(keep_arg, "--keep=receive-pack %"PRIuMAX" on ", 
(uintmax_t) getpid());
-   if (gethostname(keep_arg + s, sizeof(keep_arg) - s))
-   strcpy(keep_arg + s, "localhost");
+   char hostname[256];
 
argv_array_pushl(, "index-pack",
-"--stdin", hdr_arg, keep_arg, NULL);
+"--stdin", hdr_arg, NULL);
+
+   if (gethostname(hostname, sizeof(hostname)))
+   xsnprintf(hostname, sizeof(hostname), "localhost");
+   argv_array_pushf(,
+"--keep=receive-pack %"PRIuMAX" on %s",
+(uintmax_t)getpid(),
+hostname);
+
if (fsck_objects)
argv_array_pushf(, "--strict%s",
fsck_msg_types.buf);
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 58/67] help: clean up kfmclient munging

2015-09-15 Thread Jeff King
When we are going to launch "/path/to/konqueror", we instead
rewrite this into "/path/to/kfmclient" by duplicating the
original string and writing over the ending bits. This can
be done more obviously with strip_suffix and xstrfmt.

Note that we also fix a subtle bug with the "filename"
parameter, which is passed as argv[0] to the child. If the
user has configured a program name with no directory
component, we always pass the string "kfmclient", even if
your program is called something else. But if you give a
full path, we give the basename of that path. But more
bizarrely, if we rewrite "konqueror" to "kfmclient", we
still pass "konqueror".

The history of this function doesn't reveal anything
interesting, so it looks like just an oversight from
combining the suffix-munging with the basename-finding.
Let's just call basename on the munged path, which produces
consistent results (if you gave a program, whether a full
path or not, we pass its basename).

Probably this doesn't matter at all in practice, but it
makes the code slightly less confusing to read.

Signed-off-by: Jeff King 
---
 builtin/help.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/builtin/help.c b/builtin/help.c
index fba8c01..e1650ab 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -140,17 +140,10 @@ static void exec_man_konqueror(const char *path, const 
char *page)
 
/* It's simpler to launch konqueror using kfmclient. */
if (path) {
-   const char *file = strrchr(path, '/');
-   if (file && !strcmp(file + 1, "konqueror")) {
-   char *new = xstrdup(path);
-   char *dest = strrchr(new, '/');
-
-   /* strlen("konqueror") == strlen("kfmclient") */
-   strcpy(dest + 1, "kfmclient");
-   path = new;
-   }
-   if (file)
-   filename = file;
+   size_t len;
+   if (strip_suffix(path, "/konqueror", ))
+   path = xstrfmt("%.*s/kfmclient", (int)len, 
path);
+   filename = basename((char *)path);
} else
path = "kfmclient";
strbuf_addf(_page, "man:%s(1)", page);
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/67] war on sprintf, strcpy, etc

2015-09-15 Thread Jeff King
The git code contains a lot of calls to sprintf, strcpy, and other
unchecked string functions. In many cases, these aren't actually
overflows, because some earlier part of the code implies that the copied
content is smaller than the destination buffer. But it's often hard to
tell, because the code enforcing that assumption is far away, or there's
a complicated expression to create a buffer. This makes it difficult to
audit git for buffer overflows, because you can spend a lot of time
chasing down false positives.

My goal with this series was to not only audit each of these sites for
overflows, but to convert them to more modern constructs (e.g.,
strbufs), so that it's easier to do more audits going forward.

There are quite a large number of changes. I've tried to group similar
changes together to make reviewing easier. Here's a rough breakdown:

  [01/67]: show-branch: avoid segfault with --reflog of unborn branch
  [02/67]: mailsplit: fix FILE* leak in split_maildir
  [03/67]: archive-tar: fix minor indentation violation
  [04/67]: fsck: don't fsck alternates for connectivity-only check

These are minor bugfixes that I found while digging into various
call-sites. There's no semantic dependency, but some of the later
patches depend on these textually.

  [05/67]: add xsnprintf helper function
  [06/67]: add git_path_buf helper function
  [07/67]: strbuf: make strbuf_complete_line more generic
  [08/67]: add reentrant variants of sha1_to_hex and find_unique_abbrev

These four patches introduce infrastructure that will help later
cleanups (alongside existing tools like strbuf, xstrfmt, etc).

  [09/67]: fsck: use strbuf to generate alternate directories
  [10/67]: mailsplit: make PATH_MAX buffers dynamic
  [11/67]: trace: use strbuf for quote_crnl output
  [12/67]: progress: store throughput display in a strbuf
  [13/67]: test-dump-cache-tree: avoid overflow of cache-tree name
  [14/67]: compat/inet_ntop: fix off-by-one in inet_ntop4

These cases are all things that _can_ overflow, given the right
input. But none of them is interesting security-wise because,
because their input is not typically attacker-controlled.

  [15/67]: convert trivial sprintf / strcpy calls to xsnprintf
  [16/67]: archive-tar: use xsnprintf for trivial formatting
  [17/67]: use xsnprintf for generating git object headers
  [18/67]: find_short_object_filename: convert sprintf to xsnprintf
  [19/67]: stop_progress_msg: convert sprintf to xsnprintf
  [20/67]: compat/hstrerror: convert sprintf to snprintf
  [21/67]: grep: use xsnprintf to format failure message
  [22/67]: entry.c: convert strcpy to xsnprintf
  [23/67]: add_packed_git: convert strcpy into xsnprintf
  [24/67]: http-push: replace strcat with xsnprintf
  [25/67]: receive-pack: convert strncpy to xsnprintf

These cases can all be fixed by using the newly-added xsnprintf. The
trivial conversions are in patch 15, and the rest are cases that
needed a little more cleanup or explanation.

  [26/67]: replace trivial malloc + sprintf /strcpy calls to xstrfmt
  [27/67]: config: use xstrfmt in normalize_value
  [28/67]: fetch: replace static buffer with xstrfmt
  [29/67]: use strip_suffix and xstrfmt to replace suffix
  [30/67]: ref-filter: drop sprintf and strcpy calls
  [31/67]: help: drop prepend function in favor of xstrfmt
  [32/67]: mailmap: replace strcpy with xstrdup
  [33/67]: read_branches_file: replace strcpy with xstrdup

Ditto, but for xstrfmt/xstrdup.

  [34/67]: resolve_ref: use strbufs for internal buffers
  [35/67]: upload-archive: convert sprintf to strbuf
  [36/67]: remote-ext: simplify git pkt-line generation
  [37/67]: http-push: use strbuf instead of fwrite_buffer
  [38/67]: http-walker: store url in a strbuf
  [39/67]: sha1_get_pack_name: use a strbuf
  [40/67]: init: use strbufs to store paths
  [41/67]: apply: convert root string to strbuf
  [42/67]: transport: use strbufs for status table "quickref" strings
  [43/67]: merge-recursive: convert malloc / strcpy to strbuf
  [44/67]: enter_repo: convert fixed-size buffers to strbufs
  [45/67]: remove_leading_path: use a strbuf for internal storage
  [46/67]: write_loose_object: convert to strbuf
  [47/67]: diagnose_invalid_index_path: use strbuf to avoid strcpy/strcat

Ditto, but for strbufs. I generally used xstrfmt over a strbuf where
it was feasible, since the former is shorter. These cases typically
did something a little more complicated than xstrfmt could handle.

  [48/67]: fetch-pack: use argv_array for index-pack / unpack-objects
  [49/67]: http-push: use an argv_array for setup_revisions
  [50/67]: stat_tracking_info: convert to argv_array
  [51/67]: daemon: use cld->env_array when re-spawning

Ditto, but for argv_array. This helps regular overflows, because
argv_array_pushf uses a strbuf internally. But it also prevents
overflowing the array-of-pointers.

  [52/67]: use sha1_to_hex_to() instead of strcpy
  [53/67]: drop strcpy in 

[PATCH 12/67] progress: store throughput display in a strbuf

2015-09-15 Thread Jeff King
Coverity noticed that we strncpy() into a fixed-size buffer
without making sure that it actually ended up
NUL-terminated. This is unlikely to be a bug in practice,
since throughput strings rarely hit 32 characters, but it
would be nice to clean it up.

The most obvious way to do so is to add a NUL-terminator.
But instead, this patch switches the fixed-size buffer out
for a strbuf. At first glance this seems much less
efficient, until we realize that filling in the fixed-size
buffer is done by writing into a strbuf and copying the
result!

By writing straight to the buffer, we actually end up more
efficient:

  1. We avoid an extra copy of the bytes.

  2. Rather than malloc/free each time progress is shown, we
 can strbuf_reset and use the same buffer each time.

Signed-off-by: Jeff King 
---
I actually sent this one to the list in June:

  http://thread.gmane.org/gmane.comp.version-control.git/271880

but it got overlooked. Good luck overlooking this 67-patch
monstrosity.  :)

 progress.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/progress.c b/progress.c
index 2e31bec..a3efcfd 100644
--- a/progress.c
+++ b/progress.c
@@ -25,7 +25,7 @@ struct throughput {
unsigned int last_bytes[TP_IDX_MAX];
unsigned int last_misecs[TP_IDX_MAX];
unsigned int idx;
-   char display[32];
+   struct strbuf display;
 };
 
 struct progress {
@@ -98,7 +98,7 @@ static int display(struct progress *progress, unsigned n, 
const char *done)
}
 
progress->last_value = n;
-   tp = (progress->throughput) ? progress->throughput->display : "";
+   tp = (progress->throughput) ? progress->throughput->display.buf : "";
eol = done ? done : "   \r";
if (progress->total) {
unsigned percent = n * 100 / progress->total;
@@ -129,6 +129,7 @@ static int display(struct progress *progress, unsigned n, 
const char *done)
 static void throughput_string(struct strbuf *buf, off_t total,
  unsigned int rate)
 {
+   strbuf_reset(buf);
strbuf_addstr(buf, ", ");
strbuf_humanise_bytes(buf, total);
strbuf_addstr(buf, " | ");
@@ -141,7 +142,6 @@ void display_throughput(struct progress *progress, off_t 
total)
struct throughput *tp;
uint64_t now_ns;
unsigned int misecs, count, rate;
-   struct strbuf buf = STRBUF_INIT;
 
if (!progress)
return;
@@ -154,6 +154,7 @@ void display_throughput(struct progress *progress, off_t 
total)
if (tp) {
tp->prev_total = tp->curr_total = total;
tp->prev_ns = now_ns;
+   strbuf_init(>display, 0);
}
return;
}
@@ -193,9 +194,7 @@ void display_throughput(struct progress *progress, off_t 
total)
tp->last_misecs[tp->idx] = misecs;
tp->idx = (tp->idx + 1) % TP_IDX_MAX;
 
-   throughput_string(, total, rate);
-   strncpy(tp->display, buf.buf, sizeof(tp->display));
-   strbuf_release();
+   throughput_string(>display, total, rate);
if (progress->last_value != -1 && progress_update)
display(progress, progress->last_value, NULL);
 }
@@ -250,12 +249,9 @@ void stop_progress_msg(struct progress **p_progress, const 
char *msg)
 
bufp = (len < sizeof(buf)) ? buf : xmalloc(len + 1);
if (tp) {
-   struct strbuf strbuf = STRBUF_INIT;
unsigned int rate = !tp->avg_misecs ? 0 :
tp->avg_bytes / tp->avg_misecs;
-   throughput_string(, tp->curr_total, rate);
-   strncpy(tp->display, strbuf.buf, sizeof(tp->display));
-   strbuf_release();
+   throughput_string(>display, tp->curr_total, rate);
}
progress_update = 1;
sprintf(bufp, ", %s.\n", msg);
@@ -264,6 +260,8 @@ void stop_progress_msg(struct progress **p_progress, const 
char *msg)
free(bufp);
}
clear_progress_signal();
+   if (progress->throughput)
+   strbuf_release(>throughput->display);
free(progress->throughput);
free(progress);
 }
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 24/67] http-push: replace strcat with xsnprintf

2015-09-15 Thread Jeff King
We account for these strcats in our initial allocation, but
the code is confusing to follow and verify. Let's remember
our original allocation length, and then xsnprintf can
verify that we don't exceed it.

Note that we can't just use xstrfmt here (which would be
even cleaner) because the code tries to grow the buffer only
when necessary.

Signed-off-by: Jeff King 
---
It would probably be a good match for a strbuf, but it's
hard to over-emphasize how little interest I have in
refactoring the http-push webdav code.

 http-push.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/http-push.c b/http-push.c
index 1f3788f..37baff8 100644
--- a/http-push.c
+++ b/http-push.c
@@ -786,21 +786,21 @@ xml_start_tag(void *userData, const char *name, const 
char **atts)
 {
struct xml_ctx *ctx = (struct xml_ctx *)userData;
const char *c = strchr(name, ':');
-   int new_len;
+   int old_namelen, new_len;
 
if (c == NULL)
c = name;
else
c++;
 
-   new_len = strlen(ctx->name) + strlen(c) + 2;
+   old_namelen = strlen(ctx->name);
+   new_len = old_namelen + strlen(c) + 2;
 
if (new_len > ctx->len) {
ctx->name = xrealloc(ctx->name, new_len);
ctx->len = new_len;
}
-   strcat(ctx->name, ".");
-   strcat(ctx->name, c);
+   xsnprintf(ctx->name + old_namelen, ctx->len - old_namelen, ".%s", c);
 
free(ctx->cdata);
ctx->cdata = NULL;
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 25/67] receive-pack: convert strncpy to xsnprintf

2015-09-15 Thread Jeff King
This strncpy is pointless; we pass the strlen() of the src
string, meaning that it works just like a memcpy. Worse,
though, is that the size has no relation to the destination
buffer, meaning it is a potential overflow.  In practice,
it's not. We pass only short constant strings like
"warning: " and "error: ", which are much smaller than the
destination buffer.

We can make this much simpler by just using xsnprintf, which
will check for overflow and return the size for our next
vsnprintf, without us having to run a separate strlen().

Signed-off-by: Jeff King 
---
 builtin/receive-pack.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index e6b93d0..04d2bdf 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -280,10 +280,10 @@ static void rp_warning(const char *err, ...) 
__attribute__((format (printf, 1, 2
 
 static void report_message(const char *prefix, const char *err, va_list params)
 {
-   int sz = strlen(prefix);
+   int sz;
char msg[4096];
 
-   strncpy(msg, prefix, sz);
+   sz = xsnprintf(msg, sizeof(msg), "%s", prefix);
sz += vsnprintf(msg + sz, sizeof(msg) - sz, err, params);
if (sz > (sizeof(msg) - 1))
sz = sizeof(msg) - 1;
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 23/67] add_packed_git: convert strcpy into xsnprintf

2015-09-15 Thread Jeff King
We have the path "foo.idx", and we create a buffer big
enough to hold "foo.pack" and "foo.keep", and then strcpy
straight into it. This isn't a bug (we have enough space),
but it's very hard to tell from the strcpy that this is so.

Let's instead use strip_suffix to take off the ".idx",
record the size of our allocation, and use xsnprintf to make
sure we don't violate our assumptions.

Signed-off-by: Jeff King 
---
 cache.h |  2 +-
 sha1_file.c | 19 ++-
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/cache.h b/cache.h
index cc59aba..11372ef 100644
--- a/cache.h
+++ b/cache.h
@@ -1305,7 +1305,7 @@ extern void close_pack_windows(struct packed_git *);
 extern void unuse_pack(struct pack_window **);
 extern void free_pack_by_name(const char *);
 extern void clear_delta_base_cache(void);
-extern struct packed_git *add_packed_git(const char *, int, int);
+extern struct packed_git *add_packed_git(const char *path, size_t path_len, 
int local);
 
 /*
  * Return the SHA-1 of the nth object within the specified packfile.
diff --git a/sha1_file.c b/sha1_file.c
index f106091..28352a5 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1146,11 +1146,12 @@ static void try_to_free_pack_memory(size_t size)
release_pack_memory(size);
 }
 
-struct packed_git *add_packed_git(const char *path, int path_len, int local)
+struct packed_git *add_packed_git(const char *path, size_t path_len, int local)
 {
static int have_set_try_to_free_routine;
struct stat st;
-   struct packed_git *p = alloc_packed_git(path_len + 2);
+   size_t alloc;
+   struct packed_git *p;
 
if (!have_set_try_to_free_routine) {
have_set_try_to_free_routine = 1;
@@ -1161,18 +1162,18 @@ struct packed_git *add_packed_git(const char *path, int 
path_len, int local)
 * Make sure a corresponding .pack file exists and that
 * the index looks sane.
 */
-   path_len -= strlen(".idx");
-   if (path_len < 1) {
-   free(p);
+   if (!strip_suffix_mem(path, _len, ".idx"))
return NULL;
-   }
-   memcpy(p->pack_name, path, path_len);
 
-   strcpy(p->pack_name + path_len, ".keep");
+   alloc = path_len + strlen(".pack") + 1;
+   p = alloc_packed_git(alloc);
+   memcpy(p->pack_name, path, path_len); /* NUL from zero-ed struct */
+
+   xsnprintf(p->pack_name + path_len, alloc - path_len, ".keep");
if (!access(p->pack_name, F_OK))
p->pack_keep = 1;
 
-   strcpy(p->pack_name + path_len, ".pack");
+   xsnprintf(p->pack_name + path_len, alloc - path_len, ".pack");
if (stat(p->pack_name, ) || !S_ISREG(st.st_mode)) {
free(p);
return NULL;
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 52/67] use sha1_to_hex_to() instead of strcpy

2015-09-15 Thread Jeff King
Before sha1_to_hex_to() existed, a simple way to get a hex
sha1 into a buffer was with:

  strcpy(buf, sha1_to_hex(sha1));

This isn't wrong (assuming the buf is 41 characters), but it
makes auditing the code base for bad strcpy() calls harder,
as these become false positives.

Let's convert them to sha1_to_hex_to(), and likewise for
some calls to find_unique_abbrev(). While we're here, we'll
double-check that all of the buffers are correctly sized,
and use the more obvious GIT_SHA1_HEXSZ constant.

Signed-off-by: Jeff King 
---
 builtin/blame.c|  8 
 builtin/merge-index.c  |  4 ++--
 builtin/merge.c| 20 ++--
 builtin/receive-pack.c | 15 +--
 builtin/rev-list.c |  4 ++--
 diff.c |  9 -
 6 files changed, 31 insertions(+), 29 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 4db01c1..f8ec7ff 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1867,9 +1867,9 @@ static void emit_porcelain(struct scoreboard *sb, struct 
blame_entry *ent,
int cnt;
const char *cp;
struct origin *suspect = ent->suspect;
-   char hex[41];
+   char hex[GIT_SHA1_HEXSZ + 1];
 
-   strcpy(hex, sha1_to_hex(suspect->commit->object.sha1));
+   sha1_to_hex_to(hex, suspect->commit->object.sha1);
printf("%s %d %d %d\n",
   hex,
   ent->s_lno + 1,
@@ -1905,11 +1905,11 @@ static void emit_other(struct scoreboard *sb, struct 
blame_entry *ent, int opt)
const char *cp;
struct origin *suspect = ent->suspect;
struct commit_info ci;
-   char hex[41];
+   char hex[GIT_SHA1_HEXSZ + 1];
int show_raw_time = !!(opt & OUTPUT_RAW_TIMESTAMP);
 
get_commit_info(suspect->commit, , 1);
-   strcpy(hex, sha1_to_hex(suspect->commit->object.sha1));
+   sha1_to_hex_to(hex, suspect->commit->object.sha1);
 
cp = nth_line(sb, ent->lno);
for (cnt = 0; cnt < ent->num_lines; cnt++) {
diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 1d66111..4ed0a83 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -9,7 +9,7 @@ static int merge_entry(int pos, const char *path)
 {
int found;
const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
-   char hexbuf[4][60];
+   char hexbuf[4][GIT_SHA1_HEXSZ + 1];
char ownbuf[4][60];
 
if (pos >= active_nr)
@@ -22,7 +22,7 @@ static int merge_entry(int pos, const char *path)
if (strcmp(ce->name, path))
break;
found++;
-   strcpy(hexbuf[stage], sha1_to_hex(ce->sha1));
+   sha1_to_hex_to(hexbuf[stage], ce->sha1);
xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", 
ce->ce_mode);
arguments[stage] = hexbuf[stage];
arguments[stage + 4] = ownbuf[stage];
diff --git a/builtin/merge.c b/builtin/merge.c
index a0edaca..af2a2ae 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -1319,13 +1319,13 @@ int cmd_merge(int argc, const char **argv, const char 
*prefix)
if (verify_signatures) {
for (p = remoteheads; p; p = p->next) {
struct commit *commit = p->item;
-   char hex[41];
+   char hex[GIT_SHA1_HEXSZ + 1];
struct signature_check signature_check;
memset(_check, 0, sizeof(signature_check));
 
check_commit_signature(commit, _check);
 
-   strcpy(hex, find_unique_abbrev(commit->object.sha1, 
DEFAULT_ABBREV));
+   find_unique_abbrev_to(hex, commit->object.sha1, 
DEFAULT_ABBREV);
switch (signature_check.result) {
case 'G':
break;
@@ -1415,15 +1415,15 @@ int cmd_merge(int argc, const char **argv, const char 
*prefix)
/* Again the most common case of merging one remote. */
struct strbuf msg = STRBUF_INIT;
struct commit *commit;
-   char hex[41];
 
-   strcpy(hex, find_unique_abbrev(head_commit->object.sha1, 
DEFAULT_ABBREV));
-
-   if (verbosity >= 0)
-   printf(_("Updating %s..%s\n"),
-   hex,
-   
find_unique_abbrev(remoteheads->item->object.sha1,
-   DEFAULT_ABBREV));
+   if (verbosity >= 0) {
+   char from[GIT_SHA1_HEXSZ + 1], to[GIT_SHA1_HEXSZ + 1];
+   find_unique_abbrev_to(from, head_commit->object.sha1,
+ DEFAULT_ABBREV);
+   find_unique_abbrev_to(to, 
remoteheads->item->object.sha1,
+ DEFAULT_ABBREV);
+   printf(_("Updating %s..%s\n"), from, to);

[PATCH 51/67] daemon: use cld->env_array when re-spawning

2015-09-15 Thread Jeff King
This avoids an ugly strcat into a fixed-size buffer. It's
not wrong (the buffer is plenty large enough for an IPv6
address plus some minor formatting), but it takes some
effort to verify that.

Unfortunately we are still stuck with some fixed-size
buffers to hold the output of inet_ntop. But at least we now
pass very easy-to-verify parameters, rather than doing a
manual computation to account for other data in the buffer.

As a side effect, this also fixes the case where we might
pass an uninitialized portbuf buffer through the
environment. This probably couldn't happen in practice, as
it would mean that addr->sa_family was neither AF_INET nor
AF_INET6 (and that is all we are listening on).

Signed-off-by: Jeff King 
---
 daemon.c | 26 ++
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/daemon.c b/daemon.c
index 5218a3f..56679a1 100644
--- a/daemon.c
+++ b/daemon.c
@@ -811,8 +811,6 @@ static char **cld_argv;
 static void handle(int incoming, struct sockaddr *addr, socklen_t addrlen)
 {
struct child_process cld = CHILD_PROCESS_INIT;
-   char addrbuf[300] = "REMOTE_ADDR=", portbuf[300];
-   char *env[] = { addrbuf, portbuf, NULL };
 
if (max_connections && live_children >= max_connections) {
kill_some_child();
@@ -826,27 +824,23 @@ static void handle(int incoming, struct sockaddr *addr, 
socklen_t addrlen)
}
 
if (addr->sa_family == AF_INET) {
+   char buf[128] = "";
struct sockaddr_in *sin_addr = (void *) addr;
-   inet_ntop(addr->sa_family, _addr->sin_addr, addrbuf + 12,
-   sizeof(addrbuf) - 12);
-   snprintf(portbuf, sizeof(portbuf), "REMOTE_PORT=%d",
-   ntohs(sin_addr->sin_port));
+   inet_ntop(addr->sa_family, _addr->sin_addr, buf, 
sizeof(buf));
+   argv_array_pushf(_array, "REMOTE_ADDR=%s", buf);
+   argv_array_pushf(_array, "REMOTE_PORT=%d",
+ntohs(sin_addr->sin_port));
 #ifndef NO_IPV6
} else if (addr->sa_family == AF_INET6) {
+   char buf[128] = "";
struct sockaddr_in6 *sin6_addr = (void *) addr;
-
-   char *buf = addrbuf + 12;
-   *buf++ = '['; *buf = '\0'; /* stpcpy() is cool */
-   inet_ntop(AF_INET6, _addr->sin6_addr, buf,
-   sizeof(addrbuf) - 13);
-   strcat(buf, "]");
-
-   snprintf(portbuf, sizeof(portbuf), "REMOTE_PORT=%d",
-   ntohs(sin6_addr->sin6_port));
+   inet_ntop(AF_INET6, _addr->sin6_addr, buf, sizeof(buf));
+   argv_array_pushf(_array, "REMOTE_ADDR=[%s]", buf);
+   argv_array_pushf(_array, "REMOTE_PORT=%d",
+ntohs(sin6_addr->sin6_port));
 #endif
}
 
-   cld.env = (const char **)env;
cld.argv = (const char **)cld_argv;
cld.in = incoming;
cld.out = dup(incoming);
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 61/67] notes: document length of fanout path with a constant

2015-09-15 Thread Jeff King
We know that a fanned-out sha1 in a notes tree cannot be
more than "aa/bb/cc/...", and we have an assert() to confirm
that. But let's factor out that length into a constant so we
can be sure it is used consistently. And even though we
assert() earlier, let's replace a strcpy with xsnprintf, so
it is clear to a reader that all cases are covered.

Signed-off-by: Jeff King 
---
 notes.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/notes.c b/notes.c
index eacd2a6..db77922 100644
--- a/notes.c
+++ b/notes.c
@@ -539,6 +539,9 @@ static unsigned char determine_fanout(struct int_node 
*tree, unsigned char n,
return fanout + 1;
 }
 
+/* hex SHA1 + 19 * '/' + NUL */
+#define FANOUT_PATH_MAX 40 + 19 + 1
+
 static void construct_path_with_fanout(const unsigned char *sha1,
unsigned char fanout, char *path)
 {
@@ -551,7 +554,7 @@ static void construct_path_with_fanout(const unsigned char 
*sha1,
path[i++] = '/';
fanout--;
}
-   strcpy(path + i, hex_sha1 + j);
+   xsnprintf(path + i, FANOUT_PATH_MAX - i, "%s", hex_sha1 + j);
 }
 
 static int for_each_note_helper(struct notes_tree *t, struct int_node *tree,
@@ -562,7 +565,7 @@ static int for_each_note_helper(struct notes_tree *t, 
struct int_node *tree,
void *p;
int ret = 0;
struct leaf_node *l;
-   static char path[40 + 19 + 1];  /* hex SHA1 + 19 * '/' + NUL */
+   static char path[FANOUT_PATH_MAX];
 
fanout = determine_fanout(tree, n, fanout);
for (i = 0; i < 16; i++) {
@@ -595,7 +598,7 @@ redo:
/* invoke callback with subtree */
unsigned int path_len =
l->key_sha1[19] * 2 + fanout;
-   assert(path_len < 40 + 19);
+   assert(path_len < FANOUT_PATH_MAX - 1);
construct_path_with_fanout(l->key_sha1, fanout,
   path);
/* Create trailing slash, if needed */
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/67] add reentrant variants of sha1_to_hex and find_unique_abbrev

2015-09-15 Thread Jeff King
The sha1_to_hex and find_unique_abbrev functions always
write into reusable static buffers. There are a few problems
with this:

  - future calls overwrite our result. This is especially
annoying with find_unique_abbrev, which does not have a
ring of buffers, so you cannot even printf() a result
that has two abbreviated sha1s.

  - if you want to put the result into another buffer, we
often strcpy, which looks suspicious when auditing for
overflows.

This patch introduces sha1_to_hex_to and find_unique_abbrev_to,
which write into a user-provided buffer. Of course this is
just punting on the overflow-auditing, as the buffer
obviously needs to be GIT_SHA1_HEXSZ + 1 bytes. But it is
much easier to audit, since that is a well-known size.

We retain the non-reentrant forms, which just become thin
wrappers around the reentrant ones. This patch also adds a
strbuf variant of find_unique_abbrev, which will be handy in
later patches.

Signed-off-by: Jeff King 
---
If we wanted to be really meticulous, these functions could
take a size for the output buffer, and complain if it is not
GIT_SHA1_HEXSZ+1 bytes. But that would bloat every call
like:

  sha1_to_hex_to(buf, sizeof(buf), sha1);

 cache.h | 27 ++-
 hex.c   | 13 +
 sha1_name.c | 16 +++-
 strbuf.c|  9 +
 strbuf.h|  8 
 5 files changed, 63 insertions(+), 10 deletions(-)

diff --git a/cache.h b/cache.h
index e231e47..cc59aba 100644
--- a/cache.h
+++ b/cache.h
@@ -785,7 +785,21 @@ extern char *sha1_pack_name(const unsigned char *sha1);
  */
 extern char *sha1_pack_index_name(const unsigned char *sha1);
 
-extern const char *find_unique_abbrev(const unsigned char *sha1, int);
+/*
+ * Return an abbreviated sha1 unique within this repository's object database.
+ * The result will be at least `len` characters long, and will be NUL
+ * terminated.
+ *
+ * The non-`_to` version returns a static buffer which will be overwritten by
+ * subsequent calls.
+ *
+ * The `_to` variant writes to a buffer supplied by the caller, which must be
+ * at least `GIT_SHA1_HEXSZ + 1` bytes. The return value is the number of bytes
+ * written (excluding the NUL terminator).
+ */
+extern const char *find_unique_abbrev(const unsigned char *sha1, int len);
+extern int find_unique_abbrev_to(char *hex, const unsigned char *sha1, int 
len);
+
 extern const unsigned char null_sha1[GIT_SHA1_RAWSZ];
 
 static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
@@ -1067,6 +1081,17 @@ extern int for_each_abbrev(const char *prefix, 
each_abbrev_fn, void *);
 extern int get_sha1_hex(const char *hex, unsigned char *sha1);
 extern int get_oid_hex(const char *hex, struct object_id *sha1);
 
+/*
+ * Convert a binary sha1 to its hex equivalent. The `_to` variant writes
+ * the NUL-terminated output to the buffer `out`, which must be at least
+ * `GIT_SHA1_HEXSZ + 1` bytes, and returns a pointer to out for convenience.
+ *
+ * The non-`_to` variant returns a static buffer, but uses a ring of 4
+ * buffers, making it safe to make multiple calls for a single statement, like:
+ *
+ *   printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two));
+ */
+extern char *sha1_to_hex_to(char *out, const unsigned char *sha1);
 extern char *sha1_to_hex(const unsigned char *sha1);   /* static buffer 
result! */
 extern char *oid_to_hex(const struct object_id *oid);  /* same static buffer 
as sha1_to_hex */
 
diff --git a/hex.c b/hex.c
index 899b74a..004fdea 100644
--- a/hex.c
+++ b/hex.c
@@ -61,12 +61,10 @@ int get_oid_hex(const char *hex, struct object_id *oid)
return get_sha1_hex(hex, oid->hash);
 }
 
-char *sha1_to_hex(const unsigned char *sha1)
+char *sha1_to_hex_to(char *buffer, const unsigned char *sha1)
 {
-   static int bufno;
-   static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
static const char hex[] = "0123456789abcdef";
-   char *buffer = hexbuffer[3 & ++bufno], *buf = buffer;
+   char *buf = buffer;
int i;
 
for (i = 0; i < GIT_SHA1_RAWSZ; i++) {
@@ -79,6 +77,13 @@ char *sha1_to_hex(const unsigned char *sha1)
return buffer;
 }
 
+char *sha1_to_hex(const unsigned char *sha1)
+{
+   static int bufno;
+   static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
+   return sha1_to_hex_to(hexbuffer[3 & ++bufno], sha1);
+}
+
 char *oid_to_hex(const struct object_id *oid)
 {
return sha1_to_hex(oid->hash);
diff --git a/sha1_name.c b/sha1_name.c
index da6874c..416e408 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -368,14 +368,13 @@ int for_each_abbrev(const char *prefix, each_abbrev_fn 
fn, void *cb_data)
return ds.ambiguous;
 }
 
-const char *find_unique_abbrev(const unsigned char *sha1, int len)
+int find_unique_abbrev_to(char *hex, const unsigned char *sha1, int len)
 {
int status, exists;
-   static char hex[41];
 
-   memcpy(hex, sha1_to_hex(sha1), 40);
+   sha1_to_hex_to(hex, sha1);
if 

[PATCH 14/67] compat/inet_ntop: fix off-by-one in inet_ntop4

2015-09-15 Thread Jeff King
Our compat inet_ntop4 function writes to a temporary buffer
with snprintf, and then uses strcpy to put the result into
the final "dst" buffer. We check the return value of
snprintf against the size of "dst", but fail to account for
the NUL terminator. As a result, we may overflow "dst" with
a single NUL. In practice, this doesn't happen because the
output of inet_ntop is limited, and we provide buffers that
are way oversized.

We can fix the off-by-one check easily, but while we are
here let's also use strlcpy for increased safety, just in
case there are other bugs lurking.

As a side note, this compat code seems to be BSD-derived.
Searching for "vixie inet_ntop" turns up NetBSD's latest
version of the same code, which has an identical fix (and
switches to strlcpy, too!).

Signed-off-by: Jeff King 
---
 compat/inet_ntop.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/compat/inet_ntop.c b/compat/inet_ntop.c
index 90b7cc4..6830726 100644
--- a/compat/inet_ntop.c
+++ b/compat/inet_ntop.c
@@ -53,11 +53,11 @@ inet_ntop4(const u_char *src, char *dst, size_t size)
nprinted = snprintf(tmp, sizeof(tmp), fmt, src[0], src[1], src[2], 
src[3]);
if (nprinted < 0)
return (NULL);  /* we assume "errno" was set by "snprintf()" */
-   if ((size_t)nprinted > size) {
+   if ((size_t)nprinted >= size) {
errno = ENOSPC;
return (NULL);
}
-   strcpy(dst, tmp);
+   strlcpy(dst, tmp, size);
return (dst);
 }
 
@@ -154,7 +154,7 @@ inet_ntop6(const u_char *src, char *dst, size_t size)
errno = ENOSPC;
return (NULL);
}
-   strcpy(dst, tmp);
+   strlcpy(dst, tmp, size);
return (dst);
 }
 #endif
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/67] convert trivial sprintf / strcpy calls to xsnprintf

2015-09-15 Thread Jeff King
We sometimes sprintf into static buffers when we know that
the size of the buffer is large enough to fit the input
(either because it's a constant, or because it's numeric
input that is bounded in size). Likewise with strcpy of
constant strings.

However, these sites make it hard to audit sprintf and
strcpy calls for buffer overflows, as a reader has to
cross-reference the size of the array with the input. Let's
use xsnprintf instead, which communicates to a reader that
we don't expect this to overflow (and catches the mistake in
case we do).

Signed-off-by: Jeff King 
---
These are all pretty trivial; the obvious thing to get wrong is that
"sizeof(buf)" is not the correct length if "buf" is a pointer. I
considered a macro wrapper like:

  #define xsnprintf_array(dst, fmt, ...) \
xsnprintf(dst, sizeof(dst) + BARF_UNLESS_AN_ARRAY(dst), \
  fmt, __VA_ARGS__)

but obviously that requires variadic macro support.

 archive-tar.c |  2 +-
 builtin/gc.c  |  2 +-
 builtin/init-db.c | 11 ++-
 builtin/ls-tree.c |  9 +
 builtin/merge-index.c |  2 +-
 builtin/merge-recursive.c |  2 +-
 builtin/read-tree.c   |  2 +-
 builtin/unpack-file.c |  2 +-
 compat/mingw.c|  8 +---
 compat/winansi.c  |  2 +-
 connect.c |  2 +-
 convert.c |  3 ++-
 daemon.c  |  4 ++--
 diff.c| 12 ++--
 http-push.c   |  2 +-
 http.c|  6 +++---
 ll-merge.c| 12 ++--
 refs.c|  8 
 sideband.c|  4 ++--
 strbuf.c  |  4 ++--
 20 files changed, 52 insertions(+), 47 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index b6b30bb..d543f93 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -301,7 +301,7 @@ static int write_global_extended_header(struct 
archiver_args *args)
memset(, 0, sizeof(header));
*header.typeflag = TYPEFLAG_GLOBAL_HEADER;
mode = 0100666;
-   strcpy(header.name, "pax_global_header");
+   xsnprintf(header.name, sizeof(header.name), "pax_global_header");
prepare_header(args, , mode, ext_header.len);
write_blocked(, sizeof(header));
write_blocked(ext_header.buf, ext_header.len);
diff --git a/builtin/gc.c b/builtin/gc.c
index 0ad8d30..57584bc 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -194,7 +194,7 @@ static const char *lock_repo_for_gc(int force, pid_t* 
ret_pid)
return NULL;
 
if (gethostname(my_host, sizeof(my_host)))
-   strcpy(my_host, "unknown");
+   xsnprintf(my_host, sizeof(my_host), "unknown");
 
pidfile_path = git_pathdup("gc.pid");
fd = hold_lock_file_for_update(, pidfile_path,
diff --git a/builtin/init-db.c b/builtin/init-db.c
index 69323e1..e7d0e31 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -262,7 +262,8 @@ static int create_default_files(const char *template_path)
}
 
/* This forces creation of new config file */
-   sprintf(repo_version_string, "%d", GIT_REPO_VERSION);
+   xsnprintf(repo_version_string, sizeof(repo_version_string),
+ "%d", GIT_REPO_VERSION);
git_config_set("core.repositoryformatversion", repo_version_string);
 
path[len] = 0;
@@ -414,13 +415,13 @@ int init_db(const char *template_dir, unsigned int flags)
 */
if (shared_repository < 0)
/* force to the mode value */
-   sprintf(buf, "0%o", -shared_repository);
+   xsnprintf(buf, sizeof(buf), "0%o", -shared_repository);
else if (shared_repository == PERM_GROUP)
-   sprintf(buf, "%d", OLD_PERM_GROUP);
+   xsnprintf(buf, sizeof(buf), "%d", OLD_PERM_GROUP);
else if (shared_repository == PERM_EVERYBODY)
-   sprintf(buf, "%d", OLD_PERM_EVERYBODY);
+   xsnprintf(buf, sizeof(buf), "%d", OLD_PERM_EVERYBODY);
else
-   die("oops");
+   die("BUG: invalid value for shared_repository");
git_config_set("core.sharedrepository", buf);
git_config_set("receive.denyNonFastforwards", "true");
}
diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
index 3b04a0f..0e30d86 100644
--- a/builtin/ls-tree.c
+++ b/builtin/ls-tree.c
@@ -96,12 +96,13 @@ static int show_tree(const unsigned char *sha1, struct 
strbuf *base,
if (!strcmp(type, blob_type)) {
unsigned long size;
if (sha1_object_info(sha1, ) == OBJ_BAD)
-   strcpy(size_text, "BAD");
+   xsnprintf(size_text, sizeof(size_text),
+

[PATCH 26/67] replace trivial malloc + sprintf /strcpy calls to xstrfmt

2015-09-15 Thread Jeff King
It's a common pattern to do:

  foo = xmalloc(strlen(one) + strlen(two) + 1 + 1);
  sprintf(foo, "%s %s", one, two);

(or possibly some variant with strcpy()s or a more
complicated length computation).  We can switch these to use
xstrfmt, which is shorter, involves less error-prone manual
computation, and removes many sprintf and strcpy calls which
make it harder to audit the code for real buffer overflows.

Signed-off-by: Jeff King 
---
 builtin/apply.c |  5 +
 builtin/ls-remote.c |  8 ++--
 builtin/name-rev.c  | 13 +
 environment.c   |  7 ++-
 imap-send.c |  5 ++---
 reflog-walk.c   |  7 +++
 remote.c|  7 +--
 setup.c | 12 +++-
 unpack-trees.c  |  4 +---
 9 files changed, 20 insertions(+), 48 deletions(-)

diff --git a/builtin/apply.c b/builtin/apply.c
index 4aa53f7..094a20f 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -698,10 +698,7 @@ static char *find_name_common(const char *line, const char 
*def,
}
 
if (root) {
-   char *ret = xmalloc(root_len + len + 1);
-   strcpy(ret, root);
-   memcpy(ret + root_len, start, len);
-   ret[root_len + len] = '\0';
+   char *ret = xstrfmt("%s%.*s", root, len, start);
return squash_slash(ret);
}
 
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index 4554dbc..5b6d679 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -93,12 +93,8 @@ int cmd_ls_remote(int argc, const char **argv, const char 
*prefix)
if (argv[i]) {
int j;
pattern = xcalloc(argc - i + 1, sizeof(const char *));
-   for (j = i; j < argc; j++) {
-   int len = strlen(argv[j]);
-   char *p = xmalloc(len + 3);
-   sprintf(p, "*/%s", argv[j]);
-   pattern[j - i] = p;
-   }
+   for (j = i; j < argc; j++)
+   pattern[j - i] = xstrfmt("*/%s", argv[j]);
}
remote = remote_get(dest);
if (!remote) {
diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index 248a3eb..8a3a0cd 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -56,19 +56,16 @@ copy_data:
parents = parents->next, parent_number++) {
if (parent_number > 1) {
int len = strlen(tip_name);
-   char *new_name = xmalloc(len +
-   1 + decimal_length(generation) +  /* ~ */
-   1 + 2 +   /* ^NN */
-   1);
+   char *new_name;
 
if (len > 2 && !strcmp(tip_name + len - 2, "^0"))
len -= 2;
if (generation > 0)
-   sprintf(new_name, "%.*s~%d^%d", len, tip_name,
-   generation, parent_number);
+   new_name = xstrfmt("%.*s~%d^%d", len, tip_name,
+  generation, parent_number);
else
-   sprintf(new_name, "%.*s^%d", len, tip_name,
-   parent_number);
+   new_name = xstrfmt("%.*s^%d", len, tip_name,
+  parent_number);
 
name_rev(parents->item, new_name, 0,
distance + MERGE_TRAVERSAL_WEIGHT, 0);
diff --git a/environment.c b/environment.c
index a533aed..c5b65f5 100644
--- a/environment.c
+++ b/environment.c
@@ -143,11 +143,8 @@ static char *git_path_from_env(const char *envvar, const 
char *git_dir,
   const char *path, int *fromenv)
 {
const char *value = getenv(envvar);
-   if (!value) {
-   char *buf = xmalloc(strlen(git_dir) + strlen(path) + 2);
-   sprintf(buf, "%s/%s", git_dir, path);
-   return buf;
-   }
+   if (!value)
+   return xstrfmt("%s/%s", git_dir, path);
if (fromenv)
*fromenv = 1;
return xstrdup(value);
diff --git a/imap-send.c b/imap-send.c
index 37ac4aa..01aa227 100644
--- a/imap-send.c
+++ b/imap-send.c
@@ -889,9 +889,8 @@ static char *cram(const char *challenge_64, const char 
*user, const char *pass)
}
 
/* response: " " */
-   resp_len = strlen(user) + 1 + strlen(hex) + 1;
-   response = xmalloc(resp_len);
-   sprintf(response, "%s %s", user, hex);
+   response = xstrfmt("%s %s", user, hex);
+   resp_len = strlen(response);
 
response_64 = xmalloc(ENCODED_SIZE(resp_len) + 1);
encoded_len = EVP_EncodeBlock((unsigned char *)response_64,
diff --git a/reflog-walk.c b/reflog-walk.c
index 

[PATCH 30/67] ref-filter: drop sprintf and strcpy calls

2015-09-15 Thread Jeff King
The ref-filter code comes from for-each-ref, and inherited a
number of raw sprintf and strcpy calls. These are generally
all safe, as we custom-size the buffers, or are formatting
numbers into sufficiently large buffers. But we can make the
resulting code even simpler and more obviously correct by
using some of our helper functions.

Signed-off-by: Jeff King 
---
 ref-filter.c | 70 +++-
 1 file changed, 22 insertions(+), 48 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index f38dee4..1f71870 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -192,9 +192,7 @@ static int grab_objectname(const char *name, const unsigned 
char *sha1,
struct atom_value *v)
 {
if (!strcmp(name, "objectname")) {
-   char *s = xmalloc(41);
-   strcpy(s, sha1_to_hex(sha1));
-   v->s = s;
+   v->s = xstrdup(sha1_to_hex(sha1));
return 1;
}
if (!strcmp(name, "objectname:short")) {
@@ -219,10 +217,8 @@ static void grab_common_values(struct atom_value *val, int 
deref, struct object
if (!strcmp(name, "objecttype"))
v->s = typename(obj->type);
else if (!strcmp(name, "objectsize")) {
-   char *s = xmalloc(40);
-   sprintf(s, "%lu", sz);
v->ul = sz;
-   v->s = s;
+   v->s = xstrfmt("%lu", sz);
}
else if (deref)
grab_objectname(name, obj->sha1, v);
@@ -246,11 +242,8 @@ static void grab_tag_values(struct atom_value *val, int 
deref, struct object *ob
v->s = tag->tag;
else if (!strcmp(name, "type") && tag->tagged)
v->s = typename(tag->tagged->type);
-   else if (!strcmp(name, "object") && tag->tagged) {
-   char *s = xmalloc(41);
-   strcpy(s, sha1_to_hex(tag->tagged->sha1));
-   v->s = s;
-   }
+   else if (!strcmp(name, "object") && tag->tagged)
+   v->s = xstrdup(sha1_to_hex(tag->tagged->sha1));
}
 }
 
@@ -268,32 +261,22 @@ static void grab_commit_values(struct atom_value *val, 
int deref, struct object
if (deref)
name++;
if (!strcmp(name, "tree")) {
-   char *s = xmalloc(41);
-   strcpy(s, sha1_to_hex(commit->tree->object.sha1));
-   v->s = s;
+   v->s = xstrdup(sha1_to_hex(commit->tree->object.sha1));
}
-   if (!strcmp(name, "numparent")) {
-   char *s = xmalloc(40);
+   else if (!strcmp(name, "numparent")) {
v->ul = commit_list_count(commit->parents);
-   sprintf(s, "%lu", v->ul);
-   v->s = s;
+   v->s = xstrfmt("%lu", v->ul);
}
else if (!strcmp(name, "parent")) {
-   int num = commit_list_count(commit->parents);
-   int i;
struct commit_list *parents;
-   char *s = xmalloc(41 * num + 1);
-   v->s = s;
-   for (i = 0, parents = commit->parents;
-parents;
-parents = parents->next, i = i + 41) {
+   struct strbuf s = STRBUF_INIT;
+   for (parents = commit->parents; parents; parents = 
parents->next) {
struct commit *parent = parents->item;
-   strcpy(s+i, sha1_to_hex(parent->object.sha1));
-   if (parents->next)
-   s[i+40] = ' ';
+   if (parents != commit->parents)
+   strbuf_addch(, ' ');
+   strbuf_addstr(, 
sha1_to_hex(parent->object.sha1));
}
-   if (!i)
-   *s = '\0';
+   v->s = strbuf_detach(, NULL);
}
}
 }
@@ -700,7 +683,6 @@ static void populate_value(struct ref_array_item *ref)
else if (!strcmp(formatp, "track") &&
 (starts_with(name, "upstream") ||
  starts_with(name, "push"))) {
-   char buf[40];
 
if (stat_tracking_info(branch, _ours,
   _theirs, NULL))
@@ -708,17 +690,13 @@ static void populate_value(struct ref_array_item *ref)
 
if (!num_ours && !num_theirs)

[PATCH 28/67] fetch: replace static buffer with xstrfmt

2015-09-15 Thread Jeff King
We parse the INFINITE_DEPTH constant into a static,
fixed-size buffer using sprintf. This buffer is sufficiently
large for the current constant, but it's a suspicious
pattern, as the constant is defined far away, and it's not
immediately obvious that 12 bytes are large enough to hold
it.

We can just use xstrfmt here, which gets rid of any question
of the buffer size. It also removes any concerns with object
lifetime, which means we do not have to wonder why this
buffer deep within a conditional is marked "static" (we
never free our newly allocated result, of course, but that's
OK; it's global that lasts the lifetime of the whole program
anyway).

Signed-off-by: Jeff King 
---
 builtin/fetch.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 9a3869f..4703725 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1156,11 +1156,8 @@ int cmd_fetch(int argc, const char **argv, const char 
*prefix)
die(_("--depth and --unshallow cannot be used 
together"));
else if (!is_repository_shallow())
die(_("--unshallow on a complete repository does not 
make sense"));
-   else {
-   static char inf_depth[12];
-   sprintf(inf_depth, "%d", INFINITE_DEPTH);
-   depth = inf_depth;
-   }
+   else
+   depth = xstrfmt("%d", INFINITE_DEPTH);
}
 
/* no need to be strict, transport_set_option() will validate it again 
*/
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 31/67] help: drop prepend function in favor of xstrfmt

2015-09-15 Thread Jeff King
This function predates xstrfmt, and its functionality is a
subset. Let's just use xstrfmt.

Signed-off-by: Jeff King 
---
 builtin/help.c | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/builtin/help.c b/builtin/help.c
index 3422e73..fba8c01 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -295,16 +295,6 @@ static int is_git_command(const char *s)
is_in_cmdlist(_cmds, s);
 }
 
-static const char *prepend(const char *prefix, const char *cmd)
-{
-   size_t pre_len = strlen(prefix);
-   size_t cmd_len = strlen(cmd);
-   char *p = xmalloc(pre_len + cmd_len + 1);
-   memcpy(p, prefix, pre_len);
-   strcpy(p + pre_len, cmd);
-   return p;
-}
-
 static const char *cmd_to_page(const char *git_cmd)
 {
if (!git_cmd)
@@ -312,9 +302,9 @@ static const char *cmd_to_page(const char *git_cmd)
else if (starts_with(git_cmd, "git"))
return git_cmd;
else if (is_git_command(git_cmd))
-   return prepend("git-", git_cmd);
+   return xstrfmt("git-%s", git_cmd);
else
-   return prepend("git", git_cmd);
+   return xstrfmt("git%s", git_cmd);
 }
 
 static void setup_man_path(void)
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 29/67] use strip_suffix and xstrfmt to replace suffix

2015-09-15 Thread Jeff King
When we want to convert "foo.pack" to "foo.idx", we do it by
duplicating the original string and then munging the bytes
in place. Let's use strip_suffix and xstrfmt instead, which
has several advantages:

  1. It's more clear what the intent is.

  2. It does not implicitly rely on the fact that
 strlen(".idx") <= strlen(".pack") to avoid an overflow.

  3. We communicate the assumption that the input file ends
 with ".pack" (and get a run-time check that this is so).

  4. We drop calls to strcpy, which makes auditing the code
 base easier.

Likewise, we can do this to convert ".pack" to ".bitmap",
avoiding some manual memory computation.

Signed-off-by: Jeff King 
---
 http.c|  7 ---
 pack-bitmap.c | 13 -
 sha1_file.c   |  6 --
 3 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/http.c b/http.c
index 7b02259..e0ff876 100644
--- a/http.c
+++ b/http.c
@@ -1511,6 +1511,7 @@ int finish_http_pack_request(struct http_pack_request 
*preq)
struct packed_git **lst;
struct packed_git *p = preq->target;
char *tmp_idx;
+   size_t len;
struct child_process ip = CHILD_PROCESS_INIT;
const char *ip_argv[8];
 
@@ -1524,9 +1525,9 @@ int finish_http_pack_request(struct http_pack_request 
*preq)
lst = &((*lst)->next);
*lst = (*lst)->next;
 
-   tmp_idx = xstrdup(preq->tmpfile);
-   strcpy(tmp_idx + strlen(tmp_idx) - strlen(".pack.temp"),
-  ".idx.temp");
+   if (!strip_suffix(preq->tmpfile, ".pack.temp", ))
+   die("BUG: pack tmpfile does not end in .pack.temp?");
+   tmp_idx = xstrfmt("%.*s.idx.temp", (int)len, preq->tmpfile);
 
ip_argv[0] = "index-pack";
ip_argv[1] = "-o";
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 637770a..7dfcb34 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -252,16 +252,11 @@ static int load_bitmap_entries_v1(struct bitmap_index 
*index)
 
 static char *pack_bitmap_filename(struct packed_git *p)
 {
-   char *idx_name;
-   int len;
-
-   len = strlen(p->pack_name) - strlen(".pack");
-   idx_name = xmalloc(len + strlen(".bitmap") + 1);
-
-   memcpy(idx_name, p->pack_name, len);
-   memcpy(idx_name + len, ".bitmap", strlen(".bitmap") + 1);
+   size_t len;
 
-   return idx_name;
+   if (!strip_suffix(p->pack_name, ".pack", ))
+   die("BUG: pack_name does not end in .pack");
+   return xstrfmt("%.*s.bitmap", (int)len, p->pack_name);
 }
 
 static int open_pack_bitmap_1(struct packed_git *packfile)
diff --git a/sha1_file.c b/sha1_file.c
index 28352a5..88996f0 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -671,13 +671,15 @@ static int check_packed_git_idx(const char *path, struct 
packed_git *p)
 int open_pack_index(struct packed_git *p)
 {
char *idx_name;
+   size_t len;
int ret;
 
if (p->index_data)
return 0;
 
-   idx_name = xstrdup(p->pack_name);
-   strcpy(idx_name + strlen(idx_name) - strlen(".pack"), ".idx");
+   if (!strip_suffix(p->pack_name, ".pack", ))
+   die("BUG: pack_name does not end in .pack");
+   idx_name = xstrfmt("%.*s.idx", (int)len, p->pack_name);
ret = check_packed_git_idx(idx_name, p);
free(idx_name);
return ret;
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 27/67] config: use xstrfmt in normalize_value

2015-09-15 Thread Jeff King
We xmalloc a fixed-size buffer and sprintf into it; this is
OK because the size of our formatting types is finite, but
that's not immediately clear to a reader auditing sprintf
calls. Let's switch to xstrfmt, which is shorter and
obviously correct.

Note that just dropping the common xmalloc here causes gcc
to complain with -Wmaybe-uninitialized. That's because if
"types" does not match any of our known types, we never
write anything into the "normalized" pointer. With the
current code, gcc doesn't notice because we always return a
valid pointer (just one which might point to uninitialized
data, but the compiler doesn't know that). In other words,
the current code is potentially buggy if new types are added
without updating this spot.

So let's take this opportunity to clean up the function a
bit more. We can drop the "normalized" pointer entirely, and
just return directly from each code path. And then add an
assertion at the end in case we haven't covered any cases.

Signed-off-by: Jeff King 
---
 builtin/config.c | 34 +-
 1 file changed, 13 insertions(+), 21 deletions(-)

diff --git a/builtin/config.c b/builtin/config.c
index 71acc44..adc7727 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -246,8 +246,6 @@ free_strings:
 
 static char *normalize_value(const char *key, const char *value)
 {
-   char *normalized;
-
if (!value)
return NULL;
 
@@ -258,27 +256,21 @@ static char *normalize_value(const char *key, const char 
*value)
 * "~/foobar/" in the config file, and to expand the ~
 * when retrieving the value.
 */
-   normalized = xstrdup(value);
-   else {
-   normalized = xmalloc(64);
-   if (types == TYPE_INT) {
-   int64_t v = git_config_int64(key, value);
-   sprintf(normalized, "%"PRId64, v);
-   }
-   else if (types == TYPE_BOOL)
-   sprintf(normalized, "%s",
-   git_config_bool(key, value) ? "true" : "false");
-   else if (types == TYPE_BOOL_OR_INT) {
-   int is_bool, v;
-   v = git_config_bool_or_int(key, value, _bool);
-   if (!is_bool)
-   sprintf(normalized, "%d", v);
-   else
-   sprintf(normalized, "%s", v ? "true" : "false");
-   }
+   return xstrdup(value);
+   if (types == TYPE_INT)
+   return xstrfmt("%"PRId64, git_config_int64(key, value));
+   if (types == TYPE_BOOL)
+   return xstrdup(git_config_bool(key, value) ?  "true" : "false");
+   if (types == TYPE_BOOL_OR_INT) {
+   int is_bool, v;
+   v = git_config_bool_or_int(key, value, _bool);
+   if (!is_bool)
+   return xstrfmt("%d", v);
+   else
+   return xstrdup(v ? "true" : "false");
}
 
-   return normalized;
+   die("BUG: cannot normalize type %d", types);
 }
 
 static int get_color_found;
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 33/67] read_branches_file: replace strcpy with xstrdup

2015-09-15 Thread Jeff King
This code is exactly replicating strdup, so let's just use
that. It's shorter, and eliminates some confusion (such as
whether "p - s" is really enough to hold the result; it is,
because we write NULs as we shrink "p").

Signed-off-by: Jeff King 
---
 remote.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/remote.c b/remote.c
index 5ab0f7f..1b69751 100644
--- a/remote.c
+++ b/remote.c
@@ -297,7 +297,6 @@ static void read_branches_file(struct remote *remote)
int n = 1000;
FILE *f = fopen(git_path("branches/%.*s", n, remote->name), "r");
char *s, *p;
-   int len;
 
if (!f)
return;
@@ -313,9 +312,7 @@ static void read_branches_file(struct remote *remote)
p = s + strlen(s);
while (isspace(p[-1]))
*--p = 0;
-   len = p - s;
-   p = xmalloc(len + 1);
-   strcpy(p, s);
+   p = xstrdup(s);
 
/*
 * The branches file would have URL and optionally
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 53/67] drop strcpy in favor of raw sha1_to_hex

2015-09-15 Thread Jeff King
In some cases where we strcpy() the result of sha1_to_hex(),
there's no need; the result goes directly into a printf
statement, and we can simply pass the return value from
sha1_to_hex() directly.

Signed-off-by: Jeff King 
---
 http-push.c | 6 ++
 walker.c| 5 ++---
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/http-push.c b/http-push.c
index 43a9036..48f39b7 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1856,7 +1856,6 @@ int main(int argc, char **argv)
 
new_refs = 0;
for (ref = remote_refs; ref; ref = ref->next) {
-   char old_hex[60], *new_hex;
struct argv_array commit_argv = ARGV_ARRAY_INIT;
 
if (!ref->peer_ref)
@@ -1911,13 +1910,12 @@ int main(int argc, char **argv)
}
hashcpy(ref->new_sha1, ref->peer_ref->new_sha1);
new_refs++;
-   strcpy(old_hex, sha1_to_hex(ref->old_sha1));
-   new_hex = sha1_to_hex(ref->new_sha1);
 
fprintf(stderr, "updating '%s'", ref->name);
if (strcmp(ref->name, ref->peer_ref->name))
fprintf(stderr, " using '%s'", ref->peer_ref->name);
-   fprintf(stderr, "\n  from %s\n  to   %s\n", old_hex, new_hex);
+   fprintf(stderr, "\n  from %s\n  to   %s\n",
+   sha1_to_hex(ref->old_sha1), sha1_to_hex(ref->new_sha1));
if (dry_run) {
if (helper_status)
printf("ok %s\n", ref->name);
diff --git a/walker.c b/walker.c
index 44a936c..cdeb63f 100644
--- a/walker.c
+++ b/walker.c
@@ -17,10 +17,9 @@ void walker_say(struct walker *walker, const char *fmt, 
const char *hex)
 
 static void report_missing(const struct object *obj)
 {
-   char missing_hex[41];
-   strcpy(missing_hex, sha1_to_hex(obj->sha1));
fprintf(stderr, "Cannot obtain needed %s %s\n",
-   obj->type ? typename(obj->type): "object", missing_hex);
+   obj->type ? typename(obj->type): "object",
+   sha1_to_hex(obj->sha1));
if (!is_null_sha1(current_commit_sha1))
fprintf(stderr, "while processing commit %s.\n",
sha1_to_hex(current_commit_sha1));
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 56/67] avoid sprintf and strcpy with flex arrays

2015-09-15 Thread Jeff King
When we are allocating a struct with a FLEX_ARRAY member, we
generally compute the size of the array and then sprintf or
strcpy into it. Normally we could improve a dynamic allocation
like this by using xstrfmt, but it doesn't work here; we
have to account for the size of the rest of the struct.

But we can improve things a bit by storing the length that
we use for the allocation, and then feeding it to xsnprintf
or memcpy, which makes it more obvious that we are not
writing more than the allocated number of bytes.

It would be nice if we had some kind of helper for
allocating generic flex arrays, but it doesn't work that
well:

 - the call signature is a little bit unwieldy:

  d = flex_struct(sizeof(*d), offsetof(d, path), fmt, ...);

   You need offsetof here instead of just writing to the
   end of the base size, because we don't know how the
   struct is packed (partially this is because FLEX_ARRAY
   might not be zero, though we can account for that; but
   the size of the struct may actually be rounded up for
   alignment, and we can't know that).

 - some sites do clever things, like over-allocating because
   they know they will write larger things into the buffer
   later (e.g., struct packed_git here).

So we're better off to just write out each allocation (or
add type-specific helpers, though many of these are one-off
allocations anyway).

Signed-off-by: Jeff King 
---
Actually, I have a malloc-hardening series (which I'll post after this),
in which I _do_ break down and add a FLEX_ALLOC() macro. But we still
cannot use it for these cases anyway, because we don't assume all
platforms support variadic macros.

 archive.c   | 5 +++--
 builtin/blame.c | 5 +++--
 fast-import.c   | 6 --
 refs.c  | 8 
 sha1_file.c | 5 +++--
 submodule.c | 6 --
 6 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/archive.c b/archive.c
index 01b0899..4ac86c8 100644
--- a/archive.c
+++ b/archive.c
@@ -171,13 +171,14 @@ static void queue_directory(const unsigned char *sha1,
unsigned mode, int stage, struct archiver_context *c)
 {
struct directory *d;
-   d = xmallocz(sizeof(*d) + base->len + 1 + strlen(filename));
+   size_t len = base->len + 1 + strlen(filename) + 1;
+   d = xmalloc(sizeof(*d) + len);
d->up  = c->bottom;
d->baselen = base->len;
d->mode= mode;
d->stage   = stage;
c->bottom  = d;
-   d->len = sprintf(d->path, "%.*s%s/", (int)base->len, base->buf, 
filename);
+   d->len = xsnprintf(d->path, len, "%.*s%s/", (int)base->len, base->buf, 
filename);
hashcpy(d->oid.hash, sha1);
 }
 
diff --git a/builtin/blame.c b/builtin/blame.c
index f8ec7ff..dbec516 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -459,12 +459,13 @@ static void queue_blames(struct scoreboard *sb, struct 
origin *porigin,
 static struct origin *make_origin(struct commit *commit, const char *path)
 {
struct origin *o;
-   o = xcalloc(1, sizeof(*o) + strlen(path) + 1);
+   size_t pathlen = strlen(path) + 1;
+   o = xcalloc(1, sizeof(*o) + pathlen);
o->commit = commit;
o->refcnt = 1;
o->next = commit->util;
commit->util = o;
-   strcpy(o->path, path);
+   memcpy(o->path, path, pathlen); /* includes NUL */
return o;
 }
 
diff --git a/fast-import.c b/fast-import.c
index d0c2502..895c6b4 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -863,13 +863,15 @@ static void start_packfile(void)
 {
static char tmp_file[PATH_MAX];
struct packed_git *p;
+   int namelen;
struct pack_header hdr;
int pack_fd;
 
pack_fd = odb_mkstemp(tmp_file, sizeof(tmp_file),
  "pack/tmp_pack_XX");
-   p = xcalloc(1, sizeof(*p) + strlen(tmp_file) + 2);
-   strcpy(p->pack_name, tmp_file);
+   namelen = strlen(tmp_file) + 2;
+   p = xcalloc(1, sizeof(*p) + namelen);
+   xsnprintf(p->pack_name, namelen, "%s", tmp_file);
p->pack_fd = pack_fd;
p->do_not_close = 1;
pack_file = sha1fd(pack_fd, p->pack_name);
diff --git a/refs.c b/refs.c
index c2709de..df6c41a 100644
--- a/refs.c
+++ b/refs.c
@@ -2695,7 +2695,7 @@ static int pack_if_possible_fn(struct ref_entry *entry, 
void *cb_data)
int namelen = strlen(entry->name) + 1;
struct ref_to_prune *n = xcalloc(1, sizeof(*n) + namelen);
hashcpy(n->sha1, entry->u.value.oid.hash);
-   strcpy(n->name, entry->name);
+   memcpy(n->name, entry->name, namelen); /* includes NUL */
n->next = cb->ref_to_prune;
cb->ref_to_prune = n;
}
@@ -3984,10 +3984,10 @@ void ref_transaction_free(struct ref_transaction 
*transaction)
 static struct ref_update *add_update(struct ref_transaction *transaction,
 const char *refname)
 {
-   size_t len = 

[PATCH 62/67] convert strncpy to memcpy

2015-09-15 Thread Jeff King
strncpy is known to be a confusing function because of its
termination semantics.  These calls are all correct, but it
takes some examination to see why. In particular, every one
of them expects to copy up to the length limit, and then
makes some arrangement for terminating the result.

We can just use memcpy, along with noting explicitly how the
result is terminated (if it is not already obvious). That
should make it more clear to a reader that we are doing the
right thing.

Signed-off-by: Jeff King 
---
 builtin/help.c | 4 ++--
 fast-import.c  | 2 +-
 tag.c  | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/help.c b/builtin/help.c
index e1650ab..1cd0c1e 100644
--- a/builtin/help.c
+++ b/builtin/help.c
@@ -176,7 +176,7 @@ static void add_man_viewer(const char *name)
while (*p)
p = &((*p)->next);
*p = xcalloc(1, (sizeof(**p) + len + 1));
-   strncpy((*p)->name, name, len);
+   memcpy((*p)->name, name, len); /* NUL-terminated by xcalloc */
 }
 
 static int supported_man_viewer(const char *name, size_t len)
@@ -192,7 +192,7 @@ static void do_add_man_viewer_info(const char *name,
 {
struct man_viewer_info_list *new = xcalloc(1, sizeof(*new) + len + 1);
 
-   strncpy(new->name, name, len);
+   memcpy(new->name, name, len); /* NUL-terminated by xcalloc */
new->info = xstrdup(value);
new->next = man_viewer_info_list;
man_viewer_info_list = new;
diff --git a/fast-import.c b/fast-import.c
index cf6d8bc..4d01efc 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -703,7 +703,7 @@ static struct atom_str *to_atom(const char *s, unsigned 
short len)
 
c = pool_alloc(sizeof(struct atom_str) + len + 1);
c->str_len = len;
-   strncpy(c->str_dat, s, len);
+   memcpy(c->str_dat, s, len);
c->str_dat[len] = 0;
c->next_atom = atom_table[hc];
atom_table[hc] = c;
diff --git a/tag.c b/tag.c
index 5b0ac62..5b2a06d 100644
--- a/tag.c
+++ b/tag.c
@@ -82,7 +82,7 @@ int parse_tag_buffer(struct tag *item, const void *data, 
unsigned long size)
nl = memchr(bufptr, '\n', tail - bufptr);
if (!nl || sizeof(type) <= (nl - bufptr))
return -1;
-   strncpy(type, bufptr, nl - bufptr);
+   memcpy(type, bufptr, nl - bufptr);
type[nl - bufptr] = '\0';
bufptr = nl + 1;
 
-- 
2.6.0.rc2.408.ga2926b9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/67] add reentrant variants of sha1_to_hex and find_unique_abbrev

2015-09-15 Thread Ramsay Jones


On 15/09/15 16:26, Jeff King wrote:
> The sha1_to_hex and find_unique_abbrev functions always
> write into reusable static buffers. There are a few problems
> with this:
>
>   - future calls overwrite our result. This is especially
> annoying with find_unique_abbrev, which does not have a
> ring of buffers, so you cannot even printf() a result
> that has two abbreviated sha1s.
>
>   - if you want to put the result into another buffer, we
> often strcpy, which looks suspicious when auditing for
> overflows.
>
> This patch introduces sha1_to_hex_to and find_unique_abbrev_to,
> which write into a user-provided buffer. Of course this is
> just punting on the overflow-auditing, as the buffer
> obviously needs to be GIT_SHA1_HEXSZ + 1 bytes. But it is
> much easier to audit, since that is a well-known size.

Hmm, I haven't read any other patches yet (including those which use these
new '_to' functions), but I can't help feeling they should be named something
like 'sha1_to_hex_str()' and 'find_unique_abbrev_str()' instead.  i.e. I don't 
get
the '_to' thing - not that I'm any good at naming things ...

ATB,
Ramsay Jones

>
> We retain the non-reentrant forms, which just become thin
> wrappers around the reentrant ones. This patch also adds a
> strbuf variant of find_unique_abbrev, which will be handy in
> later patches.
>
> Signed-off-by: Jeff King 
> ---
> If we wanted to be really meticulous, these functions could
> take a size for the output buffer, and complain if it is not
> GIT_SHA1_HEXSZ+1 bytes. But that would bloat every call
> like:
>
>   sha1_to_hex_to(buf, sizeof(buf), sha1);
>
>  cache.h | 27 ++-
>  hex.c   | 13 +
>  sha1_name.c | 16 +++-
>  strbuf.c|  9 +
>  strbuf.h|  8 
>  5 files changed, 63 insertions(+), 10 deletions(-)
>
> diff --git a/cache.h b/cache.h
> index e231e47..cc59aba 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -785,7 +785,21 @@ extern char *sha1_pack_name(const unsigned char *sha1);
>   */
>  extern char *sha1_pack_index_name(const unsigned char *sha1);
>  
> -extern const char *find_unique_abbrev(const unsigned char *sha1, int);
> +/*
> + * Return an abbreviated sha1 unique within this repository's object 
> database.
> + * The result will be at least `len` characters long, and will be NUL
> + * terminated.
> + *
> + * The non-`_to` version returns a static buffer which will be overwritten by
> + * subsequent calls.
> + *
> + * The `_to` variant writes to a buffer supplied by the caller, which must be
> + * at least `GIT_SHA1_HEXSZ + 1` bytes. The return value is the number of 
> bytes
> + * written (excluding the NUL terminator).
> + */
> +extern const char *find_unique_abbrev(const unsigned char *sha1, int len);
> +extern int find_unique_abbrev_to(char *hex, const unsigned char *sha1, int 
> len);
> +
>  extern const unsigned char null_sha1[GIT_SHA1_RAWSZ];
>  
>  static inline int hashcmp(const unsigned char *sha1, const unsigned char 
> *sha2)
> @@ -1067,6 +1081,17 @@ extern int for_each_abbrev(const char *prefix, 
> each_abbrev_fn, void *);
>  extern int get_sha1_hex(const char *hex, unsigned char *sha1);
>  extern int get_oid_hex(const char *hex, struct object_id *sha1);
>  
> +/*
> + * Convert a binary sha1 to its hex equivalent. The `_to` variant writes
> + * the NUL-terminated output to the buffer `out`, which must be at least
> + * `GIT_SHA1_HEXSZ + 1` bytes, and returns a pointer to out for convenience.
> + *
> + * The non-`_to` variant returns a static buffer, but uses a ring of 4
> + * buffers, making it safe to make multiple calls for a single statement, 
> like:
> + *
> + *   printf("%s -> %s", sha1_to_hex(one), sha1_to_hex(two));
> + */
> +extern char *sha1_to_hex_to(char *out, const unsigned char *sha1);
>  extern char *sha1_to_hex(const unsigned char *sha1); /* static buffer 
> result! */
>  extern char *oid_to_hex(const struct object_id *oid);/* same static 
> buffer as sha1_to_hex */
>  
> diff --git a/hex.c b/hex.c
> index 899b74a..004fdea 100644
> --- a/hex.c
> +++ b/hex.c
> @@ -61,12 +61,10 @@ int get_oid_hex(const char *hex, struct object_id *oid)
>   return get_sha1_hex(hex, oid->hash);
>  }
>  
> -char *sha1_to_hex(const unsigned char *sha1)
> +char *sha1_to_hex_to(char *buffer, const unsigned char *sha1)
>  {
> - static int bufno;
> - static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
>   static const char hex[] = "0123456789abcdef";
> - char *buffer = hexbuffer[3 & ++bufno], *buf = buffer;
> + char *buf = buffer;
>   int i;
>  
>   for (i = 0; i < GIT_SHA1_RAWSZ; i++) {
> @@ -79,6 +77,13 @@ char *sha1_to_hex(const unsigned char *sha1)
>   return buffer;
>  }
>  
> +char *sha1_to_hex(const unsigned char *sha1)
> +{
> + static int bufno;
> + static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
> + return sha1_to_hex_to(hexbuffer[3 & ++bufno], sha1);
> +}
> +
>  char *oid_to_hex(const struct 

Re: [PATCH 08/67] add reentrant variants of sha1_to_hex and find_unique_abbrev

2015-09-15 Thread Jeff King
On Tue, Sep 15, 2015 at 05:55:55PM +0100, Ramsay Jones wrote:

> On 15/09/15 16:26, Jeff King wrote:
> > The sha1_to_hex and find_unique_abbrev functions always
> > write into reusable static buffers. There are a few problems
> > with this:
> >
> >   - future calls overwrite our result. This is especially
> > annoying with find_unique_abbrev, which does not have a
> > ring of buffers, so you cannot even printf() a result
> > that has two abbreviated sha1s.
> >
> >   - if you want to put the result into another buffer, we
> > often strcpy, which looks suspicious when auditing for
> > overflows.
> >
> > This patch introduces sha1_to_hex_to and find_unique_abbrev_to,
> > which write into a user-provided buffer. Of course this is
> > just punting on the overflow-auditing, as the buffer
> > obviously needs to be GIT_SHA1_HEXSZ + 1 bytes. But it is
> > much easier to audit, since that is a well-known size.
> 
> Hmm, I haven't read any other patches yet (including those which use these
> new '_to' functions), but I can't help feeling they should be named something
> like 'sha1_to_hex_str()' and 'find_unique_abbrev_str()' instead.  i.e. I 
> don't get
> the '_to' thing - not that I'm any good at naming things ...

I meant it as a contrast with their original. sha1_to_hex() formats into
an internal buffer and returns it. But sha1_to_hex_to() formats "to" a
buffer of your choice.

I'm happy to switch the names to something else, but I don't think
_str() conveys the difference. If I were starting from scratch, I would
probably have just called my variant sha1_to_hex(), and called the
original sha1_to_hex_unsafe(). :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/67] use xsnprintf for generating git object headers

2015-09-15 Thread Jeff King
We generally use 32-byte buffers to format git's "type size"
header fields. These should not generally overflow unless
you can produce some truly gigantic objects (and our types
come from our internal array of constant strings). But it is
a good idea to use xsnprintf to make sure this is the case.

Note that we slightly modify the interface to
write_sha1_file_prepare, which nows uses "hdrlen" as an "in"
parameter as well as an "out" (on the way in it stores the
allocated size of the header, and on the way out it returns
the ultimate size of the header).

Signed-off-by: Jeff King 
---
 builtin/index-pack.c |  2 +-
 bulk-checkin.c   |  4 ++--
 fast-import.c|  4 ++--
 http-push.c  |  2 +-
 sha1_file.c  | 13 +++--
 5 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 3431de2..1ad1bde 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -441,7 +441,7 @@ static void *unpack_entry_data(unsigned long offset, 
unsigned long size,
int hdrlen;
 
if (!is_delta_type(type)) {
-   hdrlen = sprintf(hdr, "%s %lu", typename(type), size) + 1;
+   hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(type), 
size) + 1;
git_SHA1_Init();
git_SHA1_Update(, hdr, hdrlen);
} else
diff --git a/bulk-checkin.c b/bulk-checkin.c
index 7cffc3a..4347f5c 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -200,8 +200,8 @@ static int deflate_to_pack(struct bulk_checkin_state *state,
if (seekback == (off_t) -1)
return error("cannot find the current offset");
 
-   header_len = sprintf((char *)obuf, "%s %" PRIuMAX,
-typename(type), (uintmax_t)size) + 1;
+   header_len = xsnprintf((char *)obuf, sizeof(obuf), "%s %" PRIuMAX,
+  typename(type), (uintmax_t)size) + 1;
git_SHA1_Init();
git_SHA1_Update(, obuf, header_len);
 
diff --git a/fast-import.c b/fast-import.c
index 6c7c3c9..d0c2502 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1035,8 +1035,8 @@ static int store_object(
git_SHA_CTX c;
git_zstream s;
 
-   hdrlen = sprintf((char *)hdr,"%s %lu", typename(type),
-   (unsigned long)dat->len) + 1;
+   hdrlen = xsnprintf((char *)hdr, sizeof(hdr), "%s %lu",
+  typename(type), (unsigned long)dat->len) + 1;
git_SHA1_Init();
git_SHA1_Update(, hdr, hdrlen);
git_SHA1_Update(, dat->buf, dat->len);
diff --git a/http-push.c b/http-push.c
index 154e67b..1f3788f 100644
--- a/http-push.c
+++ b/http-push.c
@@ -361,7 +361,7 @@ static void start_put(struct transfer_request *request)
git_zstream stream;
 
unpacked = read_sha1_file(request->obj->sha1, , );
-   hdrlen = sprintf(hdr, "%s %lu", typename(type), len) + 1;
+   hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(type), len) + 1;
 
/* Set it up */
git_deflate_init(, zlib_compression_level);
diff --git a/sha1_file.c b/sha1_file.c
index d295a32..f106091 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1464,7 +1464,7 @@ int check_sha1_signature(const unsigned char *sha1, void 
*map,
return -1;
 
/* Generate the header */
-   hdrlen = sprintf(hdr, "%s %lu", typename(obj_type), size) + 1;
+   hdrlen = xsnprintf(hdr, sizeof(hdr), "%s %lu", typename(obj_type), 
size) + 1;
 
/* Sha1.. */
git_SHA1_Init();
@@ -2930,7 +2930,7 @@ static void write_sha1_file_prepare(const void *buf, 
unsigned long len,
git_SHA_CTX c;
 
/* Generate the header */
-   *hdrlen = sprintf(hdr, "%s %lu", type, len)+1;
+   *hdrlen = xsnprintf(hdr, *hdrlen, "%s %lu", type, len)+1;
 
/* Sha1.. */
git_SHA1_Init();
@@ -2993,7 +2993,7 @@ int hash_sha1_file(const void *buf, unsigned long len, 
const char *type,
unsigned char *sha1)
 {
char hdr[32];
-   int hdrlen;
+   int hdrlen = sizeof(hdr);
write_sha1_file_prepare(buf, len, type, sha1, hdr, );
return 0;
 }
@@ -3139,7 +3139,7 @@ static int freshen_packed_object(const unsigned char 
*sha1)
 int write_sha1_file(const void *buf, unsigned long len, const char *type, 
unsigned char *sha1)
 {
char hdr[32];
-   int hdrlen;
+   int hdrlen = sizeof(hdr);
 
/* Normally if we have it in the pack then we do not bother writing
 * it out into .git/objects/??/?{38} file.
@@ -3157,7 +3157,8 @@ int hash_sha1_file_literally(const void *buf, unsigned 
long len, const char *typ
int hdrlen, status = 0;
 
/* type string, SP, %lu of the length plus NUL must fit this */
-   header = xmalloc(strlen(type) + 32);
+   hdrlen = strlen(type) + 32;
+   header = xmalloc(hdrlen);
write_sha1_file_prepare(buf, len, type, sha1, header, );
 
if (!(flags & HASH_WRITE_OBJECT))
@@ -3185,7 

  1   2   >