Re: [PATCH v2 07/34] sequencer (rebase -i): add support for the 'fixup' and 'squash' commands

2016-12-19 Thread Jeff King
On Mon, Dec 19, 2016 at 05:59:06PM +0100, Johannes Schindelin wrote:

> > > > +   sprintf((char *)p, "%d", ++count);
> > > 
> > > Do we know the area pointed at p (which is inside buf) long enough
> > > not to overflow?  If the original were 9 and you incremented to get
> > > 10, you would need one extra byte.
> > 
> > Even if it is enough, I'd ask to please use xsnprintf(). In the off
> > chance that there's a programming error, we'd get a nice die("BUG")
> > instead of a buffer overflow (and it makes the code base easier to audit
> > for other overflows).
> 
> I ended up with more verbose, easier-to-read code that does not try to do
> things in-place, in favor of being slightly more wasteful with strbufs.

Great. I agree that should make the whole thing way more readable.

-Peff


Re: [PATCH v2 07/34] sequencer (rebase -i): add support for the 'fixup' and 'squash' commands

2016-12-19 Thread Johannes Schindelin
Hi Peff,

On Thu, 15 Dec 2016, Jeff King wrote:

> On Tue, Dec 13, 2016 at 04:30:01PM +0100, Johannes Schindelin wrote:
> 
> > +   else {
> > +   unsigned char head[20];
> > +   struct commit *head_commit;
> > +   const char *head_message, *body;
> > +
> > +   if (get_sha1("HEAD", head))
> > +   return error(_("need a HEAD to fixup"));
> > +   if (!(head_commit = lookup_commit_reference(head)))
> > +   return error(_("could not read HEAD"));
> > +   if (!(head_message = get_commit_buffer(head_commit, NULL)))
> > +   return error(_("could not read HEAD's commit message"));
> 
> This get_commit_buffer() may allocate a fresh buffer...
> 
> > +   body = strstr(head_message, "\n\n");
> > +   if (!body)
> > +   body = "";
> > +   else
> > +   body = skip_blank_lines(body + 2);
> > +   if (write_message(body, strlen(body),
> > + rebase_path_fixup_msg(), 0))
> > +   return error(_("cannot write '%s'"),
> > +rebase_path_fixup_msg());
> 
> ...and then this return leaks the result (the other code path hits
> unuse_commit_buffer(), and is fine).

Good point.

I found another leaked commit buffer in make_patch() and fixed it, too.

> This leak was noticed by Coverity. It has a _ton_ of false positives
> across the whole project, but it sends out a mail with new ones every
> few days, which is usually short enough that I can process it in 30
> seconds or so.

Yeah, I get these mails now, thanks to Stephan adding me in response to
some issues I introduced with the builtin difftool (and hence I did not
get the warnings when I introduced the problems with sequencer-i).

Ciao,
Dscho


Re: [PATCH v2 07/34] sequencer (rebase -i): add support for the 'fixup' and 'squash' commands

2016-12-19 Thread Johannes Schindelin
Hi Peff,

On Thu, 15 Dec 2016, Jeff King wrote:

> On Thu, Dec 15, 2016 at 10:42:53AM -0800, Junio C Hamano wrote:
> 
> > > + sprintf((char *)p, "%d", ++count);
> > 
> > Do we know the area pointed at p (which is inside buf) long enough
> > not to overflow?  If the original were 9 and you incremented to get
> > 10, you would need one extra byte.
> 
> Even if it is enough, I'd ask to please use xsnprintf(). In the off
> chance that there's a programming error, we'd get a nice die("BUG")
> instead of a buffer overflow (and it makes the code base easier to audit
> for other overflows).

I ended up with more verbose, easier-to-read code that does not try to do
things in-place, in favor of being slightly more wasteful with strbufs.

Ciao,
Dscho


Re: [PATCH v2 07/34] sequencer (rebase -i): add support for the 'fixup' and 'squash' commands

2016-12-19 Thread Johannes Schindelin
Hi Junio,

On Thu, 15 Dec 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > diff --git a/sequencer.c b/sequencer.c
> > index f6e20b142a..271c21581d 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -45,6 +45,35 @@ static GIT_PATH_FUNC(rebase_path_todo, 
> > "rebase-merge/git-rebase-todo")
> >   */
> >  static GIT_PATH_FUNC(rebase_path_done, "rebase-merge/done")
> >  /*
> > + * The commit message that is planned to be used for any changes that
> > + * need to be committed following a user interaction.
> > + */
> > +static GIT_PATH_FUNC(rebase_path_message, "rebase-merge/message")
> > +/*
> > + * The file into which is accumulated the suggested commit message for
> > + * squash/fixup commands. When the first of a series of squash/fixups
> 
> The same comment as 03/34 applies here, regarding blank line to
> separate logical unit.

Same rationale here: the path functions are one big continuous block, with
comments obviously applying to the immediately following line only.

> > +static int update_squash_messages(enum todo_command command,
> > +   struct commit *commit, struct replay_opts *opts)
> > +{
> > +   struct strbuf buf = STRBUF_INIT;
> > +   int count, res;
> > +   const char *message, *body;
> > +
> > +   if (file_exists(rebase_path_squash_msg())) {
> > +   char *p, *p2;
> > +
> > +   if (strbuf_read_file(, rebase_path_squash_msg(), 2048) <= 0)
> > +   return error(_("could not read '%s'"),
> > +   rebase_path_squash_msg());
> > +
> > +   if (buf.buf[0] != comment_line_char ||
> > +   !skip_prefix(buf.buf + 1, " This is a combination of ",
> > +(const char **)))
> > +   return error(_("unexpected 1st line of squash message:"
> > +  "\n\n\t%.*s"),
> > +(int)(strchrnul(buf.buf, '\n') - buf.buf),
> > +buf.buf);
> > +   count = strtol(p, , 10);
> > +
> > +   if (count < 1 || *p2 != ' ')
> > +   return error(_("invalid 1st line of squash message:\n"
> > +  "\n\t%.*s"),
> > +(int)(strchrnul(buf.buf, '\n') - buf.buf),
> > +buf.buf);
> > +
> > +   sprintf((char *)p, "%d", ++count);
> 
> Do we know the area pointed at p (which is inside buf) long enough
> not to overflow?

Yes, we know that: p2 points to the byte after the parsed number, and said
byte is a space (ASCII 0x20), as verified by the if() above.

> If the original were 9 and you incremented to get 10, you would need one
> extra byte.

Exactly. That extra byte (if needed) is 0x20, as verified above, and can
be overwritten.

If that extra byte (to which p2 points) is *not* overwritten, i.e. if the
new count requires the same amount of space in decimal representation as
the previous count, it is now NUL, as tested here:

> > +   if (!*p2)
> > +   *p2 = ' ';
> > +   else {
> > +   *(++p2) = 'c';
> 
> p2 points into buf; do we know this increment does not step beyond
> its end?  What is the meaning of a letter 'c' here (I do not see a
> corresponding one in the scripted update_squash_messages)?

This clause is entered only when the space needed by the previous count
was not sufficient to hold the new count, at which point we know that not
only the space after the old count was overwritten, but also the 'c' of
the "commits" string.

Therefore, this clause reinstates the 'c' and inserts the space, so that
everything is groovy again:

> > +   strbuf_insert(, p2 - buf.buf, " ", 1);
> > +   }

Having said that, I just realized that this code was only safe as long as
the squash messages were not localized.

I changed the code to imitate more closely what the shell script does. It
made the code a little more verbose, but it should work better as a
consequence, and I am pretty certain you will find it easier to verify
that it is correct.

> > +   }
> > +   else {
> 
> Style: "} else {" (I won't repeat this, as it will become too noisy).

It is not necessary to repeat this, either, as I took the first such
comment as a strong hint to look at the entire patch series and fix it
appropriately.

> 
> > +   unsigned char head[20];
> > +   struct commit *head_commit;
> > +   const char *head_message, *body;
> > +
> > +   if (get_sha1("HEAD", head))
> > +   return error(_("need a HEAD to fixup"));
> > +   if (!(head_commit = lookup_commit_reference(head)))
> > +   return error(_("could not read HEAD"));
> > +   if (!(head_message = get_commit_buffer(head_commit, NULL)))
> > +   return error(_("could not read HEAD's commit message"));
> > +
> > +   body = strstr(head_message, "\n\n");
> > +

Re: [PATCH v2 07/34] sequencer (rebase -i): add support for the 'fixup' and 'squash' commands

2016-12-15 Thread Jeff King
On Thu, Dec 15, 2016 at 11:07:34AM -0800, Stefan Beller wrote:

> On Thu, Dec 15, 2016 at 11:03 AM, Jeff King  wrote:
> > wonder if it would be helpful to send that output to the list.
> 
> Sure we can try.
> 
> Another project I used to run through coverity (Gerrit), shows
> similar characteristics w.r.t. false positives, so people complained
> when I was force feeding them the niceties of static analysis.
> 
> I'll just try to set it up and see how the mailing list reacts.
> (Not sure if you can just add emails there or if the email has
> to be verified or such.)

I see you added it, but I don't see the confirmation email on the list.
I wonder if it was HTML mail and vger ate it.

-Peff


Re: [PATCH v2 07/34] sequencer (rebase -i): add support for the 'fixup' and 'squash' commands

2016-12-15 Thread Stefan Beller
On Thu, Dec 15, 2016 at 11:20 AM, Jeff King  wrote:
> On Thu, Dec 15, 2016 at 11:07:34AM -0800, Stefan Beller wrote:
>
>> On Thu, Dec 15, 2016 at 11:03 AM, Jeff King  wrote:
>> > wonder if it would be helpful to send that output to the list.
>>
>> Sure we can try.
>>
>> Another project I used to run through coverity (Gerrit), shows
>> similar characteristics w.r.t. false positives, so people complained
>> when I was force feeding them the niceties of static analysis.
>>
>> I'll just try to set it up and see how the mailing list reacts.
>> (Not sure if you can just add emails there or if the email has
>> to be verified or such.)
>
> I see you added it, but I don't see the confirmation email on the list.
> I wonder if it was HTML mail and vger ate it.
>
> -Peff

I think I'll setup a dummy gmail account to use for subscription
and then I'll configure that to forward all email from coverity to the list.
(the actual complaints about memleaks etc are plain text, so that
forwarding then should work)


Re: [PATCH v2 07/34] sequencer (rebase -i): add support for the 'fixup' and 'squash' commands

2016-12-15 Thread Stefan Beller
On Thu, Dec 15, 2016 at 11:03 AM, Jeff King  wrote:
> wonder if it would be helpful to send that output to the list.

Sure we can try.

Another project I used to run through coverity (Gerrit), shows
similar characteristics w.r.t. false positives, so people complained
when I was force feeding them the niceties of static analysis.

I'll just try to set it up and see how the mailing list reacts.
(Not sure if you can just add emails there or if the email has
to be verified or such.)

Thanks,
Stefan


Re: [PATCH v2 07/34] sequencer (rebase -i): add support for the 'fixup' and 'squash' commands

2016-12-15 Thread Jeff King
On Tue, Dec 13, 2016 at 04:30:01PM +0100, Johannes Schindelin wrote:

> + else {
> + unsigned char head[20];
> + struct commit *head_commit;
> + const char *head_message, *body;
> +
> + if (get_sha1("HEAD", head))
> + return error(_("need a HEAD to fixup"));
> + if (!(head_commit = lookup_commit_reference(head)))
> + return error(_("could not read HEAD"));
> + if (!(head_message = get_commit_buffer(head_commit, NULL)))
> + return error(_("could not read HEAD's commit message"));

This get_commit_buffer() may allocate a fresh buffer...

> + body = strstr(head_message, "\n\n");
> + if (!body)
> + body = "";
> + else
> + body = skip_blank_lines(body + 2);
> + if (write_message(body, strlen(body),
> +   rebase_path_fixup_msg(), 0))
> + return error(_("cannot write '%s'"),
> +  rebase_path_fixup_msg());

...and then this return leaks the result (the other code path hits
unuse_commit_buffer(), and is fine).

This leak was noticed by Coverity. It has a _ton_ of false positives
across the whole project, but it sends out a mail with new ones every
few days, which is usually short enough that I can process it in 30
seconds or so.

I _think_ that email just goes to me and Stefan right now. You can add
yourself at:

  https://scan.coverity.com/projects/git?tab=project_settings

if you already have admin access to the project (which I think you
(Dscho) do).  I wonder if it would be helpful to send that output to the
list.

-Peff


Re: [PATCH v2 07/34] sequencer (rebase -i): add support for the 'fixup' and 'squash' commands

2016-12-15 Thread Jeff King
On Thu, Dec 15, 2016 at 10:42:53AM -0800, Junio C Hamano wrote:

> > +   sprintf((char *)p, "%d", ++count);
> 
> Do we know the area pointed at p (which is inside buf) long enough
> not to overflow?  If the original were 9 and you incremented to get
> 10, you would need one extra byte.

Even if it is enough, I'd ask to please use xsnprintf(). In the off
chance that there's a programming error, we'd get a nice die("BUG")
instead of a buffer overflow (and it makes the code base easier to audit
for other overflows).

-Peff


Re: [PATCH v2 07/34] sequencer (rebase -i): add support for the 'fixup' and 'squash' commands

2016-12-15 Thread Junio C Hamano
Johannes Schindelin  writes:

> diff --git a/sequencer.c b/sequencer.c
> index f6e20b142a..271c21581d 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -45,6 +45,35 @@ static GIT_PATH_FUNC(rebase_path_todo, 
> "rebase-merge/git-rebase-todo")
>   */
>  static GIT_PATH_FUNC(rebase_path_done, "rebase-merge/done")
>  /*
> + * The commit message that is planned to be used for any changes that
> + * need to be committed following a user interaction.
> + */
> +static GIT_PATH_FUNC(rebase_path_message, "rebase-merge/message")
> +/*
> + * The file into which is accumulated the suggested commit message for
> + * squash/fixup commands. When the first of a series of squash/fixups

The same comment as 03/34 applies here, regarding blank line to
separate logical unit.

> +static int update_squash_messages(enum todo_command command,
> + struct commit *commit, struct replay_opts *opts)
> +{
> + struct strbuf buf = STRBUF_INIT;
> + int count, res;
> + const char *message, *body;
> +
> + if (file_exists(rebase_path_squash_msg())) {
> + char *p, *p2;
> +
> + if (strbuf_read_file(, rebase_path_squash_msg(), 2048) <= 0)
> + return error(_("could not read '%s'"),
> + rebase_path_squash_msg());
> +
> + if (buf.buf[0] != comment_line_char ||
> + !skip_prefix(buf.buf + 1, " This is a combination of ",
> +  (const char **)))
> + return error(_("unexpected 1st line of squash message:"
> +"\n\n\t%.*s"),
> +  (int)(strchrnul(buf.buf, '\n') - buf.buf),
> +  buf.buf);
> + count = strtol(p, , 10);
> +
> + if (count < 1 || *p2 != ' ')
> + return error(_("invalid 1st line of squash message:\n"
> +"\n\t%.*s"),
> +  (int)(strchrnul(buf.buf, '\n') - buf.buf),
> +  buf.buf);
> +
> + sprintf((char *)p, "%d", ++count);

Do we know the area pointed at p (which is inside buf) long enough
not to overflow?  If the original were 9 and you incremented to get
10, you would need one extra byte.

> + if (!*p2)
> + *p2 = ' ';
> + else {
> + *(++p2) = 'c';

p2 points into buf; do we know this increment does not step beyond
its end?  What is the meaning of a letter 'c' here (I do not see a
corresponding one in the scripted update_squash_messages)?

> + strbuf_insert(, p2 - buf.buf, " ", 1);
> + }
> + }
> + else {

Style: "} else {" (I won't repeat this, as it will become too noisy).

> + unsigned char head[20];
> + struct commit *head_commit;
> + const char *head_message, *body;
> +
> + if (get_sha1("HEAD", head))
> + return error(_("need a HEAD to fixup"));
> + if (!(head_commit = lookup_commit_reference(head)))
> + return error(_("could not read HEAD"));
> + if (!(head_message = get_commit_buffer(head_commit, NULL)))
> + return error(_("could not read HEAD's commit message"));
> +
> + body = strstr(head_message, "\n\n");
> + if (!body)
> + body = "";
> + else
> + body = skip_blank_lines(body + 2);

I think I saw you used a helper function find_commit_subject() to do
the above in an earlier patch for "edit" in this series.  Would it
make this part (and another one for "commit" we have after this
if/else) shorter?

>  static int do_pick_commit(enum todo_command command, struct commit *commit,
> - struct replay_opts *opts)
> + struct replay_opts *opts, int final_fixup)
>  {
> + int edit = opts->edit, cleanup_commit_message = 0;
> + const char *msg_file = edit ? NULL : git_path_merge_msg();
>   unsigned char head[20];
>   struct commit *base, *next, *parent;
>   const char *base_label, *next_label;
>   struct commit_message msg = { NULL, NULL, NULL, NULL };
>   struct strbuf msgbuf = STRBUF_INIT;
> - int res, unborn = 0, allow;
> + int res, unborn = 0, amend = 0, allow;
>  
>   if (opts->no_commit) {
>   /*
> @@ -749,7 +885,7 @@ static int do_pick_commit(enum todo_command command, 
> struct commit *commit,
>   else
>   parent = commit->parents->item;
>  
> - if (opts->allow_ff &&
> + if (opts->allow_ff && !is_fixup(command) &&
>   ((parent && !hashcmp(parent->object.oid.hash, head)) ||
>(!parent && unborn)))
>   return fast_forward_to(commit->object.oid.hash, head, unborn, 
> opts);
> @@ -813,6 +949,28 @@ static int do_pick_commit(enum todo_command command, 
> struct 

[PATCH v2 07/34] sequencer (rebase -i): add support for the 'fixup' and 'squash' commands

2016-12-13 Thread Johannes Schindelin
This is a huge patch, and at the same time a huge step forward to
execute the performance-critical parts of the interactive rebase in a
builtin command.

Since 'fixup' and 'squash' are not only similar, but also need to know
about each other (we want to reduce a series of fixups/squashes into a
single, final commit message edit, from the user's point of view), we
really have to implement them both at the same time.

Most of the actual work is done by the existing code path that already
handles the "pick" and the "edit" commands; We added support for other
features (e.g. to amend the commit message) in the patches leading up to
this one, yet there are still quite a few bits in this patch that simply
would not make sense as individual patches (such as: determining whether
there was anything to "fix up" in the "todo" script, etc).

In theory, it would be possible to reuse the fast-forward code path also
for the fixup and the squash code paths, but in practice this would make
the code less readable. The end result cannot be fast-forwarded anyway,
therefore let's just extend the cherry-picking code path for now.

Since the sequencer parses the entire `git-rebase-todo` script in one go,
fixup or squash commands without a preceding pick can be reported early
(in git-rebase--interactive, we could only report such errors just before
executing the fixup/squash).

Signed-off-by: Johannes Schindelin 
---
 sequencer.c | 232 +---
 1 file changed, 222 insertions(+), 10 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index f6e20b142a..271c21581d 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -45,6 +45,35 @@ static GIT_PATH_FUNC(rebase_path_todo, 
"rebase-merge/git-rebase-todo")
  */
 static GIT_PATH_FUNC(rebase_path_done, "rebase-merge/done")
 /*
+ * The commit message that is planned to be used for any changes that
+ * need to be committed following a user interaction.
+ */
+static GIT_PATH_FUNC(rebase_path_message, "rebase-merge/message")
+/*
+ * The file into which is accumulated the suggested commit message for
+ * squash/fixup commands. When the first of a series of squash/fixups
+ * is seen, the file is created and the commit message from the
+ * previous commit and from the first squash/fixup commit are written
+ * to it. The commit message for each subsequent squash/fixup commit
+ * is appended to the file as it is processed.
+ *
+ * The first line of the file is of the form
+ * # This is a combination of $count commits.
+ * where $count is the number of commits whose messages have been
+ * written to the file so far (including the initial "pick" commit).
+ * Each time that a commit message is processed, this line is read and
+ * updated. It is deleted just before the combined commit is made.
+ */
+static GIT_PATH_FUNC(rebase_path_squash_msg, "rebase-merge/message-squash")
+/*
+ * If the current series of squash/fixups has not yet included a squash
+ * command, then this file exists and holds the commit message of the
+ * original "pick" commit.  (If the series ends without a "squash"
+ * command, then this can be used as the commit message of the combined
+ * commit without opening the editor.)
+ */
+static GIT_PATH_FUNC(rebase_path_fixup_msg, "rebase-merge/message-fixup")
+/*
  * A script to set the GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL, and
  * GIT_AUTHOR_DATE that will be used for the commit that is currently
  * being rebased.
@@ -673,6 +702,8 @@ enum todo_command {
TODO_PICK = 0,
TODO_REVERT,
TODO_EDIT,
+   TODO_FIXUP,
+   TODO_SQUASH,
/* commands that do something else than handling a single commit */
TODO_EXEC,
/* commands that do nothing but are counted for reporting progress */
@@ -683,6 +714,8 @@ static const char *todo_command_strings[] = {
"pick",
"revert",
"edit",
+   "fixup",
+   "squash",
"exec",
"noop"
 };
@@ -694,16 +727,119 @@ static const char *command_to_string(const enum 
todo_command command)
die("Unknown command: %d", command);
 }
 
+static int is_fixup(enum todo_command command)
+{
+   return command == TODO_FIXUP || command == TODO_SQUASH;
+}
+
+static int update_squash_messages(enum todo_command command,
+   struct commit *commit, struct replay_opts *opts)
+{
+   struct strbuf buf = STRBUF_INIT;
+   int count, res;
+   const char *message, *body;
+
+   if (file_exists(rebase_path_squash_msg())) {
+   char *p, *p2;
+
+   if (strbuf_read_file(, rebase_path_squash_msg(), 2048) <= 0)
+   return error(_("could not read '%s'"),
+   rebase_path_squash_msg());
+
+   if (buf.buf[0] != comment_line_char ||
+   !skip_prefix(buf.buf + 1, " This is a combination of ",
+(const char **)))
+   return error(_("unexpected 1st