Re: [PATCH 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper

2016-09-04 Thread Johannes Schindelin
Hi Josh,

On Sat, 3 Sep 2016, Josh Triplett wrote:

> On Fri, Sep 02, 2016 at 06:23:42PM +0200, Johannes Schindelin wrote:
> > Let's reimplement this with linear complexity (using a hash map to
> > match the commits' subject lines) for the common case; Sadly, the
> > fixup/squash feature's design neglected performance considerations,
> > allowing arbitrary prefixes (read: `fixup! hell` will match the
> > commit subject `hello world`), which means that we are stuck with
> > quadratic performance in the worst case.
> 
> If the performance of that case matters enough, we can do better than
> quadratic complexity: maintain a trie of the subjects, allowing prefix
> lookups.  (Or hash all the prefixes, which you can do in linear time on
> a string: hash next char, save hash, repeat.)  However, that would
> pessimize the normal case of either a complete subject or a sha1, due to
> the extra time taken constructing the data structure.  Probably not
> worth it, if you assume that most "fixup!" subjects come from `git
> commit --fixup` or similar automated means.

Right. My reaction to finding our that subject prefixes were allowed, too,
was "WTF?". And then: who uses that? And then: that's gonna hurt
performance! And then: but I can optimize for the common case!

The point is: only when people specify a strict prefix will the
performance be hurt. Meaning that the performance is linear in the most
common cases.

That is good enough for me, and probably good enough for the vast majority
of the users. If it ain't broke, don't fix it.

In the case that somebody needs strict prefixes to be handled more
efficiently, which I do not expect, the "hash all prefixes" approach may
work well, but it would slow down the common case, so I'd suggest doing
that only as a fallback (i.e. if a fixup! could not be matched up, fall
back to hashing the prefixes, re-hashing the commit subjects that were
already seen so far). If this needs to be implemented at all, I would also
suggest that the person in need of that improvement also needs to take
charge of this: I will not spend more time thinking about this.

Ciao,
Johannes


Re: [PATCH 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper

2016-09-03 Thread Josh Triplett
On Fri, Sep 02, 2016 at 06:23:42PM +0200, Johannes Schindelin wrote:
> Let's reimplement this with linear complexity (using a hash map to
> match the commits' subject lines) for the common case; Sadly, the
> fixup/squash feature's design neglected performance considerations,
> allowing arbitrary prefixes (read: `fixup! hell` will match the
> commit subject `hello world`), which means that we are stuck with
> quadratic performance in the worst case.

If the performance of that case matters enough, we can do better than
quadratic complexity: maintain a trie of the subjects, allowing prefix
lookups.  (Or hash all the prefixes, which you can do in linear time on
a string: hash next char, save hash, repeat.)  However, that would
pessimize the normal case of either a complete subject or a sha1, due to
the extra time taken constructing the data structure.  Probably not
worth it, if you assume that most "fixup!" subjects come from `git
commit --fixup` or similar automated means.


[PATCH 9/9] rebase -i: rearrange fixup/squash lines using the rebase--helper

2016-09-02 Thread Johannes Schindelin
This operation has quadratic complexity, which is especially painful
on Windows, where shell scripts are *already* slow (mainly due to the
overhead of the POSIX emulation layer).

Let's reimplement this with linear complexity (using a hash map to
match the commits' subject lines) for the common case; Sadly, the
fixup/squash feature's design neglected performance considerations,
allowing arbitrary prefixes (read: `fixup! hell` will match the
commit subject `hello world`), which means that we are stuck with
quadratic performance in the worst case.

The reimplemented logic also happens to fix a bug where commented-out
lines (representing empty patches) were dropped by the previous code.

Signed-off-by: Johannes Schindelin 
---
 builtin/rebase--helper.c |   6 +-
 git-rebase--interactive.sh   |  90 +---
 sequencer.c  | 197 +++
 sequencer.h  |   1 +
 t/t3415-rebase-autosquash.sh |   2 +-
 5 files changed, 205 insertions(+), 91 deletions(-)

diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c
index de3ccd9..e6591f0 100644
--- a/builtin/rebase--helper.c
+++ b/builtin/rebase--helper.c
@@ -14,7 +14,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
int keep_empty = 0;
enum {
CONTINUE = 1, ABORT, MAKE_SCRIPT, SHORTEN_SHA1S, EXPAND_SHA1S,
-   CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS
+   CHECK_TODO_LIST, SKIP_UNNECESSARY_PICKS, REARRANGE_SQUASH
} command = 0;
struct option options[] = {
OPT_BOOL(0, "ff", _ff, N_("allow fast-forward")),
@@ -33,6 +33,8 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
N_("check the todo list"), CHECK_TODO_LIST),
OPT_CMDMODE(0, "skip-unnecessary-picks", ,
N_("skip unnecessary picks"), SKIP_UNNECESSARY_PICKS),
+   OPT_CMDMODE(0, "rearrange-squash", ,
+   N_("rearrange fixup/squash lines"), REARRANGE_SQUASH),
OPT_END()
};
 
@@ -59,5 +61,7 @@ int cmd_rebase__helper(int argc, const char **argv, const 
char *prefix)
return !!check_todo_list();
if (command == SKIP_UNNECESSARY_PICKS && argc == 1)
return !!skip_unnecessary_picks();
+   if (command == REARRANGE_SQUASH && argc == 1)
+   return !!rearrange_squash();
usage_with_options(builtin_rebase_helper_usage, options);
 }
diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index a34ebdc..68a6e6a 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -734,94 +734,6 @@ collapse_todo_ids() {
git rebase--helper --shorten-sha1s
 }
 
-# Rearrange the todo list that has both "pick sha1 msg" and
-# "pick sha1 fixup!/squash! msg" appears in it so that the latter
-# comes immediately after the former, and change "pick" to
-# "fixup"/"squash".
-#
-# Note that if the config has specified a custom instruction format
-# each log message will be re-retrieved in order to normalize the
-# autosquash arrangement
-rearrange_squash () {
-   format=$(git config --get rebase.instructionFormat)
-   # extract fixup!/squash! lines and resolve any referenced sha1's
-   while read -r pick sha1 message
-   do
-   test -z "${format}" || message=$(git log -n 1 --format="%s" 
${sha1})
-   case "$message" in
-   "squash! "*|"fixup! "*)
-   action="${message%%!*}"
-   rest=$message
-   prefix=
-   # skip all squash! or fixup! (but save for later)
-   while :
-   do
-   case "$rest" in
-   "squash! "*|"fixup! "*)
-   prefix="$prefix${rest%%!*},"
-   rest="${rest#*! }"
-   ;;
-   *)
-   break
-   ;;
-   esac
-   done
-   printf '%s %s %s %s\n' "$sha1" "$action" "$prefix" 
"$rest"
-   # if it's a single word, try to resolve to a full sha1 
and
-   # emit a second copy. This allows us to match on both 
message
-   # and on sha1 prefix
-   if test "${rest#* }" = "$rest"; then
-   fullsha="$(git rev-parse -q --verify "$rest" 
2>/dev/null)"
-   if test -n "$fullsha"; then
-   # prefix the action to uniquely 
identify this line as
-   # intended for full sha1 match
-