Re: [PATCH/RFC V2] stash: implement builtin stash

2017-04-30 Thread Johannes Schindelin
Hi Peff & Joel,

On Tue, 21 Mar 2017, Jeff King wrote:

> On Mon, Mar 13, 2017 at 03:23:18PM -0700, Joel Teichroeb wrote:
> 
> > I've been working on rewriting git stash as a c builtin and I have all
> > but three tests passing. I'm having a bit of trouble fixing them, as
> > well as a few other issues, so I'd really appreciate some help. Don't
> > bother commenting on the small details yet as I still need to go
> > though the code to make sure it matches the code style guidelines.
> 
> Be careful that this is a bit of a moving target. There's been some
> feature work and some bug-fixes in git-stash.sh the past few weeks.

Maybe more encouraging words would have been in order... converting
scripts to builtins is tedious, hard, and unrewarding. And it is also
highly important, despite the obstacles that are partially technical,
partially social.

So let me try:

Joel, this is awesome work you have done there! It looks already pretty
good, and the fact that it almost passes the test suite is no small feat.

I worked a little bit on this patch and pushed the non-squashed version
here:

https://github.com/git/git/compare/master...dscho:stash-builtin

Only now, after pushing this branch, did I realize that you have your own
git.git fork with the patch available right there. Oh well... ;-)

Essentially, I changed the patch in the following aspects:

- instead of deleting git-stash.sh, move it into contrib/examples/ (as is
  the recommended way, although I get doubts about that, given that some
  of that code may not even work anymore, let alone follows our style)

- I rebased to the current `master` of git.git

- I forward-ported three changes that git-stash.sh saw in the meantime

- I briefly looked at the style and changed just a couple of instances,
  but then dropped the ball as I agree with your statement that it is not
  yet time to "clean the patch up" according to the coding conventions of
  git.git

- most importantly, I inserted a `read_cache_preload()` before any
  invocation of refresh_index(). This is needed because an empty in-memory
  index would be refreshed, which is not what we want (don't feel bad
  about it, I feel into that trap uncountable times during my work on
  accelerating rebase -i)

It is that last change that fixes t3903.69 for me.

I hope that it is not too hard for you to read my patches; to document
what exactly I did, and in which order, I used what I call "merging
rebases", i.e. rebases that start by a fake merge of the previous history
("fake" as in: it only merges the commit history, the tree is actually
left untouched, in essence starting from a clean slate again).

> > git stash uses the GIT_INDEX_FILE environment variable in order to not
> > trash the main index. I ended up doing things like this:
> > 
> > discard_cache();
> > ret = read_cache_from(index_path);
> > write_index_as_tree(orig_tree.hash, _index, index_path, 0, NULL);
> > discard_cache();
> > ret = read_cache_from(index_path);
> > 
> > in order to have an empty cache. Could someone take a look at my uses
> > of the index and point out better ways to do it?
> 
> I suspect in the long run that you could pull this off without writing
> the intermediate index to disk at all. You can store multiple indices if
> you need to (read_cache_from is just a wrapper to operate on the_index).
> But while you're executing sub-programs, you're still probably going to
> need to do a lot of re-reading of the index.

I agree with that, but that is definitely something to leave for later. At
this stage, we still need to work on getting the code correct, and
fiddling with in-memory index (or for that matter, with piping output of
one subprocess to another process directly, instead of using an
intermediate strbuf) would constitute premature optimization.

> Two things that I think might help break up the work:
> 
>   1. Rather than convert stash all at once, you can implement a "stash
>  helper" in C that does individual sub-commands (or even smaller
>  chunks), and call that from git-stash.sh. Eventually it
>  git-stash.sh will just be a skeleton, and you can move that over to
>  C and call the functions directly.
> 
>   2. You may want to start purely as a C wrapper that calls the
>  sub-programs, which communicate with each other via GIT_INDEX_FILE,
>  etc. Then you don't need to worry about manipulating the index
>  inside the C program at first, and can focus on all the other
>  boilerplate.
> 
>  Then piece by piece, you can replace sub-program calls with C
>  calls. But you know you'll be working on top of a functional base.

The patch actually already goes the second route. It does convert a couple
of `git reset-tree()` calls into internal API calls, but other Git commands
are called via the run-command API.

(Interesting side note: a couple of Git commands are called directly via
cmd_(), which I did not think would work, but it does, apparently, at
least in 

Re: [PATCH/RFC V2] stash: implement builtin stash

2017-03-20 Thread Jeff King
On Mon, Mar 13, 2017 at 03:23:18PM -0700, Joel Teichroeb wrote:

> I've been working on rewriting git stash as a c builtin and I have all
> but three tests passing. I'm having a bit of trouble fixing them, as
> well as a few other issues, so I'd really appreciate some help. Don't
> bother commenting on the small details yet as I still need to go
> though the code to make sure it matches the code style guidelines.

Be careful that this is a bit of a moving target. There's been some
feature work and some bug-fixes in git-stash.sh the past few weeks.

> git stash uses the GIT_INDEX_FILE environment variable in order to not
> trash the main index. I ended up doing things like this:
> 
> discard_cache();
> ret = read_cache_from(index_path);
> write_index_as_tree(orig_tree.hash, _index, index_path, 0, NULL);
> discard_cache();
> ret = read_cache_from(index_path);
> 
> in order to have an empty cache. Could someone take a look at my uses
> of the index and point out better ways to do it?

I suspect in the long run that you could pull this off without writing
the intermediate index to disk at all. You can store multiple indices if
you need to (read_cache_from is just a wrapper to operate on the_index).
But while you're executing sub-programs, you're still probably going to
need to do a lot of re-reading of the index.

Two things that I think might help break up the work:

  1. Rather than convert stash all at once, you can implement a "stash
 helper" in C that does individual sub-commands (or even smaller
 chunks), and call that from git-stash.sh. Eventually it
 git-stash.sh will just be a skeleton, and you can move that over to
 C and call the functions directly.

  2. You may want to start purely as a C wrapper that calls the
 sub-programs, which communicate with each other via GIT_INDEX_FILE,
 etc. Then you don't need to worry about manipulating the index
 inside the C program at first, and can focus on all the other
 boilerplate.

 Then piece by piece, you can replace sub-program calls with C
 calls. But you know you'll be working on top of a functional base.

I don't know if that's helpful or not.

-Peff


[PATCH/RFC V2] stash: implement builtin stash

2017-03-13 Thread Joel Teichroeb
Implement all git stash functionality as a builtin command

Signed-off-by: Joel Teichroeb 
---
I've been working on rewriting git stash as a c builtin and I have all
but three tests passing. I'm having a bit of trouble fixing them, as
well as a few other issues, so I'd really appreciate some help. Don't
bother commenting on the small details yet as I still need to go
though the code to make sure it matches the code style guidelines.

Test Summary Report
---
t7601-merge-pull-config.sh   (Wstat: 256 Tests: 14
Failed: 2)
  Failed tests:  11-12
  Non-zero exit status: 1
t3903-stash.sh   (Wstat: 256 Tests: 74
Failed: 1)
  Failed test:  69
  Non-zero exit status: 1

It looks to be the same issue for both of these cases where
merge-recursive reports:
error: Your local changes to the following files would be overwritten by merge:
file
other-file

which doesn't make sense as those files didn't exist before the merge.
Furthermore if I take the existing git stash implementation and have
it stop before running the merge-recursive command and then run it on
the commandline manually, I get the same issue. I've tried setting all
the same environment variables that the existing git stash
implementation does, but it doesn't help. It seems like there could be
a bug in merge-recursive, but I'm not sure how to track it down.

git stash uses the GIT_INDEX_FILE environment variable in order to not
trash the main index. I ended up doing things like this:

discard_cache();
ret = read_cache_from(index_path);
write_index_as_tree(orig_tree.hash, _index, index_path, 0, NULL);
discard_cache();
ret = read_cache_from(index_path);

in order to have an empty cache. Could someone take a look at my uses
of the index and point out better ways to do it?

My main goal right now is replace as many of the cmd_* calls as
possible.

changes in v2:
* Better follow coding guidelines
* Improve error handling

 Makefile  |2 +-
 builtin.h |1 +
 builtin/stash.c   | 1266 +
 git-stash.sh  |  731 ---
 git.c |1 +
 merge-recursive.c |5 +-
 6 files changed, 1271 insertions(+), 735 deletions(-)
 create mode 100644 builtin/stash.c
 delete mode 100755 git-stash.sh

diff --git a/Makefile b/Makefile
index ba524f3a7..73915a2e0 100644
--- a/Makefile
+++ b/Makefile
@@ -516,7 +516,6 @@ SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-rebase.sh
 SCRIPT_SH += git-remote-testgit.sh
 SCRIPT_SH += git-request-pull.sh
-SCRIPT_SH += git-stash.sh
 SCRIPT_SH += git-submodule.sh
 SCRIPT_SH += git-web--browse.sh
 
@@ -949,6 +948,7 @@ BUILTIN_OBJS += builtin/send-pack.o
 BUILTIN_OBJS += builtin/shortlog.o
 BUILTIN_OBJS += builtin/show-branch.o
 BUILTIN_OBJS += builtin/show-ref.o
+BUILTIN_OBJS += builtin/stash.o
 BUILTIN_OBJS += builtin/stripspace.o
 BUILTIN_OBJS += builtin/submodule--helper.o
 BUILTIN_OBJS += builtin/symbolic-ref.o
diff --git a/builtin.h b/builtin.h
index 9e4a89816..16e8a437d 100644
--- a/builtin.h
+++ b/builtin.h
@@ -121,6 +121,7 @@ extern int cmd_shortlog(int argc, const char **argv, const 
char *prefix);
 extern int cmd_show(int argc, const char **argv, const char *prefix);
 extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
 extern int cmd_status(int argc, const char **argv, const char *prefix);
+extern int cmd_stash(int argc, const char **argv, const char *prefix);
 extern int cmd_stripspace(int argc, const char **argv, const char *prefix);
 extern int cmd_submodule__helper(int argc, const char **argv, const char 
*prefix);
 extern int cmd_symbolic_ref(int argc, const char **argv, const char *prefix);
diff --git a/builtin/stash.c b/builtin/stash.c
new file mode 100644
index 0..785fc18d5
--- /dev/null
+++ b/builtin/stash.c
@@ -0,0 +1,1266 @@
+#include "builtin.h"
+#include "parse-options.h"
+#include "refs.h"
+#include "tree.h"
+#include "lockfile.h"
+#include "object.h"
+#include "tree-walk.h"
+#include "cache-tree.h"
+#include "unpack-trees.h"
+#include "diff.h"
+#include "revision.h"
+#include "commit.h"
+#include "diffcore.h"
+#include "merge-recursive.h"
+#include "argv-array.h"
+#include "run-command.h"
+
+static const char * const git_stash_usage[] = {
+   N_("git stash list []"),
+   N_("git stash show []"),
+   N_("git stash drop [-q|--quiet] []"),
+   N_("git stash ( pop | apply ) [--index] [-q|--quiet] []"),
+   N_("git stash branch  []"),
+   N_("git stash [save [--patch] [-k|--[no-]keep-index] [-q|--quiet]"),
+   N_("[-u|--include-untracked] [-a|--all] []]"),
+   N_("git stash clear"),
+   N_("git stash create []"),
+   N_("git stash store [-m|--message ] [-q|--quiet] "),
+   NULL
+};
+
+static const char * const git_stash_list_usage[] = {
+   N_("git stash list []"),
+   NULL
+};
+
+static const