AW: [PATCH 1/2] stash--helper: implement "git stash--helper"

Matthias Aßhauer Fri, 29 Jan 2016 11:33:08 -0800

> Hmph, why not have everything inside builtin/stash--helper.c?  I do not quite 
> see a point of having the other two "library-ish" looking files.
> 
> Also I personally feel that it would be easier to review when these two 
> patches are squashed into one.  I had to go back and forth while reading the 
> "non-patch" C function to see what logic from the scripted version it is 
> trying to replace.


I can certainly do that.

> This is meant to replace this sequence:
>
>       git read-tree --index-output="$TMPindex" -m $i_tree &&
>       GIT_INDEX_FILE="$TMPindex" &&
>       export GIT_INDEX_FILE &&
>       git diff --name-only -z HEAD -- >"$TMP-stagenames" &&
>       git update-index -z --add --remove --stdin <"$TMP-stagenames" &&
>       git write-tree &&
>       rm -f "$TMPindex"
>
> And outside of this section of the script, $TMPindex is never looked at after 
> this part finishes (which is obvious as the last thing the section does is to 
> remove it).  As you are rewriting this whole section in C, you should wonder 
> if you can do it without using a temporary file in the filesystem at all.

I assumed the path to $TMPindex was still available in GIT_INDEX_FILE after 
this, as it gets exported, but I guess the surrounding $() imply a subshell.

> Honestly, I had high hopes after seeing the "we are rewriting it in C" but I 
> am not enthused after seeing this.  I was hoping that the rewritten version 
> would do this all in-core, by calling these functions that we already have:

These functions might be obvious to you, but I'm new to git's source code, so I 
worked with the things I found documented and those that were brought to my 
attention by Johannes Schindelin. The run-command API is documented and seemed 
to be the "official" method of calling any git commands from within native git  
code. On the other hand, the documentation for the in-core index API boils down 
to "TODO: document this". This lead me to believe I did this the intended way 
and just calling random functions that sound  similar to the  original command 
may be frowned upon at best.

> A C rewrite that works all in-core does not even need to write out a 
> temporary; it can just read the current index and do various things up to 
> writing the contents of the in-core index as a tree, and the result would be 
> correct as long as you do not forget *NOT* to write the in-core index out to 
> $GIT_INDEX_FILE.

I'll be working on a v2 that incorporates the feedback from you, Thomas 
Gummerer  and Stefan Beller then. Further feedback is of course welcome.

-----Ursprüngliche Nachricht-----
Von: Junio C Hamano [mailto:gits...@pobox.com] 
Gesendet: Freitag, 29. Januar 2016 00:06
An: Matthias Asshauer <mha1...@live.de>
Cc: git@vger.kernel.org
Betreff: Re: [PATCH 1/2] stash--helper: implement "git stash--helper"

Matthias Asshauer <mha1...@live.de> writes:

> From: Matthias Aßhauer <mha1...@live.de>
>
> This patch implements a new "git stash--helper" builtin plumbing 
> command that will be used to migrate "git-stash.sh" to C.
>
> We start by implementing only the "--non-patch" option that will 
> handle the core part of the non-patch stashing.
>
> Signed-off-by: Matthias Aßhauer <mha1...@live.de>
> ---
>  Makefile                |  2 ++
>  builtin.h               |  1 +
>  builtin/stash--helper.c | 13 ++++++++++
>  git.c                   |  1 +
>  stash.c                 | 65 
> +++++++++++++++++++++++++++++++++++++++++++++++++
>  stash.h                 | 11 +++++++++

Hmph, why not have everything inside builtin/stash--helper.c?  I do not quite 
see a point of having the other two "library-ish" looking files.

Also I personally feel that it would be easier to review when these two patches 
are squashed into one.  I had to go back and forth while reading the 
"non-patch" C function to see what logic from the scripted version it is trying 
to replace.

> diff --git a/builtin/stash--helper.c b/builtin/stash--helper.c new 
> file mode 100644 index 0000000..542e782
> --- /dev/null
> +++ b/builtin/stash--helper.c
> @@ -0,0 +1,13 @@
> +#include "../stash.h"
> +#include <string.h>
> +
> +static const char builtin_stash__helper_usage[] = {
> +     "Usage: git stash--helper --non-patch <tmp_indexfile> <i_tree>"
> +};
> +
> +int cmd_stash__helper(int argc, const char **argv, const char 
> +*prefix) {
> +     if (argc == 4 && !strcmp("--non-patch", argv[1]))
> +             return stash_non_patch(argv[2], argv[3], prefix);
> +     usage(builtin_stash__helper_usage);
> +}

This is meant to replace this sequence:

        git read-tree --index-output="$TMPindex" -m $i_tree &&
        GIT_INDEX_FILE="$TMPindex" &&
        export GIT_INDEX_FILE &&
        git diff --name-only -z HEAD -- >"$TMP-stagenames" &&
        git update-index -z --add --remove --stdin <"$TMP-stagenames" &&
        git write-tree &&
        rm -f "$TMPindex"

And outside of this section of the script, $TMPindex is never looked at after 
this part finishes (which is obvious as the last thing the section does is to 
remove it).  As you are rewriting this whole section in C, you should wonder if 
you can do it without using a temporary file in the filesystem at all.

> diff --git a/stash.c b/stash.c
> new file mode 100644
> index 0000000..c3d6e67
> --- /dev/null
> +++ b/stash.c
> @@ -0,0 +1,65 @@
> +#include "stash.h"
> +#include "strbuf.h"
> +
> +static int prepare_update_index_argv(struct argv_array *args,
> +     struct strbuf *buf)
> +{
> +     struct strbuf **bufs, **b;
> +
> +     bufs = strbuf_split(buf, '\0');
> +     for (b = bufs; *b; b++)
> +             argv_array_pushf(args, "%s", (*b)->buf);
> +     argv_array_push(args, "--");
> +     strbuf_list_free(bufs);
> +
> +     return 0;
> +}
> +
> +int stash_non_patch(const char *tmp_indexfile, const char *i_tree,
> +     const char *prefix)
> +{
> +     int result;
> +     struct child_process read_tree = CHILD_PROCESS_INIT;
> +     struct child_process diff = CHILD_PROCESS_INIT;
> +     struct child_process update_index = CHILD_PROCESS_INIT;
> +     struct child_process write_tree = CHILD_PROCESS_INIT;
> +     struct strbuf buf = STRBUF_INIT;
> +
> +     argv_array_push(&read_tree.args, "read-tree");
> +     argv_array_pushf(&read_tree.args, "--index-output=%s", tmp_indexfile);
> +     argv_array_pushl(&read_tree.args, "-m", i_tree, NULL);
> +
> +     argv_array_pushl(&diff.args, "diff", "--name-only", "-z", "HEAD", "--",
> +             NULL);
> +
> +     argv_array_pushl(&update_index.args, "update-index", "--add",
> +             "--remove", NULL);
> +
> +     argv_array_push(&write_tree.args, "write-tree");

Honestly, I had high hopes after seeing the "we are rewriting it in C" but I am 
not enthused after seeing this.  I was hoping that the rewritten version would 
do this all in-core, by calling these functions that we already have:

 * read_index() to read the current index into the_index;

 * unpack_trees() to overlay the contents of i_tree to the contents
   of the index;

 * run_diff_index() to make the comparison between the result of the
   above and HEAD to collect the paths that are different (you'd use
   DIFF_FORMAT_CALLBACK mechanism to collect paths---see wt-status.c
   for existing code that already does this for hints);

 * add_to_index() to add or remove paths you found in the previous
   step to the in-core index;

 * write_cache_as_tree() to write out the resulting index of the
   above sequence of calls to a new tree object;

 * sha1_to_hex() to convert that resulting tree object name to hex
   format;

 * puts() to output the result.


Actually, because i_tree is coming from $(git write-tree) that was done earlier 
on the current index, the unpack_trees() step should not even be necessary.

The first three lines of the scripted version:

        git read-tree --index-output="$TMPindex" -m $i_tree &&
        GIT_INDEX_FILE="$TMPindex" &&
        export GIT_INDEX_FILE &&

are creating a new file $TMPindex on the filesystem while preserving the cached 
stat info when it can, which is a glorified version of:

        cp -p $GIT_INDEX_FILE $TMPindex

In fact, versions of "git stash" before 3ba2e865 (stash: copy the index using 
--index-output instead of cp -p, 2011-03-16) simply did a "cp -p".

A C rewrite that works all in-core does not even need to write out a temporary; 
it can just read the current index and do various things up to writing the 
contents of the in-core index as a tree, and the result would be correct as 
long as you do not forget *NOT* to write the in-core index out to 
$GIT_INDEX_FILE.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

AW: [PATCH 1/2] stash--helper: implement "git stash--helper"

Reply via email to