Re: [PATCH RFC v2] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-10 Thread Junio C Hamano
Jacob Keller  writes:

>>> + diff = fdopen(cp.out, "r");
>>> +
>>> + c = fgetc(diff);
>>> + while (c != EOF) {
>>> + fputc(c, f);
>>> + c = fgetc(diff);
>>> + }
>>> +
>>> + fclose(diff);
>>> + finish_command();
>>
>> I do not think you need to do this.  If "f" is already a FILE * to
>> which the output must go, then instead of reading from the pipe and
>> writing it, you can just let the "diff" spit out its output to the
>> same file descriptor, by doing something like:
>>
>> fflush(f);
>> cp.out = dup(fileno(f));
>> ... other setup ...
>> run_command();
>>
>> no?  I do not offhand recall run_command() closes cp.out after done,
>> and if it doesn't you may have to close it yourself after the above
>> sequence.
>
> That makes a lot more sense, yes. I hadn't thought of dup, (long time
> since I've had to use file descriptors).
>
> the child_process api does close the descriptor itself. That's a much
> better way to get the result we want, and it is less code, so I'll try
> this out.

One caveat.  If the superproject is doing "log -p --graph", the
output will be broken with this "splice in 'git -C submodule diff'
output in place" approach.  I personally do not care if it is broken
as I do not use "-p" and "--graph" together (for that matter, I do
not think I'll use the "splice in 'git -C submodule diff'" feature
added by this patch, either, so I do not think I'd deeply care ;-),
but I am reasonably sure those who would use these would.

The way to solve it is to capture diff output and send out with the
current graph prefix (i.e. the set of vertial lines), and the
easiest and most expensive (probably too expensive to be practical)
way to do so would be to capture the whole thing into a strbuf, but
I think it is OK to teach a new option to "git diff" that tells to
prefix a fixed string given as its argument to each and every line
of its output, and that would be a more practical solution, which
can also be reused if you choose to later run "diff" internally
without spawning it as a separate process.

The graph machinery may have to be taught a way to tell what the
current graph prefix is if you go that route.

> I looked at trying to call diff for this, but I think it's not worth
> it. Using the child process API makes more sense and is a cleaner
> method. I'll go this route as it appears to work pretty well. The only
> major concern I have is that options may not get forwarded
> correctly...

You'd need to deal with options either way.  You do not want to
assume all options should be passed down.  You now know you need to
tweak the prefix, but you do not know if that is all that is
required.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC v2] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-10 Thread Jacob Keller
On Tue, Aug 9, 2016 at 5:53 PM, Stefan Beller  wrote:
> This is not used any more, but the child is run directly below?
>   unsigned char one[20], unsigned char two[20])
>> +{
>

Yea I meant to take it all out and forgot. Will be gone in v3.

>
> This pattern seems familar, do we have a function for that?
> (get a submodules git dir? As I was mainly working on the helper
> I do not know off hand)
>

I copied it from somewhere, I didn't see a helper. It's probably worth
making one.

> diff is a FILE* pointer. cp.out is a standard int file descriptor
>
> Maybe you'd want a similar thing as strbuf_getwholeline_fd()
> just instead of adding it to the strbuf, adding it to `f` ?
>

Junio has a much better suggestion in another thread, which I will use.

Thanks,
Jake

> (Which is what you have here? I just wonder about the buffer size,
> as I think reading 1 by 1 is not beneficial)
>
>> +
>> +   c = fgetc(diff);
>> +   while (c != EOF) {
>> +   fputc(c, f);
>> +   c = fgetc(diff);
>> +   }
>> +
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC v2] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-10 Thread Jacob Keller
On Tue, Aug 9, 2016 at 8:37 PM, Junio C Hamano  wrote:
> Jacob Keller  writes:
>> + cp.dir = path;
>> + cp.out = -1;
>> + cp.no_stdin = 1;
>> + argv_array_push(, "diff");
>> + argv_array_pushf(, "--src-prefix=a/%s/", path);
>> + argv_array_pushf(, "--dst-prefix=b/%s/", path);
>
> I think this is wrong.  Imagine when the caller gave you prefixes
> that are different from a/ and b/ (think: the extreme case is that
> you are a sub-sub-module, and the caller is recursing into you with
> its own prefix, perhaps set to a/sub and b/sub).  Use the prefixes
> you got for a/ and b/ instead of hardcoding them and you'd be fine,
> I'd guess.

I'll have to get these passed, but yes this makes more sense, will look into it.

>
>> + argv_array_push(, sha1_to_hex(one));
>> +
>> + /*
>> +  * If the submodule has untracked or modified contents, diff between
>> +  * the old base and the current tip. This had the advantage of showing
>> +  * the full difference of the submodule contents.
>> +  */
>> + if (!create_dirty_diff)
>> + argv_array_push(, sha1_to_hex(two));
>> +
>> + if (start_command())
>> + die("Could not run 'git diff' command in submodule %s", path);
>> +
>> + diff = fdopen(cp.out, "r");
>> +
>> + c = fgetc(diff);
>> + while (c != EOF) {
>> + fputc(c, f);
>> + c = fgetc(diff);
>> + }
>> +
>> + fclose(diff);
>> + finish_command();
>
> I do not think you need to do this.  If "f" is already a FILE * to
> which the output must go, then instead of reading from the pipe and
> writing it, you can just let the "diff" spit out its output to the
> same file descriptor, by doing something like:
>
> fflush(f);
> cp.out = dup(fileno(f));
> ... other setup ...
> run_command();
>
> no?  I do not offhand recall run_command() closes cp.out after done,
> and if it doesn't you may have to close it yourself after the above
> sequence.


That makes a lot more sense, yes. I hadn't thought of dup, (long time
since I've had to use file descriptors).

the child_process api does close the descriptor itself. That's a much
better way to get the result we want, and it is less code, so I'll try
this out.

>
> While I personally do not want to see code to access submodule's
> object store by temporarily adding it as alternates, the "show
> left-right log between the commits" code already does so, so I
> should point out that running "diff" internally without spawning it
> as a subprocess shouldn't be too hard.  The diff API is reasonably
> libified and there shouldn't be additional "libification" preparation
> work required to do this, if you really wanted to.

I looked at trying to call diff for this, but I think it's not worth
it. Using the child process API makes more sense and is a cleaner
method. I'll go this route as it appears to work pretty well. The only
major concern I have is that options may not get forwarded
correctly...

Thanks,
Jake
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC v2] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-09 Thread Junio C Hamano
Jacob Keller  writes:

> +void show_submodule_diff(FILE *f, const char *path,
> + const char *line_prefix,
> + unsigned char one[20], unsigned char two[20],
> + unsigned dirty_submodule, const char *meta,
> + const char *reset)
> +{
> + struct child_process cp = CHILD_PROCESS_INIT;
> + struct strbuf sb = STRBUF_INIT;
> + struct strbuf submodule_git_dir = STRBUF_INIT;
> + const char *git_dir, *message = NULL;
> + int create_dirty_diff = 0;
> + FILE *diff;
> + char c;
> +
> + if ((dirty_submodule & DIRTY_SUBMODULE_UNTRACKED) ||
> + (dirty_submodule & DIRTY_SUBMODULE_MODIFIED))
> + create_dirty_diff = 1;
> +
> + strbuf_addf(, "%s%sSubmodule %s %s..", line_prefix, meta, path,
> + find_unique_abbrev(one, DEFAULT_ABBREV));
> + strbuf_addf(, "%s:%s\n", !create_dirty_diff ?
> + find_unique_abbrev(two, DEFAULT_ABBREV) : "", reset);
> + fwrite(sb.buf, sb.len, 1, f);
> +
> + strbuf_addf(_git_dir, "%s/.git", path);
> + git_dir = read_gitfile(submodule_git_dir.buf);
> + if (!git_dir)
> + git_dir = submodule_git_dir.buf;
> + if (!is_directory(git_dir)) {
> + strbuf_reset(_git_dir);
> + strbuf_addf(_git_dir, ".git/modules/%s", path);
> + git_dir = submodule_git_dir.buf;
> + }
> +
> + cp.git_cmd = 1;
> + if (!create_dirty_diff)
> + cp.dir = git_dir;
> + else
> + cp.dir = path;
> + cp.out = -1;
> + cp.no_stdin = 1;
> + argv_array_push(, "diff");
> + argv_array_pushf(, "--src-prefix=a/%s/", path);
> + argv_array_pushf(, "--dst-prefix=b/%s/", path);

I think this is wrong.  Imagine when the caller gave you prefixes
that are different from a/ and b/ (think: the extreme case is that
you are a sub-sub-module, and the caller is recursing into you with
its own prefix, perhaps set to a/sub and b/sub).  Use the prefixes
you got for a/ and b/ instead of hardcoding them and you'd be fine,
I'd guess.

> + argv_array_push(, sha1_to_hex(one));
> +
> + /*
> +  * If the submodule has untracked or modified contents, diff between
> +  * the old base and the current tip. This had the advantage of showing
> +  * the full difference of the submodule contents.
> +  */
> + if (!create_dirty_diff)
> + argv_array_push(, sha1_to_hex(two));
> +
> + if (start_command())
> + die("Could not run 'git diff' command in submodule %s", path);
> +
> + diff = fdopen(cp.out, "r");
> +
> + c = fgetc(diff);
> + while (c != EOF) {
> + fputc(c, f);
> + c = fgetc(diff);
> + }
> +
> + fclose(diff);
> + finish_command();

I do not think you need to do this.  If "f" is already a FILE * to
which the output must go, then instead of reading from the pipe and
writing it, you can just let the "diff" spit out its output to the
same file descriptor, by doing something like:

fflush(f);
cp.out = dup(fileno(f));
... other setup ...
run_command();

no?  I do not offhand recall run_command() closes cp.out after done,
and if it doesn't you may have to close it yourself after the above
sequence.

While I personally do not want to see code to access submodule's
object store by temporarily adding it as alternates, the "show
left-right log between the commits" code already does so, so I
should point out that running "diff" internally without spawning it
as a subprocess shouldn't be too hard.  The diff API is reasonably
libified and there shouldn't be additional "libification" preparation
work required to do this, if you really wanted to.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC v2] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-09 Thread Stefan Beller
On Tue, Aug 9, 2016 at 5:23 PM, Jacob Keller  wrote:
>
> +static int prepare_submodule_diff(struct strbuf *buf, const char *path,
> +   unsigned char one[20], unsigned char two[20])
> +{

This is not used any more, but the child is run directly below?

> +   strbuf_addf(_git_dir, "%s/.git", path);
> +   git_dir = read_gitfile(submodule_git_dir.buf);
> +   if (!git_dir)
> +   git_dir = submodule_git_dir.buf;
> +   if (!is_directory(git_dir)) {
> +   strbuf_reset(_git_dir);
> +   strbuf_addf(_git_dir, ".git/modules/%s", path);
> +   git_dir = submodule_git_dir.buf;
> +   }

This pattern seems familar, do we have a function for that?
(get a submodules git dir? As I was mainly working on the helper
I do not know off hand)

> +   if (start_command())
> +   die("Could not run 'git diff' command in submodule %s", path);
> +
> +   diff = fdopen(cp.out, "r");

diff is a FILE* pointer. cp.out is a standard int file descriptor

Maybe you'd want a similar thing as strbuf_getwholeline_fd()
just instead of adding it to the strbuf, adding it to `f` ?

(Which is what you have here? I just wonder about the buffer size,
as I think reading 1 by 1 is not beneficial)

> +
> +   c = fgetc(diff);
> +   while (c != EOF) {
> +   fputc(c, f);
> +   c = fgetc(diff);
> +   }
> +
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC v2] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-09 Thread Jacob Keller
From: Jacob Keller 

For projects which have frequent updates to submodules it is often
useful to be able to see a submodule update commit as a difference.
Teach diff's --submodule= a new "diff" format which will execute a diff
for the submodule between the old and new commit, and display it as
a standard diff. This allows users to easily see what changed in
a submodule without having to dig into the submodule themselves.

Signed-off-by: Jacob Keller 
---
- v2
* Use fgetc and fputc... I tried to use xread but somehow screwed it up
  and this worked fine.

Use src-prefix and dst-prefix.

Use the dirty_submodule flags to indicate whether we should check a diff
between two sources or not. This way we check "current checkout" if
there is any dirty changes so the user sees the entire diff.

 Documentation/diff-config.txt  |  3 +-
 Documentation/diff-options.txt |  6 ++-
 diff.c | 21 --
 diff.h |  2 +-
 submodule.c| 95 ++
 submodule.h|  5 +++
 6 files changed, 125 insertions(+), 7 deletions(-)

diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index d5a5b17d5088..f5d693afad6c 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -123,7 +123,8 @@ diff.suppressBlankEmpty::
 diff.submodule::
Specify the format in which differences in submodules are
shown.  The "log" format lists the commits in the range like
-   linkgit:git-submodule[1] `summary` does.  The "short" format
+   linkgit:git-submodule[1] `summary` does.  The "diff" format shows the
+   diff between the beginning and end of the range. The "short" format
format just shows the names of the commits at the beginning
and end of the range.  Defaults to short.
 
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 705a87394200..b17348407805 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -215,8 +215,10 @@ any of those replacements occurred.
the commits in the range like linkgit:git-submodule[1] `summary` does.
Omitting the `--submodule` option or specifying `--submodule=short`,
uses the 'short' format. This format just shows the names of the commits
-   at the beginning and end of the range.  Can be tweaked via the
-   `diff.submodule` configuration variable.
+   at the beginning and end of the range. When `--submodule=diff` is
+   given, the 'diff' format is used. This format shows the diff between
+   the old and new submodule commmit from the perspective of the
+   submodule.  Can be tweaked via the `diff.submodule` configuration 
variable.
 
 --color[=]::
Show colored diff.
diff --git a/diff.c b/diff.c
index b43d3dd2ecb7..8d620c5d2250 100644
--- a/diff.c
+++ b/diff.c
@@ -130,12 +130,18 @@ static int parse_dirstat_params(struct diff_options 
*options, const char *params
 
 static int parse_submodule_params(struct diff_options *options, const char 
*value)
 {
-   if (!strcmp(value, "log"))
+   if (!strcmp(value, "log")) {
DIFF_OPT_SET(options, SUBMODULE_LOG);
-   else if (!strcmp(value, "short"))
+   DIFF_OPT_CLR(options, SUBMODULE_DIFF);
+   } else if (!strcmp(value, "diff")) {
+   DIFF_OPT_SET(options, SUBMODULE_DIFF);
DIFF_OPT_CLR(options, SUBMODULE_LOG);
-   else
+   } else if (!strcmp(value, "short")) {
+   DIFF_OPT_CLR(options, SUBMODULE_DIFF);
+   DIFF_OPT_CLR(options, SUBMODULE_LOG);
+   } else {
return -1;
+   }
return 0;
 }
 
@@ -2310,6 +2316,15 @@ static void builtin_diff(const char *name_a,
two->dirty_submodule,
meta, del, add, reset);
return;
+   } else if (DIFF_OPT_TST(o, SUBMODULE_DIFF) &&
+   (!one->mode || S_ISGITLINK(one->mode)) &&
+   (!two->mode || S_ISGITLINK(two->mode))) {
+   show_submodule_diff(o->file, one->path ? one->path : two->path,
+   line_prefix,
+   one->oid.hash, two->oid.hash,
+   two->dirty_submodule,
+   meta, reset);
+   return;
}
 
if (DIFF_OPT_TST(o, ALLOW_TEXTCONV)) {
diff --git a/diff.h b/diff.h
index 125447be09eb..a80f3e5bafe9 100644
--- a/diff.h
+++ b/diff.h
@@ -69,7 +69,7 @@ typedef struct strbuf *(*diff_prefix_fn_t)(struct 
diff_options *opt, void *data)
 #define DIFF_OPT_FIND_COPIES_HARDER  (1 <<  6)
 #define DIFF_OPT_FOLLOW_RENAMES  (1 <<  7)
 #define DIFF_OPT_RENAME_EMPTY(1 <<  8)
-/* (1 <<  9) unused */
+#define DIFF_OPT_SUBMODULE_DIFF  (1 <<  9)
 #define DIFF_OPT_HAS_CHANGES (1 << 10)