Re: [PATCH v2] ls-files: adding support for submodules

2016-09-15 Thread Junio C Hamano
Junio C Hamano  writes:

> Thanks, will queue with a minimum fix.

So here are two squashable patches, one is the "minimum" one, the
other is a bit more invasive one to use "a pointer to an optional
setting is set to NULL" convention.  I am undecided, and I'll stay
to be without further comments from others, on the latter one.

I understand that many internal changes in your work environment
titles their changes like "DOing X", but our convention around here
is to label them "DO X", as if you are giving an order to somebody
else, either telling the codebase "to be like so", or telling the
patch-monkey maintainer "to make it so".  So I'd retitle it

ls-files: optionally recurse into submodules

or something like that.  It is an added advantage of being a lot
more descriptive than "adding support", which does not say what kind
of support it is adding.


Re: [PATCH v2] ls-files: adding support for submodules

2016-09-15 Thread Junio C Hamano
If we were to follow the convention to leave an optional string
variable to NULL, we'd need to do this on top.  I am not sure if it
is a good change, though.
---
 builtin/ls-files.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 6e78c71..687e475 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -29,7 +29,7 @@ static int show_valid_bit;
 static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
-static const char *output_path_prefix = "";
+static const char *output_path_prefix;
 static int recurse_submodules;
 
 static const char *prefix;
@@ -78,7 +78,7 @@ static void write_name(const char *name)
 * churn.
 */
static struct strbuf full_name = STRBUF_INIT;
-   if (*output_path_prefix) {
+   if (output_path_prefix && *output_path_prefix) {
strbuf_reset(_name);
strbuf_addstr(_name, output_path_prefix);
strbuf_addstr(_name, name);
@@ -181,7 +181,8 @@ static void show_gitlink(const struct cache_entry *ce)
argv_array_push(, "ls-files");
argv_array_push(, "--recurse-submodules");
argv_array_pushf(, "--output-path-prefix=%s%s/",
-output_path_prefix, ce->name);
+output_path_prefix ? output_path_prefix : "",
+ce->name);
cp.git_cmd = 1;
cp.dir = ce->name;
status = run_command();
-- 
2.10.0-458-g97b4043



Re: [PATCH v2] ls-files: adding support for submodules

2016-09-15 Thread Junio C Hamano
Here is an absolute mininum fix ;-)

 builtin/ls-files.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index c0bce00..6e78c71 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -78,7 +78,7 @@ static void write_name(const char *name)
 * churn.
 */
static struct strbuf full_name = STRBUF_INIT;
-   if (output_path_prefix != '\0') {
+   if (*output_path_prefix) {
strbuf_reset(_name);
strbuf_addstr(_name, output_path_prefix);
strbuf_addstr(_name, name);
-- 
2.10.0-458-g97b4043



Re: [PATCH v2] ls-files: adding support for submodules

2016-09-15 Thread Junio C Hamano
Junio C Hamano  writes:

> Thanks, will queue with a minimum fix.

So here are two squashable patches, one is the "minimum" one, the
other is a bit more invasive one to use "a pointer to an optional
setting is set to NULL" convention.  I am undecided, and I'll stay
to be without further comments from others, on the latter one.

I understand that many internal changes in your work environment are
titled like "DOing X", but our convention around here is to label
them "DO X", as if you are giving an order to somebody else, either
telling the codebase "to be like so", or telling the patch-monkey
maintainer "to make it so".  So I'd retitle it

ls-files: optionally recurse into submodules

or something like that.  It is an added advantage of being a lot
more descriptive than "adding support", which does not say what kind
of support it is adding.


Re: [PATCH v2] ls-files: adding support for submodules

2016-09-13 Thread Junio C Hamano
Brandon Williams  writes:

> Allow ls-files to recognize submodules in order to retrieve a list of
> files from a repository's submodules.  This is done by forking off a
> process to recursively call ls-files on all submodules. Also added an
> output-path-prefix command in order to prepend paths to child processes.
>
> Signed-off-by: Brandon Williams 

> @@ -68,6 +71,21 @@ static void write_eolinfo(const struct cache_entry *ce, 
> const char *path)
>  static void write_name(const char *name)
>  {
>   /*
> +  * NEEDSWORK: To make this thread-safe, full_name would have to be owned
> +  * by the caller.
> +  *
> +  * full_name get reused across output lines to minimize the allocation
> +  * churn.
> +  */
> + static struct strbuf full_name = STRBUF_INIT;
> + if (output_path_prefix != '\0') {
> + strbuf_reset(_name);
> + strbuf_addstr(_name, output_path_prefix);
> + strbuf_addstr(_name, name);
> + name = full_name.buf;
> + }

At first glance it was surprising that no test caught this lack of
dereference; the reason is because you initialize output_path_prefix
to an empty string, not NULL, causing full_name.buf always used,
which does not have an impact on the output.

I think initializing it to NULL is a more typical way to say "this
option has not been given", and if you took that route, the
condition would become

if (output_path_prefix && *output_path_prefix) {
...

In any case, the fact that only this much change was required to add
output-path-prefix shows two good things: (1) the original code was
already well structured, funneling any pathname we need to emit
through this single function so that we can do this kind of updates,
and (2) the author of the patch was competent to spot this single
point that needs to be updated.

Nice.

> + status = run_command();
> + if (status)
> + exit(status);

run_command()'s return value comes from either start_command() or
finish_command().  These signal failure by returning a non-zero
value, and in practice they are negative small integers.  Feeding
negative value to exit() is not quite kosher.  Perhaps exit(128)
to mimick as if we called die() is better.

If your primary interest is to support the "find in the working tree
files that are tracked, recursively in submodules" grep, I think
this "when we hit a submodule, spawn a separate ls-files in there"
is sufficient and a solid base to build on it.

On the other hand, if you are more ambitious and "grep" is merely an
example of things that can be helped by having a list of paths
across module boundaries, we may want to "libify" ls-files in such a
way that a single process can instantiate one or more instances of
"ls-files machinery", that takes which repository to work in and
other arguments that specifies which paths to report, and instead of
always showing the result to the standard output, makes repeated
calls to a callback function to report the discovered path and other
attributes associated with the path that were asked for (the object
name, values of tag_*, etc.), without spawning a separate "ls-files"
process.

The latter would be a lot bigger task and I do not necessarily think
it is needed, but that is one possible future direction to keep in
mind.

Thanks, will queue with a minimum fix.


Re: [PATCH v2] ls-files: adding support for submodules

2016-09-12 Thread Brandon Williams
>  static void write_name(const char *name)
>  {
> /*
> +* NEEDSWORK: To make this thread-safe, full_name would have to be 
> owned
> +* by the caller.
> +*
> +* full_name get reused across output lines to minimize the allocation
> +* churn.
> +*/
> +   static struct strbuf full_name = STRBUF_INIT;
> +   if (output_path_prefix != '\0') {

It was pointed out to me that this should be:
if (*output_path_prefix != '\0') {


> +   strbuf_reset(_name);
> +   strbuf_addstr(_name, output_path_prefix);
> +   strbuf_addstr(_name, name);
> +   name = full_name.buf;
> +   }


[PATCH v2] ls-files: adding support for submodules

2016-09-12 Thread Brandon Williams
Allow ls-files to recognize submodules in order to retrieve a list of
files from a repository's submodules.  This is done by forking off a
process to recursively call ls-files on all submodules. Also added an
output-path-prefix command in order to prepend paths to child processes.

Signed-off-by: Brandon Williams 
---
 Documentation/git-ls-files.txt | 11 +++-
 builtin/ls-files.c | 60 +
 t/t3007-ls-files-recurse-submodules.sh | 99 ++
 3 files changed, 169 insertions(+), 1 deletion(-)
 create mode 100755 t/t3007-ls-files-recurse-submodules.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0d933ac..a623ebf 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -18,7 +18,9 @@ SYNOPSIS
[--exclude-per-directory=]
[--exclude-standard]
[--error-unmatch] [--with-tree=]
-   [--full-name] [--abbrev] [--] [...]
+   [--full-name] [--recurse-submodules]
+   [--output-path-prefix=]
+   [--abbrev] [--] [...]
 
 DESCRIPTION
 ---
@@ -137,6 +139,13 @@ a space) at the start of each line:
option forces paths to be output relative to the project
top directory.
 
+--recurse-submodules::
+   Recursively calls ls-files on each submodule in the repository.
+   Currently there is only support for the --cached mode.
+
+--output-path-prefix=::
+   Prepend the provided path to the output of each file
+
 --abbrev[=]::
Instead of showing the full 40-byte hexadecimal object
lines, show only a partial prefix.
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 00ea91a..c0bce00 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -14,6 +14,7 @@
 #include "resolve-undo.h"
 #include "string-list.h"
 #include "pathspec.h"
+#include "run-command.h"
 
 static int abbrev;
 static int show_deleted;
@@ -28,6 +29,8 @@ static int show_valid_bit;
 static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
+static const char *output_path_prefix = "";
+static int recurse_submodules;
 
 static const char *prefix;
 static int max_prefix_len;
@@ -68,6 +71,21 @@ static void write_eolinfo(const struct cache_entry *ce, 
const char *path)
 static void write_name(const char *name)
 {
/*
+* NEEDSWORK: To make this thread-safe, full_name would have to be owned
+* by the caller.
+*
+* full_name get reused across output lines to minimize the allocation
+* churn.
+*/
+   static struct strbuf full_name = STRBUF_INIT;
+   if (output_path_prefix != '\0') {
+   strbuf_reset(_name);
+   strbuf_addstr(_name, output_path_prefix);
+   strbuf_addstr(_name, name);
+   name = full_name.buf;
+   }
+
+   /*
 * With "--full-name", prefix_len=0; this caller needs to pass
 * an empty string in that case (a NULL is good for "").
 */
@@ -152,6 +170,25 @@ static void show_killed_files(struct dir_struct *dir)
}
 }
 
+/**
+ * Recursively call ls-files on a submodule
+ */
+static void show_gitlink(const struct cache_entry *ce)
+{
+   struct child_process cp = CHILD_PROCESS_INIT;
+   int status;
+
+   argv_array_push(, "ls-files");
+   argv_array_push(, "--recurse-submodules");
+   argv_array_pushf(, "--output-path-prefix=%s%s/",
+output_path_prefix, ce->name);
+   cp.git_cmd = 1;
+   cp.dir = ce->name;
+   status = run_command();
+   if (status)
+   exit(status);
+}
+
 static void show_ce_entry(const char *tag, const struct cache_entry *ce)
 {
int len = max_prefix_len;
@@ -163,6 +200,10 @@ static void show_ce_entry(const char *tag, const struct 
cache_entry *ce)
len, ps_matched,
S_ISDIR(ce->ce_mode) || S_ISGITLINK(ce->ce_mode)))
return;
+   if (recurse_submodules && S_ISGITLINK(ce->ce_mode)) {
+   show_gitlink(ce);
+   return;
+   }
 
if (tag && *tag && show_valid_bit &&
(ce->ce_flags & CE_VALID)) {
@@ -468,6 +509,10 @@ int cmd_ls_files(int argc, const char **argv, const char 
*cmd_prefix)
{ OPTION_SET_INT, 0, "full-name", _len, NULL,
N_("make the output relative to the project top 
directory"),
PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL },
+   OPT_STRING(0, "output-path-prefix", _path_prefix,
+   N_("path"), N_("prepend  to each file")),
+   OPT_BOOL(0, "recurse-submodules", _submodules,
+   N_("recurse through submodules")),
OPT_BOOL(0, "error-unmatch", _unmatch,
N_("if any  is not in the index, treat this as an 
error")),