Re: [PATCH 2/4 v4] ls-files: optionally recurse into submodules

2016-09-27 Thread Brandon Williams
On 09/27, Junio C Hamano wrote:
> Brandon Williams  writes:
> > +static const char *submodule_prefix;
> 
> I would have expected this to added to environment.c in the previous
> step, but it is OK--I'd imagine you'd grab this from the environment
> and carrying a piece of information from git.c to here by setenv()
> followed by getenv() feels somewhat roundabout, though.

If it would make sense to do the caching of prefix string in
environment.c I can move it there and add a get_submodule_prefix()
function which either reads the envvar or the cached value if its
already been read.  Would that be a better route?

> 
> >  static const char *prefix;
> >  static int max_prefix_len;
> > @@ -68,6 +71,21 @@ static void write_eolinfo(const struct cache_entry *ce, 
> > const char *path)
> >  static void write_name(const char *name)
> >  {
> > /*
> > +* NEEDSWORK: To make this thread-safe, full_name would have to be owned
> > +* by the caller.
> 
> As Peff mentioned in his review in another thread, a large number of
> functions in git are not reentrant, and I do not think we would want
> to give the impression that those missing a warning are safe to use.
> 
> Other than that, this step looks OK.  3/4 and later would be a lot
> more fun to review ;-)

Oh yes, I can remove the comment.  Seemed to miss that bit while
rerolling the series.

-- 
Brandon Williams


Re: [PATCH 2/4 v4] ls-files: optionally recurse into submodules

2016-09-27 Thread Junio C Hamano
Brandon Williams  writes:

> Allow ls-files to recognize submodules in order to retrieve a list of
> files from a repository's submodules.  This is done by forking off a
> process to recursively call ls-files on all submodules. Use top-level
> --submodule_prefix option to pass a path to the submodule which it can
> use to prepend to output or pathspec matching logic.
>
> Signed-off-by: Brandon Williams 
> ---
>  Documentation/git-ls-files.txt |   7 +-
>  builtin/ls-files.c | 143 
> -
>  git.c  |   2 +-
>  t/t3007-ls-files-recurse-submodules.sh | 100 +++
>  4 files changed, 212 insertions(+), 40 deletions(-)
>  create mode 100755 t/t3007-ls-files-recurse-submodules.sh
>
> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> index 0d933ac..446209e 100644
> --- a/Documentation/git-ls-files.txt
> +++ b/Documentation/git-ls-files.txt
> @@ -18,7 +18,8 @@ SYNOPSIS
>   [--exclude-per-directory=]
>   [--exclude-standard]
>   [--error-unmatch] [--with-tree=]
> - [--full-name] [--abbrev] [--] [...]
> + [--full-name] [--recurse-submodules]
> + [--abbrev] [--] [...]
>  
>  DESCRIPTION
>  ---
> @@ -137,6 +138,10 @@ a space) at the start of each line:
>   option forces paths to be output relative to the project
>   top directory.
>  
> +--recurse-submodules::
> + Recursively calls ls-files on each submodule in the repository.
> + Currently there is only support for the --cached mode.
> +
>  --abbrev[=]::
>   Instead of showing the full 40-byte hexadecimal object
>   lines, show only a partial prefix.
> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index 00ea91a..d4bfc60 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -14,6 +14,7 @@
>  #include "resolve-undo.h"
>  #include "string-list.h"
>  #include "pathspec.h"
> +#include "run-command.h"
>  
>  static int abbrev;
>  static int show_deleted;
> @@ -28,6 +29,8 @@ static int show_valid_bit;
>  static int line_terminator = '\n';
>  static int debug_mode;
>  static int show_eol;
> +static int recurse_submodules;
> +static const char *submodule_prefix;

I would have expected this to added to environment.c in the previous
step, but it is OK--I'd imagine you'd grab this from the environment
and carrying a piece of information from git.c to here by setenv()
followed by getenv() feels somewhat roundabout, though.

>  static const char *prefix;
>  static int max_prefix_len;
> @@ -68,6 +71,21 @@ static void write_eolinfo(const struct cache_entry *ce, 
> const char *path)
>  static void write_name(const char *name)
>  {
>   /*
> +  * NEEDSWORK: To make this thread-safe, full_name would have to be owned
> +  * by the caller.

As Peff mentioned in his review in another thread, a large number of
functions in git are not reentrant, and I do not think we would want
to give the impression that those missing a warning are safe to use.

Other than that, this step looks OK.  3/4 and later would be a lot
more fun to review ;-)


[PATCH 2/4 v4] ls-files: optionally recurse into submodules

2016-09-26 Thread Brandon Williams
Allow ls-files to recognize submodules in order to retrieve a list of
files from a repository's submodules.  This is done by forking off a
process to recursively call ls-files on all submodules. Use top-level
--submodule_prefix option to pass a path to the submodule which it can
use to prepend to output or pathspec matching logic.

Signed-off-by: Brandon Williams 
---
 Documentation/git-ls-files.txt |   7 +-
 builtin/ls-files.c | 143 -
 git.c  |   2 +-
 t/t3007-ls-files-recurse-submodules.sh | 100 +++
 4 files changed, 212 insertions(+), 40 deletions(-)
 create mode 100755 t/t3007-ls-files-recurse-submodules.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0d933ac..446209e 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -18,7 +18,8 @@ SYNOPSIS
[--exclude-per-directory=]
[--exclude-standard]
[--error-unmatch] [--with-tree=]
-   [--full-name] [--abbrev] [--] [...]
+   [--full-name] [--recurse-submodules]
+   [--abbrev] [--] [...]
 
 DESCRIPTION
 ---
@@ -137,6 +138,10 @@ a space) at the start of each line:
option forces paths to be output relative to the project
top directory.
 
+--recurse-submodules::
+   Recursively calls ls-files on each submodule in the repository.
+   Currently there is only support for the --cached mode.
+
 --abbrev[=]::
Instead of showing the full 40-byte hexadecimal object
lines, show only a partial prefix.
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 00ea91a..d4bfc60 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -14,6 +14,7 @@
 #include "resolve-undo.h"
 #include "string-list.h"
 #include "pathspec.h"
+#include "run-command.h"
 
 static int abbrev;
 static int show_deleted;
@@ -28,6 +29,8 @@ static int show_valid_bit;
 static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
+static int recurse_submodules;
+static const char *submodule_prefix;
 
 static const char *prefix;
 static int max_prefix_len;
@@ -68,6 +71,21 @@ static void write_eolinfo(const struct cache_entry *ce, 
const char *path)
 static void write_name(const char *name)
 {
/*
+* NEEDSWORK: To make this thread-safe, full_name would have to be owned
+* by the caller.
+*
+* full_name get reused across output lines to minimize the allocation
+* churn.
+*/
+   static struct strbuf full_name = STRBUF_INIT;
+   if (submodule_prefix && *submodule_prefix) {
+   strbuf_reset(&full_name);
+   strbuf_addstr(&full_name, submodule_prefix);
+   strbuf_addstr(&full_name, name);
+   name = full_name.buf;
+   }
+
+   /*
 * With "--full-name", prefix_len=0; this caller needs to pass
 * an empty string in that case (a NULL is good for "").
 */
@@ -152,55 +170,84 @@ static void show_killed_files(struct dir_struct *dir)
}
 }
 
+/**
+ * Recursively call ls-files on a submodule
+ */
+static void show_gitlink(const struct cache_entry *ce)
+{
+   struct child_process cp = CHILD_PROCESS_INIT;
+   int status;
+
+   argv_array_pushf(&cp.args, "--submodule-prefix=%s%s/",
+submodule_prefix ? submodule_prefix : "",
+ce->name);
+   argv_array_push(&cp.args, "ls-files");
+   argv_array_push(&cp.args, "--recurse-submodules");
+
+   cp.git_cmd = 1;
+   cp.dir = ce->name;
+   status = run_command(&cp);
+   if (status)
+   exit(status);
+}
+
 static void show_ce_entry(const char *tag, const struct cache_entry *ce)
 {
+   struct strbuf name = STRBUF_INIT;
int len = max_prefix_len;
+   if (submodule_prefix)
+   strbuf_addstr(&name, submodule_prefix);
+   strbuf_addstr(&name, ce->name);
 
if (len >= ce_namelen(ce))
die("git ls-files: internal error - cache entry not superset of 
prefix");
 
-   if (!match_pathspec(&pathspec, ce->name, ce_namelen(ce),
-   len, ps_matched,
-   S_ISDIR(ce->ce_mode) || S_ISGITLINK(ce->ce_mode)))
-   return;
+   if (recurse_submodules && S_ISGITLINK(ce->ce_mode)) {
+   show_gitlink(ce);
+   } else if (match_pathspec(&pathspec, name.buf, name.len,
+ len, ps_matched,
+ S_ISDIR(ce->ce_mode) ||
+ S_ISGITLINK(ce->ce_mode))) {
+   if (tag && *tag && show_valid_bit &&
+   (ce->ce_flags & CE_VALID)) {
+   static char alttag[4];
+   memcpy(alttag, tag, 3);
+   if (isalpha(tag[0]))
+   altt