Re: [PATCH] all: new command used for multi-repo operations

2013-01-23 Thread Lars Hjemli
On Wed, Jan 23, 2013 at 9:39 AM, Duy Nguyen  wrote:
> On Wed, Jan 23, 2013 at 4:10 AM, Lars Hjemli  wrote:
>> When working with multiple, unrelated (or loosly related) git repos,
>> there is often a need to locate all repos with uncommitted work and
>> perform some action on them (say, commit and push). Before this patch,
>> such tasks would require manually visiting all repositories, running
>> `git status` within each one and then decide what to do next.
>>
>> This mundane task can now be automated by e.g. `git all --dirty status`,
>> which will find all git repositories below the current directory (even
>> nested ones), check if they are dirty (as defined by `git diff --quiet &&
>> git diff --cached --quiet`), and for each dirty repo print the path to the
>> repo and then execute `git status` within the repo.
>
> I think it should leave out the execute part. The command, say
> ls-repo, lists repositories in specified state. The execute part could
> be easily done by
>
> xargs -I{} git --git-dir={} status blah

Not so easily on windows, which I need to use at $WORK :(

--
larsh
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] all: new command used for multi-repo operations

2013-01-23 Thread Duy Nguyen
On Wed, Jan 23, 2013 at 4:10 AM, Lars Hjemli  wrote:
> When working with multiple, unrelated (or loosly related) git repos,
> there is often a need to locate all repos with uncommitted work and
> perform some action on them (say, commit and push). Before this patch,
> such tasks would require manually visiting all repositories, running
> `git status` within each one and then decide what to do next.
>
> This mundane task can now be automated by e.g. `git all --dirty status`,
> which will find all git repositories below the current directory (even
> nested ones), check if they are dirty (as defined by `git diff --quiet &&
> git diff --cached --quiet`), and for each dirty repo print the path to the
> repo and then execute `git status` within the repo.

I think it should leave out the execute part. The command, say
ls-repo, lists repositories in specified state. The execute part could
be easily done by

xargs -I{} git --git-dir={} status blah

I haven't thought it through. I know xargs does not support chdir'ing
into a repo, so maybe a new git-xargs could be introduced for that.
But there's still a problem with git-xargs (or git-all), printed paths
are relative to the subrepos, not where the git-all/xargs command is
executed. This could be confusing. Personally I'd like to do it
without chdir. That is

xargs -I{} git --git-dir={}/.git --work-tree={} status blah

should print paths relative to current working directory even if cwd
is outside the repo.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] all: new command used for multi-repo operations

2013-01-22 Thread Lars Hjemli
On Wed, Jan 23, 2013 at 7:44 AM, Junio C Hamano  wrote:
> But I do not think the loop structure of this function is right.  If
> $D has ".git" in it, should it even try to feed other subdirectories
> of $D (say "$D/a") to itself in recursion to see if $D/a/.git exists?

Yes, this is needed to meet one of the goals for git-all:  to be an
alternative to git-submodule when submodules doesn't fit (the other
goal is to manage a collection of unrelated (not nested) repos).

-- 
larsh
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] all: new command used for multi-repo operations

2013-01-22 Thread Junio C Hamano
Lars Hjemli  writes:

> +static struct option builtin_all_options[] = {
> + OPT_BOOLEAN('c', "clean", &only_clean, N_("only show clean 
> repositories")),
> + OPT_BOOLEAN('d', "dirty", &only_dirty, N_("only show dirty 
> repositories")),
> + OPT_END(),
> +};

If you were to go in the OPT_SET_INT route, that would give users
the usual "last one wins" semantics, e.g.

$ git for-each-repo --clean --dirty

will look for only dirty repositories.  For completeness, we would
probably want "all" to defeat either of them, i.e.

$ git for-each-repo --clean --all

> +static int walk(struct strbuf *path, int argc, const char **argv)
> +{
> + DIR *dir;
> + struct dirent *ent;
> + size_t len;
> +
> + dir = opendir(path->buf);
> + if (!dir)
> + return errno;
> + strbuf_addstr(path, "/");
> + len = path->len;
> + while ((ent = readdir(dir))) {
> + if (!strcmp(ent->d_name, ".") || !strcmp(ent->d_name, ".."))
> + continue;
> + if (!strcmp(ent->d_name, ".git")) {

This only looks for the top of working tree.  Have you considered if
this "iterate over directories and list git repositories in them"
may be useful for collection of bare repositories, and if it is, how
to go about implementing the discovery process?

> + if (ent->d_type != DT_DIR)
> + continue;

I think this is wrong.

On platforms that need a NO_D_TYPE_IN_DIRENT build, your compilation
may fail here (you would need to lstat() it yourself).  See how
dir.c does this without ugly #ifdef's in the code, especially around
the use of get_dtype() and DTYPE() macro.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] all: new command used for multi-repo operations

2013-01-22 Thread Junio C Hamano
David Aguilar  writes:

>> +static int walk(struct strbuf *path, int argc, const char **argv)
>> +{
>> +   DIR *dir;
>> +   struct dirent *ent;
>> +   size_t len;
>> +
>> +   dir = opendir(path->buf);
>> +   if (!dir)
>> +   return errno;
>> +   strbuf_addstr(path, "/");
>> +   len = path->len;
>> +   while ((ent = readdir(dir))) {
>> +   if (!strcmp(ent->d_name, ".") || !strcmp(ent->d_name, ".."))
>> +   continue;
>> +   if (!strcmp(ent->d_name, ".git")) {
>> +   strbuf_setlen(path, len - 1);
>> +   chdir(path->buf);
>> +   handle_repo(path->buf, argv);
>> +   chdir(root);
>> +   strbuf_addstr(path, "/");
>> +   continue;
>> +   }
>
> Does this section above properly handle .git files (where .git is a
> file, not a directory)?

This scans a directory $D to ask "is there '.git' in you?" and if
the answer is "yes", then hands $D (not "$D/.git") to handle_repo().
That logic will not miss a gitfile that points at the real $GIT_DIR
elsewhere.

There is a recursive call to walk() later in the same loop when the
found entry ent turns out to be a directory, and "$D/" + ent->d_name
is given to this function.

But I do not think the loop structure of this function is right.  If
$D has ".git" in it, should it even try to feed other subdirectories
of $D (say "$D/a") to itself in recursion to see if $D/a/.git exists?

I think it should be more like

walk(struct strbuf *path)
{
size_t dirlen = path->len;
int has_git;

strbuf_addstr(path, "/.git");
has_git = !lstat(path->buf);
strbuf_setlen(path, dirlen);

if (has_git) {
handle_repo(path->buf);
return;
}
dir = opendir(path->buf);
while ((ent = readdir(dir))) {
... skip . and .. ...
strbuf_addstr(path, ent->d_name);
walk(path);
strbuf_setlen(path, dirlen);
}
}

The determination of has_git can be a bit fancier than a simple
!lstat() as you mentioned.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] all: new command used for multi-repo operations

2013-01-22 Thread David Aguilar
On Tue, Jan 22, 2013 at 1:10 PM, Lars Hjemli  wrote:
> When working with multiple, unrelated (or loosly related) git repos,
> there is often a need to locate all repos with uncommitted work and
> perform some action on them (say, commit and push). Before this patch,
> such tasks would require manually visiting all repositories, running
> `git status` within each one and then decide what to do next.
>
> This mundane task can now be automated by e.g. `git all --dirty status`,
> which will find all git repositories below the current directory (even
> nested ones), check if they are dirty (as defined by `git diff --quiet &&
> git diff --cached --quiet`), and for each dirty repo print the path to the
> repo and then execute `git status` within the repo.
>
> The command also honours the option '--clean' which restricts the set of
> repos to those which '--dirty' would skip.
>
> Finally, the command to execute within each repo is optional. If none is
> given, git-all will just print the path to each repo found.
>
> Signed-off-by: Lars Hjemli 
> ---
>  Documentation/git-all.txt |  37 
>  Makefile  |   1 +
>  builtin.h |   1 +
>  builtin/all.c | 105 
> ++
>  command-list.txt  |   1 +
>  git.c |   1 +
>  t/t0064-all.sh|  42 +++
>  7 files changed, 188 insertions(+)
>  create mode 100644 Documentation/git-all.txt
>  create mode 100644 builtin/all.c
>  create mode 100755 t/t0064-all.sh
>
> diff --git a/Documentation/git-all.txt b/Documentation/git-all.txt
> new file mode 100644
> index 000..b25f23c
> --- /dev/null
> +++ b/Documentation/git-all.txt
> @@ -0,0 +1,37 @@
> +git-all(1)
> +==
> +
> +NAME
> +
> +git-all - Execute a git command in multiple repositories
> +
> +SYNOPSIS
> +
> +[verse]
> +'git all' [--dirty|--clean] [command]
> +
> +DESCRIPTION
> +---
> +The git-all command is used to locate all git repositoris within the
> +current directory tree, and optionally execute a git command in each
> +of the found repos.
> +
> +OPTIONS
> +---
> +-c::
> +--clean::
> +   Only include repositories with a clean worktree.
> +
> +-d::
> +--dirty::
> +   Only include repositories with a dirty worktree.
> +
> +NOTES
> +-
> +
> +For the purpose of `git-all`, a dirty worktree is defined as a worktree
> +with uncommitted changes.
> +
> +GIT
> +---
> +Part of the linkgit:git[1] suite
> diff --git a/Makefile b/Makefile
> index 1b30d7b..8bf0583 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -840,6 +840,7 @@ LIB_OBJS += xdiff-interface.o
>  LIB_OBJS += zlib.o
>
>  BUILTIN_OBJS += builtin/add.o
> +BUILTIN_OBJS += builtin/all.o
>  BUILTIN_OBJS += builtin/annotate.o
>  BUILTIN_OBJS += builtin/apply.o
>  BUILTIN_OBJS += builtin/archive.o
> diff --git a/builtin.h b/builtin.h
> index 7e7bbd6..438c265 100644
> --- a/builtin.h
> +++ b/builtin.h
> @@ -41,6 +41,7 @@ void finish_copy_notes_for_rewrite(struct notes_rewrite_cfg 
> *c);
>  extern int textconv_object(const char *path, unsigned mode, const unsigned 
> char *sha1, int sha1_valid, char **buf, unsigned long *buf_size);
>
>  extern int cmd_add(int argc, const char **argv, const char *prefix);
> +extern int cmd_all(int argc, const char **argv, const char *prefix);
>  extern int cmd_annotate(int argc, const char **argv, const char *prefix);
>  extern int cmd_apply(int argc, const char **argv, const char *prefix);
>  extern int cmd_archive(int argc, const char **argv, const char *prefix);
> diff --git a/builtin/all.c b/builtin/all.c
> new file mode 100644
> index 000..ee9270d
> --- /dev/null
> +++ b/builtin/all.c
> @@ -0,0 +1,105 @@
> +/*
> + * "git all" builtin command.
> + *
> + * Copyright (c) 2013 Lars Hjemli 
> + */
> +#include "cache.h"
> +#include "color.h"
> +#include "builtin.h"
> +#include "run-command.h"
> +#include "parse-options.h"
> +
> +static int only_dirty;
> +static int only_clean;
> +char root[PATH_MAX];
> +
> +static const char * const builtin_all_usage[] = {
> +   N_("git all [options] [cmd]"),
> +   NULL
> +};
> +
> +static struct option builtin_all_options[] = {
> +   OPT_BOOLEAN('c', "clean", &only_clean, N_("only show clean 
> repositories")),
> +   OPT_BOOLEAN('d', "dirty", &only_dirty, N_("only show dirty 
> repositories")),
> +   OPT_END(),
> +};
> +
> +static int is_dirty()
> +{
> +   const char *diffidx[] = {"diff", "--quiet", "--cached", NULL};
> +   const char *diffwd[] = {"diff", "--quiet", NULL};
> +
> +   if (run_command_v_opt(diffidx, RUN_GIT_CMD) != 0)
> +   return 1;
> +   if (run_command_v_opt(diffwd, RUN_GIT_CMD) != 0)
> +   return 1;
> +   return 0;
> +}
> +
> +static void handle_repo(char *path, const char **argv)
> +{
> +   int dirty;
> +
> +   if (path[0] == '.' && path[1] == '/')
> +   path += 2;
> +   if (only_dirty || only_clean) {
> +   

Re: [PATCH] all: new command used for multi-repo operations

2013-01-22 Thread Junio C Hamano
Lars Hjemli  writes:

> In principle I agree with your reasoning on this, but in practice I
> fail to see what other kind of things `git all` could naturally refer
> to.

For example, the designers of 'for-each-ref' could have called it
"git all", as it is to iterate over all refs.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] all: new command used for multi-repo operations

2013-01-22 Thread Lars Hjemli
On Tue, Jan 22, 2013 at 11:01 PM, Junio C Hamano  wrote:
> Lars Hjemli  writes:
>
>> +static struct option builtin_all_options[] = {
>> + OPT_BOOLEAN('c', "clean", &only_clean, N_("only show clean 
>> repositories")),
>> + OPT_BOOLEAN('d', "dirty", &only_dirty, N_("only show dirty 
>> repositories")),
>> + OPT_END(),
>> +};
>
> Shouldn't this be more like OPT_SET_INT() on a same variable that is
> initialized to "all"?  Alternatively you could validate the input
> and die when both are given.

OPT_SET_INT() seems appropriate, will fix.

>
>> +int cmd_all(int argc, const char **argv, const char *prefix)
>> +{
>> + struct strbuf path = STRBUF_INIT;
>> +
>> + if (!getcwd(root, sizeof(root)))
>> + return 1;
>> +
>> + argc = parse_options(argc, argv, prefix, builtin_all_options,
>> +  builtin_all_usage, PARSE_OPT_STOP_AT_NON_OPTION);
>> +
>> + unsetenv(GIT_DIR_ENVIRONMENT);
>> + unsetenv(GIT_WORK_TREE_ENVIRONMENT);
>
> Don't you need to clear other variables whose uses are closely tied
> to a single repository, like GIT_INDEX_FILE, etc.?
>
> I suspect that explicitly exporting GIT_DIR and GIT_WORK_TREE (and
> nothing else) in handle_repo() to the location you discovered before
> you run the per-repository command via run_command_v_opt(), might be
> a better alternative.  The user could be sharing objects in all
> repositories by permanently setting GIT_OBJECT_DIRECTORY to a single
> place.
>

This sounds like a nice plan, I'll test it and send an updated patch.


>> diff --git a/command-list.txt b/command-list.txt
>> index 7e8cfec..f955895 100644
>> --- a/command-list.txt
>> +++ b/command-list.txt
>> @@ -1,6 +1,7 @@
>>  # List of known git commands.
>>  # command name   category [deprecated] [common]
>>  git-add mainporcelain common
>> +git-all mainporcelain
>>  git-am  mainporcelain
>>  git-annotateancillaryinterrogators
>>  git-apply   plumbingmanipulators
>
> I am not very interested in this topic in the first place, but this
> does not (at least not yet) sound like a main Porcelain to me.

There doesn't seem to be a better category, but I'm open for suggestions.

>
> "all" may be a word other people may want to use to call collections
> of things other than "Git repositories", and that use may turn out
> to be more useful in general.  A name that makes it clear that this
> is about "repositories", i.e. along the lines of "git for-each-repo"
> or something, would be a better name that does not squat on such a
> short and sweet name.
>

In principle I agree with your reasoning on this, but in practice I
fail to see what other kind of things `git all` could naturally refer
to. Also, having a short and sweet way to perform the tasks
implemented by this patch was my main motivation for writing it [1],
hence `git for-each-repo` isn't as compelling (too much typing).
There's always aliases, but I'd prefer it if future git supported `git
all` by default.

-- 
larsh

[1] Originally as a shell script, used at $WORK for ~2 years
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] all: new command used for multi-repo operations

2013-01-22 Thread Junio C Hamano
Lars Hjemli  writes:

> diff --git a/builtin/all.c b/builtin/all.c
> new file mode 100644
> index 000..ee9270d
> --- /dev/null
> +++ b/builtin/all.c
> @@ -0,0 +1,105 @@
> +/*
> + * "git all" builtin command.
> + *
> + * Copyright (c) 2013 Lars Hjemli 
> + */
> +#include "cache.h"
> +#include "color.h"
> +#include "builtin.h"
> +#include "run-command.h"
> +#include "parse-options.h"
> +
> +static int only_dirty;
> +static int only_clean;
> +char root[PATH_MAX];
> +
> +static const char * const builtin_all_usage[] = {
> + N_("git all [options] [cmd]"),
> + NULL
> +};
> +
> +static struct option builtin_all_options[] = {
> + OPT_BOOLEAN('c', "clean", &only_clean, N_("only show clean 
> repositories")),
> + OPT_BOOLEAN('d', "dirty", &only_dirty, N_("only show dirty 
> repositories")),
> + OPT_END(),
> +};

Shouldn't this be more like OPT_SET_INT() on a same variable that is
initialized to "all"?  Alternatively you could validate the input
and die when both are given.

> +int cmd_all(int argc, const char **argv, const char *prefix)
> +{
> + struct strbuf path = STRBUF_INIT;
> +
> + if (!getcwd(root, sizeof(root)))
> + return 1;
> +
> + argc = parse_options(argc, argv, prefix, builtin_all_options,
> +  builtin_all_usage, PARSE_OPT_STOP_AT_NON_OPTION);
> +
> + unsetenv(GIT_DIR_ENVIRONMENT);
> + unsetenv(GIT_WORK_TREE_ENVIRONMENT);

Don't you need to clear other variables whose uses are closely tied
to a single repository, like GIT_INDEX_FILE, etc.?

I suspect that explicitly exporting GIT_DIR and GIT_WORK_TREE (and
nothing else) in handle_repo() to the location you discovered before
you run the per-repository command via run_command_v_opt(), might be
a better alternative.  The user could be sharing objects in all
repositories by permanently setting GIT_OBJECT_DIRECTORY to a single
place.

> diff --git a/command-list.txt b/command-list.txt
> index 7e8cfec..f955895 100644
> --- a/command-list.txt
> +++ b/command-list.txt
> @@ -1,6 +1,7 @@
>  # List of known git commands.
>  # command name   category [deprecated] [common]
>  git-add mainporcelain common
> +git-all mainporcelain
>  git-am  mainporcelain
>  git-annotateancillaryinterrogators
>  git-apply   plumbingmanipulators

I am not very interested in this topic in the first place, but this
does not (at least not yet) sound like a main Porcelain to me.

"all" may be a word other people may want to use to call collections
of things other than "Git repositories", and that use may turn out
to be more useful in general.  A name that makes it clear that this
is about "repositories", i.e. along the lines of "git for-each-repo"
or something, would be a better name that does not squat on such a
short and sweet name.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html