Re: [PATCH] all: new command used for multi-repo operations
On Wed, Jan 23, 2013 at 9:39 AM, Duy Nguyen wrote: > On Wed, Jan 23, 2013 at 4:10 AM, Lars Hjemli wrote: >> When working with multiple, unrelated (or loosly related) git repos, >> there is often a need to locate all repos with uncommitted work and >> perform some action on them (say, commit and push). Before this patch, >> such tasks would require manually visiting all repositories, running >> `git status` within each one and then decide what to do next. >> >> This mundane task can now be automated by e.g. `git all --dirty status`, >> which will find all git repositories below the current directory (even >> nested ones), check if they are dirty (as defined by `git diff --quiet && >> git diff --cached --quiet`), and for each dirty repo print the path to the >> repo and then execute `git status` within the repo. > > I think it should leave out the execute part. The command, say > ls-repo, lists repositories in specified state. The execute part could > be easily done by > > xargs -I{} git --git-dir={} status blah Not so easily on windows, which I need to use at $WORK :( -- larsh -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] all: new command used for multi-repo operations
On Wed, Jan 23, 2013 at 4:10 AM, Lars Hjemli wrote: > When working with multiple, unrelated (or loosly related) git repos, > there is often a need to locate all repos with uncommitted work and > perform some action on them (say, commit and push). Before this patch, > such tasks would require manually visiting all repositories, running > `git status` within each one and then decide what to do next. > > This mundane task can now be automated by e.g. `git all --dirty status`, > which will find all git repositories below the current directory (even > nested ones), check if they are dirty (as defined by `git diff --quiet && > git diff --cached --quiet`), and for each dirty repo print the path to the > repo and then execute `git status` within the repo. I think it should leave out the execute part. The command, say ls-repo, lists repositories in specified state. The execute part could be easily done by xargs -I{} git --git-dir={} status blah I haven't thought it through. I know xargs does not support chdir'ing into a repo, so maybe a new git-xargs could be introduced for that. But there's still a problem with git-xargs (or git-all), printed paths are relative to the subrepos, not where the git-all/xargs command is executed. This could be confusing. Personally I'd like to do it without chdir. That is xargs -I{} git --git-dir={}/.git --work-tree={} status blah should print paths relative to current working directory even if cwd is outside the repo. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] all: new command used for multi-repo operations
On Wed, Jan 23, 2013 at 7:44 AM, Junio C Hamano wrote: > But I do not think the loop structure of this function is right. If > $D has ".git" in it, should it even try to feed other subdirectories > of $D (say "$D/a") to itself in recursion to see if $D/a/.git exists? Yes, this is needed to meet one of the goals for git-all: to be an alternative to git-submodule when submodules doesn't fit (the other goal is to manage a collection of unrelated (not nested) repos). -- larsh -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] all: new command used for multi-repo operations
Lars Hjemli writes: > +static struct option builtin_all_options[] = { > + OPT_BOOLEAN('c', "clean", &only_clean, N_("only show clean > repositories")), > + OPT_BOOLEAN('d', "dirty", &only_dirty, N_("only show dirty > repositories")), > + OPT_END(), > +}; If you were to go in the OPT_SET_INT route, that would give users the usual "last one wins" semantics, e.g. $ git for-each-repo --clean --dirty will look for only dirty repositories. For completeness, we would probably want "all" to defeat either of them, i.e. $ git for-each-repo --clean --all > +static int walk(struct strbuf *path, int argc, const char **argv) > +{ > + DIR *dir; > + struct dirent *ent; > + size_t len; > + > + dir = opendir(path->buf); > + if (!dir) > + return errno; > + strbuf_addstr(path, "/"); > + len = path->len; > + while ((ent = readdir(dir))) { > + if (!strcmp(ent->d_name, ".") || !strcmp(ent->d_name, "..")) > + continue; > + if (!strcmp(ent->d_name, ".git")) { This only looks for the top of working tree. Have you considered if this "iterate over directories and list git repositories in them" may be useful for collection of bare repositories, and if it is, how to go about implementing the discovery process? > + if (ent->d_type != DT_DIR) > + continue; I think this is wrong. On platforms that need a NO_D_TYPE_IN_DIRENT build, your compilation may fail here (you would need to lstat() it yourself). See how dir.c does this without ugly #ifdef's in the code, especially around the use of get_dtype() and DTYPE() macro. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] all: new command used for multi-repo operations
David Aguilar writes: >> +static int walk(struct strbuf *path, int argc, const char **argv) >> +{ >> + DIR *dir; >> + struct dirent *ent; >> + size_t len; >> + >> + dir = opendir(path->buf); >> + if (!dir) >> + return errno; >> + strbuf_addstr(path, "/"); >> + len = path->len; >> + while ((ent = readdir(dir))) { >> + if (!strcmp(ent->d_name, ".") || !strcmp(ent->d_name, "..")) >> + continue; >> + if (!strcmp(ent->d_name, ".git")) { >> + strbuf_setlen(path, len - 1); >> + chdir(path->buf); >> + handle_repo(path->buf, argv); >> + chdir(root); >> + strbuf_addstr(path, "/"); >> + continue; >> + } > > Does this section above properly handle .git files (where .git is a > file, not a directory)? This scans a directory $D to ask "is there '.git' in you?" and if the answer is "yes", then hands $D (not "$D/.git") to handle_repo(). That logic will not miss a gitfile that points at the real $GIT_DIR elsewhere. There is a recursive call to walk() later in the same loop when the found entry ent turns out to be a directory, and "$D/" + ent->d_name is given to this function. But I do not think the loop structure of this function is right. If $D has ".git" in it, should it even try to feed other subdirectories of $D (say "$D/a") to itself in recursion to see if $D/a/.git exists? I think it should be more like walk(struct strbuf *path) { size_t dirlen = path->len; int has_git; strbuf_addstr(path, "/.git"); has_git = !lstat(path->buf); strbuf_setlen(path, dirlen); if (has_git) { handle_repo(path->buf); return; } dir = opendir(path->buf); while ((ent = readdir(dir))) { ... skip . and .. ... strbuf_addstr(path, ent->d_name); walk(path); strbuf_setlen(path, dirlen); } } The determination of has_git can be a bit fancier than a simple !lstat() as you mentioned. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] all: new command used for multi-repo operations
On Tue, Jan 22, 2013 at 1:10 PM, Lars Hjemli wrote: > When working with multiple, unrelated (or loosly related) git repos, > there is often a need to locate all repos with uncommitted work and > perform some action on them (say, commit and push). Before this patch, > such tasks would require manually visiting all repositories, running > `git status` within each one and then decide what to do next. > > This mundane task can now be automated by e.g. `git all --dirty status`, > which will find all git repositories below the current directory (even > nested ones), check if they are dirty (as defined by `git diff --quiet && > git diff --cached --quiet`), and for each dirty repo print the path to the > repo and then execute `git status` within the repo. > > The command also honours the option '--clean' which restricts the set of > repos to those which '--dirty' would skip. > > Finally, the command to execute within each repo is optional. If none is > given, git-all will just print the path to each repo found. > > Signed-off-by: Lars Hjemli > --- > Documentation/git-all.txt | 37 > Makefile | 1 + > builtin.h | 1 + > builtin/all.c | 105 > ++ > command-list.txt | 1 + > git.c | 1 + > t/t0064-all.sh| 42 +++ > 7 files changed, 188 insertions(+) > create mode 100644 Documentation/git-all.txt > create mode 100644 builtin/all.c > create mode 100755 t/t0064-all.sh > > diff --git a/Documentation/git-all.txt b/Documentation/git-all.txt > new file mode 100644 > index 000..b25f23c > --- /dev/null > +++ b/Documentation/git-all.txt > @@ -0,0 +1,37 @@ > +git-all(1) > +== > + > +NAME > + > +git-all - Execute a git command in multiple repositories > + > +SYNOPSIS > + > +[verse] > +'git all' [--dirty|--clean] [command] > + > +DESCRIPTION > +--- > +The git-all command is used to locate all git repositoris within the > +current directory tree, and optionally execute a git command in each > +of the found repos. > + > +OPTIONS > +--- > +-c:: > +--clean:: > + Only include repositories with a clean worktree. > + > +-d:: > +--dirty:: > + Only include repositories with a dirty worktree. > + > +NOTES > +- > + > +For the purpose of `git-all`, a dirty worktree is defined as a worktree > +with uncommitted changes. > + > +GIT > +--- > +Part of the linkgit:git[1] suite > diff --git a/Makefile b/Makefile > index 1b30d7b..8bf0583 100644 > --- a/Makefile > +++ b/Makefile > @@ -840,6 +840,7 @@ LIB_OBJS += xdiff-interface.o > LIB_OBJS += zlib.o > > BUILTIN_OBJS += builtin/add.o > +BUILTIN_OBJS += builtin/all.o > BUILTIN_OBJS += builtin/annotate.o > BUILTIN_OBJS += builtin/apply.o > BUILTIN_OBJS += builtin/archive.o > diff --git a/builtin.h b/builtin.h > index 7e7bbd6..438c265 100644 > --- a/builtin.h > +++ b/builtin.h > @@ -41,6 +41,7 @@ void finish_copy_notes_for_rewrite(struct notes_rewrite_cfg > *c); > extern int textconv_object(const char *path, unsigned mode, const unsigned > char *sha1, int sha1_valid, char **buf, unsigned long *buf_size); > > extern int cmd_add(int argc, const char **argv, const char *prefix); > +extern int cmd_all(int argc, const char **argv, const char *prefix); > extern int cmd_annotate(int argc, const char **argv, const char *prefix); > extern int cmd_apply(int argc, const char **argv, const char *prefix); > extern int cmd_archive(int argc, const char **argv, const char *prefix); > diff --git a/builtin/all.c b/builtin/all.c > new file mode 100644 > index 000..ee9270d > --- /dev/null > +++ b/builtin/all.c > @@ -0,0 +1,105 @@ > +/* > + * "git all" builtin command. > + * > + * Copyright (c) 2013 Lars Hjemli > + */ > +#include "cache.h" > +#include "color.h" > +#include "builtin.h" > +#include "run-command.h" > +#include "parse-options.h" > + > +static int only_dirty; > +static int only_clean; > +char root[PATH_MAX]; > + > +static const char * const builtin_all_usage[] = { > + N_("git all [options] [cmd]"), > + NULL > +}; > + > +static struct option builtin_all_options[] = { > + OPT_BOOLEAN('c', "clean", &only_clean, N_("only show clean > repositories")), > + OPT_BOOLEAN('d', "dirty", &only_dirty, N_("only show dirty > repositories")), > + OPT_END(), > +}; > + > +static int is_dirty() > +{ > + const char *diffidx[] = {"diff", "--quiet", "--cached", NULL}; > + const char *diffwd[] = {"diff", "--quiet", NULL}; > + > + if (run_command_v_opt(diffidx, RUN_GIT_CMD) != 0) > + return 1; > + if (run_command_v_opt(diffwd, RUN_GIT_CMD) != 0) > + return 1; > + return 0; > +} > + > +static void handle_repo(char *path, const char **argv) > +{ > + int dirty; > + > + if (path[0] == '.' && path[1] == '/') > + path += 2; > + if (only_dirty || only_clean) { > +
Re: [PATCH] all: new command used for multi-repo operations
Lars Hjemli writes: > In principle I agree with your reasoning on this, but in practice I > fail to see what other kind of things `git all` could naturally refer > to. For example, the designers of 'for-each-ref' could have called it "git all", as it is to iterate over all refs. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] all: new command used for multi-repo operations
On Tue, Jan 22, 2013 at 11:01 PM, Junio C Hamano wrote: > Lars Hjemli writes: > >> +static struct option builtin_all_options[] = { >> + OPT_BOOLEAN('c', "clean", &only_clean, N_("only show clean >> repositories")), >> + OPT_BOOLEAN('d', "dirty", &only_dirty, N_("only show dirty >> repositories")), >> + OPT_END(), >> +}; > > Shouldn't this be more like OPT_SET_INT() on a same variable that is > initialized to "all"? Alternatively you could validate the input > and die when both are given. OPT_SET_INT() seems appropriate, will fix. > >> +int cmd_all(int argc, const char **argv, const char *prefix) >> +{ >> + struct strbuf path = STRBUF_INIT; >> + >> + if (!getcwd(root, sizeof(root))) >> + return 1; >> + >> + argc = parse_options(argc, argv, prefix, builtin_all_options, >> + builtin_all_usage, PARSE_OPT_STOP_AT_NON_OPTION); >> + >> + unsetenv(GIT_DIR_ENVIRONMENT); >> + unsetenv(GIT_WORK_TREE_ENVIRONMENT); > > Don't you need to clear other variables whose uses are closely tied > to a single repository, like GIT_INDEX_FILE, etc.? > > I suspect that explicitly exporting GIT_DIR and GIT_WORK_TREE (and > nothing else) in handle_repo() to the location you discovered before > you run the per-repository command via run_command_v_opt(), might be > a better alternative. The user could be sharing objects in all > repositories by permanently setting GIT_OBJECT_DIRECTORY to a single > place. > This sounds like a nice plan, I'll test it and send an updated patch. >> diff --git a/command-list.txt b/command-list.txt >> index 7e8cfec..f955895 100644 >> --- a/command-list.txt >> +++ b/command-list.txt >> @@ -1,6 +1,7 @@ >> # List of known git commands. >> # command name category [deprecated] [common] >> git-add mainporcelain common >> +git-all mainporcelain >> git-am mainporcelain >> git-annotateancillaryinterrogators >> git-apply plumbingmanipulators > > I am not very interested in this topic in the first place, but this > does not (at least not yet) sound like a main Porcelain to me. There doesn't seem to be a better category, but I'm open for suggestions. > > "all" may be a word other people may want to use to call collections > of things other than "Git repositories", and that use may turn out > to be more useful in general. A name that makes it clear that this > is about "repositories", i.e. along the lines of "git for-each-repo" > or something, would be a better name that does not squat on such a > short and sweet name. > In principle I agree with your reasoning on this, but in practice I fail to see what other kind of things `git all` could naturally refer to. Also, having a short and sweet way to perform the tasks implemented by this patch was my main motivation for writing it [1], hence `git for-each-repo` isn't as compelling (too much typing). There's always aliases, but I'd prefer it if future git supported `git all` by default. -- larsh [1] Originally as a shell script, used at $WORK for ~2 years -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] all: new command used for multi-repo operations
Lars Hjemli writes: > diff --git a/builtin/all.c b/builtin/all.c > new file mode 100644 > index 000..ee9270d > --- /dev/null > +++ b/builtin/all.c > @@ -0,0 +1,105 @@ > +/* > + * "git all" builtin command. > + * > + * Copyright (c) 2013 Lars Hjemli > + */ > +#include "cache.h" > +#include "color.h" > +#include "builtin.h" > +#include "run-command.h" > +#include "parse-options.h" > + > +static int only_dirty; > +static int only_clean; > +char root[PATH_MAX]; > + > +static const char * const builtin_all_usage[] = { > + N_("git all [options] [cmd]"), > + NULL > +}; > + > +static struct option builtin_all_options[] = { > + OPT_BOOLEAN('c', "clean", &only_clean, N_("only show clean > repositories")), > + OPT_BOOLEAN('d', "dirty", &only_dirty, N_("only show dirty > repositories")), > + OPT_END(), > +}; Shouldn't this be more like OPT_SET_INT() on a same variable that is initialized to "all"? Alternatively you could validate the input and die when both are given. > +int cmd_all(int argc, const char **argv, const char *prefix) > +{ > + struct strbuf path = STRBUF_INIT; > + > + if (!getcwd(root, sizeof(root))) > + return 1; > + > + argc = parse_options(argc, argv, prefix, builtin_all_options, > + builtin_all_usage, PARSE_OPT_STOP_AT_NON_OPTION); > + > + unsetenv(GIT_DIR_ENVIRONMENT); > + unsetenv(GIT_WORK_TREE_ENVIRONMENT); Don't you need to clear other variables whose uses are closely tied to a single repository, like GIT_INDEX_FILE, etc.? I suspect that explicitly exporting GIT_DIR and GIT_WORK_TREE (and nothing else) in handle_repo() to the location you discovered before you run the per-repository command via run_command_v_opt(), might be a better alternative. The user could be sharing objects in all repositories by permanently setting GIT_OBJECT_DIRECTORY to a single place. > diff --git a/command-list.txt b/command-list.txt > index 7e8cfec..f955895 100644 > --- a/command-list.txt > +++ b/command-list.txt > @@ -1,6 +1,7 @@ > # List of known git commands. > # command name category [deprecated] [common] > git-add mainporcelain common > +git-all mainporcelain > git-am mainporcelain > git-annotateancillaryinterrogators > git-apply plumbingmanipulators I am not very interested in this topic in the first place, but this does not (at least not yet) sound like a main Porcelain to me. "all" may be a word other people may want to use to call collections of things other than "Git repositories", and that use may turn out to be more useful in general. A name that makes it clear that this is about "repositories", i.e. along the lines of "git for-each-repo" or something, would be a better name that does not squat on such a short and sweet name. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html