Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-02-03 Thread Junio C Hamano
Jens Lehmann jens.lehm...@web.de writes:

 Am 28.01.2013 21:34, schrieb Junio C Hamano:
 ...
 I was imagining that foreach --untracked could go something like this:
 
  * If you are inside an existing git repository, read its index to
learn the gitlinks in the directory and its subdirectories.
 
  * Start from the current directory and recursively apply the
procedure in this step:
 
* Scan the directory and iterate over the ones that has .git in
  it:
 
  * If it is a gitlinked one, show it, but do not descend into it
unless --recursive is given (e.g. you start from /home/jens,
find /home/jens/proj/ directory that has /home/jens/proj/.git
in it.  /home/jens/.git/index knows that it is a submodule of
the top-level superproject.  proj is handled, and it is up
to the --recursive option if its submodules are handled).
 
  * If it is _not_ a gitlinked one, show it and descend into it
(e.g. /home/jens/ is not a repository or /home/jens/proj is
not a tracked submodule) to apply this procedure recursively.
 
 Of course, without --untracked, we have no need to iterate over the
 readdir() return values; instead we just scan the index of the
 top-level superproject.

 Thanks for explaining, that makes tons of sense.

There is a small thinko above, though, and I'd like to correct it
before anybody takes the above too seriously as _the_ outline of the
design and implements it to the letter.

The --recursive option should govern both a tracked submodule and an
untracked one.  When asking to list both existing submodules and
directories that could become submodules, you should be able to say

$ git submodule foreach --untracked

to list the direct submodules and the directories with .git in them
that are not yet submodules of the top-level superproject, but the
latter is limited to those with no parent directories with .git in
them (other than the top-level of the working tree of the
superproject).  With

$ git submodule foreach --untracked --recursive

you would see submodules and their submodules recursively, and also
directories with .git in them (i.e. candidates to become direct
submodules of the superproject) and the directories with .git in
them inside such submodule candidates (i.e. candidates to become
direct submodules of the directories that could become direct
submodules of the superproject) recursively.

If we set things up this way:

mkdir -p a/b c/d 
for d in . a a/b c c/d
do
git init $d 
( cd $d  git commit --allow-empty -m initial )
done 
git add a 
( cd a  git add b )

The expected results for various combinations are:

 * git submodule foreach would visit 'a' and nothing else;
 * git submodule foreach --recursive would visit 'a' and 'a/b';
 * git submodule foreach --untracked would visit 'a' and 'c'; and
 * git submodule foreach --untracked --recursive would visit all four.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-28 Thread Jonathan Nieder
Hi,

Lars Hjemli wrote:

 [1] The 'git -a' rewrite patch shows how I think about this command -
 it's just an option to the 'git' command, modifying the way any
 subcommand is invoked (btw: I don't expect that patch to be applied
 since 'git-all' was deemed to generic, so I'll just carry the patch in
 my own tree).

As one data point, 'git all' also seems too generic to me but 'git -a'
doesn't.  Intuition can be weird.

So if I ran the world, then having commands

git -a diff

and

git for-each-repo git diff

do the same thing would be fine.  Of course I don't run the world. ;-)

[...]
 One more thing that nobody brought up during the previous reviews is
 if we want to support subset of repositories by allowing the
 standard pathspec match mechanism.  For example,

 git for-each-repo -d git diff --name-only -- foo/ bar/b\*z

 might be a way to ask please find repositories match the given
 pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that
 are dirty.  We would need to think about how to mark the end of the
 command though---we could borrow \; from find(1), even though find
 is not the best example of the UI design.

In most non-git commands, -- represents an end-of-options marker,
allowing arbitrary options afterward without having to worry about
escaping minus signs.  So in that spirit, if this weren't a git
command, I'd expect to be able to do

for-each-repo -- git diff -- '*.c'

and have the second '--' passed verbatim to git diff.

Unfortunately in git (imitating commands like grep, I suppose), --
means paths start here.  That means that with the git convention,
there is only one place to pass paths to a given command.

Tracing backwards: it would be really nice to be able to do

git for-each-repo git grep -e foo -- '*.c'

or

git -a grep -e foo -- '*.c'

For this practical reason, it seems that paths listed after the '--'
should go to the command being run.  On the other hand, if I wanted to
limit my for-each-repo run to repositories in two subdirectories of
the cwd, I'd be tempted to try

git for-each-repo git grep -e foo -- src/ doc/

And if I wanted to limit to different file types in the repositories
under each directory, it would be tempting to use

git for-each-repo git grep -e foo -- 'src/*.c' 'doc/*.txt'

Is there a convention that would be usable today that is roughly
forward-compatible with that?  (To throw an example out, requiring
that each pathspec passed to for-each-repo either starts with '*' or
contains no wildcards.)

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-28 Thread Lars Hjemli
On Mon, Jan 28, 2013 at 9:10 AM, Jonathan Nieder jrnie...@gmail.com wrote:

 Lars Hjemli wrote:

 [1] The 'git -a' rewrite patch shows how I think about this command -
 it's just an option to the 'git' command, modifying the way any
 subcommand is invoked (btw: I don't expect that patch to be applied
 since 'git-all' was deemed to generic, so I'll just carry the patch in
 my own tree).

 As one data point, 'git all' also seems too generic to me but 'git -a'
 doesn't.  Intuition can be weird.

 So if I ran the world, then having commands

 git -a diff

 and

 git for-each-repo git diff

 do the same thing would be fine.  Of course I don't run the world. ;-)

This would make me very happy. Junio?

--
larsh
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-28 Thread Junio C Hamano
Jonathan Nieder jrnie...@gmail.com writes:

 Tracing backwards: it would be really nice to be able to do

   git for-each-repo git grep -e foo -- '*.c'

This is a very good example that shows the command that is run in
the repositories found may want pathspecs passed, but at the same
time, makes me realize that these repositories have to be fairly
uniform for this command to be useful.  For example, 'src/*.c' or
'inc/*.h' pathspecs wouldn't be useful unless majority if not all
projects the loop finds follow that layout convention.  This is not
necessarily limited to pathspecs, of course.  Unless they all have
the 'next' branch git for-each-repo checkout next would not work,
etc. etc.

As to the pathspec limiting to affect the loop itself, not the
argument given to the command that is run, I don't think it is
absolutely needed; I am perfectly fine with declaring that
for-each-repo goes to repositories in all subdirectories without
limit, especially if doing so will make the UI issues we have to
deal with simpler.

As to the option to the command, not to the subcommand, -a option,
I have been assuming that it was a joke patch, but if git -a grep
turns out to be really useful, submodule foreach that iterates
over the submodules may also want to have such a short and sweet
mechanism.  Between for-each-repo and submodule foreach, I do
not yet have a strong opinion on which one deserves it more.

Come to think of it, is there a reason why for-each-repo should
not be an extention to submodule foreach?  We can view this as
visiting repositories that _could_ be registered as a submodule, in
addition to iterating over the registered submodules, no?

If these two are unified, then we do not have to even worry about
which one deserves git -a more.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-28 Thread Junio C Hamano
Lars Hjemli hje...@gmail.com writes:

 On Mon, Jan 28, 2013 at 9:10 AM, Jonathan Nieder jrnie...@gmail.com wrote:
 ...
 So if I ran the world, then having commands

 git -a diff

 and

 git for-each-repo git diff

 do the same thing would be fine.  Of course I don't run the world. ;-)

 This would make me very happy. Junio?

Ahh, our mails crossed (rather, I responded to the other message I
saw before I saw this one).  I am not completely sold on git -a
yet, but another worry I have is which one between submodule
foreach and for-each-repo should use git -a, if we decide that
it is useful to the users to add it.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-28 Thread Lars Hjemli
On Mon, Jan 28, 2013 at 6:45 PM, Junio C Hamano gits...@pobox.com wrote:
 As to the pathspec limiting to affect the loop itself, not the
 argument given to the command that is run, I don't think it is
 absolutely needed; I am perfectly fine with declaring that
 for-each-repo goes to repositories in all subdirectories without
 limit, especially if doing so will make the UI issues we have to
 deal with simpler.

Good (since the relative path of each repo will be exported to the
child process, that process can perform path limiting when needed).


 As to the option to the command, not to the subcommand, -a option,
 I have been assuming that it was a joke patch, but if git -a grep
 turns out to be really useful, submodule foreach that iterates
 over the submodules may also want to have such a short and sweet
 mechanism.  Between for-each-repo and submodule foreach, I do
 not yet have a strong opinion on which one deserves it more.

 Come to think of it, is there a reason why for-each-repo should
 not be an extention to submodule foreach?  We can view this as
 visiting repositories that _could_ be registered as a submodule, in
 addition to iterating over the registered submodules, no?

Yes, but I see some possible problems with that approach:
-'git for-each-repo' does not need to be started from within a git worktree
-'git for-each-repo' and 'git submodule foreach' have different
semantics for --dirty and --clean
-'git for-each-repo' is in C because my 'git-all' shell script was
horribly slow on large directory trees (especially on windows)

All of these problems are probably solvable, but it would require
quite some reworking of git-submodule.sh

-- 
larsh
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-28 Thread Junio C Hamano
Lars Hjemli hje...@gmail.com writes:

 Come to think of it, is there a reason why for-each-repo should
 not be an extention to submodule foreach?  We can view this as
 visiting repositories that _could_ be registered as a submodule, in
 addition to iterating over the registered submodules, no?

 Yes, but I see some possible problems with that approach:
 -'git for-each-repo' does not need to be started from within a git worktree

True, but git submodule foreach --untracked can be told that it is
OK not (yet) to be in any superproject, no?

 -'git for-each-repo' and 'git submodule foreach' have different
 semantics for --dirty and --clean

That could be a problem.  Is there a good reason why they should use
different definitions of dirtyness?

 -'git for-each-repo' is in C because my 'git-all' shell script was
 horribly slow on large directory trees (especially on windows)

Your for-each-repo could be a good basis to build a new builtin
submodule--foreach that is a pure helper hidden from the end users
that does both; cmd_foreach() in git-submodule.sh can simply delegate
to it.

 All of these problems are probably solvable, but it would require
 quite some reworking of git-submodule.sh

Of course some work is needed, but we do not have to convert all the
cmd_foo in git-submodule.sh in one step.  For the purpose of
unifying for-each-repo and submodule foreach to deliver the
functionality sooner to the end users, we can go the route to add
only the submodule--foreach builtin, out of which we will get
reusable implementation of module_list and other helper functions we
can leverage later to do other cmd_foo functions.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-28 Thread Lars Hjemli
On Mon, Jan 28, 2013 at 7:51 PM, Junio C Hamano gits...@pobox.com wrote:
 Lars Hjemli hje...@gmail.com writes:

 Come to think of it, is there a reason why for-each-repo should
 not be an extention to submodule foreach?  We can view this as
 visiting repositories that _could_ be registered as a submodule, in
 addition to iterating over the registered submodules, no?

 Yes, but I see some possible problems with that approach:
 -'git for-each-repo' does not need to be started from within a git worktree

 True, but git submodule foreach --untracked can be told that it is
 OK not (yet) to be in any superproject, no?

Yes.


 -'git for-each-repo' and 'git submodule foreach' have different
 semantics for --dirty and --clean

 That could be a problem.  Is there a good reason why they should use
 different definitions of dirtyness?

I suspected that 'submodule foreach --dirty' might want to compare the
HEAD sha1 in the submodule against the one recorded in the
superproject (similar to what 'git submodule status' does), but such a
check could be triggered by a different flag (e.g. --behind/--ahead or
something similar).

 -'git for-each-repo' is in C because my 'git-all' shell script was
 horribly slow on large directory trees (especially on windows)

 Your for-each-repo could be a good basis to build a new builtin
 submodule--foreach that is a pure helper hidden from the end users
 that does both; cmd_foreach() in git-submodule.sh can simply delegate
 to it.

Ok, I'll rework my patches in this direction. Thanks.

--
larsh
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-28 Thread Jens Lehmann
Am 28.01.2013 19:51, schrieb Junio C Hamano:
 Lars Hjemli hje...@gmail.com writes:
 
 Come to think of it, is there a reason why for-each-repo should
 not be an extention to submodule foreach?  We can view this as
 visiting repositories that _could_ be registered as a submodule, in
 addition to iterating over the registered submodules, no?

 Yes, but I see some possible problems with that approach:
 -'git for-each-repo' does not need to be started from within a git worktree
 
 True, but git submodule foreach --untracked can be told that it is
 OK not (yet) to be in any superproject, no?

Hmm, I'm not sure how that would work as it looks for gitlinks
in the index which point to work tree paths.

 -'git for-each-repo' and 'git submodule foreach' have different
 semantics for --dirty and --clean

I'm confused, what semantics of --dirty and --clean does current
'git submodule foreach' have? I can't find any sign of it in the
current code ... did I miss something while skimming through this
thread? Or are you talking about status and diff here?

 That could be a problem.  Is there a good reason why they should use
 different definitions of dirtyness?

I don't see any (except of course for comparing a gitlink with the
HEAD of the submodule, which is an additional condition that only
applies to submodules). But I think the current for-each-repo
proposal doesn't allow to traverse repos which contain untracked
content (and it would be nice if the user could somehow combine
that with the current --dirty flag to have both in one go).

 -'git for-each-repo' is in C because my 'git-all' shell script was
 horribly slow on large directory trees (especially on windows)
 
 Your for-each-repo could be a good basis to build a new builtin
 submodule--foreach that is a pure helper hidden from the end users
 that does both; cmd_foreach() in git-submodule.sh can simply delegate
 to it.

I like that approach, because the operations are very similar from
the user's point of view. But please remember that internally they
would work differently, as submodule foreach walks the index and
only descends into those submodules that are populated (and contain
a .git directory or file) while for-each-repo scans the whole work
tree, which makes it a more expensive operation.

 All of these problems are probably solvable, but it would require
 quite some reworking of git-submodule.sh
 
 Of course some work is needed, but we do not have to convert all the
 cmd_foo in git-submodule.sh in one step.  For the purpose of
 unifying for-each-repo and submodule foreach to deliver the
 functionality sooner to the end users, we can go the route to add
 only the submodule--foreach builtin, out of which we will get
 reusable implementation of module_list and other helper functions we
 can leverage later to do other cmd_foo functions.

I really like that idea!
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-28 Thread Junio C Hamano
Jens Lehmann jens.lehm...@web.de writes:

 Am 28.01.2013 19:51, schrieb Junio C Hamano:
 Lars Hjemli hje...@gmail.com writes:
 
 Come to think of it, is there a reason why for-each-repo should
 not be an extention to submodule foreach?  We can view this as
 visiting repositories that _could_ be registered as a submodule, in
 addition to iterating over the registered submodules, no?

 Yes, but I see some possible problems with that approach:
 -'git for-each-repo' does not need to be started from within a git worktree
 
 True, but git submodule foreach --untracked can be told that it is
 OK not (yet) to be in any superproject, no?

 Hmm, I'm not sure how that would work as it looks for gitlinks
 in the index which point to work tree paths.

I was imagining that foreach --untracked could go something like this:

 * If you are inside an existing git repository, read its index to
   learn the gitlinks in the directory and its subdirectories.

 * Start from the current directory and recursively apply the
   procedure in this step:

   * Scan the directory and iterate over the ones that has .git in
 it:

 * If it is a gitlinked one, show it, but do not descend into it
   unless --recursive is given (e.g. you start from /home/jens,
   find /home/jens/proj/ directory that has /home/jens/proj/.git
   in it.  /home/jens/.git/index knows that it is a submodule of
   the top-level superproject.  proj is handled, and it is up
   to the --recursive option if its submodules are handled).

 * If it is _not_ a gitlinked one, show it and descend into it
   (e.g. /home/jens/ is not a repository or /home/jens/proj is
   not a tracked submodule) to apply this procedure recursively.

Of course, without --untracked, we have no need to iterate over the
readdir() return values; instead we just scan the index of the
top-level superproject.

 -'git for-each-repo' and 'git submodule foreach' have different
 semantics for --dirty and --clean

 I'm confused, what semantics of --dirty and --clean does current
 'git submodule foreach' have? I can't find any sign of it in the
 current code ... did I miss something while skimming through this
 thread? Or are you talking about status and diff here?

I think Lars is hinting that submodule foreach could restrict its
operation to a similar --dirty/--clean/--both option he has.  Of
course, the command given to foreach can decide to become no-op by
inspecting the submodule itself, so in that sense, --dirty/--clean
can be done without, but I think it would make sense to have it in
submodule foreach even without the --untracked option.

 But I think the current for-each-repo
 proposal doesn't allow to traverse repos which contain untracked
 content (and it would be nice if the user could somehow combine
 that with the current --dirty flag to have both in one go).

Perhaps.  I personally felt it was really strange that submodule
diff and status consider that it is a sin to have untracked and
unignored cruft in the submodule working tree, though.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-28 Thread Jens Lehmann
Am 28.01.2013 21:34, schrieb Junio C Hamano:
 Jens Lehmann jens.lehm...@web.de writes:
 
 Am 28.01.2013 19:51, schrieb Junio C Hamano:
 Lars Hjemli hje...@gmail.com writes:

 Come to think of it, is there a reason why for-each-repo should
 not be an extention to submodule foreach?  We can view this as
 visiting repositories that _could_ be registered as a submodule, in
 addition to iterating over the registered submodules, no?

 Yes, but I see some possible problems with that approach:
 -'git for-each-repo' does not need to be started from within a git worktree

 True, but git submodule foreach --untracked can be told that it is
 OK not (yet) to be in any superproject, no?

 Hmm, I'm not sure how that would work as it looks for gitlinks
 in the index which point to work tree paths.
 
 I was imagining that foreach --untracked could go something like this:
 
  * If you are inside an existing git repository, read its index to
learn the gitlinks in the directory and its subdirectories.
 
  * Start from the current directory and recursively apply the
procedure in this step:
 
* Scan the directory and iterate over the ones that has .git in
  it:
 
  * If it is a gitlinked one, show it, but do not descend into it
unless --recursive is given (e.g. you start from /home/jens,
find /home/jens/proj/ directory that has /home/jens/proj/.git
in it.  /home/jens/.git/index knows that it is a submodule of
the top-level superproject.  proj is handled, and it is up
to the --recursive option if its submodules are handled).
 
  * If it is _not_ a gitlinked one, show it and descend into it
(e.g. /home/jens/ is not a repository or /home/jens/proj is
not a tracked submodule) to apply this procedure recursively.
 
 Of course, without --untracked, we have no need to iterate over the
 readdir() return values; instead we just scan the index of the
 top-level superproject.

Thanks for explaining, that makes tons of sense.

 -'git for-each-repo' and 'git submodule foreach' have different
 semantics for --dirty and --clean

 I'm confused, what semantics of --dirty and --clean does current
 'git submodule foreach' have? I can't find any sign of it in the
 current code ... did I miss something while skimming through this
 thread? Or are you talking about status and diff here?
 
 I think Lars is hinting that submodule foreach could restrict its
 operation to a similar --dirty/--clean/--both option he has.  Of
 course, the command given to foreach can decide to become no-op by
 inspecting the submodule itself, so in that sense, --dirty/--clean
 can be done without, but I think it would make sense to have it in
 submodule foreach even without the --untracked option.

Nice idea. E.g. that would help submodule users to easily script
a workflow which descends only into modified submodules to create
branches and push them there. Or to remove branches which were
created everywhere only in those submodules that weren't changed.

 But I think the current for-each-repo
 proposal doesn't allow to traverse repos which contain untracked
 content (and it would be nice if the user could somehow combine
 that with the current --dirty flag to have both in one go).
 
 Perhaps.  I personally felt it was really strange that submodule
 diff and status consider that it is a sin to have untracked and
 unignored cruft in the submodule working tree, though.

The VCS we used at work before Git didn't show us any untracked
files, which caused trouble on a regular basis as people were
breaking builds for others because they forgot to check in new
files. That didn't happen with Git anymore, which was very cool.
But the problem reappeared as we started using submodules. Since
I taught status and diff to show that we're happy again. So for
us it was everything but strange ;-)

But for for-each-repo I would rather propose that modifications of
tracked files can optionally and/or solely be used to pick the
repos. Maybe: --dirty=modified, --dirty=untracked and --dirty=both
with --dirty defaulting to modified?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-27 Thread Junio C Hamano
Lars Hjemli hje...@gmail.com writes:

 When working with multiple, unrelated (or loosly related) git repos,
 there is often a need to locate all repos with uncommitted work and
 perform some action on them (say, commit and push). Before this patch,
 such tasks would require manually visiting all repositories, running
 `git status` within each one and then decide what to do next.

 This mundane task can now be automated by e.g. `git for-each-repo --dirty
 status`, which will find all non-bare git repositories below the current
 directory (even nested ones), check if they are dirty (as defined by
 `git diff --quiet  git diff --cached --quiet`), and for each dirty repo
 print the path to the repo and then execute `git status` within the repo.

 The command also honours the option '--clean' which restricts the set of
 repos to those which '--dirty' would skip, and '-x' which is used to
 execute non-git commands.

It might make sense to internally use RUN_GIT_CMD flag when the
first word of the command line is 'git' as an optimization, but 
I am not sure it is a good idea to force the end users to think
when to use -x and when not to is a good idea.

In other words, I think

 git for-each-repo -d diff --name-only
 git for-each-repo -d -x ls '*.c'

is less nice than letting the user say

 git for-each-repo -d git diff --name-only
 git for-each-repo -d ls '*.c'

 Finally, the command to execute within each repo is optional. If none is
 given, git-for-each-repo will just print the path to each repo found. And
 since the command supports -z, this can be used for more advanced scripting
 needs.

It amounts to the same thing, but I would rather describe it as:

To allow scripts to handle paths with shell-unsafe characters,
support -z to show paths with NUL termination.  Otherwise,
such paths are shown with the usual c-quoting.

One more thing that nobody brought up during the previous reviews is
if we want to support subset of repositories by allowing the
standard pathspec match mechanism.  For example,

git for-each-repo -d git diff --name-only -- foo/ bar/b\*z

might be a way to ask please find repositories match the given
pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that
are dirty.  We would need to think about how to mark the end of the
command though---we could borrow \; from find(1), even though find
is not the best example of the UI design.  I.e.

git for-each-repo -d git diff --name-only \; [--] foo/ bar/b\*z

with or without --.

 diff --git a/Documentation/git-for-each-repo.txt 
 b/Documentation/git-for-each-repo.txt
 new file mode 100644
 index 000..fb12b3f
 --- /dev/null
 +++ b/Documentation/git-for-each-repo.txt
 @@ -0,0 +1,71 @@
 +git-for-each-repo(1)
 +
 +
 +NAME
 +
 +git-for-each-repo - Execute a git command in multiple non-bare repositories

There is a separate topic in flight that turns s/git/Git/ when we
refer to the system as a whole.  In any case, this is no longer
limited to execute a Git command.

Find non-bare Git repositories in subdirectories

or

Find or execute a command in non-bare Git repositories in subdirectories


perhaps?

 +SYNOPSIS
 +
 +[verse]
 +'git for-each-repo' [-acdxz] [command]
 +
 +DESCRIPTION
 +---
 +The git-for-each-repo command is used to locate all non-bare git

Should be sufficient to say s/is used to locate/locates/.

 +repositories within the current directory tree, and optionally
 +execute a git command in each of the found repos.

s/a git command/a command/;

 +OPTIONS
 +---
 ...
 +-x::
 + Execute a genric (non-git) command in each repo.

Drop this option.

 +NOTES
 +-
 +
 +For the purpose of `git-for-each-repo`, a dirty worktree is defined as a
 +worktree with uncommitted changes.

Is it a definition that is different from usual?  If so why does it
need to be inconsistent with the rest of the system?

 diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c
 new file mode 100644
 index 000..9333ae0
 --- /dev/null
 +++ b/builtin/for-each-repo.c
 @@ -0,0 +1,145 @@
 +/*
 + * git for-each-repo builtin command.
 + *
 + * Copyright (c) 2013 Lars Hjemli hje...@gmail.com
 + */
 +#include cache.h
 +#include color.h
 +#include quote.h
 +#include builtin.h
 +#include run-command.h
 +#include parse-options.h
 +
 +#define ALL 0
 +#define DIRTY 1
 +#define CLEAN 2
 +
 +static char *color = GIT_COLOR_NORMAL;
 +static int eol = '\n';
 +static int match;
 +static int runopt = RUN_GIT_CMD;
 +
 +static const char * const builtin_foreachrepo_usage[] = {
 + N_(git for-each-repo [-acdxz] [cmd]),
 + NULL
 +};
 +
 +static struct option builtin_foreachrepo_options[] = {
 + OPT_SET_INT('a', all, match, N_(match both clean and dirty 
 repositories), ALL),
 + OPT_SET_INT('c', clean, match, N_(only show clean repositories), 
 CLEAN),
 + OPT_SET_INT('d', dirty, match, N_(only show dirty repositories), 
 DIRTY),
 + 

Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-27 Thread John Keeping
On Sun, Jan 27, 2013 at 11:04:08AM -0800, Junio C Hamano wrote:
 One more thing that nobody brought up during the previous reviews is
 if we want to support subset of repositories by allowing the
 standard pathspec match mechanism.  For example,
 
   git for-each-repo -d git diff --name-only -- foo/ bar/b\*z
 
 might be a way to ask please find repositories match the given
 pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that
 are dirty.  We would need to think about how to mark the end of the
 command though---we could borrow \; from find(1), even though find
 is not the best example of the UI design.  I.e.
 
   git for-each-repo -d git diff --name-only \; [--] foo/ bar/b\*z
 
 with or without --.

Would it be better to make this a (multi-valued) option?

git for-each-repo -d --filter=foo/ --filter=bar/b\*z git diff --name-only

It seems a lot simpler than trying to figure out how the command is
going to handle '--' arguments.

 Oh, that reminds me of another thing.  Perhaps we would want to
 export the (relative) path to the found repository in some way to
 allow the commands to do this kind of thing in the first place?
 submodule foreach does this with $path, I think.

I think $path is the only variable exported by submodule foreach which
is applicable here, but it doesn't work on Windows, where environment
variables are case-insensitive.

Commit 64394e3 (git-submodule.sh: Don't use $path variable in
eval_gettext string) changed submodule foreach to use $sm_path
internally although I notice that the documentation still uses $path.

Perhaps $repo_path in this case?


John
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-27 Thread Junio C Hamano
John Keeping j...@keeping.me.uk writes:

 On Sun, Jan 27, 2013 at 11:04:08AM -0800, Junio C Hamano wrote:
 One more thing that nobody brought up during the previous reviews is
 if we want to support subset of repositories by allowing the
 standard pathspec match mechanism.  For example,
 
  git for-each-repo -d git diff --name-only -- foo/ bar/b\*z
 
 might be a way to ask please find repositories match the given
 pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that
 are dirty.  We would need to think about how to mark the end of the
 command though---we could borrow \; from find(1), even though find
 is not the best example of the UI design.  I.e.
 
  git for-each-repo -d git diff --name-only \; [--] foo/ bar/b\*z
 
 with or without --.

 Would it be better to make this a (multi-valued) option?

 git for-each-repo -d --filter=foo/ --filter=bar/b\*z git diff --name-only

The standard way to use filtering based on paths we have is to use
the pathspec parameters at the end of the commmand line.

I see no reason for such an inconsistency with an option like --filter.

 Oh, that reminds me of another thing.  Perhaps we would want to
 export the (relative) path to the found repository in some way to
 allow the commands to do this kind of thing in the first place?
 submodule foreach does this with $path, I think.

 I think $path is the only variable exported by submodule foreach which
 is applicable here, but it doesn't work on Windows, where environment
 variables are case-insensitive.

 Commit 64394e3 (git-submodule.sh: Don't use $path variable in
 eval_gettext string) changed submodule foreach to use $sm_path
 internally although I notice that the documentation still uses $path.

 Perhaps $repo_path in this case?

I do not care too deeply about the name, as long as the names used
by both mechanisms are the same.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/2] for-each-repo: new command used for multi-repo operations

2013-01-27 Thread Lars Hjemli
On Sun, Jan 27, 2013 at 8:04 PM, Junio C Hamano gits...@pobox.com wrote:
 Lars Hjemli hje...@gmail.com writes:

 The command also honours the option '--clean' which restricts the set of
 repos to those which '--dirty' would skip, and '-x' which is used to
 execute non-git commands.

 It might make sense to internally use RUN_GIT_CMD flag when the
 first word of the command line is 'git' as an optimization, but
 I am not sure it is a good idea to force the end users to think
 when to use -x and when not to is a good idea.

 In other words, I think

  git for-each-repo -d diff --name-only
  git for-each-repo -d -x ls '*.c'

 is less nice than letting the user say

  git for-each-repo -d git diff --name-only
  git for-each-repo -d ls '*.c'


The 'git-for-each-repo' command was made to allow any git command to
be executed in all discovered repositories, and I've used it that way
for two years (in the form of a shell-script called 'git-all'). During
this time, I've occasionally thought about forking non-git commands
but the itch hasn't been strong enough for me to scratch. The point
I'm trying to make is that to me, this command acts as a modifier for
other git commands[1]. Having the possibility to execute non-git
commands would be nice, but it is not the main objective of this
command.

[1] The 'git -a' rewrite patch shows how I think about this command -
it's just an option to the 'git' command, modifying the way any
subcommand is invoked (btw: I don't expect that patch to be applied
since 'git-all' was deemed to generic, so I'll just carry the patch in
my own tree).

 Finally, the command to execute within each repo is optional. If none is
 given, git-for-each-repo will just print the path to each repo found. And
 since the command supports -z, this can be used for more advanced scripting
 needs.

 It amounts to the same thing, but I would rather describe it as:

 To allow scripts to handle paths with shell-unsafe characters,
 support -z to show paths with NUL termination.  Otherwise,
 such paths are shown with the usual c-quoting.


Much better, thanks.


 One more thing that nobody brought up during the previous reviews is
 if we want to support subset of repositories by allowing the
 standard pathspec match mechanism.  For example,

 git for-each-repo -d git diff --name-only -- foo/ bar/b\*z

 might be a way to ask please find repositories match the given
 pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that
 are dirty.  We would need to think about how to mark the end of the
 command though---we could borrow \; from find(1), even though find
 is not the best example of the UI design.  I.e.

 git for-each-repo -d git diff --name-only \; [--] foo/ bar/b\*z

 with or without --.

I don't think this would be very nice to end users, and would prefer
--include and --exclude options (the latter is actually already a part
of git-all, added by one of my coworkers).

 +NOTES
 +-
 +
 +For the purpose of `git-for-each-repo`, a dirty worktree is defined as a
 +worktree with uncommitted changes.

 Is it a definition that is different from usual?  If so why does it
 need to be inconsistent with the rest of the system?

I just wanted to clarify what condition --dirty and --clean will
check. In particular, the lack of checking for untracked files (which
could be added as yet another option).

 +static void print_repo_path(const char *path, unsigned pretty)
 +{
 + if (path[0] == '.'  path[1] == '/')
 + path += 2;
 + if (pretty)
 + color_fprintf_ln(stdout, color, [%s], path);

 This is shown before running a command in that repository.  I am of
 two minds.  It certainly is nice to be able to tell which repository
 each block of output lines comes from, and not requiring the command
 to do this themselves is a good default.  However, I wonder if people
 would want to do something like this:

 git for-each-repo sh -c '
 git diff --name-only |
 sed -e s|^|$path/|
 '

 to get a consolidated view, in a way similar to how submodule
 foreach can be used.  This unconditional output will get in the way
 for such a use case.

I guess -q/--quiet could be useful.

 +static int walk(struct strbuf *path, int argc, const char **argv)
 +{
 + DIR *dir;
 + struct dirent *ent;
 + struct stat st;
 + size_t len;
 + int has_dotgit = 0;
 + struct string_list list = STRING_LIST_INIT_DUP;
 + struct string_list_item *item;
 +
 + dir = opendir(path-buf);
 + if (!dir)
 + return errno;
 + strbuf_addstr(path, /);
 + len = path-len;
 + while ((ent = readdir(dir))) {
 + if (!strcmp(ent-d_name, .) || !strcmp(ent-d_name, ..))
 + continue;
 + if (!strcmp(ent-d_name, .git)) {
 + has_dotgit = 1;
 + continue;
 + }
 + switch (DTYPE(ent)) {
 +