Re: Re: [PATCH] submodule recursion in git-archive

2013-12-03 Thread Heiko Voigt
Hi,

On Mon, Dec 02, 2013 at 03:55:36PM -0800, Nick Townsend wrote:
 
 On 29 Nov 2013, at 14:38, Heiko Voigt hvo...@hvoigt.net wrote:
  FYI, I already started to implement this lookup of submodule paths early
  this year[1] but have not found the time to proceed on that yet. I am
  planning to continue on that topic soonish. We need it to implement a
  correct recursive fetch with clone on-demand as a basis for the future
  recursive checkout.
  
  During the work on this I hit too many open questions. Thats why I am
  currently working on a complete plan[2] so we can discuss and define how
  this needs to be implemented. It is an asciidoc document which I will
  send out once I am finished with it.
  
  Cheers Heiko
  
  [1] http://article.gmane.org/gmane.comp.version-control.git/217020
  [2] https://github.com/hvoigt/git/wiki/submodule-fetch-config
 
 It seems to me that the question that you are trying to solve is
 more complex than the problem I faced in git-archive, where we have a
 single commit of the top-level repository that we are chasing. 
 Perhaps we should split the work into two pieces:
 
 a. Identifying the complete submodule configuration for a single commit, and
 b. the complexity of behaviour when fetching and cloning recursively (which 
 of course requires a.)

You are right the latter (b) is a separate topic. So how about I extract the
submodule config parsing part from the mentioned patch and you can then
use that patch as a basis for your work? As far as I understand you only
need to parse the .gitmodules file for one commit and then lookup the
submodule names from paths right? That would simplify matters and we can
postpone the caching of multiple commits for the time when I continue on b.

 I’m very happy to work on the first, but the second seems to me to require 
 more
 understanding than I currently possess. In order to do this it would help to 
 have a
 place to discuss this. I see you have used the wiki of your fork of git on 
 GitHub.
 Is that the right place to solicit input?

I only used that to collect all information into one place. I am not
sure if thats actually necessary for the .gitmodules parsing you need.

I think we should discuss everything related to the design and patches
here on the list. If you have questions regarding my code I am also
happy to answer that via private mail.

Cheers Heiko
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] submodule recursion in git-archive

2013-12-02 Thread Nick Townsend

On 27 Nov 2013, at 11:43, Junio C Hamano gits...@pobox.com wrote:

 Nick Townsend nick.towns...@mac.com writes:
 
 On 26 Nov 2013, at 14:18, Junio C Hamano gits...@pobox.com wrote:
 
 Even if the code is run inside a repository with a working tree,
 when producing a tarball out of an ancient commit that had a
 submodule not at its current location, --recurse-submodules option
 should do the right thing, so asking for working tree location of
 that submodule to find its repository is wrong, I think.  It may
 happen to find one if the archived revision is close enough to what
 is currently checked out, but that may not necessarily be the case.
 
 At that point when the code discovers an S_ISGITLINK entry, it
 should have both a pathname to the submodule relative to the
 toplevel and the commit object name bound to that submodule
 location.  What it should do, when it does not find the repository
 at the given path (maybe because there is no working tree, or the
 sudmodule directory has moved over time) is roughly:
 
 - Read from .gitmodules at the top-level from the tree it is
  creating the tarball out of;
 
 - Find submodule.$name.path entry that records that path to the
  submodule; and then
 
 - Using that $name, find the stashed-away location of the submodule
  repository in $GIT_DIR/modules/$name.
 
 or something like that.
 
 This is a related tangent, but when used in a repository that people
 often use as their remote, the repository discovery may have to
 interact with the relative URL.  People often ship .gitmodules with
 
 [submodule bar]
 URL = ../bar.git
 path = barDir
 
 for a top-level project foo that can be cloned thusly:
 
 git clone git://site.xz/foo.git
 
 and host bar.git to be clonable with
 
 git clone git://site.xz/bar.git barDir/
 
 inside the working tree of the foo project.  In such a case, when
 archive --recurse-submodules is running, it would find the
 repository for the bar submodule at ../bar.git, I would think.
 
 So this part needs a bit more thought, I am afraid.
 
 I see that there is a lot of potential complexity around setting up a 
 submodule:
 
 No question about it.
 
 * The .gitmodules file can be dirty (easy to flag, but should we
 allow archive to proceed?)
 
 As we are discussing archive, which takes a tree object from the
 top-level project that is recorded in the object database, the
 information _about_ the submodule in question should come from the
 given tree being archived.  There is no reason for the .gitmodules
 file that happens to be sitting in the working tree of the top-level
 project to be involved in the decision, so its dirtyness should not
 matter, I think.  If the tree being archived has a submodule whose
 name is kernel at path linux/ (relative to the top-level
 project), its repository should be at .git/modules/kernel in the
 layout recent git-submodule prepares, and we should find that
 path-and-name mapping from .gitmodules recorded in that tree object
 we are archiving. The version that happens to be checked out to the
 working tree may have moved the submodule to a new path linux-3.0/
 and linux-3.0/.git may have gitdir: .git/modules/kernel in it,
 but when archiving a tree that has the submodule at linux/, it
 would not help---we would not know to look at linux-3.0/.git to
 learn that information anyway because .gitmodules in the working
 tree would say that the submodule at path linux-3.0/ is with name
 kernel, and would not tell us anything about linux/.
 
 * Users can mess with settings both prior to git submodule init
 and before git submodule update.
 
 I think this is irrelevant for exactly the same reason as above.
 
 What makes this tricker, however, is how to deal with an old-style
 repository, where the submodule repositories are embedded in the
 working tree that happens to be checked out.  In that case, we may
 have to read .gitmodules from two places, i.e.
 
 (1) We are archiving a tree with a submodule at linux/;
 
 (2) We read .gitmodules from that tree and learn that the submodule
 has name kernel;
 
 (3) There is no .git/modules/kernel because the repository uses
 the old layout (if the user never was interested in this
 submodule, .git/modules/kernel may also be missing, and we
 should tell these two cases apart by checking .git/config to
 see if a corresponding entry for the kernel submodule exists
 there);
 
 (4) In a repository that uses the old layout, there must be the
 repository somewhere embedded in the current working tree (this
 inability to remove is why we use the new layout these days).
 We can learn where it is by looking at .gitmodules in the
 working tree---map the name kernel we learned earlier, and
 map it to the current path (linux-3.0/ if you have been
 following this example so far).
 
 And in that fallback context, I would say that reading from a dirty
 (or messed with by the user) .gitmodules is the right thing to
 

Fwd: [PATCH] submodule recursion in git-archive

2013-12-02 Thread Nick Townsend


Begin forwarded message:

 From: Nick Townsend nick.towns...@mac.com
 Subject: Re: [PATCH] submodule recursion in git-archive
 Date: 2 December 2013 16:00:50 GMT-8
 To: Junio C Hamano gits...@pobox.com
 Cc: René Scharfe l@web.de, Jens Lehmann jens.lehm...@web.de, 
 git@vger.kernel.org, Jeff King p...@peff.net
 
 
 On 27 Nov 2013, at 11:43, Junio C Hamano gits...@pobox.com wrote:
 
 Nick Townsend nick.towns...@mac.com writes:
 
 On 26 Nov 2013, at 14:18, Junio C Hamano gits...@pobox.com wrote:
 
 Even if the code is run inside a repository with a working tree,
 when producing a tarball out of an ancient commit that had a
 submodule not at its current location, --recurse-submodules option
 should do the right thing, so asking for working tree location of
 that submodule to find its repository is wrong, I think.  It may
 happen to find one if the archived revision is close enough to what
 is currently checked out, but that may not necessarily be the case.
 
 At that point when the code discovers an S_ISGITLINK entry, it
 should have both a pathname to the submodule relative to the
 toplevel and the commit object name bound to that submodule
 location.  What it should do, when it does not find the repository
 at the given path (maybe because there is no working tree, or the
 sudmodule directory has moved over time) is roughly:
 
 - Read from .gitmodules at the top-level from the tree it is
 creating the tarball out of;
 
 - Find submodule.$name.path entry that records that path to the
 submodule; and then
 
 - Using that $name, find the stashed-away location of the submodule
 repository in $GIT_DIR/modules/$name.
 
 or something like that.
 
 This is a related tangent, but when used in a repository that people
 often use as their remote, the repository discovery may have to
 interact with the relative URL.  People often ship .gitmodules with
 
[submodule bar]
URL = ../bar.git
path = barDir
 
 for a top-level project foo that can be cloned thusly:
 
git clone git://site.xz/foo.git
 
 and host bar.git to be clonable with
 
git clone git://site.xz/bar.git barDir/
 
 inside the working tree of the foo project.  In such a case, when
 archive --recurse-submodules is running, it would find the
 repository for the bar submodule at ../bar.git, I would think.
 
 So this part needs a bit more thought, I am afraid.
 
 I see that there is a lot of potential complexity around setting up a 
 submodule:
 
 No question about it.
 
 * The .gitmodules file can be dirty (easy to flag, but should we
 allow archive to proceed?)
 
 As we are discussing archive, which takes a tree object from the
 top-level project that is recorded in the object database, the
 information _about_ the submodule in question should come from the
 given tree being archived.  There is no reason for the .gitmodules
 file that happens to be sitting in the working tree of the top-level
 project to be involved in the decision, so its dirtyness should not
 matter, I think.  If the tree being archived has a submodule whose
 name is kernel at path linux/ (relative to the top-level
 project), its repository should be at .git/modules/kernel in the
 layout recent git-submodule prepares, and we should find that
 path-and-name mapping from .gitmodules recorded in that tree object
 we are archiving. The version that happens to be checked out to the
 working tree may have moved the submodule to a new path linux-3.0/
 and linux-3.0/.git may have gitdir: .git/modules/kernel in it,
 but when archiving a tree that has the submodule at linux/, it
 would not help---we would not know to look at linux-3.0/.git to
 learn that information anyway because .gitmodules in the working
 tree would say that the submodule at path linux-3.0/ is with name
 kernel, and would not tell us anything about linux/.
 
 * Users can mess with settings both prior to git submodule init
 and before git submodule update.
 
 I think this is irrelevant for exactly the same reason as above.
 
 What makes this tricker, however, is how to deal with an old-style
 repository, where the submodule repositories are embedded in the
 working tree that happens to be checked out.  In that case, we may
 have to read .gitmodules from two places, i.e.
 
 (1) We are archiving a tree with a submodule at linux/;
 
 (2) We read .gitmodules from that tree and learn that the submodule
has name kernel;
 
 (3) There is no .git/modules/kernel because the repository uses
the old layout (if the user never was interested in this
submodule, .git/modules/kernel may also be missing, and we
should tell these two cases apart by checking .git/config to
see if a corresponding entry for the kernel submodule exists
there);
 
 (4) In a repository that uses the old layout, there must be the
repository somewhere embedded in the current working tree (this
inability to remove is why we use the new layout these days).
We can learn where it is by looking at .gitmodules

[PATCH] submodule recursion in git-archive

2013-12-02 Thread Nick Townsend

From: Nick Townsend nick.towns...@mac.com
Subject: Re: [PATCH] submodule recursion in git-archive
Date: 2 December 2013 15:55:36 GMT-8
To: Heiko Voigt hvo...@hvoigt.net
Cc: Junio C Hamano gits...@pobox.com, René Scharfe l@web.de, Jens 
Lehmann jens.lehm...@web.de, git@vger.kernel.org, Jeff King p...@peff.net


On 29 Nov 2013, at 14:38, Heiko Voigt hvo...@hvoigt.net wrote:

 On Wed, Nov 27, 2013 at 11:43:44AM -0800, Junio C Hamano wrote:
 Nick Townsend nick.towns...@mac.com writes:
 * The .gitmodules file can be dirty (easy to flag, but should we
 allow archive to proceed?)
 
 As we are discussing archive, which takes a tree object from the
 top-level project that is recorded in the object database, the
 information _about_ the submodule in question should come from the
 given tree being archived.  There is no reason for the .gitmodules
 file that happens to be sitting in the working tree of the top-level
 project to be involved in the decision, so its dirtyness should not
 matter, I think.  If the tree being archived has a submodule whose
 name is kernel at path linux/ (relative to the top-level
 project), its repository should be at .git/modules/kernel in the
 layout recent git-submodule prepares, and we should find that
 path-and-name mapping from .gitmodules recorded in that tree object
 we are archiving. The version that happens to be checked out to the
 working tree may have moved the submodule to a new path linux-3.0/
 and linux-3.0/.git may have gitdir: .git/modules/kernel in it,
 but when archiving a tree that has the submodule at linux/, it
 would not help---we would not know to look at linux-3.0/.git to
 learn that information anyway because .gitmodules in the working
 tree would say that the submodule at path linux-3.0/ is with name
 kernel, and would not tell us anything about linux/.
 
 * Users can mess with settings both prior to git submodule init
 and before git submodule update.
 
 I think this is irrelevant for exactly the same reason as above.
 
 What makes this tricker, however, is how to deal with an old-style
 repository, where the submodule repositories are embedded in the
 working tree that happens to be checked out.  In that case, we may
 have to read .gitmodules from two places, i.e.
 
 (1) We are archiving a tree with a submodule at linux/;
 
 (2) We read .gitmodules from that tree and learn that the submodule
 has name kernel;
 
 (3) There is no .git/modules/kernel because the repository uses
 the old layout (if the user never was interested in this
 submodule, .git/modules/kernel may also be missing, and we
 should tell these two cases apart by checking .git/config to
 see if a corresponding entry for the kernel submodule exists
 there);
 
 (4) In a repository that uses the old layout, there must be the
 repository somewhere embedded in the current working tree (this
 inability to remove is why we use the new layout these days).
 We can learn where it is by looking at .gitmodules in the
 working tree---map the name kernel we learned earlier, and
 map it to the current path (linux-3.0/ if you have been
 following this example so far).
 
 And in that fallback context, I would say that reading from a dirty
 (or messed with by the user) .gitmodules is the right thing to
 do.  Perhaps the user may be in the process of moving the submodule
 in his working tree with
 
$ mv linux-3.0 linux-3.2
$ git config -f .gitmodules submodule.kernel.path linux-3.2
 
 but hasn't committed the change yet.
 
 For those reasons I deliberately decided not to reproduce the
 above logic all by myself.
 
 As I already hinted, I agree that the how to find the location of
 submodule repository, given a particular tree in the top-level
 project the submodule belongs to and the path to the submodule in
 question deserves a separate thread to discuss with area experts.
 
 FYI, I already started to implement this lookup of submodule paths early
 this year[1] but have not found the time to proceed on that yet. I am
 planning to continue on that topic soonish. We need it to implement a
 correct recursive fetch with clone on-demand as a basis for the future
 recursive checkout.
 
 During the work on this I hit too many open questions. Thats why I am
 currently working on a complete plan[2] so we can discuss and define how
 this needs to be implemented. It is an asciidoc document which I will
 send out once I am finished with it.
 
 Cheers Heiko
 
 [1] http://article.gmane.org/gmane.comp.version-control.git/217020
 [2] https://github.com/hvoigt/git/wiki/submodule-fetch-config

Heiko
It seems to me that the question that you are trying to solve is
more complex than the problem I faced in git-archive, where we have a
single commit of the top-level repository that we are chasing. 
Perhaps we should split the work into two pieces:

a. Identifying the complete submodule configuration for a single commit, and
b. the complexity of behaviour when fetching

Re: Re: [PATCH] submodule recursion in git-archive

2013-11-29 Thread Heiko Voigt
On Wed, Nov 27, 2013 at 11:43:44AM -0800, Junio C Hamano wrote:
 Nick Townsend nick.towns...@mac.com writes:
  * The .gitmodules file can be dirty (easy to flag, but should we
  allow archive to proceed?)
 
 As we are discussing archive, which takes a tree object from the
 top-level project that is recorded in the object database, the
 information _about_ the submodule in question should come from the
 given tree being archived.  There is no reason for the .gitmodules
 file that happens to be sitting in the working tree of the top-level
 project to be involved in the decision, so its dirtyness should not
 matter, I think.  If the tree being archived has a submodule whose
 name is kernel at path linux/ (relative to the top-level
 project), its repository should be at .git/modules/kernel in the
 layout recent git-submodule prepares, and we should find that
 path-and-name mapping from .gitmodules recorded in that tree object
 we are archiving. The version that happens to be checked out to the
 working tree may have moved the submodule to a new path linux-3.0/
 and linux-3.0/.git may have gitdir: .git/modules/kernel in it,
 but when archiving a tree that has the submodule at linux/, it
 would not help---we would not know to look at linux-3.0/.git to
 learn that information anyway because .gitmodules in the working
 tree would say that the submodule at path linux-3.0/ is with name
 kernel, and would not tell us anything about linux/.
 
  * Users can mess with settings both prior to git submodule init
  and before git submodule update.
 
 I think this is irrelevant for exactly the same reason as above.
 
 What makes this tricker, however, is how to deal with an old-style
 repository, where the submodule repositories are embedded in the
 working tree that happens to be checked out.  In that case, we may
 have to read .gitmodules from two places, i.e.
 
  (1) We are archiving a tree with a submodule at linux/;
 
  (2) We read .gitmodules from that tree and learn that the submodule
  has name kernel;
 
  (3) There is no .git/modules/kernel because the repository uses
  the old layout (if the user never was interested in this
  submodule, .git/modules/kernel may also be missing, and we
  should tell these two cases apart by checking .git/config to
  see if a corresponding entry for the kernel submodule exists
  there);
 
  (4) In a repository that uses the old layout, there must be the
  repository somewhere embedded in the current working tree (this
  inability to remove is why we use the new layout these days).
  We can learn where it is by looking at .gitmodules in the
  working tree---map the name kernel we learned earlier, and
  map it to the current path (linux-3.0/ if you have been
  following this example so far).
 
 And in that fallback context, I would say that reading from a dirty
 (or messed with by the user) .gitmodules is the right thing to
 do.  Perhaps the user may be in the process of moving the submodule
 in his working tree with
 
 $ mv linux-3.0 linux-3.2
 $ git config -f .gitmodules submodule.kernel.path linux-3.2
 
 but hasn't committed the change yet.
 
  For those reasons I deliberately decided not to reproduce the
  above logic all by myself.
 
 As I already hinted, I agree that the how to find the location of
 submodule repository, given a particular tree in the top-level
 project the submodule belongs to and the path to the submodule in
 question deserves a separate thread to discuss with area experts.

FYI, I already started to implement this lookup of submodule paths early
this year[1] but have not found the time to proceed on that yet. I am
planning to continue on that topic soonish. We need it to implement a
correct recursive fetch with clone on-demand as a basis for the future
recursive checkout.

During the work on this I hit too many open questions. Thats why I am
currently working on a complete plan[2] so we can discuss and define how
this needs to be implemented. It is an asciidoc document which I will
send out once I am finished with it.

Cheers Heiko

[1] http://article.gmane.org/gmane.comp.version-control.git/217020
[2] https://github.com/hvoigt/git/wiki/submodule-fetch-config
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] submodule recursion in git-archive

2013-11-27 Thread Junio C Hamano
René Scharfe l@web.de writes:

 OK, but the repetition of cover letter and e-mail messages
 irritates me slightly for some reason.  What about the following?

Looks good to me; will queue, thanks.

 -- 8 --
 Subject: [PATCH] SubmittingPatches: document how to handle multiple patches

 Signed-off-by: Rene Scharfe l@web.de
 ---
  Documentation/SubmittingPatches |   11 +--
  1 files changed, 9 insertions(+), 2 deletions(-)

 diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
 index 7055576..e6d46ed 100644
 --- a/Documentation/SubmittingPatches
 +++ b/Documentation/SubmittingPatches
 @@ -139,8 +139,15 @@ People on the Git mailing list need to be able to read 
 and
  comment on the changes you are submitting.  It is important for
  a developer to be able to quote your changes, using standard
  e-mail tools, so that they may comment on specific portions of
 -your code.  For this reason, all patches should be submitted
 -inline.  If your log message (including your name on the
 +your code.  For this reason, each patch should be submitted
 +inline in a separate message.
 +
 +Multiple related patches should be grouped into their own e-mail
 +thread to help readers find all parts of the series.  To that end,
 +send them as replies to either an additional cover letter message
 +(see below), the first patch, or the respective preceding patch.
 +
 +If your log message (including your name on the
  Signed-off-by line) is not writable in ASCII, make sure that
  you send off a message in the correct encoding.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] submodule recursion in git-archive

2013-11-27 Thread Junio C Hamano
Nick Townsend nick.towns...@mac.com writes:

 On 26 Nov 2013, at 14:18, Junio C Hamano gits...@pobox.com wrote:

 Even if the code is run inside a repository with a working tree,
 when producing a tarball out of an ancient commit that had a
 submodule not at its current location, --recurse-submodules option
 should do the right thing, so asking for working tree location of
 that submodule to find its repository is wrong, I think.  It may
 happen to find one if the archived revision is close enough to what
 is currently checked out, but that may not necessarily be the case.
 
 At that point when the code discovers an S_ISGITLINK entry, it
 should have both a pathname to the submodule relative to the
 toplevel and the commit object name bound to that submodule
 location.  What it should do, when it does not find the repository
 at the given path (maybe because there is no working tree, or the
 sudmodule directory has moved over time) is roughly:
 
 - Read from .gitmodules at the top-level from the tree it is
   creating the tarball out of;
 
 - Find submodule.$name.path entry that records that path to the
   submodule; and then
 
 - Using that $name, find the stashed-away location of the submodule
   repository in $GIT_DIR/modules/$name.
 
 or something like that.
 
 This is a related tangent, but when used in a repository that people
 often use as their remote, the repository discovery may have to
 interact with the relative URL.  People often ship .gitmodules with
 
  [submodule bar]
  URL = ../bar.git
  path = barDir
 
 for a top-level project foo that can be cloned thusly:
 
  git clone git://site.xz/foo.git
 
 and host bar.git to be clonable with
 
  git clone git://site.xz/bar.git barDir/
 
 inside the working tree of the foo project.  In such a case, when
 archive --recurse-submodules is running, it would find the
 repository for the bar submodule at ../bar.git, I would think.
 
 So this part needs a bit more thought, I am afraid.

 I see that there is a lot of potential complexity around setting up a 
 submodule:

No question about it.

 * The .gitmodules file can be dirty (easy to flag, but should we
 allow archive to proceed?)

As we are discussing archive, which takes a tree object from the
top-level project that is recorded in the object database, the
information _about_ the submodule in question should come from the
given tree being archived.  There is no reason for the .gitmodules
file that happens to be sitting in the working tree of the top-level
project to be involved in the decision, so its dirtyness should not
matter, I think.  If the tree being archived has a submodule whose
name is kernel at path linux/ (relative to the top-level
project), its repository should be at .git/modules/kernel in the
layout recent git-submodule prepares, and we should find that
path-and-name mapping from .gitmodules recorded in that tree object
we are archiving. The version that happens to be checked out to the
working tree may have moved the submodule to a new path linux-3.0/
and linux-3.0/.git may have gitdir: .git/modules/kernel in it,
but when archiving a tree that has the submodule at linux/, it
would not help---we would not know to look at linux-3.0/.git to
learn that information anyway because .gitmodules in the working
tree would say that the submodule at path linux-3.0/ is with name
kernel, and would not tell us anything about linux/.

 * Users can mess with settings both prior to git submodule init
 and before git submodule update.

I think this is irrelevant for exactly the same reason as above.

What makes this tricker, however, is how to deal with an old-style
repository, where the submodule repositories are embedded in the
working tree that happens to be checked out.  In that case, we may
have to read .gitmodules from two places, i.e.

 (1) We are archiving a tree with a submodule at linux/;

 (2) We read .gitmodules from that tree and learn that the submodule
 has name kernel;

 (3) There is no .git/modules/kernel because the repository uses
 the old layout (if the user never was interested in this
 submodule, .git/modules/kernel may also be missing, and we
 should tell these two cases apart by checking .git/config to
 see if a corresponding entry for the kernel submodule exists
 there);

 (4) In a repository that uses the old layout, there must be the
 repository somewhere embedded in the current working tree (this
 inability to remove is why we use the new layout these days).
 We can learn where it is by looking at .gitmodules in the
 working tree---map the name kernel we learned earlier, and
 map it to the current path (linux-3.0/ if you have been
 following this example so far).

And in that fallback context, I would say that reading from a dirty
(or messed with by the user) .gitmodules is the right thing to
do.  Perhaps the user may be in the process of moving the submodule
in his working tree with

$ mv 

Re: [PATCH] submodule recursion in git-archive

2013-11-26 Thread René Scharfe
Am 26.11.2013 01:04, schrieb Nick Townsend:
 My first git patch - so shout out if I’ve got the etiquette wrong! Or
 of course if I’ve missed something.

Thanks for the patches!  Please send only one per message (the second
one as a reply to the first one, or both as replies to a cover letter),
though -- that makes commenting on them much easier.

Side note: Documentation/SubmittingPatches doesn't mention that (yet),
AFAICS.

 Subject: [PATCH 1/2] submodule: add_submodule_odb() usability
 
 Although add_submodule_odb() is documented as being
 externally usable, it is declared static and also
 has incorrect documentation.
 
 This commit fixes those and makes no changes to
 existing code using them. All tests still pass.

Sign-off missing (see Documentation/SubmittingPatches).

 ---
  Documentation/technical/api-ref-iteration.txt | 4 ++--
  submodule.c   | 2 +-
  submodule.h   | 1 +
  3 files changed, 4 insertions(+), 3 deletions(-)
 
 diff --git a/Documentation/technical/api-ref-iteration.txt 
 b/Documentation/technical/api-ref-iteration.txt
 index aa1c50f..cbee624 100644
 --- a/Documentation/technical/api-ref-iteration.txt
 +++ b/Documentation/technical/api-ref-iteration.txt
 @@ -50,10 +50,10 @@ submodules object database. You can do this by a 
 code-snippet like
  this:
  
   const char *path = path/to/submodule
 - if (!add_submodule_odb(path))
 + if (add_submodule_odb(path))
   die(Error submodule '%s' not populated., path);
  
 -`add_submodule_odb()` will return an non-zero value on success. If you
 +`add_submodule_odb()` will return a zero value on success. If you

return zero on success instead?

  do not do this you will get an error for each ref that it does not point
  to a valid object.
  
 diff --git a/submodule.c b/submodule.c
 index 1905d75..1ea46be 100644
 --- a/submodule.c
 +++ b/submodule.c
 @@ -143,7 +143,7 @@ void stage_updated_gitmodules(void)
   die(_(staging updated .gitmodules failed));
  }
  
 -static int add_submodule_odb(const char *path)
 +int add_submodule_odb(const char *path)
  {
   struct strbuf objects_directory = STRBUF_INIT;
   struct alternate_object_database *alt_odb;
 diff --git a/submodule.h b/submodule.h
 index 7beec48..3e3cdca 100644
 --- a/submodule.h
 +++ b/submodule.h
 @@ -41,5 +41,6 @@ int find_unpushed_submodules(unsigned char new_sha1[20], 
 const char *remotes_nam
   struct string_list *needs_pushing);
  int push_unpushed_submodules(unsigned char new_sha1[20], const char 
 *remotes_name);
  void connect_work_tree_and_git_dir(const char *work_tree, const char 
 *git_dir);
 +int add_submodule_odb(const char *path);
  
  #endif

 Subject: [PATCH 2/2] archive: allow submodule recursion on git-archive
 
 When using git-archive to produce a dump of a
 repository, the existing code does not recurse
 into a submodule when it encounters it in the tree
 traversal. These changes add a command line flag
 that permits this.
 
 Note that the submodules must be updated in the
 repository, otherwise this cannot take place.
 
 The feature is disabled for remote repositories as
 the git_work_tree fails. This is a possible future
 enhancement.

Hmm, curious.  Why does it fail?  I guess that happens with bare
repositories, only, right?  (Which are the most likely kind of remote
repos to encounter, of course.)

 Two additional fields are added to archiver_args:
   * recurse  - a boolean indicator
   * treepath - the path part of the tree-ish
eg. the 'www' in HEAD:www
 
 The latter is used within the archive writer to
 determin the correct path for the submodule .git
 file.
 
 Signed-off-by: Nick Townsend nick.towns...@mac.com
 ---
  Documentation/git-archive.txt |  9 +
  archive.c | 38 --
  archive.h |  2 ++
  3 files changed, 47 insertions(+), 2 deletions(-)
 
 diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
 index b97aaab..b4df735 100644
 --- a/Documentation/git-archive.txt
 +++ b/Documentation/git-archive.txt
 @@ -11,6 +11,7 @@ SYNOPSIS
  [verse]
  'git archive' [--format=fmt] [--list] [--prefix=prefix/] [extra]
 [-o file | --output=file] [--worktree-attributes]
 +   [--recursive|--recurse-submodules]

I'd expect git archive --recurse to add subdirectories and their
contents, which it does right now, and --no-recurse to only archive the
specified objects, which is not implemented.  IAW: I wouldn't normally
associate an option with that name with submodules.  Would
--recurse-submodules alone suffice?

Side note: With only one of the options defined you could shorten them
on the command line to e.g. --rec; with both you'd need to type at least
--recursi or --recurse to disambiguate -- even though they ultimately do
the same.

 [--remote=repo [--exec=git-upload-archive]] tree-ish
 [path...]
  
 @@ 

Re: [PATCH] submodule recursion in git-archive

2013-11-26 Thread Jens Lehmann
Am 26.11.2013 16:17, schrieb René Scharfe:
 Am 26.11.2013 01:04, schrieb Nick Townsend:
 diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
 index b97aaab..b4df735 100644
 --- a/Documentation/git-archive.txt
 +++ b/Documentation/git-archive.txt
 @@ -11,6 +11,7 @@ SYNOPSIS
  [verse]
  'git archive' [--format=fmt] [--list] [--prefix=prefix/] [extra]
[-o file | --output=file] [--worktree-attributes]
 +  [--recursive|--recurse-submodules]
 
 I'd expect git archive --recurse to add subdirectories and their
 contents, which it does right now, and --no-recurse to only archive the
 specified objects, which is not implemented.  IAW: I wouldn't normally
 associate an option with that name with submodules.  Would
 --recurse-submodules alone suffice?

It should. All new code recursing into submodules should not use
--recursive but always --recurse-submodules, as --recursive means
different things for different commands (the only exception being
git submodule, as --recursive is obvious here, and git clone
for backward compatibility reasons).

But I really like what these patches are aiming at.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] submodule recursion in git-archive

2013-11-26 Thread Junio C Hamano
René Scharfe l@web.de writes:

 Thanks for the patches!  Please send only one per message (the second
 one as a reply to the first one, or both as replies to a cover letter),
 though -- that makes commenting on them much easier.

 Side note: Documentation/SubmittingPatches doesn't mention that (yet),
 AFAICS.

OK, how about doing this then?

 Documentation/SubmittingPatches | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 7055576..304b3c0 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -140,7 +140,12 @@ comment on the changes you are submitting.  It is 
important for
 a developer to be able to quote your changes, using standard
 e-mail tools, so that they may comment on specific portions of
 your code.  For this reason, all patches should be submitted
-inline.  If your log message (including your name on the
+inline.  A patch series that consists of N commits is sent as N
+separate e-mail messages, or a cover letter message (see below) with
+N separate e-mail messages, each being a response to the cover
+letter.
+
+If your log message (including your name on the
 Signed-off-by line) is not writable in ASCII, make sure that
 you send off a message in the correct encoding.
 

 The feature is disabled for remote repositories as
 the git_work_tree fails. This is a possible future
 enhancement.

 Hmm, curious.  Why does it fail?  I guess that happens with bare
 repositories, only, right?  (Which are the most likely kind of remote
 repos to encounter, of course.)

Yeah, I do not think of a reason why it should fail in a bare
repository, either. git archive is about writing out the contents
of an already recorded tree, so there shouldn't be a reason to even
call get_git_work_tree() in the first place.

Even if the code is run inside a repository with a working tree,
when producing a tarball out of an ancient commit that had a
submodule not at its current location, --recurse-submodules option
should do the right thing, so asking for working tree location of
that submodule to find its repository is wrong, I think.  It may
happen to find one if the archived revision is close enough to what
is currently checked out, but that may not necessarily be the case.

At that point when the code discovers an S_ISGITLINK entry, it
should have both a pathname to the submodule relative to the
toplevel and the commit object name bound to that submodule
location.  What it should do, when it does not find the repository
at the given path (maybe because there is no working tree, or the
sudmodule directory has moved over time) is roughly:

 - Read from .gitmodules at the top-level from the tree it is
   creating the tarball out of;

 - Find submodule.$name.path entry that records that path to the
   submodule; and then

 - Using that $name, find the stashed-away location of the submodule
   repository in $GIT_DIR/modules/$name.

or something like that.

This is a related tangent, but when used in a repository that people
often use as their remote, the repository discovery may have to
interact with the relative URL.  People often ship .gitmodules with

[submodule bar]
URL = ../bar.git
path = barDir

for a top-level project foo that can be cloned thusly:

git clone git://site.xz/foo.git

and host bar.git to be clonable with

git clone git://site.xz/bar.git barDir/

inside the working tree of the foo project.  In such a case, when
archive --recurse-submodules is running, it would find the
repository for the bar submodule at ../bar.git, I would think.

So this part needs a bit more thought, I am afraid.

  'git archive' [--format=fmt] [--list] [--prefix=prefix/] [extra]
[-o file | --output=file] [--worktree-attributes]
 +  [--recursive|--recurse-submodules]

 I'd expect git archive --recurse to add subdirectories and their
 contents, which it does right now, and --no-recurse to only archive the
 specified objects, which is not implemented.  IAW: I wouldn't normally
 associate an option with that name with submodules.  Would
 --recurse-submodules alone suffice?

Jens already commented on this, and I agree that --recursive should
be dropped from this patch.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: [PATCH] submodule recursion in git-archive

2013-11-26 Thread Heiko Voigt
Hi,

I like where this is going.

On Tue, Nov 26, 2013 at 04:17:43PM +0100, René Scharfe wrote:
 Am 26.11.2013 01:04, schrieb Nick Townsend:
  +   strbuf_addstr(dotgit, work_tree);
  +   strbuf_addch(dotgit, '/');
  +   if (args-treepath) {
  + strbuf_addstr(dotgit, args-treepath);
  + strbuf_addch(dotgit, '/');
  +   }
  +   strbuf_add(dotgit, 
  path_without_prefix,strlen(path_without_prefix)-1);
  +   if (add_submodule_odb(dotgit.buf))
  + die(Can't add submodule: %s, dotgit.buf);
 
 Hmm, I wonder if we can traverse the tree and load all submodule object
 databases before traversing it again to actually write file contents.
 That would spare the user from getting half of an archive together with
 that error message.

I am not sure whether we should die here. What about submodules that
have not been initialized and or cloned? I think that is a quite regular
use case for example for libraries that not everyone needs or big media
submodules which only the design team uses. How about skipping them (maybe
issuing a warning) by returning 0 here and proceeding?

Cheers Heiko
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] submodule recursion in git-archive

2013-11-26 Thread René Scharfe
Am 26.11.2013 23:18, schrieb Junio C Hamano:
 René Scharfe l@web.de writes:
 
 Thanks for the patches!  Please send only one per message (the second
 one as a reply to the first one, or both as replies to a cover letter),
 though -- that makes commenting on them much easier.

 Side note: Documentation/SubmittingPatches doesn't mention that (yet),
 AFAICS.
 
 OK, how about doing this then?
 
  Documentation/SubmittingPatches | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)
 
 diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
 index 7055576..304b3c0 100644
 --- a/Documentation/SubmittingPatches
 +++ b/Documentation/SubmittingPatches
 @@ -140,7 +140,12 @@ comment on the changes you are submitting.  It is 
 important for
  a developer to be able to quote your changes, using standard
  e-mail tools, so that they may comment on specific portions of
  your code.  For this reason, all patches should be submitted
 -inline.  If your log message (including your name on the
 +inline.  A patch series that consists of N commits is sent as N
 +separate e-mail messages, or a cover letter message (see below) with
 +N separate e-mail messages, each being a response to the cover
 +letter.
 +
 +If your log message (including your name on the
  Signed-off-by line) is not writable in ASCII, make sure that
  you send off a message in the correct encoding.

OK, but the repetition of cover letter and e-mail messages
irritates me slightly for some reason.  What about the following?

-- 8 --
Subject: [PATCH] SubmittingPatches: document how to handle multiple patches

Signed-off-by: Rene Scharfe l@web.de
---
 Documentation/SubmittingPatches |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 7055576..e6d46ed 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -139,8 +139,15 @@ People on the Git mailing list need to be able to read and
 comment on the changes you are submitting.  It is important for
 a developer to be able to quote your changes, using standard
 e-mail tools, so that they may comment on specific portions of
-your code.  For this reason, all patches should be submitted
-inline.  If your log message (including your name on the
+your code.  For this reason, each patch should be submitted
+inline in a separate message.
+
+Multiple related patches should be grouped into their own e-mail
+thread to help readers find all parts of the series.  To that end,
+send them as replies to either an additional cover letter message
+(see below), the first patch, or the respective preceding patch.
+
+If your log message (including your name on the
 Signed-off-by line) is not writable in ASCII, make sure that
 you send off a message in the correct encoding.
 
-- 
1.7.8


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] submodule recursion in git-archive

2013-11-26 Thread Nick Townsend

On 26 Nov 2013, at 16:28, René Scharfe l@web.de wrote:

 Am 26.11.2013 23:18, schrieb Junio C Hamano:
 René Scharfe l@web.de writes:
 
 Thanks for the patches!  Please send only one per message (the second
 one as a reply to the first one, or both as replies to a cover letter),
 though -- that makes commenting on them much easier.
 
 Side note: Documentation/SubmittingPatches doesn't mention that (yet),
 AFAICS.
 
 OK, how about doing this then?
 
 Documentation/SubmittingPatches | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
 
 diff --git a/Documentation/SubmittingPatches 
 b/Documentation/SubmittingPatches
 index 7055576..304b3c0 100644
 --- a/Documentation/SubmittingPatches
 +++ b/Documentation/SubmittingPatches
 @@ -140,7 +140,12 @@ comment on the changes you are submitting.  It is 
 important for
 a developer to be able to quote your changes, using standard
 e-mail tools, so that they may comment on specific portions of
 your code.  For this reason, all patches should be submitted
 -inline.  If your log message (including your name on the
 +inline.  A patch series that consists of N commits is sent as N
 +separate e-mail messages, or a cover letter message (see below) with
 +N separate e-mail messages, each being a response to the cover
 +letter.
 +
 +If your log message (including your name on the
 Signed-off-by line) is not writable in ASCII, make sure that
 you send off a message in the correct encoding.
 
 OK, but the repetition of cover letter and e-mail messages
 irritates me slightly for some reason.  What about the following?
 
 -- 8 --
 Subject: [PATCH] SubmittingPatches: document how to handle multiple patches
 
 Signed-off-by: Rene Scharfe l@web.de
 ---
 Documentation/SubmittingPatches |   11 +--
 1 files changed, 9 insertions(+), 2 deletions(-)
 
 diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
 index 7055576..e6d46ed 100644
 --- a/Documentation/SubmittingPatches
 +++ b/Documentation/SubmittingPatches
 @@ -139,8 +139,15 @@ People on the Git mailing list need to be able to read 
 and
 comment on the changes you are submitting.  It is important for
 a developer to be able to quote your changes, using standard
 e-mail tools, so that they may comment on specific portions of
 -your code.  For this reason, all patches should be submitted
 -inline.  If your log message (including your name on the
 +your code.  For this reason, each patch should be submitted
 +inline in a separate message.
 +
 +Multiple related patches should be grouped into their own e-mail
 +thread to help readers find all parts of the series.  To that end,
 +send them as replies to either an additional cover letter message
 +(see below), the first patch, or the respective preceding patch.
 +
 +If your log message (including your name on the
 Signed-off-by line) is not writable in ASCII, make sure that
 you send off a message in the correct encoding.
 
 -- 
 1.7.8
 
 
That seems clear to me.
At any rate I’m going to rework this based on the collective input and will 
submit them again.
Please check my other replies as there are some discussion points!

Nick--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] submodule recursion in git-archive

2013-11-26 Thread Nick Townsend

On 26 Nov 2013, at 14:38, Heiko Voigt hvo...@hvoigt.net wrote:

 Hi,
 
 I like where this is going.
 
 On Tue, Nov 26, 2013 at 04:17:43PM +0100, René Scharfe wrote:
 Am 26.11.2013 01:04, schrieb Nick Townsend:
 +   strbuf_addstr(dotgit, work_tree);
 +   strbuf_addch(dotgit, '/');
 +   if (args-treepath) {
 + strbuf_addstr(dotgit, args-treepath);
 + strbuf_addch(dotgit, '/');
 +   }
 +   strbuf_add(dotgit, 
 path_without_prefix,strlen(path_without_prefix)-1);
 +   if (add_submodule_odb(dotgit.buf))
 + die(Can't add submodule: %s, dotgit.buf);
 
 Hmm, I wonder if we can traverse the tree and load all submodule object
 databases before traversing it again to actually write file contents.
 That would spare the user from getting half of an archive together with
 that error message.
 
 I am not sure whether we should die here. What about submodules that
 have not been initialized and or cloned? I think that is a quite regular
 use case for example for libraries that not everyone needs or big media
 submodules which only the design team uses. How about skipping them (maybe
 issuing a warning) by returning 0 here and proceeding?
 
 Cheers Heiko

I agree that issuing a warning and continuing is best. If the submodule hasn’t 
been setup
then we should respect that and keep the current behaviour (just archive the 
directory entry).
There is some further debate to be had about the extent to which this should 
work with
un-initialized submodules which I’ll discuss in other replies.

Thanks
Nick--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] submodule recursion in git-archive

2013-11-26 Thread Nick Townsend

On 26 Nov 2013, at 14:18, Junio C Hamano gits...@pobox.com wrote:

 René Scharfe l@web.de writes:
 
 Thanks for the patches!  Please send only one per message (the second
 one as a reply to the first one, or both as replies to a cover letter),
 though -- that makes commenting on them much easier.
 
 Side note: Documentation/SubmittingPatches doesn't mention that (yet),
 AFAICS.
 
 OK, how about doing this then?
 
 Documentation/SubmittingPatches | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
 
 diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
 index 7055576..304b3c0 100644
 --- a/Documentation/SubmittingPatches
 +++ b/Documentation/SubmittingPatches
 @@ -140,7 +140,12 @@ comment on the changes you are submitting.  It is 
 important for
 a developer to be able to quote your changes, using standard
 e-mail tools, so that they may comment on specific portions of
 your code.  For this reason, all patches should be submitted
 -inline.  If your log message (including your name on the
 +inline.  A patch series that consists of N commits is sent as N
 +separate e-mail messages, or a cover letter message (see below) with
 +N separate e-mail messages, each being a response to the cover
 +letter.
 +
 +If your log message (including your name on the
 Signed-off-by line) is not writable in ASCII, make sure that
 you send off a message in the correct encoding.
 
 
 The feature is disabled for remote repositories as
 the git_work_tree fails. This is a possible future
 enhancement.
 
 Hmm, curious.  Why does it fail?  I guess that happens with bare
 repositories, only, right?  (Which are the most likely kind of remote
 repos to encounter, of course.)
 
 Yeah, I do not think of a reason why it should fail in a bare
 repository, either. git archive is about writing out the contents
 of an already recorded tree, so there shouldn't be a reason to even
 call get_git_work_tree() in the first place.
 
See below for a discussion of why I use the .git file in the work tree to 
load the objects for the submodule. I also thought it should work in a
remote repository - but I ran it on a properly initialized remote repository and
it failed. Since I didn’t need it for my immediate use-case I just decided to 
disable 
it with an error. I can look into this further, but we must decide about the 
question 
below first…

 Even if the code is run inside a repository with a working tree,
 when producing a tarball out of an ancient commit that had a
 submodule not at its current location, --recurse-submodules option
 should do the right thing, so asking for working tree location of
 that submodule to find its repository is wrong, I think.  It may
 happen to find one if the archived revision is close enough to what
 is currently checked out, but that may not necessarily be the case.
 
 At that point when the code discovers an S_ISGITLINK entry, it
 should have both a pathname to the submodule relative to the
 toplevel and the commit object name bound to that submodule
 location.  What it should do, when it does not find the repository
 at the given path (maybe because there is no working tree, or the
 sudmodule directory has moved over time) is roughly:
 
 - Read from .gitmodules at the top-level from the tree it is
   creating the tarball out of;
 
 - Find submodule.$name.path entry that records that path to the
   submodule; and then
 
 - Using that $name, find the stashed-away location of the submodule
   repository in $GIT_DIR/modules/$name.
 
 or something like that.
 
 This is a related tangent, but when used in a repository that people
 often use as their remote, the repository discovery may have to
 interact with the relative URL.  People often ship .gitmodules with
 
   [submodule bar]
   URL = ../bar.git
   path = barDir
 
 for a top-level project foo that can be cloned thusly:
 
   git clone git://site.xz/foo.git
 
 and host bar.git to be clonable with
 
   git clone git://site.xz/bar.git barDir/
 
 inside the working tree of the foo project.  In such a case, when
 archive --recurse-submodules is running, it would find the
 repository for the bar submodule at ../bar.git, I would think.
 
 So this part needs a bit more thought, I am afraid.

I see that there is a lot of potential complexity around setting up a submodule:
* The .gitmodules file can be dirty (easy to flag, but should we allow archive 
to proceed?)
* Users can mess with settings both prior to git submodule init and before git 
submodule update.
* What if it’s a raw clone and the user manually changes things between init 
and update?
* I’m not a git-internals expert but looking through the code I see that you 
can add additional object
directories and change paths as you show above.

For those reasons I deliberately decided not to reproduce the above logic all 
by myself.
On the other hand, what it *did* seem to me is that once you have the .git file
then you know you’ve got all that 

Re: [PATCH] submodule recursion in git-archive

2013-11-26 Thread Nick Townsend
On 26 Nov 2013, at 07:17, René Scharfe l@web.de wrote:

 Am 26.11.2013 01:04, schrieb Nick Townsend:
 My first git patch - so shout out if I’ve got the etiquette wrong! Or
 of course if I’ve missed something.
 
 Thanks for the patches!  Please send only one per message (the second
 one as a reply to the first one, or both as replies to a cover letter),
 though -- that makes commenting on them much easier.
 
 Side note: Documentation/SubmittingPatches doesn't mention that (yet),
 AFAICS.
 
 Subject: [PATCH 1/2] submodule: add_submodule_odb() usability
 
 Although add_submodule_odb() is documented as being
 externally usable, it is declared static and also
 has incorrect documentation.
 
 This commit fixes those and makes no changes to
 existing code using them. All tests still pass.
 
 Sign-off missing (see Documentation/SubmittingPatches).
 
 ---
 Documentation/technical/api-ref-iteration.txt | 4 ++--
 submodule.c   | 2 +-
 submodule.h   | 1 +
 3 files changed, 4 insertions(+), 3 deletions(-)
 
 diff --git a/Documentation/technical/api-ref-iteration.txt 
 b/Documentation/technical/api-ref-iteration.txt
 index aa1c50f..cbee624 100644
 --- a/Documentation/technical/api-ref-iteration.txt
 +++ b/Documentation/technical/api-ref-iteration.txt
 @@ -50,10 +50,10 @@ submodules object database. You can do this by a 
 code-snippet like
 this:
 
  const char *path = path/to/submodule
 -if (!add_submodule_odb(path))
 +if (add_submodule_odb(path))
  die(Error submodule '%s' not populated., path);
 
 -`add_submodule_odb()` will return an non-zero value on success. If you
 +`add_submodule_odb()` will return a zero value on success. If you
 
 return zero on success instead?

I like the brevity of your suggestion. Again, I just used what was there…

 
 do not do this you will get an error for each ref that it does not point
 to a valid object.
 
 diff --git a/submodule.c b/submodule.c
 index 1905d75..1ea46be 100644
 --- a/submodule.c
 +++ b/submodule.c
 @@ -143,7 +143,7 @@ void stage_updated_gitmodules(void)
  die(_(staging updated .gitmodules failed));
 }
 
 -static int add_submodule_odb(const char *path)
 +int add_submodule_odb(const char *path)
 {
  struct strbuf objects_directory = STRBUF_INIT;
  struct alternate_object_database *alt_odb;
 diff --git a/submodule.h b/submodule.h
 index 7beec48..3e3cdca 100644
 --- a/submodule.h
 +++ b/submodule.h
 @@ -41,5 +41,6 @@ int find_unpushed_submodules(unsigned char new_sha1[20], 
 const char *remotes_nam
  struct string_list *needs_pushing);
 int push_unpushed_submodules(unsigned char new_sha1[20], const char 
 *remotes_name);
 void connect_work_tree_and_git_dir(const char *work_tree, const char 
 *git_dir);
 +int add_submodule_odb(const char *path);
 
 #endif
 
 Subject: [PATCH 2/2] archive: allow submodule recursion on git-archive
 
 When using git-archive to produce a dump of a
 repository, the existing code does not recurse
 into a submodule when it encounters it in the tree
 traversal. These changes add a command line flag
 that permits this.
 
 Note that the submodules must be updated in the
 repository, otherwise this cannot take place.
 
 The feature is disabled for remote repositories as
 the git_work_tree fails. This is a possible future
 enhancement.
 
 Hmm, curious.  Why does it fail?  I guess that happens with bare
 repositories, only, right?  (Which are the most likely kind of remote
 repos to encounter, of course.)

I’m not sure why it failed - I didn’t think it should - but it did.
See discussion in other email.

 
 Two additional fields are added to archiver_args:
  * recurse  - a boolean indicator
  * treepath - the path part of the tree-ish
   eg. the 'www' in HEAD:www
 
 The latter is used within the archive writer to
 determin the correct path for the submodule .git
 file.
 
 Signed-off-by: Nick Townsend nick.towns...@mac.com
 ---
 Documentation/git-archive.txt |  9 +
 archive.c | 38 --
 archive.h |  2 ++
 3 files changed, 47 insertions(+), 2 deletions(-)
 
 diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
 index b97aaab..b4df735 100644
 --- a/Documentation/git-archive.txt
 +++ b/Documentation/git-archive.txt
 @@ -11,6 +11,7 @@ SYNOPSIS
 [verse]
 'git archive' [--format=fmt] [--list] [--prefix=prefix/] [extra]
[-o file | --output=file] [--worktree-attributes]
 +  [--recursive|--recurse-submodules]
 
 I'd expect git archive --recurse to add subdirectories and their
 contents, which it does right now, and --no-recurse to only archive the
 specified objects, which is not implemented.  IAW: I wouldn't normally
 associate an option with that name with submodules.  Would
 --recurse-submodules alone suffice?
 
 Side note: With only one of the options defined you could shorten them
 on the command line 

[PATCH] submodule recursion in git-archive

2013-11-25 Thread Nick Townsend
All,
My first git patch - so shout out if I’ve got the etiquette wrong! Or of course 
if I’ve missed something.
I googled around looking for solutions to my problem but just came up with a 
few shell-scripts
that didn’t quite get the functionality I needed.
The first patch fixes some typos that crept in to existing doc and 
declarations. It is required
for the second which actually implements the changes.

All comments gratefully received!

Regards
Nick Townsend

Subject: [PATCH 1/2] submodule: add_submodule_odb() usability

Although add_submodule_odb() is documented as being
externally usable, it is declared static and also
has incorrect documentation.

This commit fixes those and makes no changes to
existing code using them. All tests still pass.
---
 Documentation/technical/api-ref-iteration.txt | 4 ++--
 submodule.c   | 2 +-
 submodule.h   | 1 +
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/Documentation/technical/api-ref-iteration.txt 
b/Documentation/technical/api-ref-iteration.txt
index aa1c50f..cbee624 100644
--- a/Documentation/technical/api-ref-iteration.txt
+++ b/Documentation/technical/api-ref-iteration.txt
@@ -50,10 +50,10 @@ submodules object database. You can do this by a 
code-snippet like
 this:
 
const char *path = path/to/submodule
-   if (!add_submodule_odb(path))
+   if (add_submodule_odb(path))
die(Error submodule '%s' not populated., path);
 
-`add_submodule_odb()` will return an non-zero value on success. If you
+`add_submodule_odb()` will return a zero value on success. If you
 do not do this you will get an error for each ref that it does not point
 to a valid object.
 
diff --git a/submodule.c b/submodule.c
index 1905d75..1ea46be 100644
--- a/submodule.c
+++ b/submodule.c
@@ -143,7 +143,7 @@ void stage_updated_gitmodules(void)
die(_(staging updated .gitmodules failed));
 }
 
-static int add_submodule_odb(const char *path)
+int add_submodule_odb(const char *path)
 {
struct strbuf objects_directory = STRBUF_INIT;
struct alternate_object_database *alt_odb;
diff --git a/submodule.h b/submodule.h
index 7beec48..3e3cdca 100644
--- a/submodule.h
+++ b/submodule.h
@@ -41,5 +41,6 @@ int find_unpushed_submodules(unsigned char new_sha1[20], 
const char *remotes_nam
struct string_list *needs_pushing);
 int push_unpushed_submodules(unsigned char new_sha1[20], const char 
*remotes_name);
 void connect_work_tree_and_git_dir(const char *work_tree, const char *git_dir);
+int add_submodule_odb(const char *path);
 
 #endif
-- 
1.8.3.4 (Apple Git-47)

Subject: [PATCH 2/2] archive: allow submodule recursion on git-archive

When using git-archive to produce a dump of a
repository, the existing code does not recurse
into a submodule when it encounters it in the tree
traversal. These changes add a command line flag
that permits this.

Note that the submodules must be updated in the
repository, otherwise this cannot take place.

The feature is disabled for remote repositories as
the git_work_tree fails. This is a possible future
enhancement.

Two additional fields are added to archiver_args:
  * recurse  - a boolean indicator
  * treepath - the path part of the tree-ish
   eg. the 'www' in HEAD:www

The latter is used within the archive writer to
determin the correct path for the submodule .git
file.

Signed-off-by: Nick Townsend nick.towns...@mac.com
---
 Documentation/git-archive.txt |  9 +
 archive.c | 38 --
 archive.h |  2 ++
 3 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index b97aaab..b4df735 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -11,6 +11,7 @@ SYNOPSIS
 [verse]
 'git archive' [--format=fmt] [--list] [--prefix=prefix/] [extra]
  [-o file | --output=file] [--worktree-attributes]
+ [--recursive|--recurse-submodules]
  [--remote=repo [--exec=git-upload-archive]] tree-ish
  [path...]
 
@@ -51,6 +52,14 @@ OPTIONS
 --prefix=prefix/::
Prepend prefix/ to each filename in the archive.
 
+--recursive::
+--recurse-submodules::
+   Archive entries in submodules. Errors occur if the submodules
+   have not been initialized and updated.
+   Run `git submodule update --init --recursive` immediately after
+   the clone is finished to avoid this.
+   This option is not available with remote repositories.
+
 -o file::
 --output=file::
Write the archive to file instead of stdout.
diff --git a/archive.c b/archive.c
index 346f3b2..f6313c9 100644
--- a/archive.c
+++ b/archive.c
@@ -5,6 +5,7 @@
 #include archive.h
 #include parse-options.h
 #include unpack-trees.h
+#include submodule.h
 
 static char const * const archive_usage[] = {
N_(git archive