Re: send-pack question.

2005-07-31 Thread Linus Torvalds


On Sat, 30 Jul 2005, Junio C Hamano wrote:
 
  * Right now, send-pack --all into an empty repository does
not do anything, but send-pack --all master into an empty
repository pushes all local heads.  This is because we do not
check send_all when deciding if we want to call try_match
on local references.  I am assuming this is an oversight; am
I correct?  If so, does the attached patch look OK?

Yeah, that sounds like me just not having taken my 'meds. The patch looks 
fine.

  * It appears to me that you can say send-pack net, and
depending on how the remote lists its refs, you can end up
updating their refs/heads/net or refs/tags/net.

Yeh. I was wanting to sort the refs to make everything be totally 
repeatable, but that was more of an urge than a real plan.

 More
confusingly, you could say send-pack net net to update
both.  More realistically, you could get confused with a
remote that has refs/heads/jgarzik/net and
refs/heads/dsmiller/net in this way.  I think it should
detect, stop and warn about the ambiguity and require the
user to be more explicit.  Am I reading the current code
correctly?

Yes, warning on ambiguity sounds like a sound plan, and then you don't 
need to sort.

You also probably to come up with a syntax for saying xyz is the local
name, abc is the remote name. That's needed for both the pulling and the 
pushing side, but I didn't ever do it. 

I've always _hated_ the interface to path_match() which
pretends to be just a boolean function but actually has a
grave side effect, by the way.

It's interesting but useful. But I agree, we've had one bug already due
to the interesting part.

If you do the ambiguity thing, you might mark them used some separate way, 
and maybe avoid that side effect (or rather - the side effect would still 
exist, but instead of removing the entry, it would just mark it as 
seen, ie make it less drastic).

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] Add git-send-email-script - tool to send emails from git-format-patch-script

2005-07-31 Thread Ryan Anderson
This is based off of GregKH's script, send-lots-of-email.pl, and strives to do
all the nice things a good subsystem maintainer does when forwarding a patch or
50 upstream:

All the prior handlers of the patch, as determined by the
Signed-off-by: lines, and/or the author of the commit, are cc:ed on the
email.

All emails are sent as a reply to the previous email, making it easy to
skip a collection of emails that are uninteresting.

Signed-off-by: Ryan Anderson [EMAIL PROTECTED]
---

 Makefile  |2 
 git-send-email-script |  265 +
 2 files changed, 266 insertions(+), 1 deletions(-)
 create mode 100755 git-send-email-script

55d4b5b7a11448d60eb00b5a7081954663842b06
diff --git a/Makefile b/Makefile
--- a/Makefile
+++ b/Makefile
@@ -62,7 +62,7 @@ SCRIPTS=git git-apply-patch-script git-m
git-format-patch-script git-sh-setup-script git-push-script \
git-branch-script git-parse-remote git-verify-tag-script \
git-ls-remote-script git-clone-dumb-http git-rename-script \
-   git-request-pull-script
+   git-request-pull-script git-send-email-script
 
 PROG=   git-update-cache git-diff-files git-init-db git-write-tree \
git-read-tree git-commit-tree git-cat-file git-fsck-cache \
diff --git a/git-send-email-script b/git-send-email-script
new file mode 100755
--- /dev/null
+++ b/git-send-email-script
@@ -0,0 +1,265 @@
+#!/usr/bin/perl -w
+# horrible hack of a script to send off a large number of email messages, one 
after
+# each other, all chained together.  This is useful for large numbers of 
patches.
+#
+# Use at your own risk
+#
+# greg kroah-hartman Jan 8, 2002
+# [EMAIL PROTECTED]
+#
+# GPL v2 (See COPYING)
+# 
+# Ported to support git mbox format files by Ryan Anderson [EMAIL 
PROTECTED]
+#
+# Sends emails to the email listed on the command line.
+# 
+# updated to give a valid subject and CC the owner of the patch - Jan 2005
+# first line of the message is who to CC, 
+# and second line is the subject of the message.
+# 
+
+use strict;
+use warnings;
+use Term::ReadLine;
+use Mail::Sendmail;
+use Getopt::Long;
+use Data::Dumper;
+use Email::Valid;
+
+# Variables we fill in automatically, or via prompting:
+my (@to,@cc,$initial_reply_to,$initial_subject,@files,$from);
+
+# Example of them
+# modify these options each time you run the script
+#$to = '[EMAIL PROTECTED],git@vger.kernel.org';
+#$initial_reply_to = ''; #[EMAIL PROTECTED]';
+#$initial_subject = [PATCH] Deb package build fixes;
[EMAIL PROTECTED] = (qw(
+#0001-Make-debian-rules-executable-and-correct-the-spelling-of-rsync-in.txt
+#0002-Debian-packages-should-include-the-binaries.txt
+#0003-The-deb-package-building-needs-these-two-new-files-to-work-correctly.txt
+#));
+
+# change this to your email address.
+#$from = Ryan Anderson [EMAIL PROTECTED];
+
+my $term = new Term::ReadLine 'git-send-email';
+
+# Begin by accumulating all the variables (defined above), that we will end up
+# needing, first, from the command line:
+
+my $rc = GetOptions(from=s = \$from,
+in-reply-to=s = \$initial_reply_to,
+   subject=s = \$initial_subject,
+   to=s = [EMAIL PROTECTED],
+);
+
+# Now, let's fill any that aren't set in with defaults:
+
+open(GITVAR,-|,git-var,-l)
+   or die Failed to open pipe from git-var: $!;
+
+my ($author,$committer);
+while(GITVAR) {
+   chomp;
+   my ($var,$data) = split /=/,$_,2;
+   my @fields = split /\s+/, $data;
+
+   my $ident = join( , @fields[0...(@fields-3)]);
+
+   if ($var eq 'GIT_AUTHOR_IDENT') {
+   $author = $ident;
+   } elsif ($var eq 'GIT_COMMITTER_IDENT') {
+   $committer = $ident;
+   }
+}
+close(GITVAR);
+
+
+if (!defined $from) {
+   $from = $author || $committer;
+   1 while (!defined ($_ = $term-readline(Who should the emails appear 
to be from? , 
+   $from)));
+   $from = $_;
+   print Emails will be sent from: , $from, \n;
+}
+
+if ([EMAIL PROTECTED]) {
+   1 while (!defined ($_ = $term-readline(Who should the emails be sent 
to? , 
+   )));
+   my $to = $_;
+   push @to, split /,/, $to;
+}
+
+if (!defined $initial_subject) {
+   1 while (!defined ($_ = 
+   $term-readline(What subject should the emails start with? , 
+   $initial_subject)));
+   $initial_subject = $_;
+}
+
+if (!defined $initial_reply_to) {
+   1 while (!defined ($_ = 
+   $term-readline(Message-ID to be used as In-Reply-To? , 
+   $initial_reply_to)));
+   $initial_reply_to = $_;
+}
+
+# Now that all the defaults are set, process the rest of the command line
+# arguments and collect up the files that need to be processed.
+for my $f (@ARGV) {
+   if (-d $f) {
+   opendir(DH,$f)
+   or die Failed 

[PATCH 2/3] Add documentation for git-send-email-script

2005-07-31 Thread Ryan Anderson
Signed-off-by: Ryan Anderson [EMAIL PROTECTED]
---

 Documentation/git-send-email-script.txt |   61 +++
 1 files changed, 61 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/git-send-email-script.txt

799a6320d3b07347869093beec303afbc005cf26
diff --git a/Documentation/git-send-email-script.txt 
b/Documentation/git-send-email-script.txt
new file mode 100644
--- /dev/null
+++ b/Documentation/git-send-email-script.txt
@@ -0,0 +1,61 @@
+git-send-email-script(1)
+===
+v0.1, July 2005
+
+NAME
+
+git-send-email-script - Send a collection of patches as emails
+
+
+SYNOPSIS
+
+'git-send-email-script' [options] file|directory [... file|directory]
+
+
+
+DESCRIPTION
+---
+Takes the patches given on the command line and emails them out.
+
+The header of the email is configurable by command line options.  If not
+specified on the command line, the user will be prompted with a ReadLine
+enabled interface to provide the necessary information.
+
+The options available are:
+
+  --to
+   Specify the primary recipient of the emails generated.
+   Generally, this will be the upstream maintainer of the
+   project involved.
+
+   --from
+   Specify the sender of the emails.  This will default to
+   the value GIT_COMMITTER_IDENT, as returned by git-var -l.
+   The user will still be prompted to confirm this entry.
+
+   --subject
+   Specify the initial subject of the email thread.
+
+   --in-reply-to
+   Specify the contents of the first In-Reply-To header.
+   Subsequent emails will refer to the previous email 
+   instead of this.
+   When overriding on the command line, it may be necessary
+   to set this to a space.  For example
+   --in-reply-to= 
+
+Author
+--
+Written by Ryan Anderson [EMAIL PROTECTED]
+
+git-send-email-script is originally based upon
+send_lots_of_email.pl by Greg Kroah-Hartman.
+
+Documentation
+--
+Documentation by Ryan Anderson
+
+GIT
+---
+Part of the link:git.html[git] suite
+


-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Add git-send-email-script - tool to send emails from git-format-patch-script

2005-07-31 Thread Matthias Urlichs
Hi, Ryan Anderson wrote:

 And yes, I did generate this thread with this script - so I have proof
 that it works nicely.

It might make sense to create a Patch 0/N with a short explanation, and
have the actual patches be replies to that -- or to patch 1/N if that's
not necessary.

As it is, patch N hangs off patch N-1 in my email threading view, which
gets slightly cumbersome if N10.

-- 
Matthias Urlichs   |   {M:U} IT Design @ m-u-it.de   |  [EMAIL PROTECTED]
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
 - -
Nothing makes a person more productive than the last minute.


-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Shipping gitk as part of core git.

2005-07-31 Thread Paul Mackerras
Junio C Hamano writes:

 It appears that gitk gets wider test coverage only after it is
 pulled into git.git repository.  I think it would be a good idea
 for me to pull from you often.

Yes, I agree.  I'm happy to send you an email when I have committed
changes to gitk if that will help.

 Recently there was a discussion with binary packaging folks.
 While I do not mind, and actually I would prefer, shipping gitk
 as part of the core GIT, I have never heard about your
 preference.  As long as gitk is just a single file (or even a
 handful files in the future) project that does not have a
 filename that overlaps with core GIT, I can continue pulling
 from you and I think the binary packaging folks can produce
 separate git-core and gitk package out of git.git tree without
 problems.  However, once you start wanting to have your own
 Makefile and maybe debian/rules file for packaging, for example,
 I suspect the way currently things are set up would break
 miserably.  It's all Linus' fault to have merged with your tree
 in the first place ;-).

He did ask me first, and I said he could :).  It makes things easier
for me, having gitk in the core git, because it means that I don't
have to worry about making a proper package out of it.  I don't see
any reason why gitk would grow to be more than just the script.

I am also thinking of doing a gitool, somewhat like bk citool, to
make it easier to create commits.  I guess we can decide later whether
to make it part of the core git, although it seems more like porcelain
than gitk.

 Anyhow, I have one bug to report.  I selected one rev, and then
 said diff this - selected from right-click menu on an
 adjacent one, and I got this:

Thanks for the patch.  I have committed that fix plus fixes for some
other bugs that people have reported, and pushed it to
master.kernel.org.  Could you do another pull please?

Regards,
Paul.
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Terminology

2005-07-31 Thread Junio C Hamano
Johannes Schindelin [EMAIL PROTECTED] writes:

 Maybe we should decide on a common terminology before kicking out 1.0, and
 look through all files in Documentation/ to have a consistent vocabulary.
 And poor me does not get confused no more.

Glad to see you started the discussion on this one.  I have a
slight worry and suspicion that this might open a can of worms,
but I agree we need to get this done.  We probably would end up
spliting the Terminology section in Documentation/git.txt into a
separate Glossary document.

Care to volunteer drafting a strawman, listing the concepts we
need terms for, marking the ones we seem to use the same word
for?   You do not have to suggest which candidate term to use
for all of them.  Something along these lines...

 - The unit of storage in GIT is called object; no other word
   is used and the word object is used only for this purpose
   so this one is OK.
  
 - A 20-byte SHA1 to uniquely identify objects; README and
   early Linus messages call this object name so does
   tutorial.  Many places say object SHA1 or just SHA1.

 - An object database stores a set of objects, and an
   individial object can be retrieved by giving it its object
   name.

 - Storing a regular file or a symlink in the object database
   results in a blob object created.  You cannot directly
   store filesystem directory, but a collection of blob objects
   and other tree objects can be recorded as a tree object
   which corresponds to this notion.

 - $GIT_INDEX_FILE is index file, which is a collection of
   cache entries.  The former is sometimes called cache
   file, the latter just cache.

 - the directory which corresponds to the top of the hierarchy
   described in the index file; I've seen words like working
   tree, working directory, work tree used.

 - When the stat information a cache entry records matches what
   is in the work tree, the entry is called clean or
   up-to-date.  The opposite is dirty or not up-to-date.

 - An index file can be in merged or unmerged state.  The
   former is when it does not have anything but stage 0 entries,
   the latter otherwise.

 - An merged index file can be written as a tree object, which
   is technically a set of interconnected tree objects but we
   equate it with the toplevel tree object with this set.

 - A tree object can be recorded as a part of a commit
   object.  The tree object is said to be associated with the
   commit object.

 - A tag object can be recorded as a pointer to another object
   of any type. The act of following the pointer a tag object
   holds (this can go recursively) until we get to a non-tag
   object is sometimes called resolving the tag.

 - The following objects are collectively called tree-ish: a
   tree object, a commit object, a tag object that resolves to
   either a commit or a tree object, and can be given to
   commands that expect to work on a tree object.

 - The files under $GIT_DIR/refs record object names, and are
   called refs.  What is under refs/heads/ are called heads,
   refs/tags/ tags.  Typically, they are either object names
   of commit objects or tag objects that resolve to commit
   objects, but a tag can point at any object.

 - A head is always an object name of a commit, and marks the
   latest commit in one line of development.  A line of
   development is often called a branch.  We sometimes use the
   word branch head to stress the fact that we are talking
   about a single commit that is the latest one in a branch.

 - Combining the states from more than one lines of developments
   is called merging and typically done between two branch
   heads.  This is called resolving in the tutorial and there
   is git-resolve-script command for it.

 - A set of refs with the set of objects reachable from them
   constitute a repository.  Although currently there is no
   provision for a repository to say that its objects are stored
   in this and that object database, multiple repositories can
   share the same object database, and there is not a conceptual
   limit that a repository must retrive its objects from a
   single object database.

 - The act of finding out the object names recorded in refs a
   different repository records, optionally updating a local
   refs with their values, and retrieving the objects
   reachable from them is called fetching.  Fetching immediately
   followed by merging is called pulling.

 - The act of updating refs in a different repository with new
   value and populating the object database(s) associated with
   the repository is called pushing.

 - Currently refs/heads records branch heads of both locally
   created branches and branches fetched from other
   repositories.

 - Currently, fetching always happen against a single branch
   head on a remote repository, and (a remote repository, name
   of the branch) is stored in $GIT_DIR/branches/ as a
   short-hand mechanism.  A file in this directory identifies
   a remote repository by its URL, 

Re: [PATCH 1/3] Add git-send-email-script - tool to send emails from git-format-patch-script

2005-07-31 Thread Junio C Hamano
Oh, another thing.  Could you refrain from doing
quoted-printable when possible?  Thanks.

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] Support pack files in http-pull

2005-07-31 Thread barkalow
This series adds support for downloading pack files when appropriate in 
http-pull. When it finds that a needed object is not available, it 
downloads info/packs (into memory), identifies any pack files it doesn't 
have from there, downloads indices of any of these that it doesn't have, 
and downloads the pack containing the object. If other packs are also 
needed, it downloads them when it reaches them.


-Daniel
*This .sig left intentionally blank*
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Added hook in git-receive-pack

2005-07-31 Thread Linus Torvalds


On Sun, 31 Jul 2005, Josef Weidendorfer wrote:

 Added hook in git-receive-pack
 
 After successful update of a ref,
 
  $GIT_DIR/hooks/update refname old-sha1 new-sha2
 
 is called if present. This allows e.g sending of a mail
 with pushed commits on the remote repository.
 Documentation update with example hook included.

This looks sane. However, I also get the strong feeling that
git-update-server-info should be run as part of a hook and not be built 
into receive-pack..

Personally, I simply don't want to update any dumb server info stuff for 
my own local repositories - it's not like I'm actually serving those out 
anyway.

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Added hook in git-receive-pack

2005-07-31 Thread Junio C Hamano
Josef Weidendorfer [EMAIL PROTECTED] writes:

 +It is assured that sha1-old is an ancestor of sha1-new (otherwise,
 +the update would have not been allowed). refname is relative to
 +$GIT_DIR; e.g. for the master head this is refs/heads/master.

I think this description is inaccurate; the send-pack can be run
with the --force flag and it is my understanding that receiver
would happily rewind the branch.  One possibility, if we wanted
to enforce it on the receiver end, would be to add another hook
that is called before the rename happens and tell the
receive-pack to refuse that update, but that should be done with
a separate patch, I suppose.

 +Using this hook, it is easy to generate mails on updates to
 +the local repository. This example script sends a mail with
 +the commits pushed to the repository:
 +
 +   #!/bin/sh
 +   git-rev-list --pretty $3 ^$2 |
 +mail -r $USER -s New commits on $1 [EMAIL PROTECTED]

What is the environment the hook runs in?  For example, who
defines $USER used here?

We might want to describe the environment a bit more tightly
than the current patch does.  This includes not just the
environment variables, but $cwd and the set of open file
descriptors among other things.

I am not saying this from the security standpoint (the fact that
you can invoke receive-pack and that you can write into the
update hooks means you already have control over that
repository), but to help hook writers to avoid making mistakes.
For example, I offhand cannot tell what happens if the hook
tries to read from its standard input.  Also what happens if the
hook does not return but sleeps forever in a loop?  Do we want
to somehow time it out?  I think It is hooks' responsibility to
time itself out is an acceptable answer here, but if that is
the case it had better be documented.

 +static void updatehook(const char *name, unsigned char *old_sha1, unsigned 
 char *new_sha1)
 +{
 +if (access(update_hook, X_OK)  0) return;
 +   fprintf(stderr, executing update hook for %s\n, name);
 +...
 +}

I think I've seen this fork -- exec -- for loop with waitpid
pattern repeated number of times in the code.  Would it be
feasible to make them into a single library-ish function and
call it from here and other existing places?

Another thing you may want to consider is to do this hook
processing before and/or after processing all the refs.  A hook
might want to know what the entire set of refs are that are
being updated, and may not have enough information if it is
invoked once per ref.

Thanks for the patch; I agree with what the patch tries to
achieve in general.

-jc

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Added hook in git-receive-pack

2005-07-31 Thread Junio C Hamano
Linus Torvalds [EMAIL PROTECTED] writes:

 This looks sane. However, I also get the strong feeling that
 git-update-server-info should be run as part of a hook and not be built 
 into receive-pack..

 Personally, I simply don't want to update any dumb server info stuff for 
 my own local repositories - it's not like I'm actually serving those out 
 anyway.

But you are.  I can run this just fine:

 $ git clone http://www.kernel.org/pub/scm/linux/kernel/git/torvalds/git.git/ 
linus

I agree in principle that you should be able to disable the call
to update_server_info() from there, but on the other hand once
we start doing it, we need to explain people which repo is http
capable and which repo is not and why.

I was actually thinking about a call to git-update-server-info
at the end of git-repack-script.  Again, great minds think the
opposite way sometimes ;-).






-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Terminology

2005-07-31 Thread Johannes Schindelin
Hi,

I tried to avoid the work. But I'll do it.

Ciao,
Dscho


-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Added hook in git-receive-pack

2005-07-31 Thread Linus Torvalds


On Sun, 31 Jul 2005, Junio C Hamano wrote:
 
 But you are.  I can run this just fine:

No I'm not. Try all the machines behind my firewall.

kernel.org is just the place I put things to when I publish them. It 
doesn't have any of my working directories on it.

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Added hook in git-receive-pack

2005-07-31 Thread Josef Weidendorfer
On Sunday 31 July 2005 22:15, Junio C Hamano wrote:
 Josef Weidendorfer [EMAIL PROTECTED] writes:
  +It is assured that sha1-old is an ancestor of sha1-new (otherwise,
  +the update would have not been allowed). refname is relative to
  +$GIT_DIR; e.g. for the master head this is refs/heads/master.

 I think this description is inaccurate;

Thanks for the constructive comments; that patch was only a draft, and 
tailored for my needs. I thought it would be better to provide a patch than 
requesting for a feature.

I am trying to convert a CVS project with a few developers to GIT; the idea is 
that each developer has his public branch in *the* central repository, and 
(s)he is allowed to merge to master him/herself; when pushing new features 
into the branches or merging to master, there should be send out a mail to a 
mailing list.

 the send-pack can be run 
 with the --force flag and it is my understanding that receiver
 would happily rewind the branch.

I didn't know this...

 One possibility, if we wanted 
 to enforce it on the receiver end,

I actually thought that the ancestor relationship always is given; perhaps we 
don't need to enforce this; but before sending out a mail, the hook script 
probably would like to check if there is any ancestor relationship; this 
would be a potential long lasting task, wouldn't it?

 would be to add another hook 
 that is called before the rename happens and tell the
 receive-pack to refuse that update, but that should be done with
 a separate patch, I suppose.

Sorry, I do not understand this.

  +Using this hook, it is easy to generate mails on updates to
  +the local repository. This example script sends a mail with
  +the commits pushed to the repository:
  +
  +   #!/bin/sh
  +   git-rev-list --pretty $3 ^$2 |
  +mail -r $USER -s New commits on $1 [EMAIL PROTECTED]

 What is the environment the hook runs in?  For example, who
 defines $USER used here?

Good question. I thought it is UNIX/POSIX behavior to set this environemt in 
shells (same as id -u -n). At least, ssh sets it to the user logging 
in (see man ssh).
And I supposed git-receive-pack to be called in a users SSH environment.
Hmmm... could this be called via CGI script by a web server?

You are right: we should be careful here. Is there any other hook mechanism in 
GIT at the moment? Originally, I thought this should be a task for a 
porcelain, but this thing is buried too deep in GIT itself...

All the issues about documenting the environment of the hooks, time out 
behavior and so on, are general issues for every kind of hook.

 I am not saying this from the security standpoint (the fact that
 you can invoke receive-pack and that you can write into the
 update hooks means you already have control over that
 repository), but to help hook writers to avoid making mistakes.
 For example, I offhand cannot tell what happens if the hook
 tries to read from its standard input.  Also what happens if the
 hook does not return but sleeps forever in a loop?  Do we want
 to somehow time it out?  I think It is hooks' responsibility to
 time itself out is an acceptable answer here, but if that is 
 the case it had better be documented.

I think that a time out is not needed here: as the hook is called synchronous, 
git-receive-pack won't return without until the hook terminated. And that is 
visible to the user, i.e. he would see that there is something wrong.

 I think I've seen this fork -- exec -- for loop with waitpid
 pattern repeated number of times in the code.

This is no coincidence ;-) I copied it from inside the same file.
But the behavior is another: If the hook goes wrong, I do not want for 
git-receive-pack to die.

 Would it be 
 feasible to make them into a single library-ish function and
 call it from here and other existing places?

Probably.

 Another thing you may want to consider is to do this hook
 processing before and/or after processing all the refs.  A hook
 might want to know what the entire set of refs are that are
 being updated, and may not have enough information if it is
 invoked once per ref.

Do you have a use case? At least, it would make things more complex.

Josef
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Added hook in git-receive-pack

2005-07-31 Thread Junio C Hamano
Linus Torvalds [EMAIL PROTECTED] writes:

 No I'm not. Try all the machines behind my firewall.

Ah, that's true.  Do you push into them?

Let's yank out the update_server_info() call when Josef's patch
can handle a single hook call at the end of the run, in addition
to one call per each ref getting updated.


-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Added hook in git-receive-pack

2005-07-31 Thread Linus Torvalds


On Sun, 31 Jul 2005, Junio C Hamano wrote:
 Linus Torvalds [EMAIL PROTECTED] writes:
 
  No I'm not. Try all the machines behind my firewall.
 
 Ah, that's true.  Do you push into them?

Yup, I do. I have this thing that I don't do backups, but I end up having 
redundancy instead, so I keep archives on my own machines and inside the 
private osdl network, for example.

Also, I suspect that anybody who uses the CVS model with git - ie a
central repository - is not likely to export that central repository any
way: it's the crown jewels, after all. Open source may not have that
mindset, but I'm thinking of how I was forced to use CVS at Transmeta, for
example:  the machine that had the CVS repo was certainly supposed to be
very private indeed.

In the central repo model you have another issue - you have potentially
parallell pushes to different branches with no locking what-so-ever (and
that's definitely _supposed_ to work), and I have this suspicion that the 
update for dumb servers code isn't really safe in that setting anyway. I 
haven't checked.

Linus
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Added hook in git-receive-pack

2005-07-31 Thread Johannes Schindelin
Hi,

On Sun, 31 Jul 2005, Junio C Hamano wrote:

 Let's yank out the update_server_info() call when Josef's patch
 can handle a single hook call at the end of the run, in addition
 to one call per each ref getting updated.

How about executing update_server_info() if no hook was found? That way,
it can be turned off by an empty hook, but is enabled in standard
settings.

Ciao,
Dscho

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Functions for managing the set of packs the library is using

2005-07-31 Thread Junio C Hamano
Daniel, I would really have liked to merge this immediately, but
somehow the patch is whitespace damaged.  Depressingly enough,
almost all patches I got from various people today had different
whitespace damages, and I started to suspect if there is
something wrong on my end, but it does not appear to be the
case.  The copies found in the usual archive I check show the
same problems as the copies I got directly in my mailbox have.

Could you resend these two if it is not too much trouble for
you?  

Thanks for fixing the curl_easy_setopt() screwup.  I should have
been more careful.  I've already hand-merged it and will be
pushing it out shortly, so there is no need to resend that one.

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] Add git-send-email-script - tool to send emails from git-format-patch-script

2005-07-31 Thread Ryan Anderson
On Sun, Jul 31, 2005 at 02:45:29AM -0700, Junio C Hamano wrote:
 Ryan Anderson [EMAIL PROTECTED] writes:
 
  All emails are sent as a reply to the previous email, making it easy to
  skip a collection of emails that are uninteresting.
 
 I understand why _some_ people consider this preferable, but
 wonder if this should have a knob to be tweaked.

Hmm, fair enough.

I'll send a few more patches in a minute that deal with the things in
this email but for now:

--chain-reply-to (or --no-chain-reply-to)

Will toggle between these two behaviors.  (This will not be
prompted for by the ReadLine interface, btw.)

  +# horrible hack of a script to send off a large number of email messages, 
  one after
  +# each other, all chained together.  This is useful for large numbers of 
  patches.
  +#
  +# Use at your own risk
 
 Well, if it is Use at your own risk maybe it should stay
 outside the official distribution for a while until it gets
 safer ;-).

Heh.  I missed some comments that I meant to clean up that were in
Greg's original script.  One of the patches will clean up the comments.

  +   my @fields = split /\s+/, $data;
  +   my $ident = join( , @fields[0...(@fields-3)]);
 
 Wouldn't s/.*// be easier than splitting and joining?

Most of GIT_COMMITTER_IDENT (and GIT_AUTHOR_IDENT) is use controllable,
except for the date section of it.  I know we delimit that with
spaces, so the above is guaranteed to work unless we change the format
that git-var returns.

If I hope that nobody has done something like:
GIT_AUTHOR=Ryan  Anderson
GIT_AUTHOR_EMAIL=[EMAIL PROTECTED]

I get more confusing results.  (I suddenly have to think about what that
regular expression does in this case - and I'm pretty sure that the one
you gave would do bad things.)

Probably the best fix for this would be to take libgit.a, make a shared
library out of it, and then interface the Perl scripts directly with it
via a .xs module.  I was thinking that I'd rather have direct access to
the git_ident* functions than calling out to git-var, anyway.  Consider
that a plan for a revamp after the core seems to have settled down a bit
more.

  +if (!defined $from) {
  +   $from = $author || $committer;
  +   1 while (!defined ($_ = $term-readline(Who should the emails appear 
  to be from? , 
  +   $from)));
 
 Judging from your past patches, you seem to really like
 statement modifiers[*].  While they _are_ valid Perl constructs,
 it is extremely hard to read when used outside very simple
 idiomatic use.  Please consider rewriting the above and the like
 using compound statements[*] (I am using these terms according
 to the definition in perlsyn.pod).  Remember, there are people
 Perl is not their native language, but are intelligent enough to
 be of great help fixing problems in programs you write in Perl.
 To most of them, compound statements are more familiar, so try
 to be gentle to them.

I copied this from another program of mine, and I'm *sure* I copied the
style directly from a ReadLine example.  But, I can't find a current
example that says this is good, so, I'll fix this, too.  It is rather
ugly.  (The other uses of this style are... less bad, IMO, than this
abuse here.)


 
  +   opendir(DH,$f)
  +   or die Failed to opendir $f: $!;
  +   push @files, map { +$f . / . $_ } grep !/^\.{1,2}$/,
  +   sort readdir(DH);
 
 Maybe skip potential subdirs while you are at it, something like this?
 
 push @files, sort grep { -f $_ } map { $f/$_ } readdir(DH)

Good point.  One one hand I'd say, Let it break for people who do
strange things like that, but I'll make it safer anyway.

(Someone is going to reply and ask for it to recurse into subdirectories
now.  Maybe Andrew Morton would find that useful with his rather massive
collection of patches in -mm kernels.  But that's a feature for next
week.)

  +   my $pseudo_rand = int (rand(4200));
  +   $message_id = [EMAIL PROTECTED];
  +   print new message id = $message_id\n;
 
 I doubt this hardcoded foobar.com is a good idea.  Did you mean
 to print it, by the way?

I'll convert this to something that is based off the $from address
instead.  It's probably better that way, anyway.

  +   $to{lc(Email::Valid-address($_))}++ for (@to);
  +   my $to = join(,, keys %to);
 
 Is this the culprit that produced this mechanical-looking line?
 
 To: [EMAIL PROTECTED],git@vger.kernel.org

No, that line was exactly what I put into the readline entry.

 Interestingly enough, you do not seem to do it for the From:
 line.
 
 From: Ryan Anderson [EMAIL PROTECTED]
 
 Also you seem to be losing the ordering in @to and @cc by the
 use of uniquefying keys %to and keys %cc.  I can not offhand
 tell if it matters, but you probably would care, at least for
 the primary recipients listed in @to array.

Well, it was kind of annoying to see the same email address appear 2-3
times in the email, 

Re: [PATCH] Added hook in git-receive-pack

2005-07-31 Thread Junio C Hamano
Linus Torvalds [EMAIL PROTECTED] writes:

 In the central repo model you have another issue - you have potentially
 parallell pushes to different branches with no locking what-so-ever (and
 that's definitely _supposed_ to work), and I have this suspicion that the 
 update for dumb servers code isn't really safe in that setting anyway. I 
 haven't checked.

You are absolutely right.  It should grab some sort of lock
while it does its thing (would fcntl(F_GETLK) be acceptable to
networked filesystem folks?).

I have one question regarding the hooks.  We seem to prefer
avoiding system and roll our own.  Is there a particular reason,
other than bypassing the need to quote parameters for shell?

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Support downloading packs by HTTP (whitespace fixed)

2005-07-31 Thread barkalow
This adds support to http-pull for finding the list of pack files
available on the server, downloading the index files for those pack
files, and downloading pack files when they contain needed objects not
available individually. It retains the index files even if the pack
files were not needed, but downloads the list of pack files once per
run if an object is not found separately.

Signed-off-by: Daniel Barkalow [EMAIL PROTECTED]
---

 http-pull.c |  181 +--
 1 files changed, 175 insertions(+), 6 deletions(-)

dff0b76c4a2efbb8407778a1da6dc2ea2ca1458f
diff --git a/http-pull.c b/http-pull.c
--- a/http-pull.c
+++ b/http-pull.c
@@ -33,7 +33,8 @@ struct buffer
 };
 
 static size_t fwrite_buffer(void *ptr, size_t eltsize, size_t nmemb,
-struct buffer *buffer) {
+struct buffer *buffer)
+{
 size_t size = eltsize * nmemb;
 if (size  buffer-size - buffer-posn)
 size = buffer-size - buffer-posn;
@@ -42,8 +43,9 @@ static size_t fwrite_buffer(void *ptr, s
 return size;
 }
 
-static size_t fwrite_sha1_file(void *ptr, size_t eltsize, size_t nmemb, 
-  void *data) {
+static size_t fwrite_sha1_file(void *ptr, size_t eltsize, size_t nmemb,
+  void *data)
+{
unsigned char expn[4096];
size_t size = eltsize * nmemb;
int posn = 0;
@@ -65,6 +67,168 @@ static size_t fwrite_sha1_file(void *ptr
return size;
 }
 
+static int got_indices = 0;
+
+static struct packed_git *packs = NULL;
+
+static int fetch_index(unsigned char *sha1)
+{
+   char *filename;
+   char *url;
+
+   FILE *indexfile;
+
+   if (has_pack_index(sha1))
+   return 0;
+
+   if (get_verbosely)
+   fprintf(stderr, Getting index for pack %s\n,
+   sha1_to_hex(sha1));
+   
+   url = xmalloc(strlen(base) + 64);
+   sprintf(url, %s/objects/pack/pack-%s.idx,
+   base, sha1_to_hex(sha1));
+   
+   filename = sha1_pack_index_name(sha1);
+   indexfile = fopen(filename, w);
+   if (!indexfile)
+   return error(Unable to open local file %s for pack index,
+filename);
+
+   curl_easy_setopt(curl, CURLOPT_FILE, indexfile);
+   curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, fwrite);
+   curl_easy_setopt(curl, CURLOPT_URL, url);
+   
+   if (curl_easy_perform(curl)) {
+   fclose(indexfile);
+   return error(Unable to get pack index %s, url);
+   }
+
+   fclose(indexfile);
+   return 0;
+}
+
+static int setup_index(unsigned char *sha1)
+{
+   struct packed_git *new_pack;
+   if (has_pack_file(sha1))
+   return 0; // don't list this as something we can get
+
+   if (fetch_index(sha1))
+   return -1;
+
+   new_pack = parse_pack_index(sha1);
+   new_pack-next = packs;
+   packs = new_pack;
+   return 0;
+}
+
+static int fetch_indices(void)
+{
+   unsigned char sha1[20];
+   char *url;
+   struct buffer buffer;
+   char *data;
+   int i = 0;
+
+   if (got_indices)
+   return 0;
+
+   data = xmalloc(4096);
+   buffer.size = 4096;
+   buffer.posn = 0;
+   buffer.buffer = data;
+
+   if (get_verbosely)
+   fprintf(stderr, Getting pack list\n);
+   
+   url = xmalloc(strlen(base) + 21);
+   sprintf(url, %s/objects/info/packs, base);
+
+   curl_easy_setopt(curl, CURLOPT_FILE, buffer);
+   curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, fwrite_buffer);
+   curl_easy_setopt(curl, CURLOPT_URL, url);
+   
+   if (curl_easy_perform(curl)) {
+   return error(Unable to get pack index %s, url);
+   }
+
+   do {
+   switch (data[i]) {
+   case 'P':
+   i++;
+   if (i + 52  buffer.posn 
+   !strncmp(data + i,  pack-, 6) 
+   !strncmp(data + i + 46, .pack\n, 6)) {
+   get_sha1_hex(data + i + 6, sha1);
+   setup_index(sha1);
+   i += 51;
+   break;
+   }
+   default:
+   while (data[i] != '\n')
+   i++;
+   }
+   i++;
+   } while (i  buffer.posn);
+
+   got_indices = 1;
+   return 0;
+}
+
+static int fetch_pack(unsigned char *sha1)
+{
+   char *url;
+   struct packed_git *target;
+   struct packed_git **lst;
+   FILE *packfile;
+   char *filename;
+
+   if (fetch_indices())
+   return -1;
+   target = find_sha1_pack(sha1, packs);
+   if (!target)
+   return error(Couldn't get %s: not separate or in any pack,
+

[PATCH 1/2] Functions for managing the set of packs the library is using (whitespace fixed)

2005-07-31 Thread barkalow
This adds support for reading an uninstalled index, and installing a
pack file that was added while the program was running, as well as
functions for determining where to put the file.

Signed-off-by: Daniel Barkalow [EMAIL PROTECTED]
---

 cache.h |   13 ++
 sha1_file.c |  123 +++
 2 files changed, 135 insertions(+), 1 deletions(-)

20fcc8f66a6780cf9bbd2fc2ba3b918c33696a67
diff --git a/cache.h b/cache.h
--- a/cache.h
+++ b/cache.h
@@ -172,6 +172,8 @@ extern void rollback_index_file(struct c
 extern char *mkpath(const char *fmt, ...);
 extern char *git_path(const char *fmt, ...);
 extern char *sha1_file_name(const unsigned char *sha1);
+extern char *sha1_pack_name(const unsigned char *sha1);
+extern char *sha1_pack_index_name(const unsigned char *sha1);
 
 int safe_create_leading_directories(char *path);
 
@@ -200,6 +202,9 @@ extern int write_sha1_to_fd(int fd, cons
 extern int has_sha1_pack(const unsigned char *sha1);
 extern int has_sha1_file(const unsigned char *sha1);
 
+extern int has_pack_file(const unsigned char *sha1);
+extern int has_pack_index(const unsigned char *sha1);
+
 /* Convert to/from hex/sha1 representation */
 extern int get_sha1(const char *str, unsigned char *sha1);
 extern int get_sha1_hex(const char *hex, unsigned char *sha1);
@@ -276,6 +281,7 @@ extern struct packed_git {
void *pack_base;
unsigned int pack_last_used;
unsigned int pack_use_cnt;
+   unsigned char sha1[20];
char pack_name[0]; /* something like .git/objects/pack/x.pack */
 } *packed_git;
 
@@ -298,7 +304,14 @@ extern int path_match(const char *path, 
 extern int get_ack(int fd, unsigned char *result_sha1);
 extern struct ref **get_remote_heads(int in, struct ref **list, int nr_match, 
char **match);
 
+extern struct packed_git *parse_pack_index(unsigned char *sha1);
+
 extern void prepare_packed_git(void);
+extern void install_packed_git(struct packed_git *pack);
+
+extern struct packed_git *find_sha1_pack(const unsigned char *sha1, 
+struct packed_git *packs);
+
 extern int use_packed_git(struct packed_git *);
 extern void unuse_packed_git(struct packed_git *);
 extern struct packed_git *add_packed_git(char *, int);
diff --git a/sha1_file.c b/sha1_file.c
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -200,6 +200,56 @@ char *sha1_file_name(const unsigned char
return base;
 }
 
+char *sha1_pack_name(const unsigned char *sha1)
+{
+   static const char hex[] = 0123456789abcdef;
+   static char *name, *base, *buf;
+   int i;
+
+   if (!base) {
+   const char *sha1_file_directory = get_object_directory();
+   int len = strlen(sha1_file_directory);
+   base = xmalloc(len + 60);
+   sprintf(base, 
%s/pack/pack-1234567890123456789012345678901234567890.pack, 
sha1_file_directory);
+   name = base + len + 11;
+   }
+
+   buf = name;
+
+   for (i = 0; i  20; i++) {
+   unsigned int val = *sha1++;
+   *buf++ = hex[val  4];
+   *buf++ = hex[val  0xf];
+   }
+   
+   return base;
+}
+
+char *sha1_pack_index_name(const unsigned char *sha1)
+{
+   static const char hex[] = 0123456789abcdef;
+   static char *name, *base, *buf;
+   int i;
+
+   if (!base) {
+   const char *sha1_file_directory = get_object_directory();
+   int len = strlen(sha1_file_directory);
+   base = xmalloc(len + 60);
+   sprintf(base, 
%s/pack/pack-1234567890123456789012345678901234567890.idx, 
sha1_file_directory);
+   name = base + len + 11;
+   }
+
+   buf = name;
+
+   for (i = 0; i  20; i++) {
+   unsigned int val = *sha1++;
+   *buf++ = hex[val  4];
+   *buf++ = hex[val  0xf];
+   }
+   
+   return base;
+}
+
 struct alternate_object_database *alt_odb;
 
 /*
@@ -360,6 +410,14 @@ void unuse_packed_git(struct packed_git 
 
 int use_packed_git(struct packed_git *p)
 {
+   if (!p-pack_size) {
+   struct stat st;
+   // We created the struct before we had the pack
+   stat(p-pack_name, st);
+   if (!S_ISREG(st.st_mode))
+   die(packfile %s not a regular file, p-pack_name);
+   p-pack_size = st.st_size;
+   }
if (!p-pack_base) {
int fd;
struct stat st;
@@ -387,8 +445,10 @@ int use_packed_git(struct packed_git *p)
 * this is cheap.
 */
if (memcmp((char*)(p-index_base) + p-index_size - 40,
-  p-pack_base + p-pack_size - 20, 20))
+  p-pack_base + p-pack_size - 20, 20)) {
+ 
die(packfile %s does not match index., p-pack_name);
+   }
}
p-pack_last_used = 

[RFC PATCH 0/2] Parallel pull core

2005-07-31 Thread barkalow
This series makes the core of the pull programs parallel. It should not 
actually make any difference yet. It arranges to call prefetch() for each 
object as soon as it is determined to be needed, and then call fetch() on 
each object once there is nothing left to prefetch. By implementing 
prefetch(), an implementation can make additional requests while waiting 
for the data from the earlier ones to come in. Additionally, fetch() will 
be called in the same order that prefetch() was called, so the 
implementation can just make a series of requests and get responses.

If anyone else is also interested in working on this, it could go into 
-pu; I've tested it reasonably well, and I'm pretty sure that it doesn't 
have any effect until the implementations are changed to have prefetch() 
do something. I'm working on support for it in ssh-pull, and haven't 
started looking at http-pull support.

 1: Adds support to the struct object code to produce struct objects when 
the type is unknown and the content is unavailable; this allocates 
memory for the union of the supported types, so it is slightly less 
efficient, but allows the pull code to track objects it doesn't know 
anything about (such as the targets to tags).
 2: Parallelizes the pull algorithm.

-Daniel
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] Support for making struct objects for absent objects of unknown type

2005-07-31 Thread barkalow
This adds support for calling lookup_object_type() with NULL for the type, 
which will cause it to allocate enough memory for the largest type. This 
allows struct object_lists for objects that need to be fetched to find out 
what they are.

Signed-off-by: Daniel Barkalow [EMAIL PROTECTED]
---

 object.c |   45 -
 object.h |8 
 tree.h   |1 +
 3 files changed, 53 insertions(+), 1 deletions(-)

6ed7f76e658c02cb2539a52813fe20d3fd9aa250
diff --git a/object.c b/object.c
--- a/object.c
+++ b/object.c
@@ -99,7 +99,9 @@ void mark_reachable(struct object *obj, 
 
 struct object *lookup_object_type(const unsigned char *sha1, const char *type)
 {
-   if (!strcmp(type, blob_type)) {
+   if (!type) {
+   return lookup_unknown_object(sha1);
+   } else if (!strcmp(type, blob_type)) {
return lookup_blob(sha1)-object;
} else if (!strcmp(type, tree_type)) {
return lookup_tree(sha1)-object;
@@ -113,6 +115,27 @@ struct object *lookup_object_type(const 
}
 }
 
+union any_object {
+   struct object object;
+   struct commit commit;
+   struct tree tree;
+   struct blob blob;
+   struct tag tag;
+};
+
+struct object *lookup_unknown_object(const unsigned char *sha1)
+{
+   struct object *obj = lookup_object(sha1);
+   if (!obj) {
+   union any_object *ret = xmalloc(sizeof(*ret));
+   memset(ret, 0, sizeof(*ret));
+   created_object(sha1, ret-object);
+   ret-object.type = NULL;
+   return ret-object;
+   }
+   return obj;
+}
+
 struct object *parse_object(const unsigned char *sha1)
 {
unsigned long size;
@@ -150,3 +173,23 @@ struct object *parse_object(const unsign
}
return NULL;
 }
+
+struct object_list *object_list_insert(struct object *item,
+  struct object_list **list_p)
+{
+   struct object_list *new_list = xmalloc(sizeof(struct object_list));
+new_list-item = item;
+new_list-next = *list_p;
+*list_p = new_list;
+return new_list;
+}
+
+unsigned object_list_length(struct object_list *list)
+{
+   unsigned ret = 0;
+   while (list) {
+   list = list-next;
+   ret++;
+   }
+   return ret;
+}
diff --git a/object.h b/object.h
--- a/object.h
+++ b/object.h
@@ -31,8 +31,16 @@ void created_object(const unsigned char 
 /** Returns the object, having parsed it to find out what it is. **/
 struct object *parse_object(const unsigned char *sha1);
 
+/** Returns the object, with potentially excess memory allocated. **/
+struct object *lookup_unknown_object(const unsigned  char *sha1);
+
 void add_ref(struct object *refer, struct object *target);
 
 void mark_reachable(struct object *obj, unsigned int mask);
 
+struct object_list *object_list_insert(struct object *item, 
+  struct object_list **list_p);
+
+unsigned object_list_length(struct object_list *list);
+
 #endif /* OBJECT_H */
diff --git a/tree.h b/tree.h
--- a/tree.h
+++ b/tree.h
@@ -14,6 +14,7 @@ struct tree_entry_list {
unsigned int mode;
char *name;
union {
+   struct object *any;
struct tree *tree;
struct blob *blob;
} item;

-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Parallelize pull algorithm (no change to behavior).

2005-07-31 Thread barkalow
This patch makes the core pull algorithm request all of the objects it can 
(with prefetch()) before actually reading them (with fetch()). Future 
patches (in a later series) will make use of this behavior to have 
multiple requests in flight at the same time, reducing the need for 
round-trips.

Signed-off-by: Daniel Barkalow [EMAIL PROTECTED]
---

 http-pull.c  |4 ++
 local-pull.c |4 ++
 pull.c   |  128 ++
 pull.h   |7 +++
 ssh-pull.c   |4 ++
 5 files changed, 93 insertions(+), 54 deletions(-)

02aae5a570341c5b86b914d59732009f015800d8
diff --git a/http-pull.c b/http-pull.c
--- a/http-pull.c
+++ b/http-pull.c
@@ -68,6 +68,10 @@ static size_t fwrite_sha1_file(void *ptr
return size;
 }
 
+void prefetch(unsigned char *sha1)
+{
+}
+
 int fetch(unsigned char *sha1)
 {
char *hex = sha1_to_hex(sha1);
diff --git a/local-pull.c b/local-pull.c
--- a/local-pull.c
+++ b/local-pull.c
@@ -11,6 +11,10 @@ static int use_filecopy = 1;
 
 static char *path; /* Remote git repository */
 
+void prefetch(unsigned char *sha1)
+{
+}
+
 int fetch(unsigned char *sha1)
 {
static int object_name_start = -1;
diff --git a/pull.c b/pull.c
--- a/pull.c
+++ b/pull.c
@@ -17,11 +17,8 @@ int get_all = 0;
 int get_verbosely = 0;
 static unsigned char current_commit_sha1[20];
 
-static const char commitS[] = commit;
-static const char treeS[] = tree;
-static const char blobS[] = blob;
-
-void pull_say(const char *fmt, const char *hex) {
+void pull_say(const char *fmt, const char *hex) 
+{
if (get_verbosely)
fprintf(stderr, fmt, hex);
 }
@@ -48,93 +45,114 @@ static int make_sure_we_have_it(const ch
return status;
 }
 
-static int process_unknown(unsigned char *sha1);
+static int process(unsigned char *sha1, const char *type);
 
-static int process_tree(unsigned char *sha1)
+static int process_tree(struct tree *tree)
 {
-   struct tree *tree = lookup_tree(sha1);
struct tree_entry_list *entries;
 
if (parse_tree(tree))
return -1;
 
for (entries = tree-entries; entries; entries = entries-next) {
-   const char *what = entries-directory ? treeS : blobS;
-   if (make_sure_we_have_it(what, entries-item.tree-object.sha1))
+   if (process(entries-item.any-sha1,
+   entries-directory ? tree_type : blob_type))
return -1;
-   if (entries-directory) {
-   if (process_tree(entries-item.tree-object.sha1))
-   return -1;
-   }
}
return 0;
 }
 
-static int process_commit(unsigned char *sha1)
+static int process_commit(struct commit *commit)
 {
-   struct commit *obj = lookup_commit(sha1);
-
-   if (make_sure_we_have_it(commitS, sha1))
+   if (parse_commit(commit))
return -1;
 
-   if (parse_commit(obj))
-   return -1;
+   memcpy(current_commit_sha1, commit-object.sha1, 20);
 
if (get_tree) {
-   if (make_sure_we_have_it(treeS, obj-tree-object.sha1))
-   return -1;
-   if (process_tree(obj-tree-object.sha1))
+   if (process(commit-tree-object.sha1, tree_type))
return -1;
if (!get_all)
get_tree = 0;
}
if (get_history) {
-   struct commit_list *parents = obj-parents;
+   struct commit_list *parents = commit-parents;
for (; parents; parents = parents-next) {
if (has_sha1_file(parents-item-object.sha1))
continue;
-   if (make_sure_we_have_it(NULL,
-parents-item-object.sha1)) {
-   /* The server might not have it, and
-* we don't mind. 
-*/
-   continue;
-   }
-   if (process_commit(parents-item-object.sha1))
+   if (process(parents-item-object.sha1,
+   commit_type))
return -1;
-   memcpy(current_commit_sha1, sha1, 20);
}
}
return 0;
 }
 
-static int process_tag(unsigned char *sha1)
+static int process_tag(struct tag *tag)
 {
-   struct tag *obj = lookup_tag(sha1);
-
-   if (parse_tag(obj))
+   if (parse_tag(tag))
return -1;
-   return process_unknown(obj-tagged-sha1);
+   return process(tag-tagged-sha1, NULL);
 }
 
-static int process_unknown(unsigned char *sha1)
+static struct object_list *process_queue = NULL;
+static struct object_list **process_queue_end = process_queue;
+
+static int process(unsigned char *sha1, const char *type)
 {

cg-clone failing to get cogito latest tree.

2005-07-31 Thread Martin Langhoff
On a new machine, trying to boostrap into latest cogito, I download
and make cogito 0.12.1, and then...

$ cg-clone http://www.kernel.org/pub/scm/cogito/cogito.git cogito
defaulting to local storage area
14:48:53 URL:http://www.kernel.org/pub/scm/cogito/cogito.git/refs/heads/master
[41/41] - refs/heads/origin [1]
progress: 34 objects, 45126 bytes
error: File d2072194059c65f92487c84c53b9f6b5da780d14
(http://www.kernel.org/pub/scm/cogito/cogito.git/objects/d2/072194059c65f92487c84c53b9f6b5da780d14)
corrupt

Cannot obtain needed blob d2072194059c65f92487c84c53b9f6b5da780d14
while processing commit .
cg-pull: objects pull failed
cg-init: pull failed

any hints? I have a similar problem fetching git with cg-clone: 

$ cg-clone http://www.kernel.org/pub/scm/git/git.git git
defaulting to local storage area
14:53:44 URL:http://www.kernel.org/pub/scm/git/git.git/refs/heads/master
[41/41] - refs/heads/origin [1]
progress: 2 objects, 4666 bytes
error: File 6ff87c4664981e4397625791c8ea3bbb5f2279a3
(http://www.kernel.org/pub/scm/git/git.git/objects/6f/f87c4664981e4397625791c8ea3bbb5f2279a3)
corrupt

Cannot obtain needed blob 6ff87c4664981e4397625791c8ea3bbb5f2279a3
while processing commit .
cg-pull: objects pull failed
cg-init: pull failed

Probably doing somethginf hopelessly wrong...


martin
-
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html