Re: [RFC/PATCH 1/4] builtin: add git-check-mailmap command

2013-07-11 Thread Duy Nguyen
On Thu, Jul 11, 2013 at 12:50 PM, Eric Sunshine sunsh...@sunshineco.com wrote:
 +   maybe_flush_or_die(stdout, contact to stdout);

 On error this function will print

 write failure on 'contact to stdout'

 maybe maybe_flush_or_die(stdout, write contact to stdout) or
 something? From i18n point of view, maybe_flush_or_die should not
 compose a sentence this way. Let the second argument be a complete
 sentence so that translators have more freedom. But that's a different
 issue.

 Indeed, it's not ideal. I chose contact to stdout for consistency
 with other callers, not because of a fondness for it.

   % git grep maybe_flush_or_die
   builtin/blame.c: maybe_flush_or_die(stdout, stdout);
   builtin/check-attr.c: maybe_flush_or_die(stdout, attribute to stdout);
   builtin/check-attr.c: maybe_flush_or_die(stdout, attribute to stdout);
   builtin/check-ignore.c: maybe_flush_or_die(stdout, check-ignore to 
 stdout);
   builtin/check-ignore.c: maybe_flush_or_die(stdout, ignore to stdout);
   builtin/hash-object.c: maybe_flush_or_die(stdout, hash to stdout);
   builtin/rev-list.c: maybe_flush_or_die(stdout, stdout);
   log-tree.c: maybe_flush_or_die(stdout, stdout);

 They seem pretty evenly split between just stdout and foo to
 stdout. (I actually prefer plain stdout and will happily change it
 to that.)

Then probably stick to the convention. If this i18n thing is fixed, it
has to be fixed everywhere anyway.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/4] builtin: add git-check-mailmap command

2013-07-11 Thread Antoine Pelisse
On Wed, Jul 10, 2013 at 9:03 PM, Eric Sunshine sunsh...@sunshineco.com wrote:
 +static void check_mailmap(struct string_list *mailmap, const char *contact)
 +{
 +   const char *name, *mail;
 +   size_t namelen, maillen;
 +   struct ident_split ident;
 +   char term = null_out ? '\0' : '\n';
 +
 +   if (split_ident_line(ident, contact, strlen(contact)))
 +   die(_(unable to parse contact: %s), contact);
 +
 +   name = ident.name_begin;
 +   namelen = ident.name_end - ident.name_begin;
 +   mail = ident.mail_begin;
 +   maillen = ident.mail_end - ident.mail_begin;
 +
 +   map_user(mailmap, mail, maillen, name, namelen);

Would it be useful to check the return value of this function, to
display a message when the name can't mapped ?

 +   if (namelen)
 +   printf(%.*s %.*s%c,
 +   (int)namelen, name, (int)maillen, mail, term);
 +   else
 +   printf(%.*s%c, (int)maillen, mail, term);
 +}
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Priming git clone with a local repo?

2013-07-11 Thread Andreas Krey
Hi, dear listers,

I'm wondering if there is (or will be) a way of doing almost

  git clone --reference localrepo host:canonrep

Basically, I don't want the implications of --reference but still the
performance advantages of reusing local objects/pack files. I probably
have to go and first do a

  git clone -s -n localrepo canonrepo

and then go into canonrepo, fix up the remote, fetch, and properly
reset the local branch. Is there an easier way of doing a
--reference-but-hardlink-objects-instead-of-setting-alternate
within git? Or implementing it?

Andreas

-- 
Totally trivial. Famous last words.
From: Linus Torvalds torvalds@*.org
Date: Fri, 22 Jan 2010 07:29:21 -0800
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/4] builtin: add git-check-mailmap command

2013-07-11 Thread Eric Sunshine
On Thu, Jul 11, 2013 at 2:45 AM, Antoine Pelisse apeli...@gmail.com wrote:
 On Wed, Jul 10, 2013 at 9:03 PM, Eric Sunshine sunsh...@sunshineco.com 
 wrote:
 +static void check_mailmap(struct string_list *mailmap, const char *contact)
 +{
 +   const char *name, *mail;
 +   size_t namelen, maillen;
 +   struct ident_split ident;
 +   char term = null_out ? '\0' : '\n';
 +
 +   if (split_ident_line(ident, contact, strlen(contact)))
 +   die(_(unable to parse contact: %s), contact);
 +
 +   name = ident.name_begin;
 +   namelen = ident.name_end - ident.name_begin;
 +   mail = ident.mail_begin;
 +   maillen = ident.mail_end - ident.mail_begin;
 +
 +   map_user(mailmap, mail, maillen, name, namelen);

 Would it be useful to check the return value of this function, to
 display a message when the name can't mapped ?

I thought about it but decided against the added complexity (at least
initially) for a few reasons.

There are only two callers in the existing code which even look at the
return value:

  % git grep 'map_user('
  builtin/blame.c: map_user(mailmap, mailbuf, maillen,
  builtin/shortlog.c: map_user(log-mailmap, mailbuf, maillen,
namebuf, namelen);
  pretty.c: map_user(pp-mailmap, mailbuf, maillen, namebuf, namelen);
  pretty.c: return mail_map-nr  map_user(mail_map, email,
email_len, name, name_len);
  revision.c: if (map_user(mailmap, mail, maillen, name, namelen)) {

Of those, mailmap_name() in pretty.c merely passes the return value
along to its callers, but its callers don't care about it:

  % git grep 'mailmap_name('
  pretty.c: mailmap_name(mail, maillen, name, namelen);

commit_rewrite_person() in revision.c uses the return value apparently
only as an optimization to decide if it should go through the effort
of replacing the old person with the re-mapped person, and then
passes the return value along to its callers, none of which care:

  % git grep 'commit_rewrite_person('
  revision.c: commit_rewrite_person(buf, \nauthor , opt-mailmap);
  revision.c: commit_rewrite_person(buf, \ncommitter , opt-mailmap);

The sort of optimization taken by commit_rewrite_person() is
effectively lost to script and porcelain clients of check-mailmap
since the cost of executing the command probably swamps any savings
the client might otherwise achieve by knowing whether the person was
remapped. Also, modulo possible whitespace changes, a client can
compare what it passed in to what is received back to determine if
remapping occurred.

As plumbing, the expectation is that most clients will pass a value in
and use the output without further interpretation. If check-mailmap
unconditionally adds some mapped or not mapped indicator to each
returned value, then that places extra burden on all clients by
forcing them to parse the result. Alternately, if the indicator is
controlled by a command-line option, then (at the least) that
increases documentation costs.

For people using check-mailmap as a .mailmap debugging aid, such an
indicator might indeed be useful, however, it seems prudent to keep
things simple initially by omitting the bells and whistles until
practical experience shows that more features (and complexity) are
warranted.

 +   if (namelen)
 +   printf(%.*s %.*s%c,
 +   (int)namelen, name, (int)maillen, mail, term);
 +   else
 +   printf(%.*s%c, (int)maillen, mail, term);
 +}
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/22] documentation: add documentation of the index-v5 file format

2013-07-11 Thread Duy Nguyen
On Sun, Jul 7, 2013 at 3:11 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 +== File entry (fileentries)
 +
 +  File entries are sorted in ascending order on the name field, after the
 +  respective offset given by the directory entries. All file names are
 +  prefix compressed, meaning the file name is relative to the directory.
 +
 +  filename (variable length, nul terminated). The exact encoding is
 +undefined, but the filename cannot contain a NUL byte (iow, the same
 +encoding as a UNIX pathname).
 +
 +  flags (16-bits): 'flags' field split into (high to low bits)
 +
 +assumevalid (1-bit): assume-valid flag
 +
 +intenttoadd (1-bit): intent-to-add flag, used by git add -N.
 +  Extended flag in index v3.
 +
 +stage (2-bit): stage of the file during merge
 +
 +skipworktree (1-bit): skip-worktree flag, used by sparse checkout.
 +  Extended flag in index v3.
 +
 +smudged (1-bit): indicates if the file is racily smudged.
 +
 +10-bit unused, must be zero [6]
 +
 +  mode (16-bits): file mode, split into (high to low bits)
 +
 +objtype (4-bits): object type
 +  valid values in binary are 1000 (regular file), 1010 (symbolic
 +  link) and 1110 (gitlink)
 +
 +3-bit unused
 +
 +permission (9-bits): unix permission. Only 0755 and 0644 are valid
 +  for regular files. Symbolic links and gitlinks have value 0 in
 +  this field.
 +
 +  mtimes (32-bits): mtime seconds, the last time a file's data changed
 +this is stat(2) data
 +
 +  mtimens (32-bits): mtime nanosecond fractions
 +this is stat(2) data
 +
 +  file size (32-bits): The on-disk size, trucated to 32-bit.
 +this is stat(2) data
 +
 +  statcrc (32-bits): crc32 checksum over ctime seconds, ctime
 +nanoseconds, ino, dev, uid, gid (All stat(2) data
 +except mtime and file size). If the statcrc is 0 it will
 +be ignored. [7]
 +
 +  objhash (160-bits): SHA-1 for the represented object
 +
 +  entrycrc (32-bits): crc32 checksum for the file entry. The crc code
 +includes the offset to the offset to the file, relative to the
 +beginning of the file.

Question about the possibility of updating index file directly. If git
updates a few fields of an entry (but not entrycrc yet) and crashes,
the entry would become corrupt because its entrycrc does not match the
content. What do we do? Do we need to save a copy of the entry
somewhere in the index file (maybe in the conflict data section), so
that the reader can recover the index? Losing the index because of
bugs is big deal in my opinion. pre-v5 never faces this because we
keep the original copy til the end.

Maybe entrycrc should not cover stat fields and statcrc. It would make
refreshing safer. If the above happens during refresh, only statcrc is
corrupt and we can just refresh the entry. entrycrc still says the
other fields are good (and they are).
--
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


subtree merge strategy - merge upstream branches that share history

2013-07-11 Thread Volkmar Glauche

Dear list,

I would like to set up a repository that has multiple branches from a  
upstream project merged at different paths. Also, I would like to be  
able to access upstream history in my project (it is interpreted code,  
and I would like to git-blame to display a mix of local and upstream  
history line-by-line in the code).
I have documented the steps to demonstrate my desired repository  
layout in the attached shell script snippet. It is self-contained and  
sets up two repositories upstream and main-repo in /tmp. It then reads  
two branches featureA and featureB from upstream into separate paths  
in main-repo.
Then, the script modifies contents in the upstream repository. There  
are modifications that apply to both featureA and featureB as well as  
modifications to one branch only.
Up to this point, things seemed to work well using either git-subtree  
or a combination of git-read-tree and subtree merge strategy.
Finally, the changes to upstream are to be pulled into the  
corresponding paths in main-repo. Here, neither git-subtree nor git  
pull with subtree merge do what I want. The changes that apply to one  
of the features only are applied correctly. However, changes from the  
common history of featureA and featureB are only applied to the path  
that is updated first.
Apparently, git records that the common change has been applied and  
doesn't apply it a second time (which is correct for ~99.999% of  
all use cases). In this special case however, git does not take into  
account that it applied the common change to a different path.
I found that git pull --squash would apply the right set of updates to  
both featureA and featureB in main-repo (i.e. in this case, the common  
change would be applied on each path). However, this would leave me  
unlinked to upstream history.

Now:
1) Am I doing something completely wrong, am I missing some important detail?
2) Am I asking for something impossible?
3) Is it expected behaviour, that --squash adds a different set of  
text changes than a pull without squash?


Best,

Volkmar




--
Freiburg Brain Imaging
http://fbi.uniklinik-freiburg.de/
Tel. +761 270-54783
Fax. +761 270-54819



upstream-example.sh
Description: application/shellscript


Re: [PATCH] git clone depth of 0 not possible.

2013-07-11 Thread Matthijs Kooijman
Hi Junio,

 While implementing the above, I noticed my fix now introduced an
 off-by-one error the other way. When investigating, I found this commit:
 
   commit 682c7d2f1a2d1a5443777237450505738af2ff1a
   Author: Nguyễn Thái Ngọc Duy pclo...@gmail.com
   Date:   Fri Jan 11 16:05:47 2013 +0700
 
   upload-pack: fix off-by-one depth calculation in shallow clone
 
   get_shallow_commits() is used to determine the cut points at a given
   depth (i.e. the number of commits in a chain that the user likes to
   get). However we count current depth up to the commit commit but 
 we
   do the cutting at its parents (i.e. current depth + 1). This makes
   upload-pack always return one commit more than requested. This patch
   fixes it.
 
   Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
   Signed-off-by: Junio C Hamano gits...@pobox.com
 
 Which actually seems to fix the off-by-one bug that is described in this
 thread, but without going through the hoops of preserving current
 behaviour for older git versions (that is, it makes behaviour dependent
 on server version instead of client version).
 
 Does this mean the discussion in this thread is meaningless, or is that
 commit not intended to be the final fix?
Looking more closely, I also see that the above change is already
released in 1.8.2 versions. Given that, I don't think it makes sense to
to still try to provide this capability to get backward compatible
behaviour, since this would cause a off-by-one error the other way when
talking to 1.8.2.x servers...

However, since I pretty much finished the code for this, I'll send over
the patches and let you decide wether to include them or not. If you
want to include them but they need to be changed in some way, just let
me know.

The first patch of the series should be merged regardless.

Gr.

Matthijs
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/10] pack-revindex: radix-sort the revindex

2013-07-11 Thread Jeff King
On Wed, Jul 10, 2013 at 06:47:49PM +0530, Ramkumar Ramachandra wrote:

  For a 64-bit off_t, using 16-bit digits gives us k=4.
 
 Wait, isn't off_t a signed data type?  Did you account for that in
 your algorithm?

It is signed, but the values we are storing in the revindex are all
positive file offsets. Right-shifting a positive signed type is
explicitly allowed in C.

  -static int cmp_offset(const void *a_, const void *b_)
  +/*
  + * This is a least-significant-digit radix sort.
  + */
 
 Any particular reason for choosing LSD, and not MSD?

Simplicity. An MSD implementation should have the same algorithmic
complexity and in theory, one can do MSD in-place. I'm happy enough with
the speedup here, but if you want to take a stab at beating my times
with MSD, please feel free.

The other usual downside of MSD is that it is typically not stable,
but we don't care about that here. We know that our sort keys are
unique.

  +#define DIGIT_SIZE (16)
  +#define BUCKETS (1  DIGIT_SIZE)
 
 Okay, NUMBER_OF_BUCKETS = 2^RADIX, and you choose a hex radix.  Is
 off_t guaranteed to be fixed-length though?  I thought only the ones
 in stdint.h were guaranteed to be fixed-length?

I'm not sure what you mean by fixed-length. If you mean does it have the
same size on every platform, then no. It will typically be 32-bit on
platforms without largefile support, and 64-bit elsewhere. But it
shouldn't matter. We'll first sort the entries by the lower 16 bits, and
then if we have more bits, by the next 16 bits, and so on. We quit when
the maximum value to sort (which we know ahead of time from the size of
the packfile) is smaller than the 16-bits we are on. So we don't need to
know the exact size of off_t, only the maximum value in our list (which
must, by definition, be smaller than what can be represented by off_t).

  +   /*
  +* We want to know the bucket that a[i] will go into when we are 
  using
  +* the digit that is N bits from the (least significant) end.
  +*/
  +#define BUCKET_FOR(a, i, bits) (((a)[(i)].offset  (bits))  (BUCKETS-1))
 
 Ouch!  This is unreadable.  Just write an inline function instead?  A
 % would've been easier on the eyes, but you chose base-16.

I specifically avoided an inline function because they are subject to
compiler settings. This isn't just it would be a bit faster if this got
inlined, and OK otherwise but this would be horribly slow if not
inlined.

I'm also not sure that

  static inline unsigned bucket_for(const struct revindex *a,
unsigned i,
unsigned bits)
  {
  return a[i].offset  bits  (BUCKETS-1);
  }

is actually any more readable.

I'm not sure what you mean by base-16. No matter the radix digit size,
as long as it is an integral number of bits, we can mask it off, which
is more efficient than modulo. A good compiler should see that it
is a constant and convert it to a bit-mask, but I'm not sure I agree
that modular arithmetic is more readable. This is fundamentally a
bit-twiddling operation, as we are shifting and masking.

I tried to explain it in the comment; suggestions to improve that are
welcome.

  +   /*
  +* We need O(n) temporary storage, so we sort back and forth between
  +* the real array and our tmp storage. To keep them straight, we 
  always
  +* sort from a into buckets in b.
  +*/
  +   struct revindex_entry *tmp = xcalloc(n, sizeof(*tmp));
 
 Shouldn't this be sizeof (struct revindex_entry), since tmp hasn't
 been declared yet?

No, the variable is declared (but uninitialized) in its initializer.
Despite its syntax, sizeof() is not a function and does not care about
the state of the variable, only its type.

 Also, s/n/revindex_nr/, and something more appropriate for tmp?

What name would you suggest be be more appropriate for tmp?

  +   int bits = 0;
  +   unsigned *pos = xmalloc(BUCKETS * sizeof(*pos));
 
 sizeof(unsigned int), for clarity, if not anything else.

I disagree; in general, I prefer using sizeof(*var) rather than
sizeof(type), because it avoids repeating ourselves, and there is no
compile-time check that you have gotten it right.

In the initializer it is less important, because the type is right
there. But when you are later doing:

  memset(pos, 0, BUCKETS * sizeof(*pos));

this is much more robust. If somebody changes the type of pos, the
memset line does not need changed. If you used sizeof(unsigned), then
the code is now buggy (and the compiler cannot notice).

 You picked malloc over calloc here, because you didn't want to incur
 the extra cost of zero-initializing the memory?

Yes. We have to zero-initialize in each loop, so there is no point
spending the extra effort on calloc.

We could also xcalloc inside each loop iteration, but since we need the
same-size allocation each time, I hoisted the malloc out of the loop.

 Also, pos is the actual buckets array, I presume (hence unsigned,

Re: [PATCH 10/10] pack-revindex: radix-sort the revindex

2013-07-11 Thread Jeff King
On Wed, Jul 10, 2013 at 10:10:16AM -0700, Brandon Casey wrote:

  On the linux.git repo, with about 3M objects to sort, this
  yields a 400% speedup. Here are the best-of-five numbers for
  running echo HEAD | git cat-file --batch-disk-size, which
  is dominated by time spent building the pack revindex:
 
before after
real0m0.834s   0m0.204s
user0m0.788s   0m0.164s
sys 0m0.040s   0m0.036s
 
  On a smaller repo, the radix sort would not be
  as impressive (and could even be worse), as we are trading
  the log(n) factor for the k=4 of the radix sort. However,
  even on git.git, with 173K objects, it shows some
  improvement:
 
before after
real0m0.046s   0m0.017s
user0m0.036s   0m0.012s
sys 0m0.008s   0m0.000s
 
 k should only be 2 for git.git.  I haven't packed in a while, but I
 think it should all fit within 4G.  :)  The pathological case would be
 a pack file with very few very very large objects, large enough to
 push the pack size over the 2^48 threshold so we'd have to do all four
 radixes.

Yeah, even linux.git fits into k=2. And that does more or less explain
the numbers in both cases.

For git.git, With 173K objects, log(n) is ~18, so regular sort is 18n.
With a radix sort of k=2, which has a constant factor of 2 (you can see
by looking at the code that we go through the list twice per radix), we
have 4n. So there should be a 4.5x speedup. We don't quite get that,
which is probably due to the extra bookkeeping on the buckets.

For linux.git, with 3M objects, log(n) is ~22, so the speedup we hope
for is 5.5x. We end up with 4x.

 It's probably worth mentioning here and/or in the code that k is
 dependent on the pack file size and that we can jump out early for
 small pack files.  That's my favorite part of this code by the way. :)

Yeah, I agree it is probably worth mentioning along with the numbers; it
is where half of our speedup is coming from. I think the max  bits
loop condition deserves to be commented, too. I'll add that.

Also note that my commit message still refers to --batch-disk-size
which does not exist anymore. :) I didn't update the timings in the
commit message for my re-roll, but I did confirm that they are the same.

  +   /*
  +* We need O(n) temporary storage, so we sort back and forth between
  +* the real array and our tmp storage. To keep them straight, we 
  always
  +* sort from a into buckets in b.
  +*/
  +   struct revindex_entry *tmp = xcalloc(n, sizeof(*tmp));
 
 Didn't notice it the first time I read this, but do we really need
 calloc here?  Or will malloc do?

No, a malloc should be fine. I doubt it matters much, but there's no
reason not to go the cheap route.

  +   struct revindex_entry *a = entries, *b = tmp;
  +   int bits = 0;
  +   unsigned *pos = xmalloc(BUCKETS * sizeof(*pos));
  +
  +   while (max  bits) {
  +   struct revindex_entry *swap;
  +   int i;
 
 You forgot to make i unsigned.  See below too...

Oops. Thanks for catching.

  +   /*
  +* Now we can drop the elements into their correct buckets 
  (in
  +* our temporary array).  We iterate the pos counter 
  backwards
  +* to avoid using an extra index to count up. And since we 
  are
  +* going backwards there, we must also go backwards through 
  the
  +* array itself, to keep the sort stable.
  +*/
  +   for (i = n - 1; i = 0; i--)
  +   b[--pos[BUCKET_FOR(a, i, bits)]] = a[i];
 
 ...which is why the above loop still works.

Since we are iterating by ones, I guess I can just compare to UINT_MAX.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 06/10] cat-file: add --batch-check=format

2013-07-11 Thread Jeff King
On Wed, Jul 10, 2013 at 08:21:15PM +0530, Ramkumar Ramachandra wrote:

 Jeff King wrote:
  +If `--batch` or `--batch-check` is given, `cat-file` will read objects
  +from stdin, one per line, and print information about them.
  +
  +You can specify the information shown for each object by using a custom
  +`format`. The `format` is copied literally to stdout for each
  +object, with placeholders of the form `%(atom)` expanded, followed by a
  +newline. The available atoms are:
  +
  +If no format is specified, the default format is `%(objectname)
  +%(objecttype) %(objectsize)`.
  +
  +If `--batch` is specified, the object information is followed by the
  +object contents (consisting of `%(objectsize)` bytes), followed by a
  +newline.
 
 I find this slightly hideous, and would have expected an
 %(objectcontents) or similar.

I looked into doing that, but it makes the code significantly more
complicated, assuming you do not want to copy the full object contents
in memory. You cannot use strbuf_expand, and you need to worry about
buffering/flushing more (you do not want to write() each individual
item, but if you are using printf(), you need to flush before using the
unbuffered streaming interface).

My thinking was to leave it until somebody actually wants it, at which
point they can do the necessary refactoring (and hopefully this would be
part of unifying it with other format-parsers).

If we were designing from scratch and this was the difference between
having --batch-check and --batch, or having a single --batch, I'd
care more about doing %(objectcontents) right away. But because we must
support the historical --batch/--batch-check distinction anyway, I don't
think this is any worse.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] upload-pack: Remove a piece of dead code

2013-07-11 Thread Matthijs Kooijman
Commit 682c7d2 (upload-pack: fix off-by-one depth calculation in shallow
clone) introduced a new check in get_shallow_commits to decide when to
stop traversing the history and mark the current commit as a shallow
root.

With this new check in place, the old check can no longer be true, since
the first check always fires first. This commit removes that check,
making the code a bit more simple again.

Signed-off-by: Matthijs Kooijman matth...@stdin.nl
---
 shallow.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/shallow.c b/shallow.c
index cbe2526..8a9c96d 100644
--- a/shallow.c
+++ b/shallow.c
@@ -110,17 +110,12 @@ struct commit_list *get_shallow_commits(struct 
object_array *heads, int depth,
continue;
*pointer = cur_depth;
}
-   if (cur_depth  depth) {
-   if (p-next)
-   add_object_array(p-item-object,
-   NULL, stack);
-   else {
-   commit = p-item;
-   cur_depth = *(int *)commit-util;
-   }
-   } else {
-   commit_list_insert(p-item, result);
-   p-item-object.flags |= shallow_flag;
+   if (p-next)
+   add_object_array(p-item-object,
+   NULL, stack);
+   else {
+   commit = p-item;
+   cur_depth = *(int *)commit-util;
}
}
}
-- 
1.8.3.rc1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] upload-pack: Introduce new fixed-off-by-one-depth server feature

2013-07-11 Thread Matthijs Kooijman
Commit 682c7d2 (upload-pack: fix off-by-one depth calculation in shallow
clone) changed the meaning of the fetch depth sent over the wire to mean
the total number of commits to return, instead of the number of commits
beyond the first. However, when this change is deployed on some servers
but not others, this can cause a client to behave differently based on
the server version, which is unexpected.

To prevent this, the new, fixed, depth behaviour is advertised as a server
feature and the old behaviour is restored when the feature is not
requested by the client.

Signed-off-by: Matthijs Kooijman matth...@stdin.nl
---
 upload-pack.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 127e59a..59f43d1 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -46,6 +46,7 @@ static unsigned int timeout;
 static int use_sideband;
 static int advertise_refs;
 static int stateless_rpc;
+static int fixed_depth;
 
 static void reset_timeout(void)
 {
@@ -633,6 +634,8 @@ static void receive_needs(void)
no_progress = 1;
if (parse_feature_request(features, include-tag))
use_include_tag = 1;
+   if (parse_feature_request(features, fixed-off-by-one-depth))
+   fixed_depth = 1;
 
o = parse_object(sha1_buf);
if (!o)
@@ -669,10 +672,14 @@ static void receive_needs(void)
struct object *object = 
shallows.objects[i].item;
object-flags |= NOT_SHALLOW;
}
-   else
+   else {
+   /* Emulate off-by-one bug in older versions */
+   if (!fixed_depth)
+   depth++;
backup = result =
get_shallow_commits(want_obj, depth,
SHALLOW, NOT_SHALLOW);
+   }
while (result) {
struct object *object = result-item-object;
if (!(object-flags  (CLIENT_SHALLOW|NOT_SHALLOW))) {
@@ -738,7 +745,7 @@ static int send_ref(const char *refname, const unsigned 
char *sha1, int flag, vo
 {
static const char *capabilities = multi_ack thin-pack side-band
 side-band-64k ofs-delta shallow no-progress
-include-tag multi_ack_detailed;
+include-tag multi_ack_detailed fixed-off-by-one-depth;
const char *refname_nons = strip_namespace(refname);
unsigned char peeled[20];
 
-- 
1.8.3.rc1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] fetch-pack: Request fixed-off-by-one-depth when available

2013-07-11 Thread Matthijs Kooijman
This server feature changes the meaning of the fetch depth, allowing
fetching only a single revision instead of at least two as before. To
make sure the behaviour only depends on the client version, the depth
value sent over the wire is corrected depending on wether the server has
the fix.

There is one corner case: A server without the fix cannot send less than
2 commmits, so when --depth=1 is specified a warning is shown and 2
commits are fetched instead of 1.

Signed-off-by: Matthijs Kooijman matth...@stdin.nl
---
 fetch-pack.c | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index abe5ffb..799b2c1 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -39,6 +39,7 @@ static int marked;
 
 static struct commit_list *rev_list;
 static int non_common_revs, multi_ack, use_sideband, allow_tip_sha1_in_want;
+static int fixed_depth;
 
 static void rev_list_push(struct commit *commit, int mark)
 {
@@ -327,6 +328,7 @@ static int find_common(struct fetch_pack_args *args,
if (prefer_ofs_delta)   strbuf_addstr(c,  ofs-delta);
if (agent_supported)strbuf_addf(c,  agent=%s,

git_user_agent_sanitized());
+   if (fixed_depth)strbuf_addstr(c,  
fixed-off-by-one-depth);
packet_buf_write(req_buf, want %s%s\n, remote_hex, 
c.buf);
strbuf_release(c);
} else
@@ -342,8 +344,23 @@ static int find_common(struct fetch_pack_args *args,
 
if (is_repository_shallow())
write_shallow_commits(req_buf, 1);
-   if (args-depth  0)
-   packet_buf_write(req_buf, deepen %d, args-depth);
+   if (args-depth  0) {
+   if (!fixed_depth  args-depth == 1)
+   warning(Server does not support depth=1, using depth=2 
instead);
+   if (!fixed_depth  args-depth  1) {
+   /* Old server that interprets deepen 1 as
+  give me tip + 1 extra commit */
+   packet_buf_write(req_buf, deepen %d, args-depth - 
1);
+   } else if (!fixed_depth  args-depth == 1) {
+   /* Old servers cannot handle depth=1 (deepen=0
+  means don't change depth / full depth). */
+   packet_buf_write(req_buf, deepen 1);
+   } else {
+   /* New server, send depth as-is */
+   packet_buf_write(req_buf, deepen %d, args-depth);
+   }
+   }
+
packet_buf_flush(req_buf);
state_len = req_buf.len;
 
@@ -874,6 +891,11 @@ static struct ref *do_fetch_pack(struct fetch_pack_args 
*args,
fprintf(stderr, Server supports ofs-delta\n);
} else
prefer_ofs_delta = 0;
+   if (server_supports(fixed-off-by-one-depth)) {
+   if (args-verbose)
+   fprintf(stderr, Server has fixed meaning of depth 
value\n);
+   fixed_depth = 1;
+   }
 
if ((agent_feature = server_feature_value(agent, agent_len))) {
agent_supported = 1;
-- 
1.8.3.rc1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5.5/22] Add documentation for the index api

2013-07-11 Thread Thomas Gummerer
Duy Nguyen pclo...@gmail.com writes:

 On Wed, Jul 10, 2013 at 3:10 AM, Thomas Gummerer t.gumme...@gmail.com wrote:
 If you happen to know that certain entries match the given pathspec,
 you could help the caller avoid match_pathspec'ing again by set a bit
 in ce_flags.

 I currently don't know which entries do match the pathspec from just
 reading the index file, additional calls would be needed.  I don't think
 that would be worth the overhead.

 Yeah I now see that you select what to load in v5 with the adjusted
 pathspec, not the input pathspec. Originally I thought you match the
 input pathspec against every file entry in the index :P Your adjusted
 pathspec looks like what common_prefix is for. It's cheaper than
 creating adjusted_pathspec from match_pathspec and reduces loading in
 major cases, where glob is not used.

 Still, creating an adjusted pathspec this way looks iffy. You need to
 understand pathspec in order to strip the filename part out to match
 the directory match only. An alternative is use
 tree_entry_interesting. It goes along well with tree traversal and can
 be used to match directories with original pathspec. Once you see it
 matches an entry in a directory, you could skip matching the rest of
 the files and load the whole directory. read_index_filtered_v5 and
 read_entries may need some tweaking though. I'll try it and post a
 patch later if I succeed.

Hrm, I played around a bit with this idea, but I couldn't figure out how
to make it work.  For it to work we would still have to load some
entries in a directory at least?  Or is there a way to match the
directories, which I just haven't figured out yet?

 To know which entry exists in the index and which is
 new, use another flag. Most reader code won't change if we do it this
 way, all match_pathspec() remain where they are.

 Hrm you mean to know which cache entries are added (or changed) in the
 in-memory index and will have to be written later?  I'm not sure I
 understand correctly what you mean here.

 Oh.. The to know.. sentence was nonsense. We probably don't need to
 know. We may track changed entries for partial writing, but let's
 leave that out for now.

Ok, makes sense.

 +`index_change_filter_opts(opts)`::
 +   This function again has a slightly different functionality for
 +   index-v2 and index-v5.
 +
 +   For index-v2 it simply changes the filter_opts, so
 +   for_each_index_entry uses the changed index_opts, to iterate
 +   over a different set of cache entries.
 +
 +   For index-v5 it refreshes the index if the filter_opts have
 +   changed and sets the new filter_opts in the index state, again
 +   to iterate over a different set of cache entries as with
 +   index-v2.
 +
 +   This has some optimization potential, in the case that the
 +   opts get stricter (less of the index should be read) it
 +   doesn't have to reload anything, but currently does.

 The only use case I see so far is converting a partial index_state
 back to a full one. Apart from doing so in order to write the new
 index, I think some operation (like rename tracking in diff or
 unpack-trees) may expect full index. I think we should support that. I
 doubt we need to change pathspec to something different than the one
 we used to load the index. When a user passes a pathspec to a command,
 the user expects the command to operate on that set only, not outside.

 One application was in ls-files, where we strip the trailing slash from
 the pathspecs for submodules.  But when we let the caller filter the
 rest out it's not needed anymore.  We load all entries without the
 trailing slash anyway.

 That submodule trailing slash stripping code will be moved away soon
 (I've been working on it for some time now). There's similar code in
 pathspec.c. I hope by the time this series becomes a candidate for
 'next', those pathspec manipulation is already gone. For
 strip_trailing_slash_from_submodules, peeking in index file for a few
 entries is probably ok. For check_path_for_gitlink, full index is
 loaded until we figure out a clever way.

Ah great, for now I'll just not use the for_each_index_entry function in
ls-files, and then change the code later once the stripping code is
moved away.

 Some thoughts about the writing api.

 In think we should avoid automatically converting partial index into a
 full one before writing. Push that back to the caller and die() when
 asked to update partial index. They know at what point the index may
 be updated and even what part of it may be updated. I think all
 commands fall into two categories, tree-wide updates (merge,
 checkout...) and limited by the user-given pathspec. what part to be
 updated is not so hard to determine.

 Hrm this is only true if index entries are added or removed, not if they
 are only changed.  If they are only changed we can write a partially
 read index once we have partial writing.

 Yep. We can detect if changes are updates only, no 

Re: [PATCH 13/22] documentation: add documentation of the index-v5 file format

2013-07-11 Thread Thomas Gummerer
Duy Nguyen pclo...@gmail.com writes:

 On Sun, Jul 7, 2013 at 3:11 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 +== File entry (fileentries)
 +
 +  File entries are sorted in ascending order on the name field, after the
 +  respective offset given by the directory entries. All file names are
 +  prefix compressed, meaning the file name is relative to the directory.
 +
 +  filename (variable length, nul terminated). The exact encoding is
 +undefined, but the filename cannot contain a NUL byte (iow, the same
 +encoding as a UNIX pathname).
 +
 +  flags (16-bits): 'flags' field split into (high to low bits)
 +
 +assumevalid (1-bit): assume-valid flag
 +
 +intenttoadd (1-bit): intent-to-add flag, used by git add -N.
 +  Extended flag in index v3.
 +
 +stage (2-bit): stage of the file during merge
 +
 +skipworktree (1-bit): skip-worktree flag, used by sparse checkout.
 +  Extended flag in index v3.
 +
 +smudged (1-bit): indicates if the file is racily smudged.
 +
 +10-bit unused, must be zero [6]
 +
 +  mode (16-bits): file mode, split into (high to low bits)
 +
 +objtype (4-bits): object type
 +  valid values in binary are 1000 (regular file), 1010 (symbolic
 +  link) and 1110 (gitlink)
 +
 +3-bit unused
 +
 +permission (9-bits): unix permission. Only 0755 and 0644 are valid
 +  for regular files. Symbolic links and gitlinks have value 0 in
 +  this field.
 +
 +  mtimes (32-bits): mtime seconds, the last time a file's data changed
 +this is stat(2) data
 +
 +  mtimens (32-bits): mtime nanosecond fractions
 +this is stat(2) data
 +
 +  file size (32-bits): The on-disk size, trucated to 32-bit.
 +this is stat(2) data
 +
 +  statcrc (32-bits): crc32 checksum over ctime seconds, ctime
 +nanoseconds, ino, dev, uid, gid (All stat(2) data
 +except mtime and file size). If the statcrc is 0 it will
 +be ignored. [7]
 +
 +  objhash (160-bits): SHA-1 for the represented object
 +
 +  entrycrc (32-bits): crc32 checksum for the file entry. The crc code
 +includes the offset to the offset to the file, relative to the
 +beginning of the file.

 Question about the possibility of updating index file directly. If git
 updates a few fields of an entry (but not entrycrc yet) and crashes,
 the entry would become corrupt because its entrycrc does not match the
 content. What do we do? Do we need to save a copy of the entry
 somewhere in the index file (maybe in the conflict data section), so
 that the reader can recover the index? Losing the index because of
 bugs is big deal in my opinion. pre-v5 never faces this because we
 keep the original copy til the end.

 Maybe entrycrc should not cover stat fields and statcrc. It would make
 refreshing safer. If the above happens during refresh, only statcrc is
 corrupt and we can just refresh the entry. entrycrc still says the
 other fields are good (and they are).

The original idea was to change the lock-file for partial writing to
make it work for this case.  The exact structure of the file still has
to be defined, but generally it would be done in the following steps:

  1. Write the changed entry to the lock-file
  2. Change the entry in the index
  3. If we succeed delete the lock-file (commit the transaction)

If git crashes, and leaves the index corrupted, we can recover the
information from the lock-file and write the new information to the
index file and then delete the lock-file.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/22] documentation: add documentation of the index-v5 file format

2013-07-11 Thread Duy Nguyen
On Thu, Jul 11, 2013 at 6:39 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 Question about the possibility of updating index file directly. If git
 updates a few fields of an entry (but not entrycrc yet) and crashes,
 the entry would become corrupt because its entrycrc does not match the
 content. What do we do? Do we need to save a copy of the entry
 somewhere in the index file (maybe in the conflict data section), so
 that the reader can recover the index? Losing the index because of
 bugs is big deal in my opinion. pre-v5 never faces this because we
 keep the original copy til the end.

 Maybe entrycrc should not cover stat fields and statcrc. It would make
 refreshing safer. If the above happens during refresh, only statcrc is
 corrupt and we can just refresh the entry. entrycrc still says the
 other fields are good (and they are).

 The original idea was to change the lock-file for partial writing to
 make it work for this case.  The exact structure of the file still has
 to be defined, but generally it would be done in the following steps:

   1. Write the changed entry to the lock-file
   2. Change the entry in the index
   3. If we succeed delete the lock-file (commit the transaction)

 If git crashes, and leaves the index corrupted, we can recover the
 information from the lock-file and write the new information to the
 index file and then delete the lock-file.

Ah makes sense. Still concerned about refreshing though. Updated files
are usually few while refreshed files could be a lot more, increasing
the cost at #1.
--
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] upload-pack: Remove a piece of dead code

2013-07-11 Thread Duy Nguyen
On Thu, Jul 11, 2013 at 6:25 PM, Matthijs Kooijman matth...@stdin.nl wrote:
 Commit 682c7d2 (upload-pack: fix off-by-one depth calculation in shallow
 clone) introduced a new check in get_shallow_commits to decide when to
 stop traversing the history and mark the current commit as a shallow
 root.

 With this new check in place, the old check can no longer be true, since
 the first check always fires first. This commit removes that check,
 making the code a bit more simple again.

True. Ack-by: me.

 Signed-off-by: Matthijs Kooijman matth...@stdin.nl
 ---
  shallow.c | 17 ++---
  1 file changed, 6 insertions(+), 11 deletions(-)

 diff --git a/shallow.c b/shallow.c
 index cbe2526..8a9c96d 100644
 --- a/shallow.c
 +++ b/shallow.c
 @@ -110,17 +110,12 @@ struct commit_list *get_shallow_commits(struct 
 object_array *heads, int depth,
 continue;
 *pointer = cur_depth;
 }
 -   if (cur_depth  depth) {
 -   if (p-next)
 -   add_object_array(p-item-object,
 -   NULL, stack);
 -   else {
 -   commit = p-item;
 -   cur_depth = *(int *)commit-util;
 -   }
 -   } else {
 -   commit_list_insert(p-item, result);
 -   p-item-object.flags |= shallow_flag;
 +   if (p-next)
 +   add_object_array(p-item-object,
 +   NULL, stack);
 +   else {
 +   commit = p-item;
 +   cur_depth = *(int *)commit-util;
 }
 }
 }
 --
 1.8.3.rc1

--
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv3 10/10] pack-revindex: radix-sort the revindex

2013-07-11 Thread Jeff King
  Here's an update of the radix-sort patch. It fixes the unsigned issue
  Brandon pointed out, along with a few other comment/naming/style fixes.
  I also updated the commit message with more explanation of the
  timings.
  
  The interdiff is:
  
  diff --git a/pack-revindex.c b/pack-revindex.c
  index 9365bc2..b4d2b35 100644
  --- a/pack-revindex.c
  +++ b/pack-revindex.c
  @@ -61,6 +61,10 @@ static void init_pack_revindex(void)
   
   /*
* This is a least-significant-digit radix sort.
  + *
  + * It sorts each of the n items in entries by its offset field. The max
  + * parameter must be at least as large as the largest offset in the array,
  + * and lets us quit the sort early.
*/
   static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t 
max)
   {
  @@ -78,18 +82,25 @@ static void sort_revindex(struct revindex_entry *entries, 
unsigned n, off_t max)
   #define BUCKET_FOR(a, i, bits) (((a)[(i)].offset  (bits))  (BUCKETS-1))
   
/*
  -  * We need O(n) temporary storage, so we sort back and forth between
  -  * the real array and our tmp storage. To keep them straight, we always
  -  * sort from a into buckets in b.
  +  * We need O(n) temporary storage. Rather than do an extra copy of the
  +  * partial results into entries, we sort back and forth between the
  +  * real array and temporary storage. In each iteration of the loop, we
  +  * keep track of them with alias pointers, always sorting from from
  +  * to to.
 */
  - struct revindex_entry *tmp = xcalloc(n, sizeof(*tmp));
  - struct revindex_entry *a = entries, *b = tmp;
  - int bits = 0;
  + struct revindex_entry *tmp = xmalloc(n * sizeof(*tmp));
  + struct revindex_entry *from = entries, *to = tmp;
  + int bits;
unsigned *pos = xmalloc(BUCKETS * sizeof(*pos));
   
  - while (max  bits) {
  + /*
  +  * If (max  bits) is zero, then we know that the radix digit we are
  +  * on (and any higher) will be zero for all entries, and our loop will
  +  * be a no-op, as everybody lands in the same zero-th bucket.
  +  */
  + for (bits = 0; max  bits; bits += DIGIT_SIZE) {
struct revindex_entry *swap;
  - int i;
  + unsigned i;
   
memset(pos, 0, BUCKETS * sizeof(*pos));
   
  @@ -102,7 +113,7 @@ static void sort_revindex(struct revindex_entry *entries, 
unsigned n, off_t max)
 * previous bucket to get the true index.
 */
for (i = 0; i  n; i++)
  - pos[BUCKET_FOR(a, i, bits)]++;
  + pos[BUCKET_FOR(from, i, bits)]++;
for (i = 1; i  BUCKETS; i++)
pos[i] += pos[i-1];
   
  @@ -112,32 +123,37 @@ static void sort_revindex(struct revindex_entry 
*entries, unsigned n, off_t max)
 * to avoid using an extra index to count up. And since we are
 * going backwards there, we must also go backwards through the
 * array itself, to keep the sort stable.
  +  *
  +  * Note that we use an unsigned iterator to make sure we can
  +  * handle 2^32-1 objects, even on a 32-bit system. But this
  +  * means we cannot use the more obvious i = 0 loop condition
  +  * for counting backwards, and must instead check for
  +  * wrap-around with UINT_MAX.
 */
  - for (i = n - 1; i = 0; i--)
  - b[--pos[BUCKET_FOR(a, i, bits)]] = a[i];
  + for (i = n - 1; i != UINT_MAX; i--)
  + to[--pos[BUCKET_FOR(from, i, bits)]] = from[i];
   
/*
  -  * Now b contains the most sorted list, so we swap a and
  -  * b for the next iteration.
  +  * Now to contains the most sorted list, so we swap from and
  +  * to for the next iteration.
 */
  - swap = a;
  - a = b;
  - b = swap;
  -
  - /* And bump our bits for the next round. */
  - bits += DIGIT_SIZE;
  + swap = from;
  + from = to;
  + to = swap;
}
   
/*
 * If we ended with our data in the original array, great. If not,
 * we have to move it back from the temporary storage.
 */
  - if (a != entries)
  + if (from != entries)
memcpy(entries, tmp, n * sizeof(*entries));
free(tmp);
free(pos);
   
   #undef BUCKET_FOR
  +#undef BUCKETS
  +#undef DIGIT_SIZE
   }
   
   /*

-- 8 --
Subject: [PATCH] pack-revindex: radix-sort the revindex

The pack revindex stores the offsets of the objects in the
pack in sorted order, allowing us to easily find the on-disk
size of each object. To compute it, we populate an array
with the offsets from the sha1-sorted idx file, and then use
qsort to order it by 

Re: [PATCH 13/22] documentation: add documentation of the index-v5 file format

2013-07-11 Thread Thomas Gummerer
Duy Nguyen pclo...@gmail.com writes:

 On Thu, Jul 11, 2013 at 6:39 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 Question about the possibility of updating index file directly. If git
 updates a few fields of an entry (but not entrycrc yet) and crashes,
 the entry would become corrupt because its entrycrc does not match the
 content. What do we do? Do we need to save a copy of the entry
 somewhere in the index file (maybe in the conflict data section), so
 that the reader can recover the index? Losing the index because of
 bugs is big deal in my opinion. pre-v5 never faces this because we
 keep the original copy til the end.

 Maybe entrycrc should not cover stat fields and statcrc. It would make
 refreshing safer. If the above happens during refresh, only statcrc is
 corrupt and we can just refresh the entry. entrycrc still says the
 other fields are good (and they are).

 The original idea was to change the lock-file for partial writing to
 make it work for this case.  The exact structure of the file still has
 to be defined, but generally it would be done in the following steps:

   1. Write the changed entry to the lock-file
   2. Change the entry in the index
   3. If we succeed delete the lock-file (commit the transaction)

 If git crashes, and leaves the index corrupted, we can recover the
 information from the lock-file and write the new information to the
 index file and then delete the lock-file.

 Ah makes sense. Still concerned about refreshing though. Updated files
 are usually few while refreshed files could be a lot more, increasing
 the cost at #1.

Any idea how common refreshing a big part of the cache is?  If it's not
to common, I'd prefer to leave the stat data and stat crc in the
entrycrc, as we can inform the user if something is wrong with the
index, be it from git failing, or from disk corruption.

On the other hand if refresh_cache is relatively common and usually
changes a big part of the index we should leave them out, as git can
still run correctly with incorrect stat data, but takes a little longer,
because it may have to check the file contents.  That will be trade-off
to make here.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5.5/22] Add documentation for the index api

2013-07-11 Thread Duy Nguyen
On Thu, Jul 11, 2013 at 6:42 PM, Duy Nguyen pclo...@gmail.com wrote:
 On Thu, Jul 11, 2013 at 6:30 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 Duy Nguyen pclo...@gmail.com writes:
 Hrm, I played around a bit with this idea, but I couldn't figure out how
 to make it work.  For it to work we would still have to load some
 entries in a directory at least?  Or is there a way to match the
 directories, which I just haven't figured out yet?

 Yes you have to load some entries first. Even if a directory does not
 match, we only know until at least the first file in the directory. OK
 there might be problems because tree_entry_interesting expects all
 entries in a directory to be memcmp sorted, without trailing slash for
 subdirectories. I need to check again if v5 sort order is compatible..

Not gonna work (at least not simple) because we have to mix
directories and files again. The way directory entries are ordered
makes it hard (or less efficient) to get the list of immediate subdirs
of a dir. I think I understand now why you need adjusted_pathspec..
--
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/22] documentation: add documentation of the index-v5 file format

2013-07-11 Thread Duy Nguyen
On Thu, Jul 11, 2013 at 7:26 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 Duy Nguyen pclo...@gmail.com writes:

 On Thu, Jul 11, 2013 at 6:39 PM, Thomas Gummerer t.gumme...@gmail.com 
 wrote:
 Question about the possibility of updating index file directly. If git
 updates a few fields of an entry (but not entrycrc yet) and crashes,
 the entry would become corrupt because its entrycrc does not match the
 content. What do we do? Do we need to save a copy of the entry
 somewhere in the index file (maybe in the conflict data section), so
 that the reader can recover the index? Losing the index because of
 bugs is big deal in my opinion. pre-v5 never faces this because we
 keep the original copy til the end.

 Maybe entrycrc should not cover stat fields and statcrc. It would make
 refreshing safer. If the above happens during refresh, only statcrc is
 corrupt and we can just refresh the entry. entrycrc still says the
 other fields are good (and they are).

 The original idea was to change the lock-file for partial writing to
 make it work for this case.  The exact structure of the file still has
 to be defined, but generally it would be done in the following steps:

   1. Write the changed entry to the lock-file
   2. Change the entry in the index
   3. If we succeed delete the lock-file (commit the transaction)

 If git crashes, and leaves the index corrupted, we can recover the
 information from the lock-file and write the new information to the
 index file and then delete the lock-file.

 Ah makes sense. Still concerned about refreshing though. Updated files
 are usually few while refreshed files could be a lot more, increasing
 the cost at #1.

 Any idea how common refreshing a big part of the cache is?

No, probably not common. Anyone who does find|xargs touch deserves
to be punished. Files can be edited, then reverted by an editor, but
there should not be many of those. The only sensible case is git
checkout path with lots of modified files. But that can't happen
often.

 If it's not to common, I'd prefer to leave the stat data and stat crc in the
 entrycrc, as we can inform the user if something is wrong with the
 index, be it from git failing, or from disk corruption.

 On the other hand if refresh_cache is relatively common and usually
 changes a big part of the index we should leave them out, as git can
 still run correctly with incorrect stat data, but takes a little longer,
 because it may have to check the file contents.  That will be trade-off
 to make here.



--
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t0008 hang on streaming test (OS X)

2013-07-11 Thread Jeff King
On Wed, Jul 10, 2013 at 12:36:40PM -0400, Brian Gernhardt wrote:

 The newest test in t0008 streaming support for --stdin, seems to
 hang sporadically on my MacBook Pro (running 10.8.4).  The hang seems
 to be related to running it in parallel with other tests, as I can
 only reliably cause it by running with prove  and -j 3.  However, once
 that has hung I am able to semi-reliably have it occur by running the
 test separately (with the test hung in the background, using separate
 trash directories via the --root option).

I can't replicate the hang here (on Linux) doing:

  for i in `seq 1 30`; do
  ./t0008-ignores.sh --root=/tmp/foo/$i 
  done

Do you know which test it is hanging on? You mentioned that you can
replicate it outside of prove; what does running with -v say?

The last test in t0008, with the fifos, would make me the most
suspicious. The way we do it _should_ be fine, but I'm wondering if the
shell is blocking in exec here:

  mkfifo in out 
  (git check-ignore -n -v --stdin in out ) 
  exec 9in 

That is, if the fifo is not opened for some reason by the backgrounded
process (there's a race, of course, but the outer shell should just
block until the sub-shell actually opens it). I wonder if the
descriptor-opening behavior of:

  cmd in out 

is different between shells (that is, if it backgrounds the opening of
in and out on some shells, but not on others). But then I would expect
it to fail consistently.

Just for fun, does switching the middle line there to:

  (sh -c git check-ignore -n -v --stdin in out ) 

have any effect?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/4] builtin: add git-check-mailmap command

2013-07-11 Thread Eric Sunshine
Introduce command check-mailmap, similar to check-attr and check-ignore,
which allows direct testing of .mailmap configuration.

As plumbing accessible to scripts and other porcelain, check-mailmap
publishes the stable, well-tested .mailmap functionality employed by
built-in Git commands.  Consequently, script authors need not
re-implement .mailmap functionality manually, thus avoiding potential
quirks and behavioral differences.

Signed-off-by: Eric Sunshine sunsh...@sunshineco.com
---
 .gitignore |  1 +
 Documentation/git-check-mailmap.txt| 55 +++
 Makefile   |  1 +
 builtin.h  |  1 +
 builtin/check-mailmap.c| 69 ++
 command-list.txt   |  1 +
 contrib/completion/git-completion.bash |  1 +
 git.c  |  1 +
 8 files changed, 130 insertions(+)
 create mode 100644 Documentation/git-check-mailmap.txt
 create mode 100644 builtin/check-mailmap.c

diff --git a/.gitignore b/.gitignore
index efa8db0..6b1fd1b 100644
--- a/.gitignore
+++ b/.gitignore
@@ -23,6 +23,7 @@
 /git-cat-file
 /git-check-attr
 /git-check-ignore
+/git-check-mailmap
 /git-check-ref-format
 /git-checkout
 /git-checkout-index
diff --git a/Documentation/git-check-mailmap.txt 
b/Documentation/git-check-mailmap.txt
new file mode 100644
index 000..3040bbc
--- /dev/null
+++ b/Documentation/git-check-mailmap.txt
@@ -0,0 +1,55 @@
+git-check-mailmap(1)
+
+
+NAME
+
+git-check-mailmap - Show canonical mappings of names and email addresses
+
+
+SYNOPSIS
+
+[verse]
+'git check-mailmap' [options] contact...
+
+
+DESCRIPTION
+---
+
+For each ``Name $$email@address$$'' or ``$$email@address$$'' provided on
+the command-line or standard input (when using `--stdin`), prints a line with
+the canonical contact information for that person according to the 'mailmap'
+(see Mapping Authors below). If no mapping exists for the person, echoes the
+provided contact information.
+
+
+OPTIONS
+---
+--stdin::
+   Read contacts, one per line, from the standard input after exhausting
+   contacts provided on the command-line.
+
+-z::
+--null::
+   Terminate each printed contact line with a null character `\0` rather
+   than a newline.
+
+
+OUTPUT
+--
+
+For each contact, a single line is printed of the form ``Name
+$$email@address$$'' if the name is provided or known to the 'mailmap'; or of
+the form ``$$email@address$$'' if no name is provided or known. Each line is
+terminated by a newline, or by a null character `\0` when `-z` or `--null` is
+specified.
+
+
+MAPPING AUTHORS
+---
+
+include::mailmap.txt[]
+
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 0600eb4..ef442eb 100644
--- a/Makefile
+++ b/Makefile
@@ -912,6 +912,7 @@ BUILTIN_OBJS += builtin/bundle.o
 BUILTIN_OBJS += builtin/cat-file.o
 BUILTIN_OBJS += builtin/check-attr.o
 BUILTIN_OBJS += builtin/check-ignore.o
+BUILTIN_OBJS += builtin/check-mailmap.o
 BUILTIN_OBJS += builtin/check-ref-format.o
 BUILTIN_OBJS += builtin/checkout-index.o
 BUILTIN_OBJS += builtin/checkout.o
diff --git a/builtin.h b/builtin.h
index 1ed8edb..8afa2de 100644
--- a/builtin.h
+++ b/builtin.h
@@ -40,6 +40,7 @@ extern int cmd_checkout(int argc, const char **argv, const 
char *prefix);
 extern int cmd_checkout_index(int argc, const char **argv, const char *prefix);
 extern int cmd_check_attr(int argc, const char **argv, const char *prefix);
 extern int cmd_check_ignore(int argc, const char **argv, const char *prefix);
+extern int cmd_check_mailmap(int argc, const char **argv, const char *prefix);
 extern int cmd_check_ref_format(int argc, const char **argv, const char 
*prefix);
 extern int cmd_cherry(int argc, const char **argv, const char *prefix);
 extern int cmd_cherry_pick(int argc, const char **argv, const char *prefix);
diff --git a/builtin/check-mailmap.c b/builtin/check-mailmap.c
new file mode 100644
index 000..c36a61c
--- /dev/null
+++ b/builtin/check-mailmap.c
@@ -0,0 +1,69 @@
+#include builtin.h
+#include mailmap.h
+#include parse-options.h
+#include string-list.h
+
+static int use_stdin;
+static int null_out;
+static const char * const check_mailmap_usage[] = {
+N_(git check-mailmap [options] contact...),
+NULL
+};
+
+static const struct option check_mailmap_options[] = {
+   OPT_BOOL(0, stdin, use_stdin, N_(also read contacts from stdin)),
+   OPT_BOOL('z', null, null_out, N_(null-terminate output lines)),
+   OPT_END()
+};
+
+static void check_mailmap(struct string_list *mailmap, const char *contact)
+{
+   const char *name, *mail;
+   size_t namelen, maillen;
+   struct ident_split ident;
+   char term = null_out ? '\0' : '\n';
+
+   if (split_ident_line(ident, contact, strlen(contact)))
+   die(_(unable to parse contact: %s), contact);
+
+   name = 

[PATCH v2 3/4] t4203: test mailmap functionality directly rather than indirectly

2013-07-11 Thread Eric Sunshine
With the introduction of check-mailmap, it is now possible to check
.mailmap functionality directly rather than indirectly as a side-effect
of other commands (such as git-shortlog), therefore, do so.

Signed-off-by: Eric Sunshine sunsh...@sunshineco.com
---
 t/t4203-mailmap.sh | 133 ++---
 1 file changed, 45 insertions(+), 88 deletions(-)

diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 8645492..48a000b 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -74,128 +74,96 @@ test_expect_success 'check-mailmap bogus contact' '
 '
 
 cat expect \EOF
-A U Thor (1):
-  initial
-
-nick1 (1):
-  second
-
+A U Thor aut...@example.com
+nick1 b...@company.xx
 EOF
 
 test_expect_success 'No mailmap' '
-   git shortlog HEAD actual 
+   git check-mailmap --stdin contacts actual 
test_cmp expect actual
 '
 
 cat expect \EOF
-Repo Guy (1):
-  initial
-
-nick1 (1):
-  second
-
+Repo Guy aut...@example.com
+nick1 b...@company.xx
 EOF
 
 test_expect_success 'default .mailmap' '
echo Repo Guy aut...@example.com  .mailmap 
-   git shortlog HEAD actual 
+   git check-mailmap --stdin contacts actual 
test_cmp expect actual
 '
 
 # Using a mailmap file in a subdirectory of the repo here, but
 # could just as well have been a file outside of the repository
 cat expect \EOF
-Internal Guy (1):
-  second
-
-Repo Guy (1):
-  initial
-
+Repo Guy aut...@example.com
+Internal Guy b...@company.xx
 EOF
 test_expect_success 'mailmap.file set' '
mkdir -p internal_mailmap 
echo Internal Guy b...@company.xx  internal_mailmap/.mailmap 
git config mailmap.file internal_mailmap/.mailmap 
-   git shortlog HEAD actual 
+   git check-mailmap --stdin contacts actual 
test_cmp expect actual
 '
 
 cat expect \EOF
-External Guy (1):
-  initial
-
-Internal Guy (1):
-  second
-
+External Guy aut...@example.com
+Internal Guy b...@company.xx
 EOF
 test_expect_success 'mailmap.file override' '
echo External Guy aut...@example.com  internal_mailmap/.mailmap 
git config mailmap.file internal_mailmap/.mailmap 
-   git shortlog HEAD actual 
+   git check-mailmap --stdin contacts actual 
test_cmp expect actual
 '
 
 cat expect \EOF
-Repo Guy (1):
-  initial
-
-nick1 (1):
-  second
-
+Repo Guy aut...@example.com
+nick1 b...@company.xx
 EOF
 
 test_expect_success 'mailmap.file non-existent' '
rm internal_mailmap/.mailmap 
rmdir internal_mailmap 
-   git shortlog HEAD actual 
+   git check-mailmap --stdin contacts actual 
test_cmp expect actual
 '
 
 cat expect \EOF
-Internal Guy (1):
-  second
-
-Repo Guy (1):
-  initial
-
+Repo Guy aut...@example.com
+Internal Guy b...@company.xy
 EOF
 
 test_expect_success 'name entry after email entry' '
mkdir -p internal_mailmap 
echo b...@company.xy b...@company.xx internal_mailmap/.mailmap 
echo Internal Guy b...@company.xx internal_mailmap/.mailmap 
-   git shortlog HEAD actual 
+   git check-mailmap --stdin contacts actual 
test_cmp expect actual
 '
 
 cat expect \EOF
-Internal Guy (1):
-  second
-
-Repo Guy (1):
-  initial
-
+Repo Guy aut...@example.com
+Internal Guy b...@company.xy
 EOF
 
 test_expect_success 'name entry after email entry, case-insensitive' '
mkdir -p internal_mailmap 
echo b...@company.xy b...@company.xx internal_mailmap/.mailmap 
echo Internal Guy b...@company.xx internal_mailmap/.mailmap 
-   git shortlog HEAD actual 
+   git check-mailmap --stdin contacts actual 
test_cmp expect actual
 '
 
 cat expect \EOF
-A U Thor (1):
-  initial
-
-nick1 (1):
-  second
-
+A U Thor aut...@example.com
+nick1 b...@company.xx
 EOF
 test_expect_success 'No mailmap files, but configured' '
rm -f .mailmap internal_mailmap/.mailmap 
-   git shortlog HEAD actual 
+   git check-mailmap --stdin contacts actual 
test_cmp expect actual
 '
 
@@ -217,54 +185,43 @@ test_expect_success 'setup mailmap blob tests' '
 
 test_expect_success 'mailmap.blob set' '
cat expect -\EOF 
-   Blob Guy (1):
- second
-
-   Repo Guy (1):
- initial
-
+   Repo Guy aut...@example.com
+   Blob Guy b...@company.xx
EOF
-   git -c mailmap.blob=map:just-bugs shortlog HEAD actual 
+   git -c mailmap.blob=map:just-bugs check-mailmap --stdin \
+   contacts actual 
test_cmp expect actual
 '
 
 test_expect_success 'mailmap.blob overrides .mailmap' '
cat expect -\EOF 
-   Blob Guy (2):
- initial
- second
-
+   Blob Guy aut...@example.com
+   Blob Guy b...@company.xx
EOF
-   git -c mailmap.blob=map:both shortlog HEAD actual 
+   git -c mailmap.blob=map:both check-mailmap --stdin \
+   contacts  actual 
test_cmp expect actual
 '
 

[PATCH v2 2/4] t4203: test check-mailmap command invocation

2013-07-11 Thread Eric Sunshine
Test the command-line interface of check-mailmap. (Actual .mailmap
functionality is already covered by existing tests.)

Signed-off-by: Eric Sunshine sunsh...@sunshineco.com
---
 t/t4203-mailmap.sh | 50 ++
 1 file changed, 50 insertions(+)

diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 842b754..8645492 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -13,6 +13,11 @@ fuzz_blame () {
 }
 
 test_expect_success setup '
+   cat contacts -\EOF 
+   A U Thor aut...@example.com
+   nick1 b...@company.xx
+   EOF
+
echo one one 
git add one 
test_tick 
@@ -23,6 +28,51 @@ test_expect_success setup '
git commit --author nick1 b...@company.xx -m second
 '
 
+test_expect_success 'check-mailmap no arguments' '
+   test_must_fail git check-mailmap
+'
+
+test_expect_success 'check-mailmap arguments' '
+   cat expect -\EOF 
+   A U Thor aut...@example.com
+   nick1 b...@company.xx
+   EOF
+   git check-mailmap \
+   A U Thor aut...@example.com \
+   nick1 b...@company.xx actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'check-mailmap --stdin' '
+   cat expect -\EOF 
+   A U Thor aut...@example.com
+   nick1 b...@company.xx
+   EOF
+   git check-mailmap --stdin contacts actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'check-mailmap --stdin arguments' '
+   cat expect -\EOF 
+   Internal Guy b...@company.xy
+   EOF
+   cat contacts expect 
+   git check-mailmap --stdin Internal Guy b...@company.xy \
+   contacts actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'check-mailmap -z' '
+   printf A U Thor aut...@example.com\0 expect 
+   printf nick1 b...@company.xx\0 expect 
+   git check-mailmap -z --stdin contacts actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'check-mailmap bogus contact' '
+   test_must_fail git check-mailmap bogus
+'
+
 cat expect \EOF
 A U Thor (1):
   initial
-- 
1.8.3.2

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/4] t4203: consolidate test-repository setup

2013-07-11 Thread Eric Sunshine
The simple two-commit test-repository created by 'setup' is no longer
needed by any of the tests retrofitted to use check-mailmap. Subsequent,
more complex tests of git-shortlog, git-log, and git-blame functionality
expand the repository by adding five commits. Consolidate the creation
of this seven-commit repository into a single setup function.

Signed-off-by: Eric Sunshine sunsh...@sunshineco.com
---
 t/t4203-mailmap.sh | 53 +++--
 1 file changed, 27 insertions(+), 26 deletions(-)

diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 48a000b..9c3c83a 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -22,10 +22,36 @@ test_expect_success setup '
git add one 
test_tick 
git commit -m initial 
+
echo two one 
git add one 
test_tick 
-   git commit --author nick1 b...@company.xx -m second
+   git commit --author nick1 b...@company.xx -m second 
+
+   echo three one 
+   git add one 
+   test_tick 
+   git commit --author nick2 b...@company.xx -m third 
+
+   echo four one 
+   git add one 
+   test_tick 
+   git commit --author nick2 ni...@company.xx -m fourth 
+
+   echo five one 
+   git add one 
+   test_tick 
+   git commit --author santa m...@company.xx -m fifth 
+
+   echo six one 
+   git add one 
+   test_tick 
+   git commit --author claus m...@company.xx -m sixth 
+
+   echo seven one 
+   git add one 
+   test_tick 
+   git commit --author CTO c...@coompany.xx -m seventh
 '
 
 test_expect_success 'check-mailmap no arguments' '
@@ -276,31 +302,6 @@ Some Dude s...@dude.xx (1):
 EOF
 
 test_expect_success 'Shortlog output (complex mapping)' '
-   echo three one 
-   git add one 
-   test_tick 
-   git commit --author nick2 b...@company.xx -m third 
-
-   echo four one 
-   git add one 
-   test_tick 
-   git commit --author nick2 ni...@company.xx -m fourth 
-
-   echo five one 
-   git add one 
-   test_tick 
-   git commit --author santa m...@company.xx -m fifth 
-
-   echo six one 
-   git add one 
-   test_tick 
-   git commit --author claus m...@company.xx -m sixth 
-
-   echo seven one 
-   git add one 
-   test_tick 
-   git commit --author CTO c...@coompany.xx -m seventh 
-
mkdir -p internal_mailmap 
echo Committed commit...@example.com  internal_mailmap/.mailmap 
echo c...@company.xx   c...@coompany.xx  
internal_mailmap/.mailmap 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/4] add git-check-mailmap command

2013-07-11 Thread Eric Sunshine
This is a re-roll of [1] which introduces git-check-mailmap. The
primary motivation for this command is to expose git's stable,
well-tested C-implementation of .mailmap functionality to scripts and
porcelains, thus relieving them of the need to reimplement support
themselves. The git-contacts [2] script (currently at es/contacts in
'pu') would be one such client. v2 removes the RFC status and addresses
comments from reviewers [1].

Changes since v1:

* Add Documentation/git-check-mailmap.txt.

* Add --null alias for -z.

* Use OPT_BOOL rather than deprecated OPT_BOOLEAN.

* Simplify code which outputs normalized contact.

* Settle on stdout as argument to maybe_flush_or_die().

* Eliminate diff noise from patch 4/4.

[1]: http://thread.gmane.org/gmane.comp.version-control.git/230068/
[2]: http://thread.gmane.org/gmane.comp.version-control.git/229533/

Eric Sunshine (4):
  builtin: add git-check-mailmap command
  t4203: test check-mailmap command invocation
  t4203: test mailmap functionality directly rather than indirectly
  t4203: consolidate test-repository setup

 .gitignore |   1 +
 Documentation/git-check-mailmap.txt|  55 
 Makefile   |   1 +
 builtin.h  |   1 +
 builtin/check-mailmap.c|  69 ++
 command-list.txt   |   1 +
 contrib/completion/git-completion.bash |   1 +
 git.c  |   1 +
 t/t4203-mailmap.sh | 234 +
 9 files changed, 251 insertions(+), 113 deletions(-)
 create mode 100644 Documentation/git-check-mailmap.txt
 create mode 100644 builtin/check-mailmap.c

-- 
1.8.3.2

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bug: Failure if stdin is closed.

2013-07-11 Thread Dale R. Worley
(The original problem and the discussion that ensued is on the
git-users mailing list:
https://groups.google.com/forum/#!topic/git-users/lNQ7Cn35EqA)

git commit (and probably other operations) fail if standard input
(fd 0) is closed when git starts.  A simple test case follows.  (The
execution is version 1.7.7.6, but the code listed below is from a very
recent commit.)


$ git --version
git version 1.7.7.6
$ git status
# On branch master
nothing to commit (working directory clean)
$ echo This is a test 
$ git status
# On branch master
# Untracked files:
#   (use git add file... to include in what will be committed)
#
#   
nothing added to commit but untracked files present (use git add to track)
$ git add 
$ # The notation 0- means close standard input (fd 0) in the process that
$ # executes this command.  It may be specific to the bash shell.
$ git commit -m  0-
error: unable to create temporary sha1 filename : No such file or directory

error: Error building trees
$ git status
# On branch master
# Changes to be committed:
#   (use git reset HEAD file... to unstage)
#
#   new file:   
#
$ git commit -m 
[master 54c146c] 
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 
$ git status
# On branch master
# Your branch is ahead of 'origin/master' by 1 commit.
#
nothing to commit (working directory clean)
$ 


The fundamental error is that git_mkstemps_mode() in wrapper.c assumes
that if open() is successful, its return value must be 0.  That is
not true, because if fd 0 is closed, then open() can successfully
return 0.  I have not tested it, but it looks like the fix is:

 int git_mkstemps_mode(char *pattern, int suffix_len, int mode)
 {
 static const char letters[] =
 abcdefghijklmnopqrstuvwxyz
 ABCDEFGHIJKLMNOPQRSTUVWXYZ
 0123456789;
 static const int num_letters = 62;
 uint64_t value;
 struct timeval tv;
 char *template;
 size_t len;
 int fd, count;

 len = strlen(pattern);

 if (len  6 + suffix_len) {
 errno = EINVAL;
 return -1;
 }

 if (strncmp(pattern[len - 6 - suffix_len], XX, 6)) {
 errno = EINVAL;
 return -1;
 }

 /*
  * Replace pattern's XX characters with randomness.
  * Try TMP_MAX different filenames.
  */
 gettimeofday(tv, NULL);
 value = ((size_t)(tv.tv_usec  16)) ^ tv.tv_sec ^ getpid();
 template = pattern[len - 6 - suffix_len];
 for (count = 0; count  TMP_MAX; ++count) {
 uint64_t v = value;
 /* Fill in the random bits. */
 template[0] = letters[v % num_letters]; v /= num_letters;
 template[1] = letters[v % num_letters]; v /= num_letters;
 template[2] = letters[v % num_letters]; v /= num_letters;
 template[3] = letters[v % num_letters]; v /= num_letters;
 template[4] = letters[v % num_letters]; v /= num_letters;
 template[5] = letters[v % num_letters]; v /= num_letters;

 fd = open(pattern, O_CREAT | O_EXCL | O_RDWR, mode);
-if (fd  0)
+if (fd = 0)
 return fd;
 /*
  * Fatal error (EPERM, ENOSPC etc).
  * It doesn't make sense to loop.
  */
 if (errno != EEXIST)
 break;
 /*
  * This is a random value.  It is only necessary that
  * the next TMP_MAX values generated by adding  to
  * VALUE are different with (module 2^32).
  */
 value += ;
 }
 /* We return the null string if we can't find a unique file name.  */
 pattern[0] = '\0';
 return -1;
 }


However, when looking at the code, I noticed that few of the functions
have comments describing what they do, and none describe their input
and output values.  In particular, there are no comments specifying
what the error return values are.  This is appalling for a supposedly
professional-quality project!

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] show-ref: make --head always show the HEAD ref

2013-07-11 Thread Junio C Hamano
Doug Bell madcity...@gmail.com writes:

 The docs seem to say that doing

   git show-ref --head --tags

 would show both the HEAD ref and all the tag refs. However, doing
 both --head and either of --tags or --heads would filter out the HEAD
 ref.

 Signed-off-by: Doug Bell madcity...@gmail.com
 ---

I think this patch fell through the cracks, and looking at it, I am
somewhat torn.

The command help for --head says show the HEAD reference, which
may mean:

 (1) in addition to everything else the command would do if there
 weren't this option specified, show HEAD;

 (2) show the HEAD and nothing else; or

 (3) add HEAD to the candidates to be shown, but apply the usual
 filtering rule based on --heads, --tags and/or pattern
 arguments.

While the last interpretation is what we have used since the
beginning of the command at 358ddb62 (Add git show-ref builtin
command, 2006-09-15), I tend to agree with you that the first
interpretation may make more sense, at least from the end user's
point of view.

But at a deeper level, it makes the command somewhat inconsistent.

What happens in the command is

 - We iterate over candidates to be shown, which is usually
   everything under refs/, but with --head, HEAD is added to
   this set.  For each of these candidates:

   - if one or more pattern parameters are given, reject the
 candidate ref if it does not tail-match with any of the
 patterns;

   - if either --heads or --tags is given, among the ones that
 pass pattern filter, check if they:

 - begin with refs/heads/ (if --heads is given); or
 - begin with refs/tags/ (if --tags is given).

 and reject those that don't.

   - show it if it is still surviving after these two tests.

And taht is why git show-ref --tags master v1.3.0 shows only the
v1.3.0 tag without showing the master branch, and giving --heads
instead of --tags yields only the master branch without the tag.

The semantics your patch wants, by changing the definition of
--head from (3) to (1), is:

 - If --head is given, show HEAD no matter what.

 - Iterate over everything under refs/, and for each of them, do the
   same filter-and-show as we currently do (see above).

While I think the new semantics is also understandable as the
current one, and personally I think it is a better behaviour than
the current one, it will require an update to the document to
highlight that --head is special-cased in a big way, to bypass all
the filtering that is applied to normal refs.

A few additional observations (these are not complaints to this
patch and please do not take them as such):

 - The command help says (can be combined with heads) for --tags
   and vice versa, but does not mention their interaction with
   --head.  This is because we take interpretation (3) above and
   do not treat --head as a mechanism to add to pattern
   parameter like these two.

 - The command help for --heads and --tags says only show
   heads/tags, which technically does not contradict with can be
   combined with above, but a logical consequence of combining
   ought to be showing nothing, as a ref cannot be a head (an old
   nomenclature for a branch) and a tag at the same time.  

I think we should find a word better than only to use here, but I
am not sure what would be a good phrase to use.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] upload-pack: Remove a piece of dead code

2013-07-11 Thread Junio C Hamano
Duy Nguyen pclo...@gmail.com writes:

 On Thu, Jul 11, 2013 at 6:25 PM, Matthijs Kooijman matth...@stdin.nl wrote:
 Commit 682c7d2 (upload-pack: fix off-by-one depth calculation in shallow
 clone) introduced a new check in get_shallow_commits to decide when to
 stop traversing the history and mark the current commit as a shallow
 root.

 With this new check in place, the old check can no longer be true, since
 the first check always fires first. This commit removes that check,
 making the code a bit more simple again.

 True. Ack-by: me.

 Signed-off-by: Matthijs Kooijman matth...@stdin.nl

Yeah, thanks both.  I tend to agree that 2 and 3 are the right
change that came too late after the ship sailed X-(.

 ---
  shallow.c | 17 ++---
  1 file changed, 6 insertions(+), 11 deletions(-)

 diff --git a/shallow.c b/shallow.c
 index cbe2526..8a9c96d 100644
 --- a/shallow.c
 +++ b/shallow.c
 @@ -110,17 +110,12 @@ struct commit_list *get_shallow_commits(struct 
 object_array *heads, int depth,
 continue;
 *pointer = cur_depth;
 }
 -   if (cur_depth  depth) {
 -   if (p-next)
 -   add_object_array(p-item-object,
 -   NULL, stack);
 -   else {
 -   commit = p-item;
 -   cur_depth = *(int *)commit-util;
 -   }
 -   } else {
 -   commit_list_insert(p-item, result);
 -   p-item-object.flags |= shallow_flag;
 +   if (p-next)
 +   add_object_array(p-item-object,
 +   NULL, stack);
 +   else {
 +   commit = p-item;
 +   cur_depth = *(int *)commit-util;
 }
 }
 }
 --
 1.8.3.rc1

 --
 Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Priming git clone with a local repo?

2013-07-11 Thread Junio C Hamano
Andreas Krey a.k...@gmx.de writes:

 I'm wondering if there is (or will be) a way of doing almost

   git clone --reference localrepo host:canonrep

 Basically, I don't want the implications of --reference but still the
 performance advantages of reusing local objects/pack files.

I think the standard operating procedure for that is to still clone
with the --reference option first, and then do repack -a -d in
the resulting clone (note the lack of -l option in repack).

After that repack, you should be able to discard the alternates
file.

$ git clone --reference ../there otherhost:repo ours.git
$ cd ours.git
$ git repack -a -d ;# no -l!!!
$ mv .git/objects/info/alternates .git/objects/info/alter-nates
$ git fsck
$ rm .git/objects/info/alter-nates
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t0008 hang on streaming test (OS X)

2013-07-11 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Wed, Jul 10, 2013 at 12:36:40PM -0400, Brian Gernhardt wrote:

 The newest test in t0008 streaming support for --stdin, seems to
 hang sporadically on my MacBook Pro (running 10.8.4).  The hang seems
 to be related to running it in parallel with other tests, as I can
 only reliably cause it by running with prove  and -j 3.  However, once
 that has hung I am able to semi-reliably have it occur by running the
 test separately (with the test hung in the background, using separate
 trash directories via the --root option).

 I can't replicate the hang here (on Linux) doing:

   for i in `seq 1 30`; do
   ./t0008-ignores.sh --root=/tmp/foo/$i 
   done

It seems to hang on me with bash, but not dash, here.

 Do you know which test it is hanging on? You mentioned that you can
 replicate it outside of prove; what does running with -v say?

 The last test in t0008, with the fifos, would make me the most
 suspicious. The way we do it _should_ be fine, but I'm wondering if the
 shell is blocking in exec here:

   mkfifo in out 
   (git check-ignore -n -v --stdin in out ) 
   exec 9in 

 That is, if the fifo is not opened for some reason by the backgrounded
 process (there's a race, of course, but the outer shell should just
 block until the sub-shell actually opens it). I wonder if the
 descriptor-opening behavior of:

   cmd in out 

 is different between shells (that is, if it backgrounds the opening of
 in and out on some shells, but not on others). But then I would expect
 it to fail consistently.

 Just for fun, does switching the middle line there to:

   (sh -c git check-ignore -n -v --stdin in out ) 

 have any effect?

 -Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t0008 hang on streaming test (OS X)

2013-07-11 Thread Brian Gernhardt

On Jul 11, 2013, at 9:34 AM, Jeff King p...@peff.net wrote:

 On Wed, Jul 10, 2013 at 12:36:40PM -0400, Brian Gernhardt wrote:
 
 The newest test in t0008 streaming support for --stdin,

 Experimentation has led me to find that it is hanging when trying to read 
 the 2nd response from check-ignore.


 Do you know which test it is hanging on? You mentioned that you can
 replicate it outside of prove; what does running with -v say?
 
 The last test in t0008, with the fifos, would make me the most
 suspicious.

The 2nd `read response out` line is where it was hanging, based on a variety 
of echos added to the test.

~~ Brian

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t0008 hang on streaming test (OS X)

2013-07-11 Thread Brian Gernhardt

On Jul 10, 2013, at 4:35 PM, Antoine Pelisse apeli...@gmail.com wrote:

 On Wed, Jul 10, 2013 at 6:36 PM, Brian Gernhardt
 br...@gernhardtsoftware.com wrote:
 I am somewhat stuck on how to fix it.  Any ideas?
 
 I don't have anything to reproduce here, but usually I start
 investigating this kind of problems by attaching the hung process with
 gdb to see the current state (if it's stuck in a specific state), or
 to investigate the end-less loop.
 That usually help finding a good starting point.

Unfortunately, the hung process is /bin/sh (aka bash).  Using the Sample 
function of Activity Monitor gave me that 100% of the time was spent in libc's 
_open...  Which enlightens me not at all.

~~ Brian Gernhardt

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] cat-file: add --batch-disk-sizes option

2013-07-11 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 I started on this, and it turned out not to really be any simpler
 So I went ahead with the full formats for my re-roll. It turned out
 pretty reasonable, I think.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] blame: accept multiple -L ranges

2013-07-11 Thread Junio C Hamano
Thomas Rast tr...@inf.ethz.ch writes:

 But still, log -L should then be changed to match this behavior (for all
 args affecting a single file).  Currently it always does the scan for
 the start of the range from line 1 of the file.

Thanks, I think that makes sense.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Jul 2013, #03; Tue, 9)

2013-07-11 Thread Junio C Hamano
Kyle McKay mack...@gmail.com writes:

 On Jul 9, 2013, at 16:09, Junio C Hamano wrote:
 * km/svn-1.8-serf-only (2013-07-07) 2 commits
 - git-svn: allow git-svn fetching to work using serf
 - Git.pm: add new temp_is_locked function

 Comments?


 Since neither David nor Jonathan have piped in here (they were the two
 primarily involved in the discussion).

 On Jul 8, 2013, at 09:22, Junio C Hamano wrote:
 Kyle J. McKay mack...@gmail.com writes:

 From: Kyle J. McKay mack...@gmail.com
 Subject: Re: [PATCH v3 0/2] allow git-svn fetching to work using serf

 This patch allows git-svn to fetch successfully using the
 serf library when given an https?: url to fetch from.

 [...]

 Thanks; I've queued this version to 'pu' at least tentatively.

 Is everybody who discussed the issue happy with the direction of
 this patch?

 I will add that David previously indicated this patch works for him:

 On Jul 6, 2013, at 00:17, David Rothenberger wrote:
 On 7/5/2013 8:41 PM, Kyle McKay wrote:
 This patch allows git-svn to fetch successfully using the
 serf library when given an https?: url to fetch from.

 Thanks, Kyle. I confirm this is working for my problem cases as
 well.

 Subversion 1.8.0 was released less than a month ago on 2013-06-18 so
 there probably aren't too many git-svn users affected by this just
 yet.

Alright.  Let's advance it to 'next' and see what happens.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Jul 2013, #03; Tue, 9)

2013-07-11 Thread Junio C Hamano
Matthieu Moy matthieu@imag.fr writes:

 Junio C Hamano gits...@pobox.com writes:

 * bp/mediawiki-preview (2013-07-08) 7 commits
  - git-remote-mediawiki: add preview subcommand into git mw
  - git-remote-mediawiki: add git-mw command
  - git-remote-mediawiki: factoring code between git-remote-mediawiki and 
 Git::Mediawiki
  - git-remote-mediawiki: update tests to run with the new bin-wrapper
  - git-remote-mediawiki: add a git bin-wrapper for developement
  - wrap-for-bin: make bin-wrappers chainable
  - git-remote-mediawiki: introduction of Git::Mediawiki.pm

  Looks like this is in a fairly good shape?

 Yes it is. I think all remarks have been taken into account.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] config: add support for http.url.* settings

2013-07-11 Thread Kyle J. McKay
From: Kyle J. McKay mack...@gmail.com

The url value is considered a match to a url if the url value
is either an exact match or a prefix of the url which ends on a
path component boundary ('/').  So https://example.com/test; will
match https://example.com/test; and https://example.com/test/too;
but not https://example.com/testextra;.

Longer matches take precedence over shorter matches with
environment variable settings taking precedence over all.

With this configuration:

[http]
useragent = other-agent
noEPSV = true
[http https://example.com;]
useragent = example-agent
sslVerify = false
[http https://example.com/path;]
useragent = path-agent

The https://other.example.com/; url will have useragent
other-agent and sslVerify will be on.

The https://example.com/; url will have useragent
example-agent and sslVerify will be off.

The https://example.com/path/sub; url will have useragent
path-agent and sslVerify will be off.

All three of the examples will have noEPSV enabled.

Signed-off-by: Kyle J. McKay mack...@gmail.com
---

The credentials configuration values already support url-specific
configuration items in the form credential.url.*.  This patch
adds similar support for http configuration values.

 Documentation/config.txt |  11 
 http.c   | 140 +--
 2 files changed, 134 insertions(+), 17 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index b4d4887..3731a3a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1531,6 +1531,17 @@ http.useragent::
of common USER_AGENT strings (but not including those like git/1.7.1).
Can be overridden by the 'GIT_HTTP_USER_AGENT' environment variable.
 
+http.url.*::
+   Any of the http.* options above can be applied selectively to some urls.
+   For example http.https://example.com.useragent; would set the user
+   agent only for https connections to example.com.  The url value
+   matches a url if it is an exact match or a prefix of the url matching
+   at a '/' boundary.  Longer url matches take precedence over shorter
+   ones with the environment variable settings taking precedence over all.
+   Note that url must match the url exactly (other than possibly being a
+   prefix).  This means any user, password and/or port setting that appears
+   in a url must also appear in url to have a successful match.
+
 i18n.commitEncoding::
Character encoding the commit messages are stored in; Git itself
does not care per se, but this information is necessary e.g. when
diff --git a/http.c b/http.c
index 2d086ae..68dc789 100644
--- a/http.c
+++ b/http.c
@@ -30,6 +30,34 @@ static CURL *curl_default;
 
 char curl_errorstr[CURL_ERROR_SIZE];
 
+enum http_option_type {
+   opt_post_buffer = 0,
+   opt_min_sessions,
+#ifdef USE_CURL_MULTI
+   opt_max_requests,
+#endif
+   opt_ssl_verify,
+   opt_ssl_try,
+   opt_ssl_cert,
+#if LIBCURL_VERSION_NUM = 0x070903
+   opt_ssl_key,
+#endif
+#if LIBCURL_VERSION_NUM = 0x070908
+   opt_ssl_capath,
+#endif
+   opt_ssl_cainfo,
+   opt_low_speed,
+   opt_low_time,
+   opt_no_epsv,
+   opt_http_proxy,
+   opt_cookie_file,
+   opt_user_agent,
+   opt_passwd_req,
+   opt_max
+};
+
+static int http_option_maxlen[opt_max];
+
 static int curl_ssl_verify = -1;
 static int curl_ssl_try;
 static const char *ssl_cert;
@@ -141,34 +169,99 @@ static void process_curl_messages(void)
 }
 #endif
 
+static int http_options_url_match_prefix(const char *url, const char 
*url_prefix)
+{
+   /*
+* url_prefix matches url if url_prefix is an exact match for url or it
+* is a prefix of url and the match ends on a path component boundary.
+* url_prefix is considered to have an implicit '/' on the end for
+* matching purposes if it does not already and it is shorter than url.
+* the return value is the length of the match in characters (excluding
+* any final '/') or 0 for no match.
+*/
+   size_t url_len, url_prefix_len = strlen(url_prefix);
+   if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
+   return 0;
+   url_len = strlen(url);
+   if (url_len == url_prefix_len || url[url_prefix_len - 1] == '/' || 
url[url_prefix_len] == '/')
+   return url[url_prefix_len - 1] == '/' ? url_prefix_len - 1 : 
url_prefix_len;
+   return 0;
+}
+
 static int http_options(const char *var, const char *value, void *cb)
 {
-   if (!strcmp(http.sslverify, var)) {
+/*
+ * Macro to ignore matches with a match length less than previously seen
+ * for the same option type and to remember the largest match length seen so
+ * far for each option type
+ */
+#define CHECK_MATCHLEN(opt) \
+   if (http_option_maxlen[opt]  matchlen) return 0; \
+   else http_option_maxlen[opt] = 

Re: Bug: Failure if stdin is closed.

2013-07-11 Thread Dale Worley
 From: Thomas Rast tr...@inf.ethz.ch

 May I ask why you need this, and to what extent this problem cannot be
 solved by instead redirecting from/to /dev/null?

The situation in which the problem arose is described here:

 wor...@alum.mit.edu (Dale R. Worley) writes:
 
  (The original problem and the discussion that ensued is on the
  git-users mailing list:
  https://groups.google.com/forum/#!topic/git-users/lNQ7Cn35EqA)

It involves invoking git from Apache.  I haven't read up on the
details because I didn't need to in order to debug the problem.

 Closing 2 usually has even funkier consequences if the program proceeds
 to open some other file, it ends up as fd 2, and it then dies with an
 error.  In that sense it might be saner to simply die whenever open()
 gives an FD in the 0..2 range (and we weren't explicitly trying to
 reopen one of them).

True...  But fd 2 may be needed under many unpredictable
circumstances.  In regard to fd 0, we can predict that standard input
(per se) will not be needed, so it's anti-robust to require that it be
open for the code to function at all.

 However, does it fully fix the issue you describe?  What if you then run
 'git checkout -F -' to read the message from stdin?

Obviously, if the git command explicitly requires use of
standard-input, then standard-input needs to be open.

  However, when looking at the code, I noticed that few of the functions
  have comments describing what they do, and none describe their input
  and output values.  In particular, there are no comments specifying
  what the error return values are.  This is appalling for a supposedly
  professional-quality project!
 
 You are touching on a sore point of the git code base.  Some
 contributors have made a point of adding comments where appropriate, so
 we're improving, but round tuits are in short supply as always.  If you
 can supply such tuits, they would be appreciated.

I will try to put my money where my mouth is.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] Corrections to the mailmap file

2013-07-11 Thread Junio C Hamano
Stefan Beller stefanbel...@googlemail.com writes:

 I noticed many duplicates in email addresses but having the same name by
 running:

 # Finding out duplicates by comparing names:
 git shortlog -sne |awk '{ NF--; $1=; print }' |sort |uniq -d

 Most of these entries are most probably the same person, but we cannot be 
 sure, as there might be different persons having the same name, then they
 are only distinguished by the mail address.

 However I suspect most of these to be the same person, having changed 
 mail addresses.

 Here comes an initial batch of corrections to the mailmap file, which
 maps people with email addresses of different capitalization onto
 the same entity.
 (Example n...@mit.edu is the same as n...@mit.edu)

 I intend to contact each of the persons individually and ask whether 
 just their mail address changed, or if they are indeed different persons.

Has anything happen to this topic sice you posted?

I think:

 - .(none) is obviously bad, and we can fix without waiting for
   responses as long as we know the replacement address is the
   address from the list we usually see on the list (3).

 - Domain part is defined to be case insensitive (e.g. @mit.edu vs
   @MIT.EDU), so both forms are equally valid.  The owner of the
   address may have preferences (1 and 4), though.

   For this, we could just declare we downcase the domain part.

 - The local-part, even though RFC 2821 says local-part of a
   mailbox MUST BE treated as case sensitive, is often case
   insensitive, and User.Name@domain and user.name@domain often name
   the same mailbox.  The owner of the address may have preferences
   (5 and 6), though.

   For this, we could just declare we Camel.Case the local part,
   after making sure Camel.Case@domain has been used by the owner of
   the address on this list.

So where does that leave us?

We can apply without waiting:

  1: downcase domain
  3: .(none) and we know pau...@samba.org
  4: downcase domain
  5: We recently saw Dshco calls himself Johannes.Schindelin@domain.
  6: The latest one from Nov 2009 uses Toby.Allsopp@domain.

The only possibly iffy one is Alex Riesen, but raa.lkml@domain
seems to be the one he uses here, so I think 2. is also fine.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 0/1] cygwin: Remove the Win32 l/stat() functions

2013-07-11 Thread Ramsay Jones
Torsten Bögershausen wrote:
 On 30.06.13 19:28, Ramsay Jones wrote:

[ ... ]

 You have just described my second patch! :D
 Unfortunately, I have not had any time to work on the patch this weekend.
 However, despite the patch being a bit rough around the edges, I decided
 to send it out (see below) to get some early feedback.

 Note that it passes the t3210, t3211, t5500, t3200, t3301, t7606 and t1301
 tests, but I have not run the full test suite.

 Comments welcome.

 ATB,
 Ramsay Jones

 -- 8 --
 Subject: [PATCH] cygwin: Add fast_[l]stat() functions

 Signed-off-by: Ramsay Jones ram...@ramsay1.demon.co.uk
 ---
  builtin/apply.c|  6 +++---
  builtin/commit.c   |  2 +-
  builtin/ls-files.c |  2 +-
  builtin/rm.c   |  2 +-
  builtin/update-index.c |  2 +-
  check-racy.c   |  2 +-
  compat/cygwin.c| 12 
  compat/cygwin.h| 10 +++---
  diff-lib.c |  2 +-
  diff.c |  2 +-
  entry.c|  4 ++--
  git-compat-util.h  | 13 +++--
  help.c |  5 +
  path.c |  9 +
  preload-index.c|  2 +-
  read-cache.c   |  6 +++---
  unpack-trees.c |  8 
  17 files changed, 36 insertions(+), 53 deletions(-)


[ ... ]

 I managed to apply your patch on next, and run the test suite before and after
 your patch.
 The performance running git status on a linux kernel is the same as in 
 v1.8.3.
 
 Running test suite did not show new failures.
 The following test cases had failed, and are fixed now:
  t1400-update-ref
  t3210-pack-refs
  t3211-peel-ref
  t3306-notes-prune
  t5304-prune
  t5404-tracking-branches
  t5500-fetch-pack
  t5505-remote
  t5514-fetch-multiple
  t5515-fetch-merge-logic
  t6030-bisect-porcelain
  t9300-fast-import
  t9903-bash-prompt
 
 In short: Thanks, This looks good to me.

Thank you for testing this! It is much appreciated.

I sent a v2 patch to the list last night. It passes the full
test suite for me on cygwin 1.5, although there appears to be
a slight performance problem on MinGW (perhaps!). :(

ATB,
Ramsay Jones



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/10] cat-file: split --batch input lines on whitespace

2013-07-11 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Wed, Jul 10, 2013 at 08:59:51PM +0530, Ramkumar Ramachandra wrote:

 Jeff King wrote:
git rev-list --objects HEAD |
git cat-file --batch-check='%(objectsize) %(text)'
 
 If anything, I would have expected %(rest), not %(text).  This atom is
 specific to commands that accept input via stdin (i.e. not log, f-e-r,
 branch, or anything else I can think of).

 I considered %(rest) as well. I don't have a strong preference.

 Also, this makes me wonder if %(field:0), %(field:1), and probably
 %(field:@) are good ideas.  Even if we go down that road, I don't
 think %(rest) is a problem per-se.

 I don't have a use for them, and even if we want to add them later, you
 would still want to support %(rest) for when the user wants to take the
 rest of the line verbatim without caring about field-splitting.

 To be honest, I do not see %(field) as all that useful. If you want to
 go about rearranging or selecting fields, that is what cut (or awk)
 is for.  Having fields means you need to specify field separators, and
 how runs of separators are treated. Other tools already do this.

Very true, and more importantly, you cannot still say my input
object name is at field N, not at the beginning, so that makes it
doubly dubious how %(field:$n) would be any useful.

 So it would (at best) save you from an extra cut invocation, whereas
 %(rest) gets you out of doing something much more difficult. Without it,
 information is lost from your pipeline (so you have to have tee to a
 separate pipeline, and then reassemble the pieces).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] Corrections to the mailmap file

2013-07-11 Thread Stefan Beller
On 07/11/2013 07:33 PM, Junio C Hamano wrote:
 Stefan Beller stefanbel...@googlemail.com writes:

 I noticed many duplicates in email addresses but having the same name by
 running:

 # Finding out duplicates by comparing names:
 git shortlog -sne |awk '{ NF--; $1=; print }' |sort |uniq -d

 Most of these entries are most probably the same person, but we cannot be 
 sure, as there might be different persons having the same name, then they
 are only distinguished by the mail address.

 However I suspect most of these to be the same person, having changed 
 mail addresses.

 Here comes an initial batch of corrections to the mailmap file, which
 maps people with email addresses of different capitalization onto
 the same entity.
 (Example n...@mit.edu is the same as n...@mit.edu)

 I intend to contact each of the persons individually and ask whether 
 just their mail address changed, or if they are indeed different persons.
 
 Has anything happen to this topic sice you posted?
 
 I think:
 
  - .(none) is obviously bad, and we can fix without waiting for
responses as long as we know the replacement address is the
address from the list we usually see on the list (3).
 
  - Domain part is defined to be case insensitive (e.g. @mit.edu vs
@MIT.EDU), so both forms are equally valid.  The owner of the
address may have preferences (1 and 4), though.
 
For this, we could just declare we downcase the domain part.
 
  - The local-part, even though RFC 2821 says local-part of a
mailbox MUST BE treated as case sensitive, is often case
insensitive, and User.Name@domain and user.name@domain often name
the same mailbox.  The owner of the address may have preferences
(5 and 6), though.
 
For this, we could just declare we Camel.Case the local part,
after making sure Camel.Case@domain has been used by the owner of
the address on this list.
 
 So where does that leave us?
 
 We can apply without waiting:
 
   1: downcase domain
   3: .(none) and we know pau...@samba.org
   4: downcase domain
   5: We recently saw Dshco calls himself Johannes.Schindelin@domain.
   6: The latest one from Nov 2009 uses Toby.Allsopp@domain.
 
 The only possibly iffy one is Alex Riesen, but raa.lkml@domain
 seems to be the one he uses here, so I think 2. is also fine.
 

To be honest, I did not continue to work on this. I was 
waiting for the first few patches to be reviewed, because 
I was not sure how important you all think this topic 
really is. I know projects, which frown upon such 
beautyfing commits. Hence I was waiting for an answer, 
whether such work is appreciated.

Anyway, last time I tried contributing to git, I was 
told to explicitely do it in the open. I did forget 
to CC the mailing list when asking Alex Riesen 
(2nd patch), whether it's all him.
I do see the benefit of the openess, when discussing code 
or documentation, but I wonder if you'd also like to see these
'Hello, are you the person having email x, y and z?' 
kind of mails  put on the mailing list as well. 
That would be 120 to go, which may be undesired?

So I'll start contacting the other people now.



 




--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] Corrections to the mailmap file

2013-07-11 Thread Junio C Hamano
Stefan Beller stefanbel...@googlemail.com writes:

 On 07/11/2013 07:33 PM, Junio C Hamano wrote:
 Stefan Beller stefanbel...@googlemail.com writes:
 ...
 I intend to contact each of the persons individually and ask whether 
 just their mail address changed, or if they are indeed different persons.
 
 Has anything happen to this topic sice you posted?
  ...
 I do see the benefit of the openess, when discussing code 
 or documentation, but I wonder if you'd also like to see these
 'Hello, are you the person having email x, y and z?' 
 kind of mails put on the mailing list as well. 

I read, from your 'I intend to contact ... individiually', that you
would have already sent such private inquiries.

I was reviewing my mailbox for possible fell into cracks topics,
and found that nothing seems to have happened since then. I guessed
that since they you may already have received responses to allow us
to move forward with this series, hence my ping.

I do not see a need for are you still at this address? to be made
in public.  It is fine and probably even preferrable to see a single
patch that corrects multiple .mailmap entries, with a proposed log
message that says I've contacted everybody whose address is
affected by this change and made sure the replacement address is
correct.  We trust you ;-)

Of course, you may want to Bcc: such a patch to them to give them
another chance to double-check as a courtesy.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/4] builtin: add git-check-mailmap command

2013-07-11 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 +DESCRIPTION
 +---
 +
 +For each ``Name $$email@address$$'' or ``$$email@address$$'' provided on
 +the command-line or standard input (when using `--stdin`), prints a line with
 +the canonical contact information for that person according to the 'mailmap'
 +(see Mapping Authors below). If no mapping exists for the person, echoes 
 the
 +provided contact information.

OK.  The last part needed a reading and a half for me to realize the
echoes the provided contact information is the same thing as the
input string is printed as-is, especially because contact
information is not defined anywhere in the previous part of the
DESCRIPTION section, though.  I might be worth starting the
paragraph with:

For each contact information (either in the form of ``Name
user@host'' or ...)

in order to clarify that the two forms of input is what you call
contact information.

 diff --git a/builtin/check-mailmap.c b/builtin/check-mailmap.c
 new file mode 100644
 index 000..c36a61c
 --- /dev/null
 +++ b/builtin/check-mailmap.c
 @@ -0,0 +1,69 @@
 +#include builtin.h
 +#include mailmap.h
 +#include parse-options.h
 +#include string-list.h
 +
 +static int use_stdin;
 +static int null_out;

Is there a reason why the variable name should not match that of the
corresponding variables in check-ignore and check-attr, which you
copied the bulk of the code from?

If there isn't, use null_term_line like they do.

 +static const char * const check_mailmap_usage[] = {
 +N_(git check-mailmap [options] contact...),
 +NULL
 +};

This mis-indentation looks somewhat unfortunate, but they are shared
with other check-blah, and they are meant to contain a rather long
string, so let's keep it like so.

 +static void check_mailmap(struct string_list *mailmap, const char *contact)
 +{
 + const char *name, *mail;
 + size_t namelen, maillen;
 + struct ident_split ident;
 + char term = null_out ? '\0' : '\n';

Is there a reason why the variable name term should not match that
of the corresponding variables in check-ignore and check-attr, which
you copied the bulk of the code from?

If there isn't, use line_termination like they do.

 + if (split_ident_line(ident, contact, strlen(contact)))
 + die(_(unable to parse contact: %s), contact);
 +
 + name = ident.name_begin;
 + namelen = ident.name_end - ident.name_begin;
 + mail = ident.mail_begin;
 + maillen = ident.mail_end - ident.mail_begin;
 +
 + map_user(mailmap, mail, maillen, name, namelen);
 +
 + if (namelen)
 + printf(%.*s , (int)namelen, name);
 + printf(%.*s%c, (int)maillen, mail, term);
 +}
 +
 +int cmd_check_mailmap(int argc, const char **argv, const char *prefix)
 +{
 + int i;
 + struct string_list mailmap = STRING_LIST_INIT_NODUP;
 +
 + git_config(git_default_config, NULL);
 + argc = parse_options(argc, argv, prefix, check_mailmap_options,
 + check_mailmap_usage, 0);

It is a bit distracting that the second line that is not indented
enough.  Either (1) with enough HT and with minimum number of SP,
align argc and check_mailmap_usage, or (2) with minimum number
of HT and no SP, push check_mailmap_usage to the right of opening
parenthesis of (argc.  Our code tend to prefer (1).

 + if (argc == 0  !use_stdin)
 + die(_(no contacts specified));
 +
 + read_mailmap(mailmap, NULL);
 +
 + for (i = 0; i  argc; ++i)
 + check_mailmap(mailmap, argv[i]);
 + maybe_flush_or_die(stdout, stdout);
 +
 + if (use_stdin) {
 + struct strbuf buf = STRBUF_INIT;
 + while (strbuf_getline(buf, stdin, '\n') != EOF) {
 + check_mailmap(mailmap, buf.buf);
 + maybe_flush_or_die(stdout, stdout);
 + }
 + strbuf_release(buf);
 + }
 +
 + clear_mailmap(mailmap);
 + return 0;
 +}
 diff --git a/command-list.txt b/command-list.txt
 index bf83303..08b04e2 100644
 --- a/command-list.txt
 +++ b/command-list.txt
 @@ -13,6 +13,7 @@ git-bundle  mainporcelain
  git-cat-fileplumbinginterrogators
  git-check-attr  purehelpers
  git-check-ignorepurehelpers
 +git-check-mailmap   purehelpers
  git-checkoutmainporcelain common
  git-checkout-index  plumbingmanipulators
  git-check-ref-formatpurehelpers
 diff --git a/contrib/completion/git-completion.bash 
 b/contrib/completion/git-completion.bash
 index ebc40d4..c118550 100644
 --- a/contrib/completion/git-completion.bash
 +++ b/contrib/completion/git-completion.bash
 @@ -648,6 +648,7 @@ __git_list_porcelain_commands ()
   cat-file) : plumbing;;
   check-attr)   : plumbing;;
   check-ignore) : plumbing;;
 + check-mailmap): 

Re: [PATCH v2 3/4] t4203: test mailmap functionality directly rather than indirectly

2013-07-11 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 With the introduction of check-mailmap, it is now possible to check
 .mailmap functionality directly rather than indirectly as a side-effect
 of other commands (such as git-shortlog), therefore, do so.

Does this patch mean that we will now ignore future breakages in
shortlog and blame if their mailmap integration becomes buggy?

I am not convinced it is a good idea if that is what is going on.


 Signed-off-by: Eric Sunshine sunsh...@sunshineco.com
 ---
  t/t4203-mailmap.sh | 133 
 ++---
  1 file changed, 45 insertions(+), 88 deletions(-)

 diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
 index 8645492..48a000b 100755
 --- a/t/t4203-mailmap.sh
 +++ b/t/t4203-mailmap.sh
 @@ -74,128 +74,96 @@ test_expect_success 'check-mailmap bogus contact' '
  '
  
  cat expect \EOF
 -A U Thor (1):
 -  initial
 -
 -nick1 (1):
 -  second
 -
 +A U Thor aut...@example.com
 +nick1 b...@company.xx
  EOF
  
  test_expect_success 'No mailmap' '
 - git shortlog HEAD actual 
 + git check-mailmap --stdin contacts actual 
   test_cmp expect actual
  '
  
  cat expect \EOF
 -Repo Guy (1):
 -  initial
 -
 -nick1 (1):
 -  second
 -
 +Repo Guy aut...@example.com
 +nick1 b...@company.xx
  EOF
  
  test_expect_success 'default .mailmap' '
   echo Repo Guy aut...@example.com  .mailmap 
 - git shortlog HEAD actual 
 + git check-mailmap --stdin contacts actual 
   test_cmp expect actual
  '
  
  # Using a mailmap file in a subdirectory of the repo here, but
  # could just as well have been a file outside of the repository
  cat expect \EOF
 -Internal Guy (1):
 -  second
 -
 -Repo Guy (1):
 -  initial
 -
 +Repo Guy aut...@example.com
 +Internal Guy b...@company.xx
  EOF
  test_expect_success 'mailmap.file set' '
   mkdir -p internal_mailmap 
   echo Internal Guy b...@company.xx  internal_mailmap/.mailmap 
   git config mailmap.file internal_mailmap/.mailmap 
 - git shortlog HEAD actual 
 + git check-mailmap --stdin contacts actual 
   test_cmp expect actual
  '
  
  cat expect \EOF
 -External Guy (1):
 -  initial
 -
 -Internal Guy (1):
 -  second
 -
 +External Guy aut...@example.com
 +Internal Guy b...@company.xx
  EOF
  test_expect_success 'mailmap.file override' '
   echo External Guy aut...@example.com  internal_mailmap/.mailmap 
   git config mailmap.file internal_mailmap/.mailmap 
 - git shortlog HEAD actual 
 + git check-mailmap --stdin contacts actual 
   test_cmp expect actual
  '
  
  cat expect \EOF
 -Repo Guy (1):
 -  initial
 -
 -nick1 (1):
 -  second
 -
 +Repo Guy aut...@example.com
 +nick1 b...@company.xx
  EOF
  
  test_expect_success 'mailmap.file non-existent' '
   rm internal_mailmap/.mailmap 
   rmdir internal_mailmap 
 - git shortlog HEAD actual 
 + git check-mailmap --stdin contacts actual 
   test_cmp expect actual
  '
  
  cat expect \EOF
 -Internal Guy (1):
 -  second
 -
 -Repo Guy (1):
 -  initial
 -
 +Repo Guy aut...@example.com
 +Internal Guy b...@company.xy
  EOF
  
  test_expect_success 'name entry after email entry' '
   mkdir -p internal_mailmap 
   echo b...@company.xy b...@company.xx internal_mailmap/.mailmap 
   echo Internal Guy b...@company.xx internal_mailmap/.mailmap 
 - git shortlog HEAD actual 
 + git check-mailmap --stdin contacts actual 
   test_cmp expect actual
  '
  
  cat expect \EOF
 -Internal Guy (1):
 -  second
 -
 -Repo Guy (1):
 -  initial
 -
 +Repo Guy aut...@example.com
 +Internal Guy b...@company.xy
  EOF
  
  test_expect_success 'name entry after email entry, case-insensitive' '
   mkdir -p internal_mailmap 
   echo b...@company.xy b...@company.xx internal_mailmap/.mailmap 
   echo Internal Guy b...@company.xx internal_mailmap/.mailmap 
 - git shortlog HEAD actual 
 + git check-mailmap --stdin contacts actual 
   test_cmp expect actual
  '
  
  cat expect \EOF
 -A U Thor (1):
 -  initial
 -
 -nick1 (1):
 -  second
 -
 +A U Thor aut...@example.com
 +nick1 b...@company.xx
  EOF
  test_expect_success 'No mailmap files, but configured' '
   rm -f .mailmap internal_mailmap/.mailmap 
 - git shortlog HEAD actual 
 + git check-mailmap --stdin contacts actual 
   test_cmp expect actual
  '
  
 @@ -217,54 +185,43 @@ test_expect_success 'setup mailmap blob tests' '
  
  test_expect_success 'mailmap.blob set' '
   cat expect -\EOF 
 - Blob Guy (1):
 -   second
 -
 - Repo Guy (1):
 -   initial
 -
 + Repo Guy aut...@example.com
 + Blob Guy b...@company.xx
   EOF
 - git -c mailmap.blob=map:just-bugs shortlog HEAD actual 
 + git -c mailmap.blob=map:just-bugs check-mailmap --stdin \
 + contacts actual 
   test_cmp expect actual
  '
  
  test_expect_success 'mailmap.blob overrides .mailmap' '
   cat expect -\EOF 
 - 

Re: [PATCH v2] config: add support for http.url.* settings

2013-07-11 Thread Junio C Hamano
Kyle J. McKay mack...@gmail.com writes:

 Longer matches take precedence over shorter matches with
 environment variable settings taking precedence over all.

OK.

 diff --git a/Documentation/config.txt b/Documentation/config.txt
 index b4d4887..3731a3a 100644
 --- a/Documentation/config.txt
 +++ b/Documentation/config.txt
 @@ -1531,6 +1531,17 @@ http.useragent::
   of common USER_AGENT strings (but not including those like git/1.7.1).
   Can be overridden by the 'GIT_HTTP_USER_AGENT' environment variable.
  
 +http.url.*::
 + Any of the http.* options above can be applied selectively to some urls.
 + For example http.https://example.com.useragent; would set the user
 + agent only for https connections to example.com.  The url value
 + matches a url if it is an exact match or a prefix of the url matching
 + at a '/' boundary.  Longer url matches take precedence over shorter
 + ones with the environment variable settings taking precedence over all.
 + Note that url must match the url exactly (other than possibly being a
 + prefix).  This means any user, password and/or port setting that appears
 + in a url must also appear in url to have a successful match.
 +

OK.

 diff --git a/http.c b/http.c
 index 2d086ae..68dc789 100644
 --- a/http.c
 +++ b/http.c
 @@ -30,6 +30,34 @@ static CURL *curl_default;
  
  char curl_errorstr[CURL_ERROR_SIZE];
  
 +enum http_option_type {
 + opt_post_buffer = 0,
 + opt_min_sessions,
 +#ifdef USE_CURL_MULTI
 + opt_max_requests,
 +#endif
 + opt_ssl_verify,
 + opt_ssl_try,
 + opt_ssl_cert,
 +#if LIBCURL_VERSION_NUM = 0x070903
 + opt_ssl_key,
 +#endif
 +#if LIBCURL_VERSION_NUM = 0x070908
 + opt_ssl_capath,
 +#endif
 + opt_ssl_cainfo,
 + opt_low_speed,
 + opt_low_time,
 + opt_no_epsv,
 + opt_http_proxy,
 + opt_cookie_file,
 + opt_user_agent,
 + opt_passwd_req,
 + opt_max
 +};
 +
 +static int http_option_maxlen[opt_max];

I understand that this is to keep track of the length of the longest
one that has matched (hence the current candidate).  The name maxlen
captures the longest part, but has matched is somehow lost.

Can we have a better name here please, or at least a comment to
clarify what this variable keeps track of.

  static int curl_ssl_verify = -1;
  static int curl_ssl_try;
  static const char *ssl_cert;
 @@ -141,34 +169,99 @@ static void process_curl_messages(void)
  }
  #endif
  
 +static int http_options_url_match_prefix(const char *url, const char 
 *url_prefix)
 +{
 + /*
 +  * url_prefix matches url if url_prefix is an exact match for url or it
 +  * is a prefix of url and the match ends on a path component boundary.
 +  * url_prefix is considered to have an implicit '/' on the end for
 +  * matching purposes if it does not already and it is shorter than url.
 +  * the return value is the length of the match in characters (excluding
 +  * any final '/') or 0 for no match.
 +  */
 + size_t url_len, url_prefix_len = strlen(url_prefix);
 + if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
 + return 0;

Should url=http://git.or.xz/git; match url_prefix=http://git.or.xz/git/;?

 + url_len = strlen(url);
 + if (url_len == url_prefix_len || url[url_prefix_len - 1] == '/' || 
 url[url_prefix_len] == '/')
 + return url[url_prefix_len - 1] == '/' ? url_prefix_len - 1 : 
 url_prefix_len;

Overlong lines that are somewhat hard to read.

 + return 0;
 +}
 +
  static int http_options(const char *var, const char *value, void *cb)
  {
 - if (!strcmp(http.sslverify, var)) {
 +/*
 + * Macro to ignore matches with a match length less than previously seen
 + * for the same option type and to remember the largest match length seen so
 + * far for each option type
 + */
 +#define CHECK_MATCHLEN(opt) \
 + if (http_option_maxlen[opt]  matchlen) return 0; \
 + else http_option_maxlen[opt] = matchlen

Avoid defining a macro _inside_ a function.  Also if you can make it
a static function, that would be much easier to read.

 +
 + const char *url = (const char *)cb;
 + const char *key, *dot;
 + int matchlen = 0;

 + key = skip_prefix(var, http.);
 + if (!key)
 + return git_default_config(var, value, cb);
 +
 + /*
 +  * If the part following the leading http. contains a '.' then check
 +  * to see if the part between http. and the last '.' is a match to
 +  * url.  if it's not then ignore the setting.  Otherwise set key to
 +  * point to the option which is the final part after the final '.' and
 +  * use key in subsequent comparisons to determine the option type.
 +  */
 + dot = strrchr(key, '.');
 + if (dot) {
 + char *config_url = xmemdupz(key, dot - key);
 + matchlen = http_options_url_match_prefix(url, config_url);
 + free(config_url);

Yikes.  http_options_url_match_prefix() 

[PATCHv3 08/10] cat-file: split --batch input lines on whitespace

2013-07-11 Thread Jeff King
On Thu, Jul 11, 2013 at 07:36:53AM -0400, Jeff King wrote:

 On Wed, Jul 10, 2013 at 08:59:51PM +0530, Ramkumar Ramachandra wrote:
 
  Jeff King wrote:
 git rev-list --objects HEAD |
 git cat-file --batch-check='%(objectsize) %(text)'
  
  If anything, I would have expected %(rest), not %(text).  This atom is
  specific to commands that accept input via stdin (i.e. not log, f-e-r,
  branch, or anything else I can think of).
 
 I considered %(rest) as well. I don't have a strong preference.

Here is the patch re-rolled with s/text/rest/.

Junio, that makes this and 10/10 the only v3 updates. Let me know if
it would be simpler to just resend the whole series.

-- 8 --
Subject: [PATCH] cat-file: split --batch input lines on whitespace

If we get an input line to --batch or --batch-check that
looks like HEAD foo bar, we will currently feed the whole
thing to get_sha1(). This means that to use --batch-check
with `rev-list --objects`, one must pre-process the input,
like:

  git rev-list --objects HEAD |
  cut -d' ' -f1 |
  git cat-file --batch-check

Besides being more typing and slightly less efficient to
invoke `cut`, the result loses information: we no longer
know which path each object was found at.

This patch teaches cat-file to split input lines at the
first whitespace. Everything to the left of the whitespace
is considered an object name, and everything to the right is
made available as the %(reset) atom. So you can now do:

  git rev-list --objects HEAD |
  git cat-file --batch-check='%(objectsize) %(rest)'

to collect object sizes at particular paths.

Even if %(rest) is not used, we always do the whitespace
split (which means you can simply eliminate the `cut`
command from the first example above).

This whitespace split is backwards compatible for any
reasonable input. Object names cannot contain spaces, so any
input with spaces would have resulted in a missing line.
The only input hurt is if somebody really expected input of
the form HEAD is a fine-looking ref! to fail; it will now
parse HEAD, and make is a fine-looking ref! available as
%(rest).

Signed-off-by: Jeff King p...@peff.net
---
 Documentation/git-cat-file.txt | 10 --
 builtin/cat-file.c | 20 +++-
 t/t1006-cat-file.sh|  7 +++
 3 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index 06bdc43..68691d4 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -88,8 +88,10 @@ from stdin, one per line, and print information about them.
 If `--batch` or `--batch-check` is given, `cat-file` will read objects
 from stdin, one per line, and print information about them.
 
-Each line is considered as a whole object name, and is parsed as if
-given to linkgit:git-rev-parse[1].
+Each line is split at the first whitespace boundary. All characters
+before that whitespace are considered as a whole object name, and are
+parsed as if given to linkgit:git-rev-parse[1]. Characters after that
+whitespace can be accessed using the `%(rest)` atom (see below).
 
 You can specify the information shown for each object by using a custom
 `format`. The `format` is copied literally to stdout for each
@@ -110,6 +112,10 @@ newline. The available atoms are:
The size, in bytes, that the object takes up on disk. See the
note about on-disk sizes in the `CAVEATS` section below.
 
+`rest`::
+   The text (if any) found after the first run of whitespace on the
+   input line (i.e., the rest of the line).
+
 If no format is specified, the default format is `%(objectname)
 %(objecttype) %(objectsize)`.
 
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 11fa8c0..0e64b41 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -119,6 +119,7 @@ struct expand_data {
enum object_type type;
unsigned long size;
unsigned long disk_size;
+   const char *rest;
 
/*
 * If mark_query is true, we do not expand anything, but rather
@@ -161,6 +162,9 @@ static void expand_atom(struct strbuf *sb, const char 
*atom, int len,
data-info.disk_sizep = data-disk_size;
else
strbuf_addf(sb, %lu, data-disk_size);
+   } else if (is_atom(rest, atom, len)) {
+   if (!data-mark_query  data-rest)
+   strbuf_addstr(sb, data-rest);
} else
die(unknown format element: %.*s, len, atom);
 }
@@ -263,7 +267,21 @@ static int batch_objects(struct batch_options *opt)
data.mark_query = 0;
 
while (strbuf_getline(buf, stdin, '\n') != EOF) {
-   int error = batch_one_object(buf.buf, opt, data);
+   char *p;
+   int error;
+
+   /*
+* Split at first whitespace, tying off the beginning of the
+* string and saving the remainder (or NULL) in data.rest.
+

Re: [PATCH 7/7] push: document --lockref

2013-07-11 Thread Johannes Sixt
Am 10.07.2013 01:08, schrieb Junio C Hamano:
 Junio C Hamano gits...@pobox.com writes:
 
 I _think_ I am OK if we introduced --allow-no-ff that means the
 current --force (i.e. rewinding is OK), that does not defeat the
 --lockref safety.  That is the intended application (you know that
 push does not fast-forward because you rebased, but you also want to
 make sure there is nothing you are losing by enforcing --lockref
 safety).

 If that is what happens, then I think --force that means anything
 goes makes sense.
 
 Or perhaps you were implicitly assuming that --lockref would
 automatically mean I know I am rewinding, so as soon as I say
 --lockref, I mean --allow-no-ff, and I did not realize that.

That's what I mean, sort of. Because of your 4 cases of a ref update, I
do not think that

 3. The update fast-forwards, but the ref to be updated is not at the
expected place; or

is important to consider. The point of --lockref is to avoid data loss,
but if the push is fast-forward, there is no data loss.

 If that is the semantics you are proposing, then I think it makes
 sense to make --force the big red button that lets anything go.
 
 I was considering --lockref to be orthogonal to the traditional
 ff only check, and rejecting a push when the updated ref's current
 value is expected (i.e. --lockref satisfied) but the update does not
 fast-forward, and that was where my resistance to allow --force to
 override --lockref comes from (because otherwise there is no way
 to say I know I want to bypass 'ff-only' check, but instead make
 sure the current value is this).

Again: Why not just define +refspec as the way to achieve this check?

-- Hannes

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 10/10] pack-revindex: radix-sort the revindex

2013-07-11 Thread Brandon Casey
On Thu, Jul 11, 2013 at 5:16 AM, Jeff King p...@peff.net wrote:
   Here's an update of the radix-sort patch. It fixes the unsigned issue
   Brandon pointed out, along with a few other comment/naming/style fixes.
   I also updated the commit message with more explanation of the
   timings.

Very nice.

For what it's worth:

Reviewed-by: Brandon Casey draf...@gmail.com

remainder retained for reference (or whatever Jonathan usually says)

   The interdiff is:

   diff --git a/pack-revindex.c b/pack-revindex.c
   index 9365bc2..b4d2b35 100644
   --- a/pack-revindex.c
   +++ b/pack-revindex.c
   @@ -61,6 +61,10 @@ static void init_pack_revindex(void)

/*
 * This is a least-significant-digit radix sort.
   + *
   + * It sorts each of the n items in entries by its offset field. The 
 max
   + * parameter must be at least as large as the largest offset in the array,
   + * and lets us quit the sort early.
 */
static void sort_revindex(struct revindex_entry *entries, unsigned n, 
 off_t max)
{
   @@ -78,18 +82,25 @@ static void sort_revindex(struct revindex_entry 
 *entries, unsigned n, off_t max)
#define BUCKET_FOR(a, i, bits) (((a)[(i)].offset  (bits))  (BUCKETS-1))

 /*
   -  * We need O(n) temporary storage, so we sort back and forth between
   -  * the real array and our tmp storage. To keep them straight, we 
 always
   -  * sort from a into buckets in b.
   +  * We need O(n) temporary storage. Rather than do an extra copy of the
   +  * partial results into entries, we sort back and forth between the
   +  * real array and temporary storage. In each iteration of the loop, we
   +  * keep track of them with alias pointers, always sorting from from
   +  * to to.
  */
   - struct revindex_entry *tmp = xcalloc(n, sizeof(*tmp));
   - struct revindex_entry *a = entries, *b = tmp;
   - int bits = 0;
   + struct revindex_entry *tmp = xmalloc(n * sizeof(*tmp));
   + struct revindex_entry *from = entries, *to = tmp;
   + int bits;
 unsigned *pos = xmalloc(BUCKETS * sizeof(*pos));

   - while (max  bits) {
   + /*
   +  * If (max  bits) is zero, then we know that the radix digit we are
   +  * on (and any higher) will be zero for all entries, and our loop will
   +  * be a no-op, as everybody lands in the same zero-th bucket.
   +  */
   + for (bits = 0; max  bits; bits += DIGIT_SIZE) {
 struct revindex_entry *swap;
   - int i;
   + unsigned i;

 memset(pos, 0, BUCKETS * sizeof(*pos));

   @@ -102,7 +113,7 @@ static void sort_revindex(struct revindex_entry 
 *entries, unsigned n, off_t max)
  * previous bucket to get the true index.
  */
 for (i = 0; i  n; i++)
   - pos[BUCKET_FOR(a, i, bits)]++;
   + pos[BUCKET_FOR(from, i, bits)]++;
 for (i = 1; i  BUCKETS; i++)
 pos[i] += pos[i-1];

   @@ -112,32 +123,37 @@ static void sort_revindex(struct revindex_entry 
 *entries, unsigned n, off_t max)
  * to avoid using an extra index to count up. And since we are
  * going backwards there, we must also go backwards through 
 the
  * array itself, to keep the sort stable.
   +  *
   +  * Note that we use an unsigned iterator to make sure we can
   +  * handle 2^32-1 objects, even on a 32-bit system. But this
   +  * means we cannot use the more obvious i = 0 loop 
 condition
   +  * for counting backwards, and must instead check for
   +  * wrap-around with UINT_MAX.
  */
   - for (i = n - 1; i = 0; i--)
   - b[--pos[BUCKET_FOR(a, i, bits)]] = a[i];
   + for (i = n - 1; i != UINT_MAX; i--)
   + to[--pos[BUCKET_FOR(from, i, bits)]] = from[i];

 /*
   -  * Now b contains the most sorted list, so we swap a and
   -  * b for the next iteration.
   +  * Now to contains the most sorted list, so we swap from 
 and
   +  * to for the next iteration.
  */
   - swap = a;
   - a = b;
   - b = swap;
   -
   - /* And bump our bits for the next round. */
   - bits += DIGIT_SIZE;
   + swap = from;
   + from = to;
   + to = swap;
 }

 /*
  * If we ended with our data in the original array, great. If not,
  * we have to move it back from the temporary storage.
  */
   - if (a != entries)
   + if (from != entries)
 memcpy(entries, tmp, n * sizeof(*entries));
 free(tmp);
 free(pos);

#undef BUCKET_FOR
   +#undef BUCKETS
   +#undef DIGIT_SIZE
}

/*

 -- 8 --
 Subject: [PATCH] 

Re: [PATCH v2] config: add support for http.url.* settings

2013-07-11 Thread Kyle J. McKay

On Jul 11, 2013, at 12:26, Junio C Hamano wrote:

Kyle J. McKay mack...@gmail.com writes:


+static int http_option_maxlen[opt_max];


I understand that this is to keep track of the length of the longest
one that has matched (hence the current candidate).  The name maxlen
captures the longest part, but has matched is somehow lost.

Can we have a better name here please, or at least a comment to
clarify what this variable keeps track of.


Will do.  How about http_option_max_matched_len?

+static int http_options_url_match_prefix(const char *url, const  
char *url_prefix)

+{
+   /*
+	 * url_prefix matches url if url_prefix is an exact match for url  
or it
+	 * is a prefix of url and the match ends on a path component  
boundary.

+* url_prefix is considered to have an implicit '/' on the end for
+	 * matching purposes if it does not already and it is shorter  
than url.
+	 * the return value is the length of the match in characters  
(excluding

+* any final '/') or 0 for no match.
+*/
+   size_t url_len, url_prefix_len = strlen(url_prefix);
+   if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
+   return 0;


Should url=http://git.or.xz/git; match url_prefix=http://git.or.xz/git/ 
?


My initial thought was no.  But your example is persuasive and  
probably should match.  Fix forthcoming.



+   url_len = strlen(url);
+	if (url_len == url_prefix_len || url[url_prefix_len - 1] == '/'  
|| url[url_prefix_len] == '/')
+		return url[url_prefix_len - 1] == '/' ? url_prefix_len - 1 :  
url_prefix_len;


Overlong lines that are somewhat hard to read.


OK.


static int http_options(const char *var, const char *value, void *cb)
{
-   if (!strcmp(http.sslverify, var)) {
+/*
+ * Macro to ignore matches with a match length less than  
previously seen
+ * for the same option type and to remember the largest match  
length seen so

+ * far for each option type
+ */
+#define CHECK_MATCHLEN(opt) \
+   if (http_option_maxlen[opt]  matchlen) return 0; \
+   else http_option_maxlen[opt] = matchlen


Avoid defining a macro _inside_ a function.  Also if you can make it
a static function, that would be much easier to read.


Was modeled after the credential_match() function from credential.c  
that defines a CHECK macro inside it and then undefines it at the end  
of the function.  I will change it to a static function.



+   if (dot) {
+   char *config_url = xmemdupz(key, dot - key);
+   matchlen = http_options_url_match_prefix(url, config_url);
+   free(config_url);


Yikes.  http_options_url_match_prefix() could take a counted string
to config_url to avoid this repeated allocation and deallocation,
no?


Again, modeled after the credential_config_callback() function from  
credential.c that does this exact thing.


I will update http_options_url_match_prefix to take a size_t.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3] config: add support for http.url.* settings

2013-07-11 Thread Kyle J. McKay
The url value is considered a match to a url if the url value
is either an exact match or a prefix of the url which ends on a
path component boundary ('/').  So https://example.com/test; will
match https://example.com/test; and https://example.com/test/too;
but not https://example.com/testextra;.

Longer matches take precedence over shorter matches with
environment variable settings taking precedence over all.

With this configuration:

[http]
useragent = other-agent
noEPSV = true
[http https://example.com;]
useragent = example-agent
sslVerify = false
[http https://example.com/path;]
useragent = path-agent

The https://other.example.com/; url will have useragent
other-agent and sslVerify will be on.

The https://example.com/; url will have useragent
example-agent and sslVerify will be off.

The https://example.com/path/sub; url will have useragent
path-agent and sslVerify will be off.

All three of the examples will have noEPSV enabled.

Signed-off-by: Kyle J. McKay mack...@gmail.com
---

The credentials configuration values already support url-specific
configuration items in the form credential.url.*.  This patch
adds similar support for http configuration values.

This version of the patch addresses some initial feedback.

 Documentation/config.txt |  11 +++
 http.c   | 169 ++-
 2 files changed, 163 insertions(+), 17 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index b4d4887..3731a3a 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1531,6 +1531,17 @@ http.useragent::
of common USER_AGENT strings (but not including those like git/1.7.1).
Can be overridden by the 'GIT_HTTP_USER_AGENT' environment variable.
 
+http.url.*::
+   Any of the http.* options above can be applied selectively to some urls.
+   For example http.https://example.com.useragent; would set the user
+   agent only for https connections to example.com.  The url value
+   matches a url if it is an exact match or a prefix of the url matching
+   at a '/' boundary.  Longer url matches take precedence over shorter
+   ones with the environment variable settings taking precedence over all.
+   Note that url must match the url exactly (other than possibly being a
+   prefix).  This means any user, password and/or port setting that appears
+   in a url must also appear in url to have a successful match.
+
 i18n.commitEncoding::
Character encoding the commit messages are stored in; Git itself
does not care per se, but this information is necessary e.g. when
diff --git a/http.c b/http.c
index 2d086ae..d6d3507 100644
--- a/http.c
+++ b/http.c
@@ -30,6 +30,34 @@ static CURL *curl_default;
 
 char curl_errorstr[CURL_ERROR_SIZE];
 
+enum http_option_type {
+   opt_post_buffer = 0,
+   opt_min_sessions,
+#ifdef USE_CURL_MULTI
+   opt_max_requests,
+#endif
+   opt_ssl_verify,
+   opt_ssl_try,
+   opt_ssl_cert,
+#if LIBCURL_VERSION_NUM = 0x070903
+   opt_ssl_key,
+#endif
+#if LIBCURL_VERSION_NUM = 0x070908
+   opt_ssl_capath,
+#endif
+   opt_ssl_cainfo,
+   opt_low_speed,
+   opt_low_time,
+   opt_no_epsv,
+   opt_http_proxy,
+   opt_cookie_file,
+   opt_user_agent,
+   opt_passwd_req,
+   opt_max
+};
+
+static size_t http_option_max_matched_len[opt_max];
+
 static int curl_ssl_verify = -1;
 static int curl_ssl_try;
 static const char *ssl_cert;
@@ -141,34 +169,122 @@ static void process_curl_messages(void)
 }
 #endif
 
+static size_t http_options_url_match_prefix(const char *url,
+   const char *url_prefix,
+   size_t url_prefix_len)
+{
+   /*
+* url_prefix matches url if url_prefix is an exact match for url or it
+* is a prefix of url and the match ends on a path component boundary.
+* Both url and url_prefix are considered to have an implicit '/' on the
+* end for matching purposes if they do not already.
+*
+* url must be NUL terminated.  url_prefix_len is the length of
+* url_prefix which need not be NUL terminated.
+*
+* The return value is the length of the match in characters (excluding
+* any final '/') or 0 for no match.  Passing / as url_prefix will
+* always cause 0 to be returned.
+*/
+   size_t url_len;
+   if (url_prefix_len  url_prefix[url_prefix_len - 1] == '/')
+   url_prefix_len--;
+   if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
+   return 0;
+   url_len = strlen(url);
+   if ( (url_len == url_prefix_len)  ||
+(url[url_prefix_len - 1] == '/') ||
+(url[url_prefix_len] == '/') )
+   return url[url_prefix_len - 1] == '/'
+  ? url_prefix_len - 1 : 

Re: [PATCH 7/7] push: document --lockref

2013-07-11 Thread Junio C Hamano
Johannes Sixt j...@kdbg.org writes:

 Or perhaps you were implicitly assuming that --lockref would
 automatically mean I know I am rewinding, so as soon as I say
 --lockref, I mean --allow-no-ff, and I did not realize that.

 That's what I mean, sort of. Because of your 4 cases of a ref update, I
 do not think that

 3. The update fast-forwards, but the ref to be updated is not at the
expected place; or

 is important to consider. The point of --lockref is to avoid data loss,
 but if the push is fast-forward, there is no data loss.

 If that is the semantics you are proposing, then I think it makes
 sense to make --force the big red button that lets anything go.

I have a reroll that goes in that direction.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH] During a shallow fetch, prevent sending over unneeded objects

2013-07-11 Thread Matthijs Kooijman
Hi folks,

while playing with shallow fetches, I've found that in some
circumstances running git fetch with --depth can return too many objects
(in particular, _all_ the objects for the requested revisions are
returned, even when some of those objects are already known to the
client).

This happens when a client issues a fetch with a depth bigger or equal
to the number of commits the server is ahead of the client. In this
case, the revisions to be sent over will be completely detached from any
revisions the client already has (history-wise), causing the server to
effectively ignore all objects the client has (as advertised using its
have lines) and just send over _all_ objects (needed for the revisions
it is sending over).

I've traced this down to the way do_rev_list in upload-pack.c works. If
I've poured over the code enough to understand it, this is what happens:
 - The new shallow roots are made into graft points without parents.
 - The want commits are added to the pending list (revs-pending)
 - The have commits are marked uninteresting and added to the pending list
 - prepare_revision_walk is called, which adds everything from the
   pending list into the commmit list (revs-commits)
 - limit_list is called, which traverses the history of each interesting
   commit in the commit list (i.e., all want revisions), up to excluding
   the first uninteresting commit (i.e. a have revision). The result of
   this is the new commit list.

   This means the commit list now contains all commits that the client
   wants, up to (excluding) any commits he already has or up to
   (including) any (new) shallow roots.
 - mark_edges_uninteresting is called, which marks the tree of every
   parent of each edge in the commit list as uninteresting (in practice,
   this marks the tree of each uninteresting parent, since those are by
   definition the only kinds of revisions that can be beyond the edge).
 - All trees and blobs that are referenced by trees in the commit list
   but are not marked as uninteresting, are passed to git-pack-objects
   to put into the pack.

Normally, the list of commits to send over is connected to the
client's existing commits (which are marked as uninteresting). This
means that only the trees of those uninteresting (have) commits that
are actually (direct) predecessors of the commits to send over are
marked as uninteresting. This is probably useful, since it prevents
having to go over all trees the client has (for other branches, for
example) and instead limits to the trees that are the most likely to
contain duplicate (or similar, for delta-ing) objects.

However, in the detached shallow fetch case, this assumption is no
longer valid. There will be no uninteresting commits as parents for
the commit list, since all edge commits will be shallow roots (hence
have no parents).  Ideally, one would find out which of the detached
have revisions are the closest to the new shallow roots, but with the
current code these shallow roots have their parents cut off long before
this code even runs, so this is probably not feasible.

Instead, what we can do in this case, is simply mark the trees of all
have commits as uninteresting. This prevents all objects that are
contained in the have commits themselves from being sent to the
client, which can be a big win for bigger repositories. Marking them all
is is probably more work than strictly needed, but is easy to implement.

I have created a mockup patch which does this, and also adds a test case
demonstrating the problem. Right now, the above fix is applied always,
even in cases where it isn't needed.

Looking at the code, I think it would be good to let
mark_edges_uninteresting look for shallow roots in the commit list (or
perhaps just add another loop over the commit list inside do_rev_list)
and only apply the fix if any shallow roots are in the commit list
(meaning at least a part of the history to send over is detached from
the clients current history). I haven't implemented this yet, wanting to
get some feedback first.

Also, I'm not quite sure how this fits in with the concept of thin
packs. There might be some opportunities missing here as well, though
git-pack-objects is called without --thin when shallow roots are
involved. I think this is related to the - prefixed commit sha's that
are sent to git-pack-objects, but I couldn't found any documentation on
what the - prefix is supposed to mean.

(On a somewhat related note, show_commit in upload-pack.c checks the
BOUNDARY flag, but AFAICS the revs-boundary flag is never set, so
BOUNDARY cannot ever be set in this case either?)

How does this patch look?

Gr.

Matthijs

---
 t/t5500-fetch-pack.sh | 11 +++
 upload-pack.c |  8 
 2 files changed, 19 insertions(+)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index fd2598e..a022d65 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -393,6 +393,17 @@ test_expect_success 'fetch in shallow repo unreachable 

Re: [PATCH 7/7] push: document --lockref

2013-07-11 Thread Junio C Hamano
Johannes Sixt j...@kdbg.org writes:

 Again: Why not just define +refspec as the way to achieve this check?

What justification do we have to break existing people's
configuration that says something like:

[remote ko]
url = kernel.org:/pub/scm/git/git.git
push = master
push = next
push = +pu
push = maint

by adding a _new_ requirement they may not be able to satisify?
Notice that the above is a typical push only publishing point,
where you do not need any remote tracking branches.

I am not opposed if your proposal were to introduce a new syntax
element that calls for this new feature, e.g.

[remote ko]
url = kernel.org:/pub/scm/git/git.git
push = *pu
fetch = +refs/heads/*:refs/remotes/ko/*

but changing what + means to something new will simply not fly.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/6] cache.h: move remote/connect API out of it

2013-07-11 Thread Junio C Hamano
The definition of struct ref in cache.h, a header file so
central to the system, always confused me.  This structure is not
about the local ref used by sha1-name API to name local objects.

It is what refspecs are expanded into, after finding out what refs
the other side has, to define what refs are updated after object
transfer succeeds to what values.  It belongs to remote.h together
with struct refspec.

While we are at it, also move the types and functions related to the
Git transport connection to a new header file connect.h

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 builtin/fetch-pack.c   |  2 ++
 builtin/receive-pack.c |  1 +
 builtin/send-pack.c|  1 +
 cache.h| 62 --
 connect.c  |  1 +
 connect.h  | 13 +++
 fetch-pack.c   |  1 +
 fetch-pack.h   |  1 +
 refs.c |  8 ---
 remote.c   |  8 +++
 remote.h   | 54 +++
 send-pack.c|  1 +
 transport.c|  2 ++
 transport.h|  1 +
 upload-pack.c  |  1 +
 15 files changed, 87 insertions(+), 70 deletions(-)
 create mode 100644 connect.h

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index aba4465..c6888c6 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -1,6 +1,8 @@
 #include builtin.h
 #include pkt-line.h
 #include fetch-pack.h
+#include remote.h
+#include connect.h
 
 static const char fetch_pack_usage[] =
 git fetch-pack [--all] [--stdin] [--quiet|-q] [--keep|-k] [--thin] 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index e3eb5fc..7434d9b 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -8,6 +8,7 @@
 #include commit.h
 #include object.h
 #include remote.h
+#include connect.h
 #include transport.h
 #include string-list.h
 #include sha1-array.h
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 152c4ea..e86d3b5 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -5,6 +5,7 @@
 #include sideband.h
 #include run-command.h
 #include remote.h
+#include connect.h
 #include send-pack.h
 #include quote.h
 #include transport.h
diff --git a/cache.h b/cache.h
index dd0fb33..cb2891d 100644
--- a/cache.h
+++ b/cache.h
@@ -1035,68 +1035,6 @@ struct pack_entry {
struct packed_git *p;
 };
 
-struct ref {
-   struct ref *next;
-   unsigned char old_sha1[20];
-   unsigned char new_sha1[20];
-   char *symref;
-   unsigned int
-   force:1,
-   forced_update:1,
-   deletion:1,
-   matched:1;
-
-   /*
-* Order is important here, as we write to FETCH_HEAD
-* in numeric order. And the default NOT_FOR_MERGE
-* should be 0, so that xcalloc'd structures get it
-* by default.
-*/
-   enum {
-   FETCH_HEAD_MERGE = -1,
-   FETCH_HEAD_NOT_FOR_MERGE = 0,
-   FETCH_HEAD_IGNORE = 1
-   } fetch_head_status;
-
-   enum {
-   REF_STATUS_NONE = 0,
-   REF_STATUS_OK,
-   REF_STATUS_REJECT_NONFASTFORWARD,
-   REF_STATUS_REJECT_ALREADY_EXISTS,
-   REF_STATUS_REJECT_NODELETE,
-   REF_STATUS_REJECT_FETCH_FIRST,
-   REF_STATUS_REJECT_NEEDS_FORCE,
-   REF_STATUS_UPTODATE,
-   REF_STATUS_REMOTE_REJECT,
-   REF_STATUS_EXPECTING_REPORT
-   } status;
-   char *remote_status;
-   struct ref *peer_ref; /* when renaming */
-   char name[FLEX_ARRAY]; /* more */
-};
-
-#define REF_NORMAL (1u  0)
-#define REF_HEADS  (1u  1)
-#define REF_TAGS   (1u  2)
-
-extern struct ref *find_ref_by_name(const struct ref *list, const char *name);
-
-#define CONNECT_VERBOSE   (1u  0)
-extern struct child_process *git_connect(int fd[2], const char *url, const 
char *prog, int flags);
-extern int finish_connect(struct child_process *conn);
-extern int git_connection_is_socket(struct child_process *conn);
-struct extra_have_objects {
-   int nr, alloc;
-   unsigned char (*array)[20];
-};
-extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
-struct ref **list, unsigned int flags,
-struct extra_have_objects *);
-extern int server_supports(const char *feature);
-extern int parse_feature_request(const char *features, const char *feature);
-extern const char *server_feature_value(const char *feature, int *len_ret);
-extern const char *parse_feature_value(const char *feature_list, const char 
*feature, int *len_ret);
-
 extern struct packed_git *parse_pack_index(unsigned char *sha1, const char 
*idx_path);
 
 /* A hook for count-objects to report invalid files in pack directory */
diff --git a/connect.c b/connect.c
index a0783d4..a80ebd3 100644
--- a/connect.c
+++ b/connect.c
@@ -5,6 +5,7 @@

[PATCH v2 5/6] push --lockref: tie it all together

2013-07-11 Thread Junio C Hamano
This teaches the deepest part of the callchain for git push (and
git send-pack) to enforce the old value of the ref must be this,
otherwise fail this push (aka compare-and-swap / --lockref).

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 builtin/send-pack.c |  5 +
 remote.c| 49 -
 remote.h|  1 +
 send-pack.c |  1 +
 transport-helper.c  |  6 ++
 transport.c |  5 +
 6 files changed, 54 insertions(+), 13 deletions(-)

diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 6027ead..41dc512 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -55,6 +55,11 @@ static void print_helper_status(struct ref *ref)
msg = needs force;
break;
 
+   case REF_STATUS_REJECT_STALE:
+   res = error;
+   msg = stale info;
+   break;
+
case REF_STATUS_REJECT_ALREADY_EXISTS:
res = error;
msg = already exists;
diff --git a/remote.c b/remote.c
index 93e5b65..0109487 100644
--- a/remote.c
+++ b/remote.c
@@ -1396,12 +1396,13 @@ int match_push_refs(struct ref *src, struct ref **dst,
 }
 
 void set_ref_status_for_push(struct ref *remote_refs, int send_mirror,
-   int force_update)
+int force_update)
 {
struct ref *ref;
 
for (ref = remote_refs; ref; ref = ref-next) {
int force_ref_update = ref-force || force_update;
+   int reject_reason = 0;
 
if (ref-peer_ref)
hashcpy(ref-new_sha1, ref-peer_ref-new_sha1);
@@ -1416,6 +1417,26 @@ void set_ref_status_for_push(struct ref *remote_refs, 
int send_mirror,
}
 
/*
+* Bypass the usual must fast-forward check but
+* replace it with a weaker the old value must be
+* this value we observed.  If the remote ref has
+* moved and is now different from what we expect,
+* reject any push.
+*
+* It also is an error if the user told us to check
+* with the remote-tracking branch to find the value
+* to expect, but we did not have such a tracking
+* branch.
+*/
+   if (ref-expect_old_sha1) {
+   if (ref-expect_old_no_trackback ||
+   hashcmp(ref-old_sha1, ref-old_sha1_expect))
+   reject_reason = REF_STATUS_REJECT_STALE;
+   }
+
+   /*
+* The usual must fast-forward rules.
+*
 * Decide whether an individual refspec A:B can be
 * pushed.  The push will succeed if any of the
 * following are true:
@@ -1433,24 +1454,26 @@ void set_ref_status_for_push(struct ref *remote_refs, 
int send_mirror,
 * passing the --force argument
 */
 
-   if (!ref-deletion  !is_null_sha1(ref-old_sha1)) {
-   int why = 0; /* why would this push require --force? */
-
+   else if (!ref-deletion  !is_null_sha1(ref-old_sha1)) {
if (!prefixcmp(ref-name, refs/tags/))
-   why = REF_STATUS_REJECT_ALREADY_EXISTS;
+   reject_reason = 
REF_STATUS_REJECT_ALREADY_EXISTS;
else if (!has_sha1_file(ref-old_sha1))
-   why = REF_STATUS_REJECT_FETCH_FIRST;
+   reject_reason = REF_STATUS_REJECT_FETCH_FIRST;
else if (!lookup_commit_reference_gently(ref-old_sha1, 
1) ||
 !lookup_commit_reference_gently(ref-new_sha1, 
1))
-   why = REF_STATUS_REJECT_NEEDS_FORCE;
+   reject_reason = REF_STATUS_REJECT_NEEDS_FORCE;
else if (!ref_newer(ref-new_sha1, ref-old_sha1))
-   why = REF_STATUS_REJECT_NONFASTFORWARD;
-
-   if (!force_ref_update)
-   ref-status = why;
-   else if (why)
-   ref-forced_update = 1;
+   reject_reason = 
REF_STATUS_REJECT_NONFASTFORWARD;
}
+
+   /*
+* --force will defeat any rejection implemented
+* by the rules above.
+*/
+   if (!force_ref_update)
+   ref-status = reject_reason;
+   else if (reject_reason)
+   ref-forced_update = 1;
}
 }
 
diff --git a/remote.h b/remote.h
index 4c564c5..baa1c68 100644
--- a/remote.h
+++ b/remote.h
@@ -107,6 +107,7 @@ struct ref {
  

[PATCH v2 6/6] t5533: test push --lockref

2013-07-11 Thread Junio C Hamano
Prepare two repositories, src and dst, the latter of which is a
clone of the former (with tracking branches), and push from the
latter into the former, using --lockref=name (using tracking ref for
name when updating name), --lockref=name:value, --lockref=name:
(i.e. check creation), and --lockref (using tracking ref for
anything that we update).

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 t/t5533-push-cas.sh | 189 
 1 file changed, 189 insertions(+)
 create mode 100755 t/t5533-push-cas.sh

diff --git a/t/t5533-push-cas.sh b/t/t5533-push-cas.sh
new file mode 100755
index 000..ea1c789
--- /dev/null
+++ b/t/t5533-push-cas.sh
@@ -0,0 +1,189 @@
+#!/bin/sh
+
+test_description='compare  swap push force/delete safety'
+
+. ./test-lib.sh
+
+setup_srcdst_basic () {
+   rm -fr src dst 
+   git clone --no-local . src 
+   git clone --no-local src dst 
+   (
+   cd src  git checkout HEAD^0
+   )
+}
+
+test_expect_success setup '
+   : create template repository
+   test_commit A 
+   test_commit B 
+   test_commit C
+'
+
+test_expect_success 'push to update (protected)' '
+   setup_srcdst_basic 
+   (
+   cd dst 
+   test_commit D 
+   test_must_fail git push --lockref=master:master origin master
+   ) 
+   git ls-remote . refs/heads/master expect 
+   git ls-remote src refs/heads/master actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'push to update (protected, forced)' '
+   setup_srcdst_basic 
+   (
+   cd dst 
+   test_commit D 
+   git push --force --lockref=master:master origin master
+   ) 
+   git ls-remote dst refs/heads/master expect 
+   git ls-remote src refs/heads/master actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'push to update (protected, tracking)' '
+   setup_srcdst_basic 
+   (
+   cd src 
+   git checkout master 
+   test_commit D 
+   git checkout HEAD^0
+   ) 
+   git ls-remote src refs/heads/master expect 
+   (
+   cd dst 
+   test_commit E 
+   git ls-remote . refs/remotes/origin/master expect 
+   test_must_fail git push --lockref=master origin master 
+   git ls-remote . refs/remotes/origin/master actual 
+   test_cmp expect actual
+   ) 
+   git ls-remote src refs/heads/master actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'push to update (protected, tracking, forced)' '
+   setup_srcdst_basic 
+   (
+   cd src 
+   git checkout master 
+   test_commit D 
+   git checkout HEAD^0
+   ) 
+   (
+   cd dst 
+   test_commit E 
+   git ls-remote . refs/remotes/origin/master expect 
+   git push --force --lockref=master origin master
+   ) 
+   git ls-remote dst refs/heads/master expect 
+   git ls-remote src refs/heads/master actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'push to update (allowed)' '
+   setup_srcdst_basic 
+   (
+   cd dst 
+   test_commit D 
+   git push --lockref=master:master^ origin master
+   ) 
+   git ls-remote dst refs/heads/master expect 
+   git ls-remote src refs/heads/master actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'push to update (allowed, tracking)' '
+   setup_srcdst_basic 
+   (
+   cd dst 
+   test_commit D 
+   git push --lockref=master origin master
+   ) 
+   git ls-remote dst refs/heads/master expect 
+   git ls-remote src refs/heads/master actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'push to update (allowed even though no-ff)' '
+   setup_srcdst_basic 
+   (
+   cd dst 
+   git reset --hard HEAD^ 
+   test_commit D 
+   git push --lockref=master origin master
+   ) 
+   git ls-remote dst refs/heads/master expect 
+   git ls-remote src refs/heads/master actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'push to delete (protected)' '
+   setup_srcdst_basic 
+   git ls-remote src refs/heads/master expect 
+   (
+   cd dst 
+   test_must_fail git push --lockref=master:master^ origin :master
+   ) 
+   git ls-remote src refs/heads/master actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'push to delete (protected, forced)' '
+   setup_srcdst_basic 
+   (
+   cd dst 
+   git push --force --lockref=master:master^ origin :master
+   ) 
+   expect 
+   git ls-remote src refs/heads/master actual 
+   test_cmp expect actual
+'
+
+test_expect_success 'push to delete 

[PATCH v2 3/6] remote.c: add command line option parser for --lockref

2013-07-11 Thread Junio C Hamano
Update git push and git send-pack to parse this commnd line
option.

The intended sematics is:

 * --lockref alone, without specifying the details, will protect
   _all_ remote refs that are going to be updated by requiring their
   current value to be the same as the remote-tracking branch we
   have for them, unless otherwise specified;

 * --lockref=refname, without specifying the expected value, will
   protect that refname, if it is going to be updated, by requiring
   its current value to be the same as the remote-tracking branch we
   have for it;

 * --lockref=refname:value will protect that refname, if it is
   going to be updated, by requiring its current value to be the
   same as the specified value (which is allowed to be different
   from the remote-tracking branch we have for the refname, or we do
   not even have to have such a remote-tracking branch when this
   form is used); and

 * --no-lockref will cancel all the previous --lockref on the
   command line.

In any of the forms, when we try to use a remote-tracking branch for
the remote ref being updated, it is an error if there is no such
remote-tracking branch on our end.

Because the command line options are parsed _before_ we know which
remote we are pushing to, there needs further processing to the
parsed data after we instantiate the transport object to:

 * expand refname given by the user to a full refname to be
   matched with the list of struct ref used in match_push_refs()
   and set_ref_status_for_push(); and

 * learning the actual local ref that is the remote-tracking branch
   for the specified remote ref.

Further, some processing need to be deferred until we find the set
of remote refs and match_push_refs() returns in order to find the
ones that need to be checked after explicit ones have been processed
for --lockref (no specific details).

These post-processing will be the topic of the next patch.

Oh, of course, the option name is still not cast in stone.

Signed-off-by: Junio C Hamano gits...@pobox.com
---

 This round makes --lockref simply a weaker form of --force.

 The old version used to allow --lockref=refname: to say I want
 this push to create refname, any existing old value should make
 it fail, but the normal fast-forward check already implements it,
 so there is no need for such an option.  This round removes it.

 The old version allowed you to express I want the old value of
 'master' to be at X and also I want the update to fast-forward by
 giving only --lockref=master:X without --force.  Because you
 already know if the commit you are updating their 'master' with
 fast-forwards from X or not, such a combination is unnecessary to
 support.  This simplifies the UI greatly, as --force can again be
 the big red button that overrides all the safety.

 I think the option now should be renamed to --force-with-???; you
 are essentially taking a lease on the ref before you start your
 rebase, and making sure the ref hasn't been updated by other
 people, you are safely forcing your push of a rebased history.

 Perhaps --force-with-lease?  --force-leased-ref?
---
 Documentation/git-push.txt | 68 ++
 builtin/push.c |  6 
 builtin/send-pack.c| 17 
 remote.c   | 57 ++
 remote.h   | 22 +++
 5 files changed, 159 insertions(+), 11 deletions(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index f7dfe48..b0a793e 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -11,6 +11,7 @@ SYNOPSIS
 [verse]
 'git push' [--all | --mirror | --tags] [--follow-tags] [-n | --dry-run] 
[--receive-pack=git-receive-pack]
   [--repo=repository] [-f | --force] [--prune] [-v | --verbose] [-u 
| --set-upstream]
+  [--lockref[=refname[:expect]]]
   [--no-verify] [repository [refspec...]]
 
 DESCRIPTION
@@ -130,21 +131,66 @@ already exists on the remote side.
repository over ssh, and you do not have the program in
a directory on the default $PATH.
 
+--[no-]lockref::
+--lockref=refname::
+--lockref=refname:expect::
+   Usually, the command refuses to update a remote ref that is
+   not an ancestor of the local ref used to overwrite it.
++
+This option bypasses the check, but instead requires that the
+current value of the ref to be the expected value.
++
+Imagine that you have to rebase what you have already published.
+You will have to bypass the must fast-forward rule to replace the
+history you originally published with the rebased history.  If
+somebody else built on top of your original history while you are
+rebasing, the tip of the branch at the remote may advance with her
+commit, and blindly pushing with `--force` will lose her work.  By
+using this option to specify that you expect the history you are
+updating is what you rebased and want to replace, you can make sure
+other people's 

[PATCH v2 0/6] Less potent --force for git push

2013-07-11 Thread Junio C Hamano
So here is a reroll to make --lockref a weaker form of --force
that by itself makes git push bypass the usual must fast-forward
check but enforces a different check the old ref must be at X
instead, taking ideas from J6t.  This allows --force to be again
the big red button to bypass all the safety.

I still am not married to the lockref name, and as I said in 3/6,
I think --force-with-lease might be a better name for it.

The previous round is found at $gmane/229987.


[PATCH v2 1/6] cache.h: move remote/connect API out of it
[PATCH v2 2/6] builtin/push.c: use OPT_BOOL, not OPT_BOOLEAN

These are the same from v1

[PATCH v2 3/6] remote.c: add command line option parser for --lockref

The earlier one dug the system from both ends and met in the middle,
but this UI level change has been moved to the front---the series
now digs from the UI surface to the core.  The documentation has
also been moved to this patch, using the same text as the proposed
log message, as Michael Haggerty suggested.

[PATCH v2 4/6] push --lockref: implement logic to populate old_sha1_expect[]

This is unchanged.

[PATCH v2 5/6] push --lockref: tie it all together

This corresponds to old 3/7, but the semantics has been adjusted to
allow --force override it.

[PATCH v2 6/6] t5533: test push --lockref

This has been adjusted to the change in 5/6 (earlier --force would
not break the --lockref safety, but now it does).

Junio C Hamano (6):
  cache.h: move remote/connect API out of it
  builtin/push.c: use OPT_BOOL, not OPT_BOOLEAN
  remote.c: add command line option parser for --lockref
  push --lockref: implement logic to populate old_sha1_expect[]
  push --lockref: tie it all together
  t5533: test push --lockref

 Documentation/git-push.txt |  68 +---
 builtin/fetch-pack.c   |   2 +
 builtin/push.c |  18 -
 builtin/receive-pack.c |   1 +
 builtin/send-pack.c|  26 +++
 cache.h|  62 ---
 connect.c  |   1 +
 connect.h  |  13 
 fetch-pack.c   |   1 +
 fetch-pack.h   |   1 +
 refs.c |   8 --
 remote.c   | 175 +
 remote.h   |  83 
 send-pack.c|   2 +
 t/t5533-push-cas.sh| 189 +
 transport-helper.c |   6 ++
 transport.c|  13 
 transport.h|   5 ++
 upload-pack.c  |   1 +
 19 files changed, 578 insertions(+), 97 deletions(-)
 create mode 100644 connect.h
 create mode 100755 t/t5533-push-cas.sh

-- 
1.8.3.2-912-g65cf5cf

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/6] builtin/push.c: use OPT_BOOL, not OPT_BOOLEAN

2013-07-11 Thread Junio C Hamano
The command line parser of git push for --tags, --delete, and
--thin options still used outdated OPT_BOOLEAN.  Because these
options do not give escalating levels when given multiple times,
they should use OPT_BOOL.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 builtin/push.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/push.c b/builtin/push.c
index 2d84d10..342d792 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -427,15 +427,15 @@ int cmd_push(int argc, const char **argv, const char 
*prefix)
OPT_BIT( 0 , all, flags, N_(push all refs), 
TRANSPORT_PUSH_ALL),
OPT_BIT( 0 , mirror, flags, N_(mirror all refs),
(TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE)),
-   OPT_BOOLEAN( 0, delete, deleterefs, N_(delete refs)),
-   OPT_BOOLEAN( 0 , tags, tags, N_(push tags (can't be used 
with --all or --mirror))),
+   OPT_BOOL( 0, delete, deleterefs, N_(delete refs)),
+   OPT_BOOL( 0 , tags, tags, N_(push tags (can't be used with 
--all or --mirror))),
OPT_BIT('n' , dry-run, flags, N_(dry run), 
TRANSPORT_PUSH_DRY_RUN),
OPT_BIT( 0,  porcelain, flags, N_(machine-readable 
output), TRANSPORT_PUSH_PORCELAIN),
OPT_BIT('f', force, flags, N_(force updates), 
TRANSPORT_PUSH_FORCE),
{ OPTION_CALLBACK, 0, recurse-submodules, flags, N_(check),
N_(control recursive pushing of submodules),
PARSE_OPT_OPTARG, option_parse_recurse_submodules },
-   OPT_BOOLEAN( 0 , thin, thin, N_(use thin pack)),
+   OPT_BOOL( 0 , thin, thin, N_(use thin pack)),
OPT_STRING( 0 , receive-pack, receivepack, receive-pack, 
N_(receive pack program)),
OPT_STRING( 0 , exec, receivepack, receive-pack, 
N_(receive pack program)),
OPT_BIT('u', set-upstream, flags, N_(set upstream for git 
pull/status),
-- 
1.8.3.2-912-g65cf5cf

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/6] push --lockref: implement logic to populate old_sha1_expect[]

2013-07-11 Thread Junio C Hamano
This plugs the push_cas_option data collected by the command line
option parser to the transport system with a new function
apply_push_cas(), which is called after match_push_refs() has
already been called.

At this point, we know which remote we are talking to, and what
remote refs we are going to update, so we can fill in the details
that may have been missing from the command line, such as

 (1) what abbreviated refname the user gave us matches the actual
 refname at the remote; and

 (2) which remote tracking branch in our local repository to read the
 value of the object to expect at the remote.

to populate the old_sha1_expect[] field of each of the remote ref.

Still nobody uses this information, which is the topic of the next
patch.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 builtin/push.c  |  6 ++
 builtin/send-pack.c |  3 +++
 remote.c| 61 +
 remote.h|  6 ++
 transport.c |  6 ++
 transport.h |  4 
 6 files changed, 86 insertions(+)

diff --git a/builtin/push.c b/builtin/push.c
index 31a5ba0..b0e3691 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -299,6 +299,12 @@ static int push_with_options(struct transport *transport, 
int flags)
if (thin)
transport_set_option(transport, TRANS_OPT_THIN, yes);
 
+   if (!is_empty_cas(cas)) {
+   if (!transport-smart_options)
+   die(underlying transport does not support --lockref 
option);
+   transport-smart_options-cas = cas;
+   }
+
if (verbosity  0)
fprintf(stderr, _(Pushing to %s\n), transport-url);
err = transport_push(transport, refspec_nr, refspec, flags,
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index a23b26d..6027ead 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -242,6 +242,9 @@ int cmd_send_pack(int argc, const char **argv, const char 
*prefix)
if (match_push_refs(local_refs, remote_refs, nr_refspecs, refspecs, 
flags))
return -1;
 
+   if (!is_empty_cas(cas))
+   apply_push_cas(cas, remote, remote_refs);
+
set_ref_status_for_push(remote_refs, args.send_mirror,
args.force_update);
 
diff --git a/remote.c b/remote.c
index 6014acd..93e5b65 100644
--- a/remote.c
+++ b/remote.c
@@ -1978,3 +1978,64 @@ int parseopt_push_cas_option(const struct option *opt, 
const char *arg, int unse
 {
return parse_push_cas_option(opt-value, arg, unset);
 }
+
+int is_empty_cas(const struct push_cas_option *cas)
+{
+   return !cas-use_tracking_for_rest  !cas-nr;
+}
+
+/*
+ * Look at remote.fetch refspec and see if we have a remote
+ * tracking branch for the refname there.  Fill its current
+ * value in sha1[].
+ * If we cannot do so, return negative to signal an error.
+ */
+static int remote_tracking(struct remote *remote, const char *refname,
+  unsigned char sha1[20])
+{
+   char *dst;
+
+   dst = apply_refspecs(remote-fetch, remote-fetch_refspec_nr, refname);
+   if (!dst)
+   return -1; /* no tracking ref for refname at remote */
+   if (read_ref(dst, sha1))
+   return -1; /* we know what the tracking ref is but we cannot 
read it */
+   return 0;
+}
+
+static void apply_cas(struct push_cas_option *cas,
+ struct remote *remote,
+ struct ref *ref)
+{
+   int i;
+
+   /* Find an explicit --lockref=name[:value] entry */
+   for (i = 0; i  cas-nr; i++) {
+   struct push_cas *entry = cas-entry[i];
+   if (!refname_match(entry-refname, ref-name, 
ref_rev_parse_rules))
+   continue;
+   ref-expect_old_sha1 = 1;
+   if (!entry-use_tracking)
+   hashcpy(ref-old_sha1_expect, cas-entry[i].expect);
+   else if (remote_tracking(remote, ref-name, 
ref-old_sha1_expect))
+   ref-expect_old_no_trackback = 1;
+   return;
+   }
+
+   /* Are we using --lockref to cover all? */
+   if (!cas-use_tracking_for_rest)
+   return;
+
+   ref-expect_old_sha1 = 1;
+   if (remote_tracking(remote, ref-name, ref-old_sha1_expect))
+   ref-expect_old_no_trackback = 1;
+}
+
+void apply_push_cas(struct push_cas_option *cas,
+   struct remote *remote,
+   struct ref *remote_refs)
+{
+   struct ref *ref;
+   for (ref = remote_refs; ref; ref = ref-next)
+   apply_cas(cas, remote, ref);
+}
diff --git a/remote.h b/remote.h
index 8070303..4c564c5 100644
--- a/remote.h
+++ b/remote.h
@@ -77,10 +77,13 @@ struct ref {
struct ref *next;
unsigned char old_sha1[20];
unsigned char new_sha1[20];
+   unsigned char old_sha1_expect[20]; /* used by expect-old */
char *symref;
unsigned int
   

[PATCH v5 0/5] allow more sources for config values

2013-07-11 Thread Heiko Voigt
Hi,

I finally got around rerolling this series.

The fourth iteration can be found here[1]. You can also find these patches as a
branch on my github[2].

There is not many changes since the last iteration. I have added some
documentation for the --blob option and now there is no rename of the
do_config_from() function anymore.

Cheers Heiko

[1] http://article.gmane.org/gmane.comp.version-control.git/223964
[2] https://github.com/hvoigt/git/commits/hv/strbuf_config_parsing-series5


Heiko Voigt (5):
  config: factor out config file stack management
  config: drop cf validity check in get_next_char()
  config: make parsing stack struct independent from actual data source
  teach config --blob option to parse config from database
  do not die when error in config parsing of buf occurs

 Documentation/git-config.txt |   7 ++
 builtin/config.c |  31 ++-
 cache.h  |   6 +-
 config.c | 217 +--
 t/t1307-config-blob.sh   |  70 ++
 5 files changed, 278 insertions(+), 53 deletions(-)
 create mode 100755 t/t1307-config-blob.sh

-- 
1.8.3.2.773.gcfaae5b

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 1/5] config: factor out config file stack management

2013-07-11 Thread Heiko Voigt
Because a config callback may start parsing a new file, the
global context regarding the current config file is stored
as a stack. Currently we only need to manage that stack from
git_config_from_file. Let's factor it out to allow new
sources of config data.

Signed-off-by: Heiko Voigt hvo...@hvoigt.net
---
 config.c | 42 --
 1 file changed, 28 insertions(+), 14 deletions(-)

diff --git a/config.c b/config.c
index d04e815..c165c85 100644
--- a/config.c
+++ b/config.c
@@ -906,6 +906,32 @@ int git_default_config(const char *var, const char *value, 
void *dummy)
return 0;
 }
 
+/*
+ * The fields f and name of top need to be initialized before calling
+ * this function.
+ */
+static int do_config_from(struct config_file *top, config_fn_t fn, void *data)
+{
+   int ret;
+
+   /* push config-file parsing state stack */
+   top-prev = cf;
+   top-linenr = 1;
+   top-eof = 0;
+   strbuf_init(top-value, 1024);
+   strbuf_init(top-var, 1024);
+   cf = top;
+
+   ret = git_parse_file(fn, data);
+
+   /* pop config-file parsing state stack */
+   strbuf_release(top-value);
+   strbuf_release(top-var);
+   cf = top-prev;
+
+   return ret;
+}
+
 int git_config_from_file(config_fn_t fn, const char *filename, void *data)
 {
int ret;
@@ -915,22 +941,10 @@ int git_config_from_file(config_fn_t fn, const char 
*filename, void *data)
if (f) {
config_file top;
 
-   /* push config-file parsing state stack */
-   top.prev = cf;
top.f = f;
top.name = filename;
-   top.linenr = 1;
-   top.eof = 0;
-   strbuf_init(top.value, 1024);
-   strbuf_init(top.var, 1024);
-   cf = top;
-
-   ret = git_parse_file(fn, data);
-
-   /* pop config-file parsing state stack */
-   strbuf_release(top.value);
-   strbuf_release(top.var);
-   cf = top.prev;
+
+   ret = do_config_from(top, fn, data);
 
fclose(f);
}
-- 
1.8.3.2.773.gcfaae5b

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 2/5] config: drop cf validity check in get_next_char()

2013-07-11 Thread Heiko Voigt
The global variable cf is set with an initialized value in all codepaths before
calling this function.

The complete call graph looks like this:

  git_config_from_file
- do_config_from
  - git_parse_file
- get_next_char
- get_value
- get_next_char
- parse_value
- get_next_char
- get_base_var
- get_next_char
- get_extended_base_var
- get_next_char

The variable is initialized in do_config_from.

Signed-off-by: Heiko Voigt hvo...@hvoigt.net
---
 config.c | 29 +
 1 file changed, 13 insertions(+), 16 deletions(-)

diff --git a/config.c b/config.c
index c165c85..1ec73ad 100644
--- a/config.c
+++ b/config.c
@@ -169,26 +169,23 @@ int git_config_from_parameters(config_fn_t fn, void *data)
 static int get_next_char(void)
 {
int c;
-   FILE *f;
+   FILE *f = cf-f;
 
-   c = '\n';
-   if (cf  ((f = cf-f) != NULL)) {
+   c = fgetc(f);
+   if (c == '\r') {
+   /* DOS like systems */
c = fgetc(f);
-   if (c == '\r') {
-   /* DOS like systems */
-   c = fgetc(f);
-   if (c != '\n') {
-   ungetc(c, f);
-   c = '\r';
-   }
-   }
-   if (c == '\n')
-   cf-linenr++;
-   if (c == EOF) {
-   cf-eof = 1;
-   c = '\n';
+   if (c != '\n') {
+   ungetc(c, f);
+   c = '\r';
}
}
+   if (c == '\n')
+   cf-linenr++;
+   if (c == EOF) {
+   cf-eof = 1;
+   c = '\n';
+   }
return c;
 }
 
-- 
1.8.3.2.773.gcfaae5b

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 3/5] config: make parsing stack struct independent from actual data source

2013-07-11 Thread Heiko Voigt
To simplify adding other sources we extract all functions needed for
parsing into a list of callbacks. We implement those callbacks for the
current file parsing. A new source can implement its own set of callbacks.

Instead of storing the concrete FILE pointer for parsing we store a void
pointer. A new source can use this to store its custom data.

Signed-off-by: Heiko Voigt hvo...@hvoigt.net
---
 config.c | 64 +++-
 1 file changed, 43 insertions(+), 21 deletions(-)

diff --git a/config.c b/config.c
index 1ec73ad..71da389 100644
--- a/config.c
+++ b/config.c
@@ -10,20 +10,41 @@
 #include strbuf.h
 #include quote.h
 
-typedef struct config_file {
-   struct config_file *prev;
-   FILE *f;
+struct config_source {
+   struct config_source *prev;
+   union {
+   FILE *file;
+   } u;
const char *name;
int linenr;
int eof;
struct strbuf value;
struct strbuf var;
-} config_file;
 
-static config_file *cf;
+   int (*fgetc)(struct config_source *c);
+   int (*ungetc)(int c, struct config_source *conf);
+   long (*ftell)(struct config_source *c);
+};
+
+static struct config_source *cf;
 
 static int zlib_compression_seen;
 
+static int config_file_fgetc(struct config_source *conf)
+{
+   return fgetc(conf-u.file);
+}
+
+static int config_file_ungetc(int c, struct config_source *conf)
+{
+   return ungetc(c, conf-u.file);
+}
+
+static long config_file_ftell(struct config_source *conf)
+{
+   return ftell(conf-u.file);
+}
+
 #define MAX_INCLUDE_DEPTH 10
 static const char include_depth_advice[] =
 exceeded maximum include depth (%d) while including\n
@@ -168,15 +189,13 @@ int git_config_from_parameters(config_fn_t fn, void *data)
 
 static int get_next_char(void)
 {
-   int c;
-   FILE *f = cf-f;
+   int c = cf-fgetc(cf);
 
-   c = fgetc(f);
if (c == '\r') {
/* DOS like systems */
-   c = fgetc(f);
+   c = cf-fgetc(cf);
if (c != '\n') {
-   ungetc(c, f);
+   cf-ungetc(c, cf);
c = '\r';
}
}
@@ -336,7 +355,7 @@ static int get_base_var(struct strbuf *name)
}
 }
 
-static int git_parse_file(config_fn_t fn, void *data)
+static int git_parse_source(config_fn_t fn, void *data)
 {
int comment = 0;
int baselen = 0;
@@ -904,10 +923,11 @@ int git_default_config(const char *var, const char 
*value, void *dummy)
 }
 
 /*
- * The fields f and name of top need to be initialized before calling
+ * All source specific fields in the union, name and the callbacks
+ * fgetc, ungetc, ftell of top need to be initialized before calling
  * this function.
  */
-static int do_config_from(struct config_file *top, config_fn_t fn, void *data)
+static int do_config_from(struct config_source *top, config_fn_t fn, void 
*data)
 {
int ret;
 
@@ -919,7 +939,7 @@ static int do_config_from(struct config_file *top, 
config_fn_t fn, void *data)
strbuf_init(top-var, 1024);
cf = top;
 
-   ret = git_parse_file(fn, data);
+   ret = git_parse_source(fn, data);
 
/* pop config-file parsing state stack */
strbuf_release(top-value);
@@ -936,10 +956,13 @@ int git_config_from_file(config_fn_t fn, const char 
*filename, void *data)
 
ret = -1;
if (f) {
-   config_file top;
+   struct config_source top;
 
-   top.f = f;
+   top.u.file = f;
top.name = filename;
+   top.fgetc = config_file_fgetc;
+   top.ungetc = config_file_ungetc;
+   top.ftell = config_file_ftell;
 
ret = do_config_from(top, fn, data);
 
@@ -1074,7 +1097,6 @@ static int store_aux(const char *key, const char *value, 
void *cb)
 {
const char *ep;
size_t section_len;
-   FILE *f = cf-f;
 
switch (store.state) {
case KEY_SEEN:
@@ -1086,7 +1108,7 @@ static int store_aux(const char *key, const char *value, 
void *cb)
return 1;
}
 
-   store.offset[store.seen] = ftell(f);
+   store.offset[store.seen] = cf-ftell(cf);
store.seen++;
}
break;
@@ -1113,19 +1135,19 @@ static int store_aux(const char *key, const char 
*value, void *cb)
 * Do not increment matches: this is no match, but we
 * just made sure we are in the desired section.
 */
-   store.offset[store.seen] = ftell(f);
+   store.offset[store.seen] = cf-ftell(cf);
/* fallthru */
case SECTION_END_SEEN:
case START:
if (matches(key, value)) {
-   store.offset[store.seen] = ftell(f);
+   

[PATCH v5 4/5] teach config --blob option to parse config from database

2013-07-11 Thread Heiko Voigt
This can be used to read configuration values directly from git's
database. For example it is useful for reading to be checked out
.gitmodules files directly from the database.

Signed-off-by: Heiko Voigt hvo...@hvoigt.net
---
 Documentation/git-config.txt |  7 
 builtin/config.c | 31 +---
 cache.h  |  6 +++-
 config.c | 86 ++--
 t/t1307-config-blob.sh   | 70 
 5 files changed, 193 insertions(+), 7 deletions(-)
 create mode 100755 t/t1307-config-blob.sh

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index fbad05e..5c9ddbe 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -127,6 +127,13 @@ See also FILES.
 --file config-file::
Use the given config file instead of the one specified by GIT_CONFIG.
 
+--blob blob::
+   Similar to '--file' but use the given blob instead of a file. E.g.
+   you can use 'master:.gitmodules' to read values from the file
+   '.gitmodules' in the master branch. See SPECIFYING REVISIONS
+   section in linkgit:gitrevisions[7] for a more complete list of
+   ways to spell blob names.
+
 --remove-section::
Remove the given section from the configuration file.
 
diff --git a/builtin/config.c b/builtin/config.c
index 7759671..4010c43 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -21,6 +21,7 @@ static char term = '\n';
 
 static int use_global_config, use_system_config, use_local_config;
 static const char *given_config_file;
+static const char *given_config_blob;
 static int actions, types;
 static const char *get_color_slot, *get_colorbool_slot;
 static int end_null;
@@ -53,6 +54,7 @@ static struct option builtin_config_options[] = {
OPT_BOOLEAN(0, system, use_system_config, N_(use system config 
file)),
OPT_BOOLEAN(0, local, use_local_config, N_(use repository config 
file)),
OPT_STRING('f', file, given_config_file, N_(file), N_(use given 
config file)),
+   OPT_STRING(0, blob, given_config_blob, N_(blob-id), N_(read 
config from given blob object)),
OPT_GROUP(N_(Action)),
OPT_BIT(0, get, actions, N_(get value: name [value-regex]), 
ACTION_GET),
OPT_BIT(0, get-all, actions, N_(get all values: key 
[value-regex]), ACTION_GET_ALL),
@@ -218,7 +220,8 @@ static int get_value(const char *key_, const char *regex_)
}
 
git_config_with_options(collect_config, values,
-   given_config_file, respect_includes);
+   given_config_file, given_config_blob,
+   respect_includes);
 
ret = !values.nr;
 
@@ -302,7 +305,8 @@ static void get_color(const char *def_color)
get_color_found = 0;
parsed_color[0] = '\0';
git_config_with_options(git_get_color_config, NULL,
-   given_config_file, respect_includes);
+   given_config_file, given_config_blob,
+   respect_includes);
 
if (!get_color_found  def_color)
color_parse(def_color, command line, parsed_color);
@@ -331,7 +335,8 @@ static int get_colorbool(int print)
get_diff_color_found = -1;
get_color_ui_found = -1;
git_config_with_options(git_get_colorbool_config, NULL,
-   given_config_file, respect_includes);
+   given_config_file, given_config_blob,
+   respect_includes);
 
if (get_colorbool_found  0) {
if (!strcmp(get_colorbool_slot, color.diff))
@@ -353,6 +358,12 @@ static int get_colorbool(int print)
return get_colorbool_found ? 0 : 1;
 }
 
+static void check_blob_write(void)
+{
+   if (given_config_blob)
+   die(writing config blobs is not supported);
+}
+
 int cmd_config(int argc, const char **argv, const char *prefix)
 {
int nongit = !startup_info-have_repository;
@@ -364,7 +375,8 @@ int cmd_config(int argc, const char **argv, const char 
*prefix)
 builtin_config_usage,
 PARSE_OPT_STOP_AT_NON_OPTION);
 
-   if (use_global_config + use_system_config + use_local_config + 
!!given_config_file  1) {
+   if (use_global_config + use_system_config + use_local_config +
+   !!given_config_file + !!given_config_blob  1) {
error(only one config file at a time.);
usage_with_options(builtin_config_usage, 
builtin_config_options);
}
@@ -443,6 +455,7 @@ int cmd_config(int argc, const char **argv, const char 
*prefix)
check_argc(argc, 0, 0);
if (git_config_with_options(show_all_config, NULL,
given_config_file,
+   given_config_blob,
   

[PATCH v5 5/5] do not die when error in config parsing of buf occurs

2013-07-11 Thread Heiko Voigt
If a config parsing error in a file occurs we can die and let the user
fix the issue. This is different for the buf parsing function since it
can be used to parse blobs of .gitmodules files. If a parsing error
occurs here we should proceed since otherwise a database containing such
an error in a single revision could be rendered unusable.

Signed-off-by: Heiko Voigt hvo...@hvoigt.net
---
 config.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/config.c b/config.c
index 75f6ad9..e13a7b6 100644
--- a/config.c
+++ b/config.c
@@ -21,6 +21,7 @@ struct config_source {
} buf;
} u;
const char *name;
+   int die_on_error;
int linenr;
int eof;
struct strbuf value;
@@ -442,7 +443,10 @@ static int git_parse_source(config_fn_t fn, void *data)
if (get_value(fn, data, var)  0)
break;
}
-   die(bad config file line %d in %s, cf-linenr, cf-name);
+   if (cf-die_on_error)
+   die(bad config file line %d in %s, cf-linenr, cf-name);
+   else
+   return error(bad config file line %d in %s, cf-linenr, 
cf-name);
 }
 
 static int parse_unit_factor(const char *end, uintmax_t *val)
@@ -950,7 +954,7 @@ int git_default_config(const char *var, const char *value, 
void *dummy)
 }
 
 /*
- * All source specific fields in the union, name and the callbacks
+ * All source specific fields in the union, die_on_error, name and the 
callbacks
  * fgetc, ungetc, ftell of top need to be initialized before calling
  * this function.
  */
@@ -987,6 +991,7 @@ int git_config_from_file(config_fn_t fn, const char 
*filename, void *data)
 
top.u.file = f;
top.name = filename;
+   top.die_on_error = 1;
top.fgetc = config_file_fgetc;
top.ungetc = config_file_ungetc;
top.ftell = config_file_ftell;
@@ -1007,6 +1012,7 @@ int git_config_from_buf(config_fn_t fn, const char *name, 
const char *buf,
top.u.buf.len = len;
top.u.buf.pos = 0;
top.name = name;
+   top.die_on_error = 0;
top.fgetc = config_buf_fgetc;
top.ungetc = config_buf_ungetc;
top.ftell = config_buf_ftell;
-- 
1.8.3.2.773.gcfaae5b

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] During a shallow fetch, prevent sending over unneeded objects

2013-07-11 Thread Junio C Hamano
Matthijs Kooijman matth...@stdin.nl writes:

[administrivia: you seem to have mail-followup-to that points at you
and the list; is that really needed???]

 This happens when a client issues a fetch with a depth bigger or equal
 to the number of commits the server is ahead of the client.

Do you mean smaller (not bigger)?

 diff --git a/upload-pack.c b/upload-pack.c
 index 59f43d1..5885f33 100644
 --- a/upload-pack.c
 +++ b/upload-pack.c
 @@ -122,6 +122,14 @@ static int do_rev_list(int in, int out, void *user_data)
   if (prepare_revision_walk(revs))
   die(revision walk setup failed);
   mark_edges_uninteresting(revs.commits, revs, show_edge);
 + /* In case we create a new shallow root, make sure that all
 +  * we don't send over objects that the client already has just
 +  * because their have revisions are no longer reachable from
 +  * the shallow root. */
 + for (i = 0; i  have_obj.nr; i++) {
 + struct commit *commit = (struct commit 
 *)have_obj.objects[i].item;
 + mark_tree_uninteresting(commit-tree);
 + }

Hmph.

In your discussion (including the comment), you talk about shallow
root (I think that is the same as what we call shallow boundary),
but in this added block, there is nothing that checks CLIENT_SHALLOW
or SHALLOW flags to special case that.

Is it a good idea to unconditionally do this for all have
revisions?

Also there is another loop that iterates over have revisions just
above the precontext.  I wonder if this added code belongs in that
loop.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] config: add support for http.url.* settings

2013-07-11 Thread Junio C Hamano
Kyle J. McKay mack...@gmail.com writes:

 +static size_t http_option_max_matched_len[opt_max];
 +
  static int curl_ssl_verify = -1;
  static int curl_ssl_try;
  static const char *ssl_cert;
 @@ -141,34 +169,122 @@ static void process_curl_messages(void)
  }
  #endif
  
 +static size_t http_options_url_match_prefix(const char *url,
 + const char *url_prefix,
 + size_t url_prefix_len)
 +{
 + /*
 +  * url_prefix matches url if url_prefix is an exact match for url or it
 +  * is a prefix of url and the match ends on a path component boundary.
 +  * Both url and url_prefix are considered to have an implicit '/' on the
 +  * end for matching purposes if they do not already.
 +  *
 +  * url must be NUL terminated.  url_prefix_len is the length of
 +  * url_prefix which need not be NUL terminated.
 +  *
 +  * The return value is the length of the match in characters (excluding
 +  * any final '/') or 0 for no match.  Passing / as url_prefix will
 +  * always cause 0 to be returned.
 +  */
 + size_t url_len;
 + if (url_prefix_len  url_prefix[url_prefix_len - 1] == '/')
 + url_prefix_len--;
 + if (!url_prefix_len || strncmp(url, url_prefix, url_prefix_len))
 + return 0;

URL=http://a.b/c/; vs URLprefix=http://a.b/c/; would not return
here, because we only check up to the length of the prefix after
stripping the trailing slash from the prefix..

URL=http://a.b/cd; vs URLprefix=http://a.b/c/; would not return
here, because we only check up to the length of the prefix after
stripping the trailing slash from the prefix.

The latter needs to be rejected in the next condition.

 + url_len = strlen(url);
 + if ( (url_len == url_prefix_len)  ||

I thought your plan was that you wanted to treat URL=http://a.b/c/;
and URL=http://a.b/c; the same way; taking strlen(url) and using it
for comparison without adjusting for the trailing slash makes it
smell somewhat fishy...

 +  (url[url_prefix_len - 1] == '/') ||
 +  (url[url_prefix_len] == '/') )

The prefix side no longer has slash at the end, and we know url and
the prefix match up to the length of the prefix at this point.  Can
url[prefix-len - 1] ever be '/'?  The latter (e.g. one past the
prefix length can be '/' so that the prefix http://a.b/c; can
match against url http://a.b/c/anything;) makes sense, though.

It is not immediately obvious if this function is correct, at
least to me.

 + return url[url_prefix_len - 1] == '/'
 +? url_prefix_len - 1 : url_prefix_len;
 + return 0;
 +}
 +
 +static int check_matched_len(enum http_option_type opt, size_t matchlen)
 +{
 + /*
 +  * Check the last matched length of option opt against matchlen
 +  * and return true if the last matched length is larger (meaning
 +  * the config setting should be ignored).  Otherwise return false
 +  * and record matchlen as the last matched length of option opt.
 +  */
 + if (http_option_max_matched_len[opt]  matchlen)
 + return 1;

If you had

http.http://a.b/c.opt = A

in your ~/.gitconfig and then

http.http://a.b/c.opt = B

to override it for a particular repository's .git/config, then we
read ~/.gitconfig first, remembering http://a.b/c; has matched, and
then we read .git/config, and encounter the same URLprefix.  As the
comparision is done with , not =, this allows the latter to
override the former.

Which may deserve to be added to the comment, perhaps

... length is larger (meaning the config setting should be
ignored).  Upon seeing the _same_ key (i.e. new key is the
same length as the longest match so far) is not ignored, but
overrides the previous settings.

or something.  It also would be nice to have a test to make sure
this will not be broken in any future change.

Other than that, overall the change looks nice.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 0/5] allow more sources for config values

2013-07-11 Thread Junio C Hamano
Thanks.

The differences since the last round I see are these.  And I think
the series overall makes sense (I haven't look hard enough to pick
any nits yet, though).

Thanks, will queue.

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 9ae2508..f0e179e 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -118,6 +118,13 @@ See also FILES.
 --file config-file::
Use the given config file instead of the one specified by GIT_CONFIG.
 
+--blob blob::
+   Similar to '--file' but use the given blob instead of a file. E.g.
+   you can use 'master:.gitmodules' to read values from the file
+   '.gitmodules' in the master branch. See SPECIFYING REVISIONS
+   section in linkgit:gitrevisions[7] for a more complete list of
+   ways to spell blob names.
+
 --remove-section::
Remove the given section from the configuration file.
 
diff --git a/config.c b/config.c
index a8d3dcf..680dd6d 100644
--- a/config.c
+++ b/config.c
@@ -948,7 +948,7 @@ int git_default_config(const char *var, const char *value, 
void *dummy)
  * fgetc, ungetc, ftell of top need to be initialized before calling
  * this function.
  */
-static int do_config_from_source(struct config_source *top, config_fn_t fn, 
void *data)
+static int do_config_from(struct config_source *top, config_fn_t fn, void 
*data)
 {
int ret;
 
@@ -986,7 +986,7 @@ int git_config_from_file(config_fn_t fn, const char 
*filename, void *data)
top.ungetc = config_file_ungetc;
top.ftell = config_file_ftell;
 
-   ret = do_config_from_source(top, fn, data);
+   ret = do_config_from(top, fn, data);
 
fclose(f);
}
@@ -1007,7 +1007,7 @@ int git_config_from_buf(config_fn_t fn, const char *name, 
const char *buf,
top.ungetc = config_buf_ungetc;
top.ftell = config_buf_ftell;
 
-   return do_config_from_source(top, fn, data);
+   return do_config_from(top, fn, data);
 }
 
 static int git_config_from_blob_sha1(config_fn_t fn,
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


What's cooking in git.git (Jul 2013, #04; Thu, 11)

2013-07-11 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to master]

* af/rebase-i-merge-options (2013-07-02) 1 commit
  (merged to 'next' on 2013-07-08 at f411975)
 + Do not ignore merge options in interactive rebase

 git rebase -i now honors --strategy and -X options.


* jc/maint-diff-core-safecrlf (2013-06-25) 1 commit
  (merged to 'next' on 2013-07-03 at db8a2a6)
 + diff: demote core.safecrlf=true to core.safecrlf=warn

 git diff refused to even show difference when core.safecrlf is
 set to true (i.e. error out) and there are offending lines in the
 working tree files.


* jc/t1512-fix (2013-07-01) 2 commits
  (merged to 'next' on 2013-07-09 at a6c62bb)
 + get_short_sha1(): correctly disambiguate type-limited abbreviation
 + t1512: correct leftover constants from earlier edition

 A test that should have failed but didn't revealed a bug that needs
 to be corrected.


* jc/triangle-push-fixup (2013-06-24) 5 commits
  (merged to 'next' on 2013-06-26 at 73cbb69)
 + t/t5528-push-default: test pushdefault workflows
 + t/t5528-push-default: generalize test_push_*
 + push: change `simple` to accommodate triangular workflows
 + config doc: rewrite push.default section
 + t/t5528-push-default: remove redundant test_config lines

 Earlier remote.pushdefault (and per-branch branch.*.pushremote)
 were introduced as an additional mechanism to choose what
 repository to push into when git push did not say it from the
 command line, to help people who push to a repository that is
 different from where they fetch from.  This attempts to finish that
 topic by teaching the default mechanism to choose branch in the
 remote repository to be updated by such a push.

 The 'current', 'matching' and 'nothing' modes (specified by the
 push.default configuration variable) extend to such a triangular
 workflow naturally, but 'upstream' and 'simple' have to be updated.

 . 'upstream' is about pushing back to update the branch in the
 remote repository that the current branch fetches from and
 integrates with, it errors out in a triangular workflow.

 . 'simple' is meant to help new people by avoiding mistakes, and
 will be the safe default in Git 2.0.  In a non-triangular
 workflow, it will continue to act as a cross between 'upstream'
 and 'current' in that it pushes to the current branch's
 @{upstream} only when it is set to the same name as the current
 branch (e.g. your 'master' forks from the 'master' from the
 central repository).  In a triangular workflow, this series
 tentatively defines it as the same as 'current', but we may have
 to tighten it to avoid surprises in some way.


* jg/status-config (2013-06-24) 4 commits
  (merged to 'next' on 2013-07-03 at 6ac1ada)
 + status/commit: make sure --porcelain is not affected by user-facing config
 + commit: make it work with status.short
 + status: introduce status.branch to enable --branch by default
 + status: introduce status.short to enable --short by default

 git status learned status.branch and status.short configuration
 variables to use --branch and --short options by default (override
 with --no-branch and --no-short options from the command line).

 The bottom two has been graduated to 'master' but then reverted.
 The tip two are quick attempts to fix the fallout.  The one for
 status.short looks correct; the other one, while it is correct,
 is unfortunately overly complex in order not to introduce an
 unnecessary regression.


* jk/bash-completion (2013-06-30) 2 commits
  (merged to 'next' on 2013-07-01 at 6daca44)
 + completion: learn about --man-path
 + completion: handle unstuck form of base git options


* mh/maint-lockfile-overflow (2013-07-07) 1 commit
  (merged to 'next' on 2013-07-09 at e1a0eac)
 + lockfile: fix buffer overflow in path handling

 Will merge later to 'maint'.


* pb/stash-refuse-to-kill (2013-07-01) 2 commits
  (merged to 'next' on 2013-07-05 at 78ecc59)
 + git stash: avoid data loss when git stash save kills a directory
 + treat_directory(): do not declare submodules to be untracked

 git stash save is not just about saving the local changes, but
 also is to restore the working tree state to that of HEAD. If you
 changed a non-directory into a directory in the local change, you
 may have untracked files in that directory, which have to be killed
 while doing so, unless you run it with --include-untracked.  Teach
 the command to detect and error out before spreading the damage.

 This needed a small fix to ls-files --killed.


* rr/rebase-checkout-reflog (2013-06-17) 5 commits
  (merged to 'next' on 2013-07-01 at 27cfd27)
 + checkout: respect GIT_REFLOG_ACTION
  (merged to 'next' on 2013-06-27 at 4d99efa)
 + status: do 

merging commit history

2013-07-11 Thread Stephen Linda Smith
I'm working on a project that used to use a proprietary CM system (aka oldCM).  
 At a point in time, the state of the code was frozen and used as the basis for 
commits in SVN.

What I would like to to do is take the individal commits from the oldCM and 
place them into git knowing that the time/date stamps won't match.  Then I want 
to do whatever is necessary to
setup git so that I can run svn rebase to pull in the commits from the SVN 
repository.

What is the easy way to do this?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: merging commit history

2013-07-11 Thread Andrew Ardill
On 12 July 2013 09:43, Stephen  Linda Smith isch...@cox.net wrote:
 I'm working on a project that used to use a proprietary CM system (aka 
 oldCM).   At a point in time, the state of the code was frozen and used as 
 the basis for commits in SVN.

 What I would like to to do is take the individal commits from the oldCM and 
 place them into git knowing that the time/date stamps won't match.  Then I 
 want to do whatever is necessary to
 setup git so that I can run svn rebase to pull in the commits from the SVN 
 repository.

 What is the easy way to do this?


There may be other tools that make this easier, but if I had this
problem I would simply create two repositories, one for oldCM and one
for SVN. I would then merge the two together (as branches with
different roots) and do my rebase from there.

I haven't tried this, and maybe there is something I am missing, but
there shouldn't be too much pain going that way.


Regards,

Andrew Ardill
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/4] t4203: test mailmap functionality directly rather than indirectly

2013-07-11 Thread Eric Sunshine
On Thu, Jul 11, 2013 at 3:07 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 With the introduction of check-mailmap, it is now possible to check
 .mailmap functionality directly rather than indirectly as a side-effect
 of other commands (such as git-shortlog), therefore, do so.

 Does this patch mean that we will now ignore future breakages in
 shortlog and blame if their mailmap integration becomes buggy?

 I am not convinced it is a good idea if that is what is going on.

I meant to mention in the cover letter that this patch was open for
debate, however, it does not eliminate all testing of these other
commands.

The tests in which git-check-mailmap is substituted for git-shortlog
all worked against a simplistic two-commit repository. Those tests
were checking the low-level mailmap functionality under various
conditions and configurations; they were not especially checking any
particular behavior of git-shortlog.

There still remain a final eight tests which cover git-shortlog,
git-log, and git-blame. These tests do check mailmap-related behavior
of those commands, and they do so using a more involved seven-commit
repository with complex mappings, so we have not necessarily lost
any checks of mailmap integration for those commands.

Would this patch become more palatable if I stated the above in the
commit message?

-- ES
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/4] t4203: test mailmap functionality directly rather than indirectly

2013-07-11 Thread Jonathan Nieder
Eric Sunshine wrote:
 On Thu, Jul 11, 2013 at 3:07 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 With the introduction of check-mailmap, it is now possible to check
 .mailmap functionality directly rather than indirectly as a side-effect
 of other commands (such as git-shortlog), therefore, do so.

 Does this patch mean that we will now ignore future breakages in
 shortlog and blame if their mailmap integration becomes buggy?

 I am not convinced it is a good idea if that is what is going on.

 I meant to mention in the cover letter that this patch was open for
 debate, however, it does not eliminate all testing of these other
 commands.

 The tests in which git-check-mailmap is substituted for git-shortlog
 all worked against a simplistic two-commit repository. Those tests
 were checking the low-level mailmap functionality under various
 conditions and configurations; they were not especially checking any
 particular behavior of git-shortlog.

 There still remain a final eight tests which cover git-shortlog,
 git-log, and git-blame. These tests do check mailmap-related behavior
 of those commands, and they do so using a more involved seven-commit
 repository with complex mappings, so we have not necessarily lost
 any checks of mailmap integration for those commands.

 Would this patch become more palatable if I stated the above in the
 commit message?

My current thinking is no --- the patch has as a justification Now
we can test these aspects of .mailmap handling directly with a
low-level tool instead of using the tool most people will use, so do
so, which sounds an awful lot like Reduce test coverage of commonly
used tools, because we can.

But I imagine the actual motivation was something other than because
we can.  For example, maybe the idea is that after this patch, it
should be easier to make cosmetic improvements to the shortlog, log,
and blame output and only have to change those final 8 tests that are
adequately covering the output?  If you have such plans and this patch
makes them easier, it could sound like a good patch as a stepping
stone toward that.

Thanks and hope that helps,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/4] t4203: test mailmap functionality directly rather than indirectly

2013-07-11 Thread Eric Sunshine
On Thu, Jul 11, 2013 at 8:55 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Eric Sunshine wrote:
 On Thu, Jul 11, 2013 at 3:07 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 With the introduction of check-mailmap, it is now possible to check
 .mailmap functionality directly rather than indirectly as a side-effect
 of other commands (such as git-shortlog), therefore, do so.

 Does this patch mean that we will now ignore future breakages in
 shortlog and blame if their mailmap integration becomes buggy?

 I am not convinced it is a good idea if that is what is going on.

 I meant to mention in the cover letter that this patch was open for
 debate, however, it does not eliminate all testing of these other
 commands.

 The tests in which git-check-mailmap is substituted for git-shortlog
 all worked against a simplistic two-commit repository. Those tests
 were checking the low-level mailmap functionality under various
 conditions and configurations; they were not especially checking any
 particular behavior of git-shortlog.

 There still remain a final eight tests which cover git-shortlog,
 git-log, and git-blame. These tests do check mailmap-related behavior
 of those commands, and they do so using a more involved seven-commit
 repository with complex mappings, so we have not necessarily lost
 any checks of mailmap integration for those commands.

 Would this patch become more palatable if I stated the above in the
 commit message?

 My current thinking is no --- the patch has as a justification Now
 we can test these aspects of .mailmap handling directly with a
 low-level tool instead of using the tool most people will use, so do
 so, which sounds an awful lot like Reduce test coverage of commonly
 used tools, because we can.

 But I imagine the actual motivation was something other than because
 we can.

The motivation is as stated in the subject: Direct rather than
indirect testing of mailmap functionality. The tests in question are
properly crafted to check only low-level mailmap (not git-shortlog)
functionality under various conditions and configurations, yet the
mental load on the reader is high because he has to keep in mind the
authors and commits in the repository underlying git-shortlog's
output. When testing via git-check-mailmap, mental load is low: there
is a direct correlation between the Name email@address which goes
in and that which comes out. What is being tested is the same, but
it's easier to reason about and understand the the tests when done
directly.

When converting the tests, I also discovered an additional, perhaps
more important motivation. Most of the tests check only that name
remapping functions correctly, however, a couple also attempt to check
address remapping (b...@company.xx = b...@company.xy). Even though
the tests succeed, they don't actually prove that address remapping
was correct (or occurred at all) since git-shortlog does not display
the email address. git-check-mailmap always displays the email
address, thus the check actually tests the intended email address
remapping.

 For example, maybe the idea is that after this patch, it
 should be easier to make cosmetic improvements to the shortlog, log,
 and blame output and only have to change those final 8 tests that are
 adequately covering the output?  If you have such plans and this patch
 makes them easier, it could sound like a good patch as a stepping
 stone toward that.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/4] builtin: add git-check-mailmap command

2013-07-11 Thread Eric Sunshine
On Thu, Jul 11, 2013 at 3:04 PM, Junio C Hamano gits...@pobox.com wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:

 +DESCRIPTION
 +---
 +
 +For each ``Name $$email@address$$'' or ``$$email@address$$'' provided on
 +the command-line or standard input (when using `--stdin`), prints a line 
 with
 +the canonical contact information for that person according to the 'mailmap'
 +(see Mapping Authors below). If no mapping exists for the person, echoes 
 the
 +provided contact information.

 OK.  The last part needed a reading and a half for me to realize the
 echoes the provided contact information is the same thing as the
 input string is printed as-is, especially because contact
 information is not defined anywhere in the previous part of the
 DESCRIPTION section, though.  I might be worth starting the
 paragraph with:

 For each contact information (either in the form of ``Name
 user@host'' or ...)

 in order to clarify that the two forms of input is what you call
 contact information.

Is this easier to read?

For each ``Name $$user@host$$'' or ``$$user@host$$'' from the
command-line or standard input (when using `--stdin`), print a line
showing either the canonical name and email address (see Mapping
Authors below), or the input ``Name $$user@host$$'' or
``$$user@host$$'' if there is no mapping for that person.

 diff --git a/builtin/check-mailmap.c b/builtin/check-mailmap.c
 new file mode 100644
 index 000..c36a61c
 --- /dev/null
 +++ b/builtin/check-mailmap.c
 @@ -0,0 +1,69 @@
 +#include builtin.h
 +#include mailmap.h
 +#include parse-options.h
 +#include string-list.h
 +
 +static int use_stdin;
 +static int null_out;

 Is there a reason why the variable name should not match that of the
 corresponding variables in check-ignore and check-attr, which you
 copied the bulk of the code from?

 If there isn't, use null_term_line like they do.

In check-attr, null_term_line indicates that _input_ lines are
null-terminated. In check-ignore, null_term_lines is overloaded (and
perhaps abused) to mean that both _input_ and _output_ lines are
null-terminated. In check-mailmap, -z affects only _output_ lines. A
reader of the code can easily be misled into thinking that the
variable controls the same function in all three programs, hence
null_out was chosen to make it clear that it applies only to output. A
lesser reason is that, in the future, someone might add an option to
null terminate input lines (distinct from output), and null_in would
be a reasonable name for that variable.

Other than the above reasoning, I don't feel strongly about it and can
revert the name if you prefer.

 +static void check_mailmap(struct string_list *mailmap, const char *contact)
 +{
 + const char *name, *mail;
 + size_t namelen, maillen;
 + struct ident_split ident;
 + char term = null_out ? '\0' : '\n';

 Is there a reason why the variable name term should not match that
 of the corresponding variables in check-ignore and check-attr, which
 you copied the bulk of the code from?

 If there isn't, use line_termination like they do.

No strong justification. As with 'buf' vs. 'buffer',
'line_termination' required more reading effort without conveying any
more information than 'term'.

 + if (split_ident_line(ident, contact, strlen(contact)))
 + die(_(unable to parse contact: %s), contact);
 +
 + name = ident.name_begin;
 + namelen = ident.name_end - ident.name_begin;
 + mail = ident.mail_begin;
 + maillen = ident.mail_end - ident.mail_begin;
 +
 + map_user(mailmap, mail, maillen, name, namelen);
 +
 + if (namelen)
 + printf(%.*s , (int)namelen, name);
 + printf(%.*s%c, (int)maillen, mail, term);
 +}
 +
 +int cmd_check_mailmap(int argc, const char **argv, const char *prefix)
 +{
 + int i;
 + struct string_list mailmap = STRING_LIST_INIT_NODUP;
 +
 + git_config(git_default_config, NULL);
 + argc = parse_options(argc, argv, prefix, check_mailmap_options,
 + check_mailmap_usage, 0);

 It is a bit distracting that the second line that is not indented
 enough.  Either (1) with enough HT and with minimum number of SP,
 align argc and check_mailmap_usage, or (2) with minimum number
 of HT and no SP, push check_mailmap_usage to the right of opening
 parenthesis of (argc.  Our code tend to prefer (1).

Okay.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] git-clone.txt: remove the restriction on pushing from a shallow clone

2013-07-11 Thread Nguyễn Thái Ngọc Duy
The document says one cannot push from a shallow clone. But that is
not true (maybe it was at some point in the past). The client does not
stop such a push nor does it give any indication to the receiver that
this is a shallow push. If the receiver accepts it, it's in.

Since 52fed6e (receive-pack: check connectivity before concluding git
push - 2011-09-02), receive-pack is prepared to deal with broken
push, a shallow push can't cause any corruption. Update the document
to reflect that.

Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
---
 Documentation/git-clone.txt | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 450f158..85769b8 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -182,11 +182,13 @@ objects from the source repository into a pack in the 
cloned repository.
 --depth depth::
Create a 'shallow' clone with a history truncated to the
specified number of revisions.  A shallow repository has a
-   number of limitations (you cannot clone or fetch from
-   it, nor push from nor into it), but is adequate if you
-   are only interested in the recent history of a large project
-   with a long history, and would want to send in fixes
-   as patches.
+   number of limitations (you cannot clone or fetch from it, nor
+   push into it), but is adequate if you are only interested in
+   the recent history of a large project with a long history.
++
+Pushing from a shallow clone should be avoided if the git version on
+the receiver end is older than v1.7.10, or any other git
+implementation that does not perform connectivity check.
 
 --[no-]single-branch::
Clone only the history leading to the tip of a single branch,
-- 
1.8.2.82.gc24b958

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/4] builtin: add git-check-mailmap command

2013-07-11 Thread Junio C Hamano
Eric Sunshine sunsh...@sunshineco.com writes:

 For each contact information (either in the form of ``Name
 user@host'' or ...)

 in order to clarify that the two forms of input is what you call
 contact information.

 Is this easier to read?

 For each ``Name $$user@host$$'' or ``$$user@host$$'' from the
 command-line or standard input (when using `--stdin`), print a line
 showing either the canonical name and email address (see Mapping
 Authors below), or the input ``Name $$user@host$$'' or
 ``$$user@host$$'' if there is no mapping for that person.

I find it easier than your original, but I do not know if you would
want to repeat the Name... or user@host at the end.  It does not
seem to add much useful information and is distracting.

 If there isn't, use null_term_line like they do.

 In check-attr, null_term_line indicates that _input_ lines are
 null-terminated. In check-ignore, null_term_lines is overloaded (and
 perhaps abused) to mean that both _input_ and _output_ lines are
 null-terminated.

That is unfortunate but it is good that you found the breakage.  As
we do not have --nul-terminated-input and --nul-terminated-output
options separtely, -z should apply to both input and output.  What
b4666852 (check-attr: Add --stdin option, 2008-10-07) did is broken.
What check-ignore does 

We should find a way to fix it.  I have a feeling that silently
fixing it and seeing if anybody screams might be the best course of
action ;-).

Also git check-ignore -h advertises -z as only affecting --stdin,
which is also wrong.  It does affect both input and output as it should,
so it should be described as such, I think.

Thanks for noticing.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/4] t4203: test mailmap functionality directly rather than indirectly

2013-07-11 Thread Junio C Hamano
Jonathan Nieder jrnie...@gmail.com writes:

 My current thinking is no --- the patch has as a justification Now
 we can test these aspects of .mailmap handling directly with a
 low-level tool instead of using the tool most people will use, so do
 so, which sounds an awful lot like Reduce test coverage of commonly
 used tools, because we can.

Yes, that was exactly my reaction that prompted my response.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-clone.txt: remove the restriction on pushing from a shallow clone

2013-07-11 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 The document says one cannot push from a shallow clone. But that is
 not true (maybe it was at some point in the past). The client does not
 stop such a push nor does it give any indication to the receiver that
 this is a shallow push. If the receiver accepts it, it's in.

 Since 52fed6e (receive-pack: check connectivity before concluding git
 push - 2011-09-02), receive-pack is prepared to deal with broken
 push, a shallow push can't cause any corruption. Update the document
 to reflect that.

Err, what are these lines in builtin/receive-pack.c doing, then?

int cmd_receive_pack(int argc, const char **argv, const char *prefix)
{
...
setup_path();

if (!enter_repo(dir, 0))
die('%s' does not appear to be a git repository, dir);

if (is_repository_shallow())
die(attempt to push into a shallow repository);
...

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-clone.txt: remove the restriction on pushing from a shallow clone

2013-07-11 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 + number of limitations (you cannot clone or fetch from it, nor
 + push into it), but is adequate if you are only interested in
 + the recent history of a large project with a long history.

Ahh, sorry for the noise.  You still say you cannot push _into_ it.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html