Re: Wishlist: git fetch --reference

2014-08-22 Thread Dennis Kaarsemaker
On do, 2014-08-21 at 19:57 -0700, Howard Chu wrote:
 I maintain multiple copies of the same repo because I keep each one checked 
 out to different branch/rev levels. It would be nice if, similar to clone 
 --reference, we could also use git fetch --reference to reference a local 
 repo 
 when doing a fetch to pull in updates.

Alternatively, you can use Duy's multiple-work-trees patches to safely
make multiple checkouts of one repository. These patches are in next.  
-- 
Dennis Kaarsemaker
http://www.kaarsemaker.net

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


IMPORTANT OFFER!!!

2014-08-22 Thread Febo, Madeline
IMPORTANT OFFER!!!
I, Liliane authenticate this email, you can read about me on:
http://en.wikipedia.org/wiki/Liliane_Bettencourt
I write to you because I intend to give to you a portion of my
Net-worth which I have been banking. I want to cede it out as gift
hoping it would be of help to you and others too. Respond for
confirmation to this email address: lilianebettenc...@aol.fr
With love,
Liliane Bettencourt
lilianebettenc...@aol.fr --
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


strange histories from filter-branch

2014-08-22 Thread Bernard Clark
I tried sending this mail to the git for humans mailing list but got
no response, so I'll try here.

I've been running the git filter-branch described on this page:

http://stackoverflow.com/questions/14759345/how-to-split-a-git-repository-and-follow-directory-renames.

But the resulting history includes at least two extraneous commits,
i.e., commits that appear to affect no files in the subdirectory of
interest. (The corresponding commits in the original history do affect
files, but those files are in another subdirectory.)

I also noticed several duplicate parent errors during the git
filter-branch. The page
http://stackoverflow.com/questions/15161809/git-duplicate-parent-causes-half-the-history-to-to-disappear
says that those errors can produce an incomplete new history, so I'm
now wondering if, in addition to including extraneous commits, my new
history might also be missing some. Both that page and
http://stackoverflow.com/questions/7489713/git-duplicate-parent/7501703#7501703
say that the duplicate parent errors should disappear if git
filter-branch is first run with no filter. But that didn't work for
me.

Any ideas?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v20 43/48] refs.c: move the check for valid refname to lock_ref_sha1_basic

2014-08-22 Thread Michael Haggerty
On 08/20/2014 11:47 PM, Ronnie Sahlberg wrote:
 [...]
 Since we already display broken/unresolvable refs, I think the most
 consistent path is to also allow showing the refs broken/illegal-names
 too in the list. (when DO_FOR_EACH_INCLUDE_BROKEN is specified)
 Of course, an end user could fix this by deleting the file but since
 it is easy to add the special casing to 'git branch -D' to handle this
 case I think this would be more userfriendly since then the user can
 use git branch -D regardless of the reason why the ref is broken.

My concern with this idea is that some code relies on at least some of
the reference name constraints for its proper functioning; for example,

* The ref caching code would likely be confused by ill-formed refnames
like refs/heads//foo or /refs/heads/foo or refs/heads/foo/.  (I
understand that such references cannot exist as loose refs, but they
could be represented in the packed-refs file.)

* Any code that might try to read or write a loose reference would
likely be confused by refs/heads//foo or refs/heads/./foo or
refs/heads/../foo or /refs/heads/foo or refs/heads/foo/.  On
Windows there might also be problems with refs/heads\foo or
d:refs/heads/foo or prn:refs/heads/foo or //refs/heads/foo.

* The locking code could easily be confused by a reference named
refs/heads/foo.lock.

So to the extent that we loosen the checks on refnames when they are
read, we would have to re-vet any code that touches them to make sure
that it doesn't break in a horrible (and possibly security-compromising)
way.  This is why I would prefer to quarantine broken reference names in
the smallest possible part of the code.

I *think* that the biggest problems would be related to reference names
that do not map straightforwardly to relative filenames, so an
alternative would be to do some minimal checks in any case, but make it
possible to turn off the stricter checks (those that mostly exist to
make reference expression parsing possible) when necessary.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] teach fast-export an --anonymize option

2014-08-22 Thread Duy Nguyen
On Fri, Aug 22, 2014 at 6:21 AM, Jeff King p...@peff.net wrote:
 -- 8 --
 Subject: teach fast-export an --anonymize option

 Sometimes users want to report a bug they experience on
 their repository, but they are not at liberty to share the
 contents of the repository. It would be useful if they could
 produce a repository that has a similar shape to its history
 and tree, but without leaking any information. This
 anonymized repository could then be shared with developers
 (assuming it still replicates the original problem).

This is cool. Thanks Jeff. Steven could you try this with the repo
that failed shallow clone --no-single-branch the other day?


 This patch implements an --anonymize option to
 fast-export, which generates a stream that can recreate such
 a repository. Producing a single stream makes it easy for
 the caller to verify that they are not leaking any useful
 information. You can get an overview of what will be shared
 by running a command like:

   git fast-export --anonymize --all |
   perl -pe 's/\d+/X/g' |
   sort -u |
   less

 which will show every unique line we generate, modulo any
 numbers (each anonymized token is assigned a number, like
 User 0, and we replace it consistently in the output).

 In addition to anonymizing, this produces test cases that
 are relatively small (compared to the original repository)
 and fast to generate (compared to using filter-branch, or
 modifying the output of fast-export yourself). Here are
 numbers for git.git:

   $ time git fast-export --anonymize --all \
  --tag-of-filtered-object=drop output
   real0m2.883s
   user0m2.828s
   sys 0m0.052s

   $ gzip output
   $ ls -lh output.gz | awk '{print $5}'
   2.9M

 Signed-off-by: Jeff King p...@peff.net
 ---
  Documentation/git-fast-export.txt |   6 +
  builtin/fast-export.c | 300 
 --
  t/t9351-fast-export-anonymize.sh  | 117 +++
  3 files changed, 412 insertions(+), 11 deletions(-)
  create mode 100755 t/t9351-fast-export-anonymize.sh

 diff --git a/Documentation/git-fast-export.txt 
 b/Documentation/git-fast-export.txt
 index 221506b..52831fa 100644
 --- a/Documentation/git-fast-export.txt
 +++ b/Documentation/git-fast-export.txt
 @@ -105,6 +105,12 @@ marks the same across runs.
 in the commit (as opposed to just listing the files which are
 different from the commit's first parent).

 +--anonymize::
 +   Replace all refnames, paths, blob contents, commit and tag
 +   messages, names, and email addresses in the output with
 +   anonymized data, while still retaining the shape of history and
 +   of the stored tree.
 +
  --refspec::
 Apply the specified refspec to each ref exported. Multiple of them can
 be specified.
 diff --git a/builtin/fast-export.c b/builtin/fast-export.c
 index 92b4624..b8182c2 100644
 --- a/builtin/fast-export.c
 +++ b/builtin/fast-export.c
 @@ -18,6 +18,7 @@
  #include parse-options.h
  #include quote.h
  #include remote.h
 +#include blob.h

  static const char *fast_export_usage[] = {
 N_(git fast-export [rev-list-opts]),
 @@ -34,6 +35,7 @@ static int full_tree;
  static struct string_list extra_refs = STRING_LIST_INIT_NODUP;
  static struct refspec *refspecs;
  static int refspecs_nr;
 +static int anonymize;

  static int parse_opt_signed_tag_mode(const struct option *opt,
  const char *arg, int unset)
 @@ -81,6 +83,76 @@ static int has_unshown_parent(struct commit *commit)
 return 0;
  }

 +struct anonymized_entry {
 +   struct hashmap_entry hash;
 +   const char *orig;
 +   size_t orig_len;
 +   const char *anon;
 +   size_t anon_len;
 +};
 +
 +static int anonymized_entry_cmp(const void *va, const void *vb,
 +   const void *data)
 +{
 +   const struct anonymized_entry *a = va, *b = vb;
 +   return a-orig_len != b-orig_len ||
 +   memcmp(a-orig, b-orig, a-orig_len);
 +}
 +
 +/*
 + * Basically keep a cache of X-Y so that we can repeatedly replace
 + * the same anonymized string with another. The actual generation
 + * is farmed out to the generate function.
 + */
 +static const void *anonymize_mem(struct hashmap *map,
 +void *(*generate)(const void *, size_t *),
 +const void *orig, size_t *len)
 +{
 +   struct anonymized_entry key, *ret;
 +
 +   if (!map-cmpfn)
 +   hashmap_init(map, anonymized_entry_cmp, 0);
 +
 +   hashmap_entry_init(key, memhash(orig, *len));
 +   key.orig = orig;
 +   key.orig_len = *len;
 +   ret = hashmap_get(map, key, NULL);
 +
 +   if (!ret) {
 +   ret = xmalloc(sizeof(*ret));
 +   hashmap_entry_init(ret-hash, key.hash.hash);
 +   ret-orig = xstrdup(orig);
 +   ret-orig_len = *len;
 +   ret-anon = generate(orig, len);
 +   

Re: Shallow clones with explicit history cutoff?

2014-08-22 Thread Duy Nguyen
On Thu, Aug 21, 2014 at 10:39 PM, Matthias Urlichs matth...@urlichs.de wrote:
 What I would like to have, instead, is a version of shallow cloning which
 cuts off not at a pre-determined depth, but at a given branch (or set of
 branches). In other words, given

 +-J--K  (packaged)
//
   +-F--G--HI(clean)
  /   /
 A---B---C---D---E   (upstream)

 a command git clone --shallow-until upstream $REPO (or however that would
 be named) would create a shallow git archive which contains branches
 packaged+clean, with commits FGHIJK. In contrast, with --single-branch and
 --depth 4 I would get CGHIJK, which isn't what I'd want.

I would imagine a more generic mechanism git clone
--shallow-rev=rev $REPO where you could pass anything that git
rev-list can accept (maybe more restricted, and some verification
required). --shallow-rev could be repeated. So in your case it could
be git clone --shallow-rev=^A $REPO. We could even maybe turn
--depth into a generic thing that is accepted by rev-list so that it
could be easily combined with other rev-list options (--shallow-rev
and --depth are mutually exclusive).

 As I have not spent too much time with the git sources lately (as in None
 at all), some pointers where to start implementing this would be
 appreciated, assuming (a) this has a reasonable chance of landing in git and
 (b) nobody beats me to it. ;-)

I'd like to see this implemented. You are not the first one
complaining about the (lack of) flexibility of --depth. If you have
time, I may be able to support (I should not take on another topic
given my many ongoing/unfinished topics). The starting point is
upload-pack.c. And GIT_TRACE env variable will be your friend. Search
for get_shallow_commits(). There the function is supposed to traverse
down from want_obj and set/unset SHALLOW/NOT_SHALLOW flags  properly.

SHALLOW flag should be set right before the cut-out commit (e.g. B and
F if you want to cut A out). NOT_SHALLOW flags could be used to remove
shallow lines in the receiver repo. If you traverse past an existing
shallow point in the client (this is the fetch/pull case, not clone),
then you should set NOT_SHALLOW so the client knows to remove that
point from their $GIT_DIR/shallow. Once you set these properly, the
rest should work.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] Introduce GIT_MMAP_LIMIT to allow testing expected mmap size

2014-08-22 Thread Steffen Prohaska

On Aug 22, 2014, at 12:26 AM, Junio C Hamano gits...@pobox.com wrote:

 Steffen Prohaska proha...@zib.de writes:
 
 Similar to testing expectations about malloc with GIT_ALLOC_LIMIT (see
 commit d41489), it can be useful to test expectations about mmap.
 
 This introduces a new environment variable GIT_MMAP_LIMIT to limit the
 largest allowed mmap length (in KB).  xmmap() is modified to check the
 limit.  Together with GIT_ALLOC_LIMIT tests can now easily confirm
 expectations about memory consumption.
 
 GIT_ALLOC_LIMIT will be used in the next commit to test that data will
 
 I smell the need for s/ALLOC/MMAP/ here, but perhaps you did mean
 ALLOC (I won't know until I check 3/3 ;-)

You are right.


 diff --git a/sha1_file.c b/sha1_file.c
 index 00c07f2..88d64c0 100644
 --- a/sha1_file.c
 +++ b/sha1_file.c
 @@ -663,10 +663,25 @@ void release_pack_memory(size_t need)
  ; /* nothing */
 }
 
 +static void mmap_limit_check(size_t length)
 +{
 +static int limit = -1;
 
 Perhaps you want ssize_t here?  I see mmap() as a tool to handle a
 lot more data than a single malloc() typically would ;-) so previous
 mistakes by other people would not be a good excuse.

Maybe; and ...


 +if (limit == -1) {
 +const char *env = getenv(GIT_MMAP_LIMIT);
 +limit = env ? atoi(env) * 1024 : 0;

... this should then be changed to atol(env), and ... 


 +}
 +if (limit  length  limit)
 +die(attempting to mmap %PRIuMAX over limit %d,
 +(intmax_t)length, limit);

... here PRIuMAX and (uintmax_t); (uintmax_t) also for length.

I'll fix it and also GIT_ALLOC_LIMIT.

Steffen


 +}
 +
 void *xmmap(void *start, size_t length,
  int prot, int flags, int fd, off_t offset)
 {
 -void *ret = mmap(start, length, prot, flags, fd, offset);
 +void *ret;
 +
 +mmap_limit_check(length);
 +ret = mmap(start, length, prot, flags, fd, offset);
  if (ret == MAP_FAILED) {
  if (!length)
  return NULL;

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 3/4] Change GIT_ALLOC_LIMIT check to use ssize_t for consistency

2014-08-22 Thread Steffen Prohaska
GIT_ALLOC_LIMIT limits xmalloc()'s size, which is of type size_t.  Use
ssize_t to store the limit for consistency.  The change has no direct
practical impact, because we use GIT_ALLOC_LIMIT to test small sizes.

The cast of size in the call to die() is changed to uintmax_t to match
the format string PRIuMAX.

Signed-off-by: Steffen Prohaska proha...@zib.de
---
 wrapper.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/wrapper.c b/wrapper.c
index bc1bfb8..e91f6e9 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -11,14 +11,14 @@ static void (*try_to_free_routine)(size_t size) = 
do_nothing;
 
 static void memory_limit_check(size_t size)
 {
-   static int limit = -1;
+   static ssize_t limit = -1;
if (limit == -1) {
const char *env = getenv(GIT_ALLOC_LIMIT);
-   limit = env ? atoi(env) * 1024 : 0;
+   limit = env ? atol(env) * 1024 : 0;
}
if (limit  size  limit)
-   die(attempting to allocate %PRIuMAX over limit %d,
-   (intmax_t)size, limit);
+   die(attempting to allocate %PRIuMAX over limit %PRIuMAX,
+   (uintmax_t)size, (uintmax_t)limit);
 }
 
 try_to_free_t set_try_to_free_routine(try_to_free_t routine)
-- 
2.1.0.6.gb452461

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 1/4] convert: Refactor would_convert_to_git() to single arg 'path'

2014-08-22 Thread Steffen Prohaska
It is only the path that matters in the decision whether to filter or
not.  Clarify this by making path the single argument of
would_convert_to_git().

Signed-off-by: Steffen Prohaska proha...@zib.de
---
 convert.h   | 5 ++---
 sha1_file.c | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/convert.h b/convert.h
index 0c2143c..c638b33 100644
--- a/convert.h
+++ b/convert.h
@@ -40,10 +40,9 @@ extern int convert_to_working_tree(const char *path, const 
char *src,
   size_t len, struct strbuf *dst);
 extern int renormalize_buffer(const char *path, const char *src, size_t len,
  struct strbuf *dst);
-static inline int would_convert_to_git(const char *path, const char *src,
-  size_t len, enum safe_crlf checksafe)
+static inline int would_convert_to_git(const char *path)
 {
-   return convert_to_git(path, src, len, NULL, checksafe);
+   return convert_to_git(path, NULL, 0, NULL, 0);
 }
 
 /*
diff --git a/sha1_file.c b/sha1_file.c
index 3f70b1d..00c07f2 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3144,7 +3144,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st,
if (!S_ISREG(st-st_mode))
ret = index_pipe(sha1, fd, type, path, flags);
else if (size = big_file_threshold || type != OBJ_BLOB ||
-(path  would_convert_to_git(path, NULL, 0, 0)))
+(path  would_convert_to_git(path)))
ret = index_core(sha1, fd, size, type, path, flags);
else
ret = index_stream(sha1, fd, size, type, path, flags);
-- 
2.1.0.6.gb452461

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 2/4] Introduce GIT_MMAP_LIMIT to allow testing expected mmap size

2014-08-22 Thread Steffen Prohaska
Similar to testing expectations about malloc with GIT_ALLOC_LIMIT (see
commit d41489), it can be useful to test expectations about mmap.

This introduces a new environment variable GIT_MMAP_LIMIT to limit the
largest allowed mmap length (in KB).  xmmap() is modified to check the
limit.  Together with GIT_ALLOC_LIMIT tests can now easily confirm
expectations about memory consumption.

GIT_MMAP_LIMIT will be used in the next commit to test that data will be
streamed to an external filter without mmaping the entire file.

[commit d41489]: d41489a6424308dc9a0409bc2f6845aa08bd4f7d Add more large
blob test cases

Signed-off-by: Steffen Prohaska proha...@zib.de
---
 sha1_file.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/sha1_file.c b/sha1_file.c
index 00c07f2..603673b 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -663,10 +663,25 @@ void release_pack_memory(size_t need)
; /* nothing */
 }
 
+static void mmap_limit_check(size_t length)
+{
+   static ssize_t limit = -1;
+   if (limit == -1) {
+   const char *env = getenv(GIT_MMAP_LIMIT);
+   limit = env ? atol(env) * 1024 : 0;
+   }
+   if (limit  length  limit)
+   die(attempting to mmap %PRIuMAX over limit %PRIuMAX,
+   (uintmax_t)length, (uintmax_t)limit);
+}
+
 void *xmmap(void *start, size_t length,
int prot, int flags, int fd, off_t offset)
 {
-   void *ret = mmap(start, length, prot, flags, fd, offset);
+   void *ret;
+
+   mmap_limit_check(length);
+   ret = mmap(start, length, prot, flags, fd, offset);
if (ret == MAP_FAILED) {
if (!length)
return NULL;
-- 
2.1.0.6.gb452461

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 0/4] Stream fd to clean filter, GIT_MMAP_LIMIT, GIT_ALLOC_LIMIT

2014-08-22 Thread Steffen Prohaska
Only changes since PATCH v3: Use ssize_t to store limits.

Steffen Prohaska (4):
  convert: Refactor would_convert_to_git() to single arg 'path'
  Introduce GIT_MMAP_LIMIT to allow testing expected mmap size
  Change GIT_ALLOC_LIMIT check to use ssize_t for consistency
  convert: Stream from fd to required clean filter instead of mmap

 convert.c | 60 +--
 convert.h | 10 ++---
 sha1_file.c   | 46 ---
 t/t0021-conversion.sh | 24 -
 wrapper.c |  8 +++
 5 files changed, 127 insertions(+), 21 deletions(-)

-- 
2.1.0.6.gb452461

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 4/4] convert: Stream from fd to required clean filter instead of mmap

2014-08-22 Thread Steffen Prohaska
The data is streamed to the filter process anyway.  Better avoid mapping
the file if possible.  This is especially useful if a clean filter
reduces the size, for example if it computes a sha1 for binary data,
like git media.  The file size that the previous implementation could
handle was limited by the available address space; large files for
example could not be handled with (32-bit) msysgit.  The new
implementation can filter files of any size as long as the filter output
is small enough.

The new code path is only taken if the filter is required.  The filter
consumes data directly from the fd.  The original data is not available
to git, so it must fail if the filter fails.

The environment variable GIT_MMAP_LIMIT, which has been introduced in
the previous commit is used to test that the expected code path is
taken.  A related test that exercises required filters is modified to
verify that the data actually has been modified on its way from the file
system to the object store.

Signed-off-by: Steffen Prohaska proha...@zib.de
---
 convert.c | 60 +--
 convert.h |  5 +
 sha1_file.c   | 27 ++-
 t/t0021-conversion.sh | 24 -
 4 files changed, 104 insertions(+), 12 deletions(-)

diff --git a/convert.c b/convert.c
index cb5fbb4..463f6de 100644
--- a/convert.c
+++ b/convert.c
@@ -312,11 +312,12 @@ static int crlf_to_worktree(const char *path, const char 
*src, size_t len,
 struct filter_params {
const char *src;
unsigned long size;
+   int fd;
const char *cmd;
const char *path;
 };
 
-static int filter_buffer(int in, int out, void *data)
+static int filter_buffer_or_fd(int in, int out, void *data)
 {
/*
 * Spawn cmd and feed the buffer contents through its stdin.
@@ -325,6 +326,7 @@ static int filter_buffer(int in, int out, void *data)
struct filter_params *params = (struct filter_params *)data;
int write_err, status;
const char *argv[] = { NULL, NULL };
+   int fd;
 
/* apply % substitution to cmd */
struct strbuf cmd = STRBUF_INIT;
@@ -355,7 +357,17 @@ static int filter_buffer(int in, int out, void *data)
 
sigchain_push(SIGPIPE, SIG_IGN);
 
-   write_err = (write_in_full(child_process.in, params-src, params-size) 
 0);
+   if (params-src) {
+   write_err = (write_in_full(child_process.in, params-src, 
params-size)  0);
+   } else {
+   /* dup(), because copy_fd() closes the input fd. */
+   fd = dup(params-fd);
+   if (fd  0)
+   write_err = error(failed to dup file descriptor.);
+   else
+   write_err = copy_fd(fd, child_process.in);
+   }
+
if (close(child_process.in))
write_err = 1;
if (write_err)
@@ -371,7 +383,7 @@ static int filter_buffer(int in, int out, void *data)
return (write_err || status);
 }
 
-static int apply_filter(const char *path, const char *src, size_t len,
+static int apply_filter(const char *path, const char *src, size_t len, int fd,
 struct strbuf *dst, const char *cmd)
 {
/*
@@ -392,11 +404,12 @@ static int apply_filter(const char *path, const char 
*src, size_t len,
return 1;
 
memset(async, 0, sizeof(async));
-   async.proc = filter_buffer;
+   async.proc = filter_buffer_or_fd;
async.data = params;
async.out = -1;
params.src = src;
params.size = len;
+   params.fd = fd;
params.cmd = cmd;
params.path = path;
 
@@ -747,6 +760,24 @@ static void convert_attrs(struct conv_attrs *ca, const 
char *path)
}
 }
 
+int would_convert_to_git_filter_fd(const char *path)
+{
+   struct conv_attrs ca;
+
+   convert_attrs(ca, path);
+   if (!ca.drv)
+   return 0;
+
+   /* Apply a filter to an fd only if the filter is required to succeed.
+* We must die if the filter fails, because the original data before
+* filtering is not available.
+*/
+   if (!ca.drv-required)
+   return 0;
+
+   return apply_filter(path, NULL, 0, -1, NULL, ca.drv-clean);
+}
+
 int convert_to_git(const char *path, const char *src, size_t len,
struct strbuf *dst, enum safe_crlf checksafe)
 {
@@ -761,7 +792,7 @@ int convert_to_git(const char *path, const char *src, 
size_t len,
required = ca.drv-required;
}
 
-   ret |= apply_filter(path, src, len, dst, filter);
+   ret |= apply_filter(path, src, len, -1, dst, filter);
if (!ret  required)
die(%s: clean filter '%s' failed, path, ca.drv-name);
 
@@ -778,6 +809,23 @@ int convert_to_git(const char *path, const char *src, 
size_t len,
return ret | ident_to_git(path, src, len, dst, ca.ident);
 }
 
+void convert_to_git_filter_fd(const char *path, int fd, struct strbuf *dst,
+

Re: Shallow clones with explicit history cutoff?

2014-08-22 Thread Matthias Urlichs
Hi,

Duy Nguyen:
 On Thu, Aug 21, 2014 at 10:39 PM, Matthias Urlichs matth...@urlichs.de 
 wrote:
  What I would like to have, instead, is a version of shallow cloning which
  cuts off not at a pre-determined depth, but at a given branch (or set of
  branches). In other words, given
 
  +-J--K  (packaged)
 //
+-F--G--HI(clean)
   /   /
  A---B---C---D---E   (upstream)
 
  a command git clone --shallow-until upstream $REPO (or however that would
  be named) would create a shallow git archive which contains branches
  packaged+clean, with commits FGHIJK. In contrast, with --single-branch and
  --depth 4 I would get CGHIJK, which isn't what I'd want.
 
 I would imagine a more generic mechanism git clone
 --shallow-rev=rev $REPO where you could pass anything that git
 rev-list can accept (maybe more restricted, and some verification
 required). --shallow-rev could be repeated. So in your case it could
 be git clone --shallow-rev=^A $REPO.

Umm, no. ^E (or ^upstream) would do what I want. Hopefully. ;-)

But you're right, that would fit far better into the existing git
paradigms.

  As I have not spent too much time with the git sources lately (as in None
  at all), some pointers where to start implementing this would be
  appreciated, assuming (a) this has a reasonable chance of landing in git and
  (b) nobody beats me to it. ;-)
 
 I'd like to see this implemented. You are not the first one
 complaining about the (lack of) flexibility of --depth. If you have
 time, I may be able to support (I should not take on another topic
 given my many ongoing/unfinished topics).

Welcome to the club. :-/

Thanks for the pointers. I'll see what I can do (and when).

-- 
-- Matthias Urlichs
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] upload-pack: keep poll(2)'s timeout to -1

2014-08-22 Thread Edward Thomson
Keep poll's timeout at -1 when uploadpack.keepalive = 0, instead of
setting it to -1000, since some pedantic old systems (eg HP-UX) and
the gnulib compat/poll will treat only -1 as the valid value for
an infinite timeout.

Signed-off-by: Edward Thomson ethom...@microsoft.com
---
 upload-pack.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/upload-pack.c b/upload-pack.c
index 01de944..433211a 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -167,7 +167,9 @@ static void create_pack_file(void)
if (!pollsize)
break;
 
-   ret = poll(pfd, pollsize, 1000 * keepalive);
+   ret = poll(pfd, pollsize,
+   keepalive  0 ? -1 : 1000 * keepalive);
+
if (ret  0) {
if (errno != EINTR) {
error(poll failed, resuming: %s,
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


check-ref-format: include refs/ in the argument or to strip it?

2014-08-22 Thread Jonathan Nieder
Hi,

Michael Haggerty wrote[1]:
 Jonathan Nieder wrote:

 The check-ref-format documentation is pretty unclear, but the
 intent is that it would be used like

  git check-ref-format heads/master

 (see the surviving examples in contrib/examples/). That way, it can
 enforce the rule (from git-check-ref-format(1))

  2. They must contain at least one /. This enforces the presence
  of a category like heads/, tags/ etc. but the actual names
  are not restricted.
[...]
 Thanks for the explanation and the pointer.

 I suppose that was a usage that was more popular in the past than
 now. I can hardly remember anyone talking about references like
 heads/master or tags/v1. It seems to me that when somebody wants
 to be unambiguous they usually use the whole reference name
 refs/heads/master or refs/tags/v1. When they want to be succinct
 they usually use just master or v1.

 To me it seems incongruous to have the heads/master syntax
 supported at this deep level of plumbing. In most cases, the caller
 could get the same effect by prepending refs/ to the string and
 then calling check_refname_format with a new
 REFNAME_REQUIRE_CATEGORY option that requires both a refs/ prefix
 and a total of at least *three* levels.

I agree that this piece of UI is pretty weird.  Worse, it's
underdocumented.

I wonder if it would make sense to have the check-ref-format commandline
utility require two slashes when its argument begins with refs/ (still
requiring only one slash when it doesn't, for backward compatibility)
and to start encouraging people to pass refnames with refs/ to it.

The alternative would be something like the following, which doesn't
seem too appealing.

Signed-off-by: Jonathan Nieder jrnie...@gmail.com
---
[1] https://code-review.googlesource.com/1017

 Documentation/git-check-ref-format.txt | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/Documentation/git-check-ref-format.txt 
b/Documentation/git-check-ref-format.txt
index fc02959..447e9fb 100644
--- a/Documentation/git-check-ref-format.txt
+++ b/Documentation/git-check-ref-format.txt
@@ -15,8 +15,8 @@ SYNOPSIS
 
 DESCRIPTION
 ---
-Checks if a given 'refname' is acceptable, and exits with a non-zero
-status if it is not.
+Checks if refs/refname is an acceptable ref name, and exits
+with a non-zero status if it is not.
 
 A reference is used in Git to specify branches and tags.  A
 branch head is stored in the `refs/heads` hierarchy, while
@@ -91,14 +91,14 @@ OPTIONS
components).  The default is `--no-allow-onelevel`.
 
 --refspec-pattern::
-   Interpret refname as a reference name pattern for a refspec
-   (as used with remote repositories).  If this option is
-   enabled, refname is allowed to contain a single `*`
-   in place of a one full pathname component (e.g.,
-   `foo/*/bar` but not `foo/bar*`).
+   Interpret refs/refname as a reference name pattern
+   for a refspec (as used with remote repositories).
+   If this option is enabled, refname is allowed
+   to contain a single `*` in place of a one full pathname
+   component (e.g., `foo/*/bar` but not `foo/bar*`).
 
 --normalize::
-   Normalize 'refname' by removing any leading slash (`/`)
+   Normalize refname by removing any leading slash (`/`)
characters and collapsing runs of adjacent slashes between
name components into a single slash.  Iff the normalized
refname is valid then print it to standard output and exit
@@ -118,7 +118,7 @@ $ git check-ref-format --branch @{-1}
 * Determine the reference name to use for a new branch:
 +
 
-$ ref=$(git check-ref-format --normalize refs/heads/$newbranch) ||
+$ ref=$(git check-ref-format --normalize heads/$newbranch) ||
 die we do not like '$newbranch' as a branch name.
 
 
-- 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] upload-pack: keep poll(2)'s timeout to -1

2014-08-22 Thread Jeff King
On Fri, Aug 22, 2014 at 03:19:11PM +, Edward Thomson wrote:

 Keep poll's timeout at -1 when uploadpack.keepalive = 0, instead of
 setting it to -1000, since some pedantic old systems (eg HP-UX) and
 the gnulib compat/poll will treat only -1 as the valid value for
 an infinite timeout.

That makes sense, and POSIX only specifies the behavior for -1 anyway.
The patch itself looks obviously correct. Thanks.

Since we're now translating the keepalive value, and since there's no
way to set it to 0 (nor would that really have any meaning), I guess
we could switch the internal no keepalive value to 0, and do:

  ret = poll(pfd, pollsize, keepalive ? 1000 * keepalive : -1);

which would let us avoid setting it to -1 in some other spots.  I dunno
if that actually makes a real difference to maintainability, though.
Either way:

  Acked-by: Jeff King p...@peff.net

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] upload-pack: keep poll(2)'s timeout to -1

2014-08-22 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 Since we're now translating the keepalive value, and since there's no
 way to set it to 0 (nor would that really have any meaning), I guess
 we could switch the internal no keepalive value to 0, and do:

   ret = poll(pfd, pollsize, keepalive ? 1000 * keepalive : -1);

 which would let us avoid setting it to -1 in some other spots.  I dunno
 if that actually makes a real difference to maintainability, though.

Where we parse and set the value of the variable, we do this:

else if (!strcmp(uploadpack.keepalive, var)) {
keepalive = git_config_int(var, value);
if (!keepalive)
keepalive = -1;
}

The condition may have to become if (keepalive = 0).

 Either way:

   Acked-by: Jeff King p...@peff.net

 -Peff

Yeah, either way, the patch as-posted is good.  Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] Introduce GIT_MMAP_LIMIT to allow testing expected mmap size

2014-08-22 Thread Junio C Hamano
Steffen Prohaska proha...@zib.de writes:

 +   if (limit == -1) {
 +   const char *env = getenv(GIT_MMAP_LIMIT);
 +   limit = env ? atoi(env) * 1024 : 0;

 ... this should then be changed to atol(env), and ... 

In the real codepath (not debugging aid like this) we try to avoid
atoi/atol so that we can catch errors like feeding 123Q and
parsing it as 123.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] upload-pack: keep poll(2)'s timeout to -1

2014-08-22 Thread Jeff King
On Fri, Aug 22, 2014 at 08:56:12AM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  Since we're now translating the keepalive value, and since there's no
  way to set it to 0 (nor would that really have any meaning), I guess
  we could switch the internal no keepalive value to 0, and do:
 
ret = poll(pfd, pollsize, keepalive ? 1000 * keepalive : -1);
 
  which would let us avoid setting it to -1 in some other spots.  I dunno
  if that actually makes a real difference to maintainability, though.
 
 Where we parse and set the value of the variable, we do this:
 
   else if (!strcmp(uploadpack.keepalive, var)) {
   keepalive = git_config_int(var, value);
   if (!keepalive)
   keepalive = -1;
   }
 
 The condition may have to become if (keepalive = 0).

Yeah, I wasn't thinking we would get negative values from the user (we
don't document them at all), but we should probably do something
sensible. Let's just leave it at Ed's patch.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Relative submodule URLs

2014-08-22 Thread Marc Branchaud
On 14-08-19 12:07 PM, Robert Dailey wrote:
 On Mon, Aug 18, 2014 at 3:55 PM, Jonathan Nieder jrnie...@gmail.com wrote:
 Thanks for reporting.  The remote used is the default remote that git
 fetch without further arguments would use:

 get_default_remote () {
 curr_branch=$(git symbolic-ref -q HEAD)
 curr_branch=${curr_branch#refs/heads/}
 origin=$(git config --get branch.$curr_branch.remote)
 echo ${origin:-origin}
 }

 The documentation is wrong.  git-fetch(1) doesn't provide a name for
 this thing.  Any ideas for wording?
 
 I guess a good start would be to call it the tracked remote instead
 of remote origin. The word tracked here makes it very obvious that
 if I have a remote tracking branch setup, it will use the remote
 portion of that configuration.
 
 The real question is, how will `git submodule update` function if a
 tracking remote is not configured? Will it fail with some useful error
 message? I don't like the idea of it defaulting to self remote mode,
 where it will be relative to my repo directory. That seems like it
 would fail is subtle ways in a worst-case scenario (if I did by
 happenstance have a bare repository cloned up one directory level for
 other reasons).
 
 Currently there isn't, short of reconfiguring the remote used by
 default by git fetch.
 
 I wish there was a way to specify the remote on a per-branch basis
 separately from the tracking branch. I read a while back that someone
 proposed some changes to git to support decentralized tracking
 (concept of an upstream tracking branch and a separate one for origin,
 i think). If that were implemented, then relative submodules could
 utilize the 'upstream' remote by default for each branch, which would
 provide more predictable default behavior. Right now most people on my
 team would not be aware that if they tracked a branch on their fork,
 they would subsequently need to fork the submodules to that same
 remote.
 
 Various co-workers use the remote named central instead of
 upstream and fork instead of origin (because that just makes
 more sense to them and it's perfectly valid).

 However if relative submodules require 'origin' to exist AND also
 represent the upstream repository (in triangle workflow), then this
 breaks on several levels.

 Can you explain further?  In a triangle workflow, git fetch will
 pull from the 'origin' remote by default and will push to the remote
 named in the '[remote] pushdefault' setting (see remote.pushdefault
 in git-config(1)).  So you can do

 [remote]
 pushDefault = whereishouldpush

 and then 'git fetch' and 'git fetch --recurse-submodules' will fetch
 from origin and 'git push' will push to the whereishouldpush remote.
 
 I didn't know about this option, seems useful.
 
 A common workflow that we use on my team is to set the tracking branch
 to 'upstream' for convenient pulls with rebase. This means a feature
 branch of mine can track its parent branch on 'upstream', so that when
 other pull requests get merged in on the remote repo branch, I can
 just do `git pull` and my feature branch rebases onto the latest of
 that parent branch.
 
 Cases like these would work with relative submodules because
 'upstream' is the tracked remote (and most of the time we don't want
 to fork submodules). However sometimes I like to track the same pushed
 branch on origin (my fork), especially when it is up for pull request.
 In these cases, my submodule update will fail because I didn't fork my
 submodules when I changed my tracking branch. Is this correct?
 
 breaks on several levels was basically my way of saying that various
 workflow choices will break when you introduce submodules. One of the
 beautiful things about Git is that it allows everyone to choose their
 own workflow. But submodules seem to prevent that to some degree. I
 think addressing relative submodule usability issues is the best
 approach for the long term as they feel more sustainable and scalable.
 It's an absolute pain to move a submodule URL, I think we've all
 experienced it. It's even harder for me because I'm the go-to at work
 for help with Git. Most people that aren't advanced with Git will not
 know what to do without a ton of reading  such.
 
 It might make sense to introduce a new

 [remote]
 default = whereishouldfetch

 setting to allow the name origin above to be replaced, too.  Is that
 what you mean?

A couple of years ago I started to work on such a thing ([1] [2] [3]), mainly
because when we tried to change to relative submodules we got bitten when
someone used clone's -o option so that his super-repo had no origin remote
*and* his was checked out on a detached HEAD.  So get_default_remote() failed
for him.

I didn't have time to complete the work -- it ended up being quite involved.
 But Junio did come up with an excellent transition plan [4] for adopting a
default remote setting.

[1] 

Re: [PATCH] upload-pack: keep poll(2)'s timeout to -1

2014-08-22 Thread Edward Thomson
On Fri, Aug 22, 2014 at 12:03:34PM -0400, Jeff King wrote:
 
 Yeah, I wasn't thinking we would get negative values from the user (we
 don't document them at all), but we should probably do something
 sensible. Let's just leave it at Ed's patch.

Thanks, both.  Apologies for the dumb question: is there anything
additional that I need to do (repost with your Acked-by, for example)
or is this adequate as-is?

Thanks-
-ed
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] Introduce GIT_MMAP_LIMIT to allow testing expected mmap size

2014-08-22 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 Steffen Prohaska proha...@zib.de writes:

 +  if (limit == -1) {
 +  const char *env = getenv(GIT_MMAP_LIMIT);
 +  limit = env ? atoi(env) * 1024 : 0;

 ... this should then be changed to atol(env), and ... 

 In the real codepath (not debugging aid like this) we try to avoid
 atoi/atol so that we can catch errors like feeding 123Q and
 parsing it as 123.

Sorry for hitting SEND by mistake before finishing the paragraph,
which should have concluded with:

But it is OK to be loose in an debugging aid.  If I were doing
this code, I actually would call git_parse_ulong() and not
define it in terms of kilobytes, though.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 18/18] signed push: final protocol update

2014-08-22 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 There are a few gotchas I can certainly use help on, especially from
 a smart-http expert ;-).

  * pushed-to URL will identify the site and the repository, so
you cannot MITM my push to an experimental server and replay it
against the authoritative server.

However, the receiving end may not even know what name its users
call the repository being pushed into.  Obviously gethostname()
may not be what the pusher called us, and getcwd() may not match
the repository name without leading /var/repos/shard3/ path
components stripped, for example.

I am not sure if we even have the necessary information at
send-pack.c::send_pack() level, where it already has an
established connection to the server (hence it does not need to
know to whom it is talking to).


  * The receiving end will issue push-cert=nonce in its initial
capability advertisement, and this nonce will be given on the
PUSH_CERT_NONCE environment to the pre/post-receive hooks, to
allow the nonce nonce header in the signed certificate to be
checked against it.  You cannot capture my an earlier push to the
authoritative server and replay it later.

That would all work well within a single receive-pack process,
but with stateless RPC, it is unclear to me how we should
arrange the nonce the initial instance of receive-pack placed
on its capability advertisement to be securely passed to the
instance of receive-pack that actually receives the push
certificate.

A good nonce may be something like taking the SHA-1 hash of the
concatenation of the sitename, repo-path and the timestamp when the
receive-pack generated the nonce.  Replaying a push certificate
for a push to a repository at a site that gives such a nonce can
succeed at the same chance of finding a SHA-1 collision [*1*].  As
long as you exercise good hygiene and only push to repositories that
give such nonce, we can do without checking pushed-to that says
where the push went.

So nonce nonce is the only thing that is necessary to make them
impossible to replay.  For auditing purposes, pushed-to URL that
records the repository the pusher intended to push to may help but
probably not necessary [*2*].


[Footnote]

*1* And the old-sha1s recorded in the certificate has to match what
the repository being attacked currently has; otherwise the push
will fail with the ref moved while you were trying to push.

*2* When auditing the history for a repository at a site, the
certificate the auditors examine would be the ones accumulated
at that site for the repository, so we would implicitly know the
value for URL already.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/4] Stream fd to clean filter, GIT_MMAP_LIMIT, GIT_ALLOC_LIMIT

2014-08-22 Thread Junio C Hamano
Thanks; will replace what has been on 'pu' with this one with some
tweaks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] upload-pack: keep poll(2)'s timeout to -1

2014-08-22 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Fri, Aug 22, 2014 at 03:19:11PM +, Edward Thomson wrote:

 Keep poll's timeout at -1 when uploadpack.keepalive = 0, instead of
 setting it to -1000, since some pedantic old systems (eg HP-UX) and
 the gnulib compat/poll will treat only -1 as the valid value for
 an infinite timeout.

 That makes sense, and POSIX only specifies the behavior for -1 anyway.
 The patch itself looks obviously correct. Thanks.

 Since we're now translating the keepalive value, and since there's no
 way to set it to 0 (nor would that really have any meaning), I guess
 we could switch the internal no keepalive value to 0, and do:

   ret = poll(pfd, pollsize, keepalive ? 1000 * keepalive : -1);

 which would let us avoid setting it to -1 in some other spots.  I dunno
 if that actually makes a real difference to maintainability, though.
 Either way:

   Acked-by: Jeff King p...@peff.net

 -Peff

There is 1000 * wakeup in credential-cache--daemon.c, by the way.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] upload-pack: keep poll(2)'s timeout to -1

2014-08-22 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 There is 1000 * wakeup in credential-cache--daemon.c, by the way.

Ah, nevermind.  That uses an expiration computed, not some we can
choose to block indefinitely configuration.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] upload-pack: keep poll(2)'s timeout to -1

2014-08-22 Thread Junio C Hamano
Edward Thomson ethom...@edwardthomson.com writes:

 On Fri, Aug 22, 2014 at 12:03:34PM -0400, Jeff King wrote:
 
 Yeah, I wasn't thinking we would get negative values from the user (we
 don't document them at all), but we should probably do something
 sensible. Let's just leave it at Ed's patch.

 Thanks, both.  Apologies for the dumb question: is there anything
 additional that I need to do (repost with your Acked-by, for example)
 or is this adequate as-is?

I've picked it up and queued it on 'pu'.  Thanks.

commit 6c71f8b0d3d39beffe050f92f33a25dc30dffca3
Author: Edward Thomson ethom...@edwardthomson.com
Date:   Fri Aug 22 15:19:11 2014 +

upload-pack: keep poll(2)'s timeout to -1

Keep poll's timeout at -1 when uploadpack.keepalive = 0, instead of
setting it to -1000, since some pedantic old systems (eg HP-UX) and
the gnulib compat/poll will treat only -1 as the valid value for
an infinite timeout.

Signed-off-by: Edward Thomson ethom...@microsoft.com
Acked-by: Jeff King p...@peff.net
Signed-off-by: Junio C Hamano gits...@pobox.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-ref-format: include refs/ in the argument or to strip it?

2014-08-22 Thread Junio C Hamano
Jonathan Nieder jrnie...@gmail.com writes:

 Michael Haggerty wrote[1]:
 Jonathan Nieder wrote:

 The check-ref-format documentation is pretty unclear, but the
 intent is that it would be used like

 git check-ref-format heads/master

 (see the surviving examples in contrib/examples/). That way, it can
 enforce the rule (from git-check-ref-format(1))

 2. They must contain at least one /. This enforces the presence
 of a category like heads/, tags/ etc. but the actual names
 are not restricted.
 [...]
 Thanks for the explanation and the pointer.

I wanted to follow this discussion, especially the ellided [...]
pointer, but had a hart time finding what pointer was.

Anyway, the true origin of ONELEVEL as far as I recall was to give
us a way to say in this code path, we also expect to be fed HEAD,
ORIG_HEAD, etc., so please do not subject us to the 'at least one
slash' rule., implication of which is that the 'at least one slash'
rule was to expect things are 'refs/anything' so there will be at
least one.  Even back then, that anything alone had at least one
slash (e.g. heads/master), but the intention was *never* that we
would forbid anything that does not have a slash by feeding
anything part alone to check-ref-format, i.e. things like
refs/stash were designed to be allowed.

That the function does not reject what does not begin with refs/
when ONELEVEL is not in effect is just being loose.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] teach fast-export an --anonymize option

2014-08-22 Thread Philip Oakley

From: Jeff King p...@peff.net: Friday, August 22, 2014 12:21 AM

On Thu, Aug 21, 2014 at 06:49:10PM -0400, Jeff King wrote:


The few things I don't anonymize are:

  1. ref prefixes. We see the same distribution of refs/heads vs
 refs/tags, etc.

  2. refs/heads/master is left untouched, for convenience (and 
because
 it's not really a secret). The implementation is lazy, though, 
and
 would leave refs/heads/master-supersecret, as well. I can 
tighten

 that if we really want to be careful.

  3. gitlinks are left untouched, since sha1s cannot be reversed. 
This

 could leak some information (if your private repo points to a
 public, I can find out you have it as submodule). I doubt it
 matters, but we can also scramble the sha1s.


Here's a re-roll that addresses the latter two. I don't think any are 
a
big deal, but it's much easier to say it's handled than try to 
figure

out whether and when it's important.

This also includes the documentation update I sent earlier. The
interdiff is a bit noisy, as I also converted the anonymize_mem 
function
to take void pointers (since it doesn't know or care what it's 
storing,

and this makes storing unsigned chars for sha1s easier).



Just a bit of bikeshedding for future improvements..

The .gitignore is another potential user problem area that may benefit 
form not being anonymised when problems strike. For example, there's a 
current problem on the git-users list 
https://groups.google.com/forum/#!topic/git-users/JJFIEsI5HRQ about git 
clean vs git status re .gitignore, which would then also beg questions 
about retaining file extensions/suffixes (.txt, .o, .c, etc).


I've had a similar problem with an over zealous file compare routine 
where the same too much vs too little was an issue.


One thought is that the user should be able to, as an option, select the 
number of initial characters retained from filenames, and similarly, the 
option to retain the file extension, and possibly directory names, such 
that the full .gitignore still works in most cases, and the sort order 
works (as far as it goes on number of characters).


All things for future improvers to consider.

Philip


-- 8 --
Subject: teach fast-export an --anonymize option

Sometimes users want to report a bug they experience on
their repository, but they are not at liberty to share the
contents of the repository. It would be useful if they could
produce a repository that has a similar shape to its history
and tree, but without leaking any information. This
anonymized repository could then be shared with developers
(assuming it still replicates the original problem).

This patch implements an --anonymize option to
fast-export, which generates a stream that can recreate such
a repository. Producing a single stream makes it easy for
the caller to verify that they are not leaking any useful
information. You can get an overview of what will be shared
by running a command like:

 git fast-export --anonymize --all |
 perl -pe 's/\d+/X/g' |
 sort -u |
 less

which will show every unique line we generate, modulo any
numbers (each anonymized token is assigned a number, like
User 0, and we replace it consistently in the output).

In addition to anonymizing, this produces test cases that
are relatively small (compared to the original repository)
and fast to generate (compared to using filter-branch, or
modifying the output of fast-export yourself). Here are
numbers for git.git:

 $ time git fast-export --anonymize --all \
--tag-of-filtered-object=drop output
 real0m2.883s
 user0m2.828s
 sys 0m0.052s

 $ gzip output
 $ ls -lh output.gz | awk '{print $5}'
 2.9M

Signed-off-by: Jeff King p...@peff.net
---
[...] 


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-ref-format: include refs/ in the argument or to strip it?

2014-08-22 Thread Jonathan Nieder
Junio C Hamano wrote:

implication of which is that the 'at least one slash'
 rule was to expect things are 'refs/anything' so there will be at
 least one.  Even back then, that anything alone had at least one
 slash (e.g. heads/master), but the intention was *never* that we
 would forbid anything that does not have a slash by feeding
 anything part alone to check-ref-format, i.e. things like
 refs/stash were designed to be allowed.

Now I'm more confused.  Until 5f7b202a (2008-01-01), there was a
comment

if (level  2)
return -2; /* at least of form heads/blah */

and that behavior has been preserved since the beginning.

Why do most old callers pass a string that doesn't start with refs/
(e.g., see the callers in 03feddd6, 2005-10-13)?  Has the intent been
to relax the requirement since then?

Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/3] Push the NATIVE_CRLF Makefile variable to C and added a test for native.

2014-08-22 Thread Torsten Bögershausen
Commit 95f31e9a correctly points out that the NATIVE_CRLF setting is
incorrectly set on Mingw git. However, the Makefile variable is not
propagated to the C preprocessor and results in no change. This patch
pushes the definition to the C code and adds a test to validate that
when core.eol as native is crlf, we actually normalize text files to this
line ending convention when core.autocrlf is false.

Signed-off-by: Pat Thoyts pattho...@users.sourceforge.net
Signed-off-by: Stepan Kasal ka...@ucw.cz
Signed-off-by: Torsten Bögershausen tbo...@web.de
---

This mini series mainly updates git.git with patches from msysgit:
Patch 1 is taken as is,
Patch 2 is taken as is,
and Patch 3 is the outcome of the code-review 

Thanks for careful reading


 Makefile  |  3 +++
 t/t0026-eol-config.sh | 18 ++
 2 files changed, 21 insertions(+)

diff --git a/Makefile b/Makefile
index 2320de5..517036e 100644
--- a/Makefile
+++ b/Makefile
@@ -1479,6 +1479,9 @@ ifdef NO_REGEX
COMPAT_CFLAGS += -Icompat/regex
COMPAT_OBJS += compat/regex/regex.o
 endif
+ifdef NATIVE_CRLF
+   BASIC_CFLAGS += -DNATIVE_CRLF
+endif
 
 ifdef USE_NED_ALLOCATOR
COMPAT_CFLAGS += -Icompat/nedmalloc
diff --git a/t/t0026-eol-config.sh b/t/t0026-eol-config.sh
index 4807b0f..43a580a 100755
--- a/t/t0026-eol-config.sh
+++ b/t/t0026-eol-config.sh
@@ -80,4 +80,22 @@ test_expect_success 'autocrlf=true overrides unset eol' '
test -z $onediff  test -z $twodiff
 '
 
+test_expect_success NATIVE_CRLF 'eol native is crlf' '
+
+   rm -rf native_eol  mkdir native_eol 
+   ( cd native_eol 
+   printf *.txt text\n  .gitattributes
+   printf one\r\ntwo\r\nthree\r\n  filedos.txt
+   printf one\ntwo\nthree\n  fileunix.txt
+   git init 
+   git config core.autocrlf false 
+   git config core.eol native 
+   git add filedos.txt fileunix.txt 
+   git commit -m first 
+   rm file*.txt 
+   git reset --hard HEAD 
+   has_cr filedos.txt  has_cr fileunix.txt
+   )
+'
+
 test_done
-- 
2.1.0.rc2.210.g636bceb


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/3] t0026: Add missing

2014-08-22 Thread Torsten Bögershausen
Fix the broken  chain

Reported-By: Eric Sunshine sunsh...@sunshineco.com
Signed-off-by: Torsten Bögershausen tbo...@web.de
---
 t/t0026-eol-config.sh | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/t/t0026-eol-config.sh b/t/t0026-eol-config.sh
index 43a580a..4806969 100755
--- a/t/t0026-eol-config.sh
+++ b/t/t0026-eol-config.sh
@@ -84,9 +84,9 @@ test_expect_success NATIVE_CRLF 'eol native is crlf' '
 
rm -rf native_eol  mkdir native_eol 
( cd native_eol 
-   printf *.txt text\n  .gitattributes
-   printf one\r\ntwo\r\nthree\r\n  filedos.txt
-   printf one\ntwo\nthree\n  fileunix.txt
+   printf *.txt text\n  .gitattributes 
+   printf one\r\ntwo\r\nthree\r\n  filedos.txt 
+   printf one\ntwo\nthree\n  fileunix.txt 
git init 
git config core.autocrlf false 
git config core.eol native 
-- 
2.1.0.rc2.210.g636bceb

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/3] MinGW: Update tests to handle a native eol of crlf

2014-08-22 Thread Torsten Bögershausen
Some of the tests were written with the assumption that the native eol would
always be lf. After defining NATIVE_CRLF on MinGW, these tests began failing.
This change will update the tests to also handle a native eol of crlf.

Signed-off-by: Brice Lambson brice...@live.com
Signed-off-by: Torsten Bögershausen tbo...@web.de
---
 t/t6038-merge-text-auto.sh | 54 +-
 t/test-lib.sh  |  1 +
 2 files changed, 35 insertions(+), 20 deletions(-)

diff --git a/t/t6038-merge-text-auto.sh b/t/t6038-merge-text-auto.sh
index d9c2d38..85c10b0 100755
--- a/t/t6038-merge-text-auto.sh
+++ b/t/t6038-merge-text-auto.sh
@@ -72,6 +72,10 @@ test_expect_success 'Merge after setting text=auto' '
same line
EOF
 
+   if test_have_prereq NATIVE_CRLF; then
+   append_cr expected expected.temp 
+   mv expected.temp expected
+   fi 
git config merge.renormalize true 
git rm -fr . 
rm -f .gitattributes 
@@ -86,6 +90,10 @@ test_expect_success 'Merge addition of text=auto' '
same line
EOF
 
+   if test_have_prereq NATIVE_CRLF; then
+   append_cr expected expected.temp 
+   mv expected.temp expected
+   fi 
git config merge.renormalize true 
git rm -fr . 
rm -f .gitattributes 
@@ -95,16 +103,19 @@ test_expect_success 'Merge addition of text=auto' '
 '
 
 test_expect_success 'Detect CRLF/LF conflict after setting text=auto' '
-   q_to_cr -\EOF expected 
-   
-   first line
-   same line
-   ===
-   first lineQ
-   same lineQ
-   
-   EOF
-
+   echo  expected 
+   if test_have_prereq NATIVE_CRLF; then
+   echo first line | append_cr expected 
+   echo same line | append_cr expected 
+   echo === | append_cr expected
+   else
+   echo first line expected 
+   echo same line expected 
+   echo === expected
+   fi 
+   echo first line | append_cr expected 
+   echo same line | append_cr expected 
+   echo  expected 
git config merge.renormalize false 
rm -f .gitattributes 
git reset --hard a 
@@ -114,16 +125,19 @@ test_expect_success 'Detect CRLF/LF conflict after 
setting text=auto' '
 '
 
 test_expect_success 'Detect LF/CRLF conflict from addition of text=auto' '
-   q_to_cr -\EOF expected 
-   
-   first lineQ
-   same lineQ
-   ===
-   first line
-   same line
-   
-   EOF
-
+   echo  expected 
+   echo first line | append_cr expected 
+   echo same line | append_cr expected 
+   if test_have_prereq NATIVE_CRLF; then
+   echo === | append_cr expected 
+   echo first line | append_cr expected 
+   echo same line | append_cr expected
+   else
+   echo === expected 
+   echo first line expected 
+   echo same line expected
+   fi 
+   echo  expected 
git config merge.renormalize false 
rm -f .gitattributes 
git reset --hard b 
diff --git a/t/test-lib.sh b/t/test-lib.sh
index b1bc65b..aceb418 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -871,6 +871,7 @@ case $(uname -s) in
# exec does not inherit the PID
test_set_prereq MINGW
test_set_prereq NOT_CYGWIN
+   test_set_prereq NATIVE_CRLF
test_set_prereq SED_STRIPS_CR
test_set_prereq GREP_STRIPS_CR
GIT_TEST_CMP=mingw_test_cmp
-- 
2.1.0.rc2.210.g636bceb


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] resolved deltas

2014-08-22 Thread Martin von Gagern
On 21.08.2014 13:35, Petr Stodulka wrote:
 Hi guys,
 I wanted post you patch here for this bug, but I can't find primary
 source of this problem [0], because I don't understand some ideas in the
 code.

 […]
 
 Any next ideas/hints or explanation of these functions? I began study
 source code and mechanisms of the git this week, so don't beat me yet
 please :-)
 
 Regards,
 Petr
 
 [0] https://bugzilla.redhat.com/show_bug.cgi?id=1099919

Some pointers to related reports and investigations:

https://groups.google.com/forum/#!topic/mapsforge-dev/IF6mgmwvZmY
https://groups.google.com/forum/#!topic/mapsforge-dev/f2KvFALlkvo
https://code.google.com/p/support/issues/detail?id=31571
https://groups.google.com/forum/#!topic/mapsforge-dev/nomzr5dkkqc
http://thread.gmane.org/gmane.comp.version-control.git/254626

The last is my own bug report to this mailing list here, which
unfortunately received no reaction yet. In that report, I can confirm
that the commit introducing the assertion is the same commit which
causes things to fail:
https://github.com/git/git/commit/7218a215efc7ae46f7ca8d82442f354e

In this https://code.google.com/p/mapsforge/ repository, resolve_delta
gets called twice for some delta. The first time, type and real_type are
identical, but by the second pass in find_unresolved_deltas the real
type will have chaned to OBJ_TREE. This caused the old code to simply
scip the object, but the new code aborts instead.

So far my understanding. I'm not sure whether this kind of duplicate
resolution is something normal or indicates some breakage in the
repository in question. But aborting seems a bad solution in either case.

Greetings,
 Martin von Gagern


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/18] Signed push

2014-08-22 Thread Stefan Beller
On 20.08.2014 00:06, Junio C Hamano wrote:
 While signed tags and commits assert that the objects thusly signed
 came from you, who signed these objects, there is not a good way to
 assert that you wanted to have a particular object at the tip of a
 particular branch.  My signing v2.0.1 tag only means I want to call
 the version v2.0.1, and it does not mean I want to push it out to my
 'master' branch---it is likely that I only want it in 'maint', so
 the signature on the object alone is insufficient.
 
 The only assurance to you that 'maint' points at what I wanted to
 place there comes from your trust on the hosting site and my
 authentication with it, which cannot easily audited later.
 
 This series introduces a cryptographic assurance for ref updates
 done by git push by introducing a mechanism that allows you to
 sign a push certificate (for the lack of better name) every time
 you push.  Think of it as working on an axis orthogonal to the
 traditional signed tags.

What kind of additional benefit do we gain? (i.e. why?)
The described problem, the lacking auditability of what's pushed
at which time, could be worked around like this:

Whenever you do a push, you just tag the latest commit in that branch.
So there would be tags like:
master_2014_08_21
master_2014_08_22
...
maint_2014_08_13
maint_2014_08_21
and so on. Whenever there is no tag at the tip of the branch, we'd
know there is something wrong.

My guess would be usability as tagging so many branches is cumbersome
for a maintainer?

Looking at my made-up workaround again:
That would produce lots of tags. So I could imagine there would also be
lots of push certs. The number of push certs would
roughly scale linear to time passed. May this result in slowness over time?

Does this patch series mean, we'd get another object type (additional to
blobs, commits, tags, trees)?

I did not read the code yet, it's just first thoughts,
so this weigh this input lightly.

Stefan




--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/18] Signed push

2014-08-22 Thread Junio C Hamano
Stefan Beller stefanbel...@gmail.com writes:

 So there would be tags like:
   master_2014_08_21
   master_2014_08_22
   ...
   maint_2014_08_13
   maint_2014_08_21
 and so on. Whenever there is no tag at the tip of the branch, we'd
 know there is something wrong.

Who creates that tag?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/18] Signed push

2014-08-22 Thread Stefan Beller
On 22.08.2014 22:03, Junio C Hamano wrote:
 Stefan Beller stefanbel...@gmail.com writes:
 
 So there would be tags like:
  master_2014_08_21
  master_2014_08_22
  ...
  maint_2014_08_13
  maint_2014_08_21
 and so on. Whenever there is no tag at the tip of the branch, we'd
 know there is something wrong.
 
 Who creates that tag?
 

 My guess would be usability as tagging so many branches is cumbersome
for a maintainer?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 00/19] Signed push

2014-08-22 Thread Junio C Hamano
The first round is found at $gmane/255520

While signed tags and commits assert that the objects thusly signed
came from you, who signed these objects, there is not a good way to
assert that you wanted to have a particular object at the tip of a
particular branch.  My signing v2.0.1 tag only means I want to call
the version v2.0.1, and it does not mean I want to push it out to my
'master' branch---it is likely that I only want it in 'maint', so
the signature on the object alone is insufficient.

The only assurance to you that 'maint' points at what I wanted to
place there comes from your trust on the hosting site and my
authentication with it, which cannot easily audited later.

This series introduces a cryptographic assurance for ref updates
done by git push by introducing a mechanism that allows you to
sign a push certificate (for the lack of better name) every time
you push.  Think of it as working on an axis orthogonal to the
traditional signed tags.

Notable changes from the first iteration are:

 - push --signed against a receiver that does not support push
   certificates used to downgrade to an unsigned push with a
   warning; this round makes it die.

 - The push-cert capability the receiver sends now with a value,
   nonce; the certificate must include this value on the nonce
   header to prevent a valid push certificate that was used to push
   elsewhere from being replayed to push to an unrelated repository.

And several typofixes here and there.

Junio C Hamano (19):
  receive-pack: do not overallocate command structure
  receive-pack: parse feature request a bit earlier
  receive-pack: do not reuse old_sha1[] for other things
  receive-pack: factor out queueing of command
  send-pack: move REF_STATUS_REJECT_NODELETE logic a bit higher
  send-pack: refactor decision to send update per ref
  send-pack: always send capabilities
  send-pack: factor out capability string generation
  send-pack: rename new_refs to need_pack_data
  send-pack: refactor inspecting and resetting status and sending
commands
  send-pack: clarify that cmds_sent is a boolean
  gpg-interface: move parse_gpg_output() to where it should be
  gpg-interface: move parse_signature() to where it should be
  pack-protocol doc: typofix for PKT-LINE
  the beginning of the signed push
  receive-pack: GPG-validate push certificates
  send-pack: send feature request on push-cert packet
  signed push: signed push: remove duplicated protocol info
  signed push: fortify against replay attacks

 Documentation/git-push.txt|   9 +-
 Documentation/git-receive-pack.txt|  41 -
 Documentation/technical/pack-protocol.txt |  25 ++-
 Documentation/technical/protocol-capabilities.txt |  13 +-
 builtin/push.c|   1 +
 builtin/receive-pack.c| 199 ++
 commit.c  |  36 
 gpg-interface.c   |  57 +++
 gpg-interface.h   |  18 +-
 send-pack.c   | 197 -
 send-pack.h   |   1 +
 t/t5534-push-signed.sh|  82 +
 tag.c |  20 ---
 tag.h |   1 -
 transport.c   |   4 +
 transport.h   |   5 +
 16 files changed, 564 insertions(+), 145 deletions(-)
 create mode 100755 t/t5534-push-signed.sh

-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 01/19] receive-pack: do not overallocate command structure

2014-08-22 Thread Junio C Hamano
An update command in the protocol exchange consists of 40-hex old
object name, SP, 40-hex new object name, SP, and a refname, but the
first instance is further followed by a NUL with feature requests.

The command structure, which has a flex-array member that stores the
refname at the end, was allocated based on the whole length of the
update command, without excluding the trailing feature requests.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 builtin/receive-pack.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index f93ac45..1663beb 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -872,10 +872,11 @@ static struct command *read_head_info(struct sha1_array 
*shallow)
if (parse_feature_request(feature_list, quiet))
quiet = 1;
}
-   cmd = xcalloc(1, sizeof(struct command) + len - 80);
+   cmd = xcalloc(1, sizeof(struct command) + reflen + 1);
hashcpy(cmd-old_sha1, old_sha1);
hashcpy(cmd-new_sha1, new_sha1);
-   memcpy(cmd-ref_name, line + 82, len - 81);
+   memcpy(cmd-ref_name, refname, reflen);
+   cmd-ref_name[reflen] = '\0';
*p = cmd;
p = cmd-next;
}
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 02/19] receive-pack: parse feature request a bit earlier

2014-08-22 Thread Junio C Hamano
Ideally, we should have also allowed the first shallow to carry
the feature request trailer, but that is water under the bridge
now.  This makes the next step to factor out the queuing of commands
easier to review.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 builtin/receive-pack.c | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1663beb..a91eec8 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -840,7 +840,7 @@ static struct command *read_head_info(struct sha1_array 
*shallow)
unsigned char old_sha1[20], new_sha1[20];
struct command *cmd;
char *refname;
-   int len, reflen;
+   int len, reflen, linelen;
 
line = packet_read_line(0, len);
if (!line)
@@ -853,7 +853,18 @@ static struct command *read_head_info(struct sha1_array 
*shallow)
continue;
}
 
-   if (len  83 ||
+   linelen = strlen(line);
+   if (linelen  len) {
+   const char *feature_list = line + linelen + 1;
+   if (parse_feature_request(feature_list, 
report-status))
+   report_status = 1;
+   if (parse_feature_request(feature_list, 
side-band-64k))
+   use_sideband = LARGE_PACKET_MAX;
+   if (parse_feature_request(feature_list, quiet))
+   quiet = 1;
+   }
+
+   if (linelen  83 ||
line[40] != ' ' ||
line[81] != ' ' ||
get_sha1_hex(line, old_sha1) ||
@@ -862,16 +873,7 @@ static struct command *read_head_info(struct sha1_array 
*shallow)
line);
 
refname = line + 82;
-   reflen = strlen(refname);
-   if (reflen + 82  len) {
-   const char *feature_list = refname + reflen + 1;
-   if (parse_feature_request(feature_list, 
report-status))
-   report_status = 1;
-   if (parse_feature_request(feature_list, 
side-band-64k))
-   use_sideband = LARGE_PACKET_MAX;
-   if (parse_feature_request(feature_list, quiet))
-   quiet = 1;
-   }
+   reflen = linelen - 82;
cmd = xcalloc(1, sizeof(struct command) + reflen + 1);
hashcpy(cmd-old_sha1, old_sha1);
hashcpy(cmd-new_sha1, new_sha1);
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 03/19] receive-pack: do not reuse old_sha1[] for other things

2014-08-22 Thread Junio C Hamano
This piece of code reads object names of shallow boundaries, not
old_sha1[], i.e. the current value the ref points at, which is to be
replaced by what is in new_sha1[].

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 builtin/receive-pack.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index a91eec8..c9b92bf 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -847,9 +847,11 @@ static struct command *read_head_info(struct sha1_array 
*shallow)
break;
 
if (len == 48  starts_with(line, shallow )) {
-   if (get_sha1_hex(line + 8, old_sha1))
-   die(protocol error: expected shallow sha, got 
'%s', line + 8);
-   sha1_array_append(shallow, old_sha1);
+   unsigned char sha1[20];
+   if (get_sha1_hex(line + 8, sha1))
+   die(protocol error: expected shallow sha, got 
'%s',
+   line + 8);
+   sha1_array_append(shallow, sha1);
continue;
}
 
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 04/19] receive-pack: factor out queueing of command

2014-08-22 Thread Junio C Hamano
Make a helper function to accept a line of a protocol message and
queue an update command out of the code from read_head_info().

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 builtin/receive-pack.c | 50 +-
 1 file changed, 29 insertions(+), 21 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index c9b92bf..341bb46 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -831,16 +831,40 @@ static void execute_commands(struct command *commands,
  the reported refs above);
 }
 
+static struct command **queue_command(struct command **p,
+ const char *line,
+ int linelen)
+{
+   unsigned char old_sha1[20], new_sha1[20];
+   struct command *cmd;
+   const char *refname;
+   int reflen;
+
+   if (linelen  83 ||
+   line[40] != ' ' ||
+   line[81] != ' ' ||
+   get_sha1_hex(line, old_sha1) ||
+   get_sha1_hex(line + 41, new_sha1))
+   die(protocol error: expected old/new/ref, got '%s', line);
+
+   refname = line + 82;
+   reflen = linelen - 82;
+   cmd = xcalloc(1, sizeof(struct command) + reflen + 1);
+   hashcpy(cmd-old_sha1, old_sha1);
+   hashcpy(cmd-new_sha1, new_sha1);
+   memcpy(cmd-ref_name, refname, reflen);
+   cmd-ref_name[reflen] = '\0';
+   *p = cmd;
+   return cmd-next;
+}
+
 static struct command *read_head_info(struct sha1_array *shallow)
 {
struct command *commands = NULL;
struct command **p = commands;
for (;;) {
char *line;
-   unsigned char old_sha1[20], new_sha1[20];
-   struct command *cmd;
-   char *refname;
-   int len, reflen, linelen;
+   int len, linelen;
 
line = packet_read_line(0, len);
if (!line)
@@ -866,23 +890,7 @@ static struct command *read_head_info(struct sha1_array 
*shallow)
quiet = 1;
}
 
-   if (linelen  83 ||
-   line[40] != ' ' ||
-   line[81] != ' ' ||
-   get_sha1_hex(line, old_sha1) ||
-   get_sha1_hex(line + 41, new_sha1))
-   die(protocol error: expected old/new/ref, got '%s',
-   line);
-
-   refname = line + 82;
-   reflen = linelen - 82;
-   cmd = xcalloc(1, sizeof(struct command) + reflen + 1);
-   hashcpy(cmd-old_sha1, old_sha1);
-   hashcpy(cmd-new_sha1, new_sha1);
-   memcpy(cmd-ref_name, refname, reflen);
-   cmd-ref_name[reflen] = '\0';
-   *p = cmd;
-   p = cmd-next;
+   p = queue_command(p, line, linelen);
}
return commands;
 }
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 05/19] send-pack: move REF_STATUS_REJECT_NODELETE logic a bit higher

2014-08-22 Thread Junio C Hamano
20e8b465 (refactor ref status logic for pushing, 2010-01-08)
restructured the code to set status for each ref to be pushed, but
did not quite go far enough.  We inspect the status set earlier by
set_refs_status_for_push() and then perform yet another update to
the status of a ref with an otherwise OK status to be deleted to
mark it with REF_STATUS_REJECT_NODELETE when the protocol tells us
never to delete.

Split the latter into a separate loop that comes before we enter the
per-ref loop.  This way we would have one less condition to check in
the main loop.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 send-pack.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/send-pack.c b/send-pack.c
index 6129b0f..7428ae6 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -231,6 +231,15 @@ int send_pack(struct send_pack_args *args,
return 0;
}
 
+   /*
+* NEEDSWORK: why is delete-refs so specific to send-pack
+* machinery that set_ref_status_for_push() cannot set this
+* bit for us???
+*/
+   for (ref = remote_refs; ref; ref = ref-next)
+   if (ref-deletion  !allow_deleting_refs)
+   ref-status = REF_STATUS_REJECT_NODELETE;
+
if (!args-dry_run)
advertise_shallow_grafts_buf(req_buf);
 
@@ -249,17 +258,13 @@ int send_pack(struct send_pack_args *args,
case REF_STATUS_REJECT_FETCH_FIRST:
case REF_STATUS_REJECT_NEEDS_FORCE:
case REF_STATUS_REJECT_STALE:
+   case REF_STATUS_REJECT_NODELETE:
case REF_STATUS_UPTODATE:
continue;
default:
; /* do nothing */
}
 
-   if (ref-deletion  !allow_deleting_refs) {
-   ref-status = REF_STATUS_REJECT_NODELETE;
-   continue;
-   }
-
if (!ref-deletion)
new_refs++;
 
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 08/19] send-pack: factor out capability string generation

2014-08-22 Thread Junio C Hamano
A run of 'var ?  var : ' fed to a long printf string in a deeply
nested block was hard to read.  Move it outside the loop and format
it into a strbuf.

As an added bonus, the trick to add agent=agent-name by using
two conditionals is replaced by a more readable version.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 send-pack.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/send-pack.c b/send-pack.c
index 2fa6c34..c1c64ee 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -218,6 +218,7 @@ int send_pack(struct send_pack_args *args,
int in = fd[0];
int out = fd[1];
struct strbuf req_buf = STRBUF_INIT;
+   struct strbuf cap_buf = STRBUF_INIT;
struct ref *ref;
int new_refs;
int allow_deleting_refs = 0;
@@ -251,6 +252,15 @@ int send_pack(struct send_pack_args *args,
return 0;
}
 
+   if (status_report)
+   strbuf_addstr(cap_buf,  report-status);
+   if (use_sideband)
+   strbuf_addstr(cap_buf,  side-band-64k);
+   if (quiet_supported  (args-quiet || !args-progress))
+   strbuf_addstr(cap_buf,  quiet);
+   if (agent_supported)
+   strbuf_addf(cap_buf,  agent=%s, git_user_agent_sanitized());
+
/*
 * NEEDSWORK: why is delete-refs so specific to send-pack
 * machinery that set_ref_status_for_push() cannot set this
@@ -279,18 +289,12 @@ int send_pack(struct send_pack_args *args,
} else {
char *old_hex = sha1_to_hex(ref-old_sha1);
char *new_hex = sha1_to_hex(ref-new_sha1);
-   int quiet = quiet_supported  (args-quiet || 
!args-progress);
 
if (!cmds_sent)
packet_buf_write(req_buf,
-%s %s %s%c%s%s%s%s%s,
+%s %s %s%c%s,
 old_hex, new_hex, ref-name, 0,
-status_report ?  
report-status : ,
-use_sideband ?  
side-band-64k : ,
-quiet ?  quiet : ,
-agent_supported ?  agent= : 
,
-agent_supported ? 
git_user_agent_sanitized() : 
-   );
+cap_buf.buf);
else
packet_buf_write(req_buf, %s %s %s,
 old_hex, new_hex, ref-name);
@@ -311,6 +315,7 @@ int send_pack(struct send_pack_args *args,
packet_flush(out);
}
strbuf_release(req_buf);
+   strbuf_release(cap_buf);
 
if (use_sideband  cmds_sent) {
memset(demux, 0, sizeof(demux));
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 06/19] send-pack: refactor decision to send update per ref

2014-08-22 Thread Junio C Hamano
A new helper function ref_update_to_be_sent() decides for each ref
if the update is to be sent based on the status previously set by
set_ref_status_for_push() and also if this is a mirrored push.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 send-pack.c | 36 +---
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/send-pack.c b/send-pack.c
index 7428ae6..f3c5ebe 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -190,6 +190,26 @@ static void advertise_shallow_grafts_buf(struct strbuf *sb)
for_each_commit_graft(advertise_shallow_grafts_cb, sb);
 }
 
+static int ref_update_to_be_sent(const struct ref *ref, const struct 
send_pack_args *args)
+{
+   if (!ref-peer_ref  !args-send_mirror)
+   return 0;
+
+   /* Check for statuses set by set_ref_status_for_push() */
+   switch (ref-status) {
+   case REF_STATUS_REJECT_NONFASTFORWARD:
+   case REF_STATUS_REJECT_ALREADY_EXISTS:
+   case REF_STATUS_REJECT_FETCH_FIRST:
+   case REF_STATUS_REJECT_NEEDS_FORCE:
+   case REF_STATUS_REJECT_STALE:
+   case REF_STATUS_REJECT_NODELETE:
+   case REF_STATUS_UPTODATE:
+   return 0;
+   default:
+   return 1;
+   }
+}
+
 int send_pack(struct send_pack_args *args,
  int fd[], struct child_process *conn,
  struct ref *remote_refs,
@@ -248,23 +268,9 @@ int send_pack(struct send_pack_args *args,
 */
new_refs = 0;
for (ref = remote_refs; ref; ref = ref-next) {
-   if (!ref-peer_ref  !args-send_mirror)
+   if (!ref_update_to_be_sent(ref, args))
continue;
 
-   /* Check for statuses set by set_ref_status_for_push() */
-   switch (ref-status) {
-   case REF_STATUS_REJECT_NONFASTFORWARD:
-   case REF_STATUS_REJECT_ALREADY_EXISTS:
-   case REF_STATUS_REJECT_FETCH_FIRST:
-   case REF_STATUS_REJECT_NEEDS_FORCE:
-   case REF_STATUS_REJECT_STALE:
-   case REF_STATUS_REJECT_NODELETE:
-   case REF_STATUS_UPTODATE:
-   continue;
-   default:
-   ; /* do nothing */
-   }
-
if (!ref-deletion)
new_refs++;
 
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 07/19] send-pack: always send capabilities

2014-08-22 Thread Junio C Hamano
We tried to avoid sending one extra byte, NUL and nothing behind it
to signal there is no protocol capabilities being sent, on the first
command packet on the wire, but it just made the code look ugly.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 send-pack.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/send-pack.c b/send-pack.c
index f3c5ebe..2fa6c34 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -281,8 +281,7 @@ int send_pack(struct send_pack_args *args,
char *new_hex = sha1_to_hex(ref-new_sha1);
int quiet = quiet_supported  (args-quiet || 
!args-progress);
 
-   if (!cmds_sent  (status_report || use_sideband ||
-  quiet || agent_supported)) {
+   if (!cmds_sent)
packet_buf_write(req_buf,
 %s %s %s%c%s%s%s%s%s,
 old_hex, new_hex, ref-name, 0,
@@ -292,7 +291,6 @@ int send_pack(struct send_pack_args *args,
 agent_supported ?  agent= : 
,
 agent_supported ? 
git_user_agent_sanitized() : 
);
-   }
else
packet_buf_write(req_buf, %s %s %s,
 old_hex, new_hex, ref-name);
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 10/19] send-pack: refactor inspecting and resetting status and sending commands

2014-08-22 Thread Junio C Hamano
The main loop over remote_refs list inspects the ref status
to see if we need to generate pack data (i.e. a delete-only push
does not need to send any additional data), resets it to expecting
the status report state, and formats the actual update commands
to be sent.

Split the former two out of the main loop, as it will become
conditional in later steps.

Besides, we should have code that does real thing here, before the
Finally, tell the other end! part ;-)

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 send-pack.c | 49 ++---
 1 file changed, 30 insertions(+), 19 deletions(-)

diff --git a/send-pack.c b/send-pack.c
index 590eb0a..f3262f2 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -274,7 +274,8 @@ int send_pack(struct send_pack_args *args,
advertise_shallow_grafts_buf(req_buf);
 
/*
-* Finally, tell the other end!
+* Clear the status for each ref and see if we need to send
+* the pack data.
 */
for (ref = remote_refs; ref; ref = ref-next) {
if (!ref_update_to_be_sent(ref, args))
@@ -283,25 +284,35 @@ int send_pack(struct send_pack_args *args,
if (!ref-deletion)
need_pack_data = 1;
 
-   if (args-dry_run) {
+   if (args-dry_run || !status_report)
ref-status = REF_STATUS_OK;
-   } else {
-   char *old_hex = sha1_to_hex(ref-old_sha1);
-   char *new_hex = sha1_to_hex(ref-new_sha1);
-
-   if (!cmds_sent)
-   packet_buf_write(req_buf,
-%s %s %s%c%s,
-old_hex, new_hex, ref-name, 0,
-cap_buf.buf);
-   else
-   packet_buf_write(req_buf, %s %s %s,
-old_hex, new_hex, ref-name);
-   ref-status = status_report ?
-   REF_STATUS_EXPECTING_REPORT :
-   REF_STATUS_OK;
-   cmds_sent++;
-   }
+   else
+   ref-status = REF_STATUS_EXPECTING_REPORT;
+   }
+
+   /*
+* Finally, tell the other end!
+*/
+   for (ref = remote_refs; ref; ref = ref-next) {
+   char *old_hex, *new_hex;
+
+   if (args-dry_run)
+   continue;
+
+   if (!ref_update_to_be_sent(ref, args))
+   continue;
+
+   old_hex = sha1_to_hex(ref-old_sha1);
+   new_hex = sha1_to_hex(ref-new_sha1);
+   if (!cmds_sent)
+   packet_buf_write(req_buf,
+%s %s %s%c%s,
+old_hex, new_hex, ref-name, 0,
+cap_buf.buf);
+   else
+   packet_buf_write(req_buf, %s %s %s,
+old_hex, new_hex, ref-name);
+   cmds_sent++;
}
 
if (args-stateless_rpc) {
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 09/19] send-pack: rename new_refs to need_pack_data

2014-08-22 Thread Junio C Hamano
The variable counts how many non-deleting command is being sent, but
is only checked with 0-ness to decide if we need to send the pack
data.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 send-pack.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/send-pack.c b/send-pack.c
index c1c64ee..590eb0a 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -220,7 +220,7 @@ int send_pack(struct send_pack_args *args,
struct strbuf req_buf = STRBUF_INIT;
struct strbuf cap_buf = STRBUF_INIT;
struct ref *ref;
-   int new_refs;
+   int need_pack_data = 0;
int allow_deleting_refs = 0;
int status_report = 0;
int use_sideband = 0;
@@ -276,13 +276,12 @@ int send_pack(struct send_pack_args *args,
/*
 * Finally, tell the other end!
 */
-   new_refs = 0;
for (ref = remote_refs; ref; ref = ref-next) {
if (!ref_update_to_be_sent(ref, args))
continue;
 
if (!ref-deletion)
-   new_refs++;
+   need_pack_data = 1;
 
if (args-dry_run) {
ref-status = REF_STATUS_OK;
@@ -327,7 +326,7 @@ int send_pack(struct send_pack_args *args,
in = demux.out;
}
 
-   if (new_refs  cmds_sent) {
+   if (need_pack_data  cmds_sent) {
if (pack_objects(out, remote_refs, extra_have, args)  0) {
for (ref = remote_refs; ref; ref = ref-next)
ref-status = REF_STATUS_NONE;
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 11/19] send-pack: clarify that cmds_sent is a boolean

2014-08-22 Thread Junio C Hamano
We use it to make sure that the feature request is sent only once on
the very first request packet (ignoring the shallow  line, which
was an unfortunate mistake we cannot retroactively fix with existing
receive-pack already deployed in the field) and we set it to true
with cmds_sent++, not because we care about the actual number of
updates sent but because it is merely an old idiomatic way.

Set it explicitly to one to clarify that the code that uses this
variable only cares about its zero-ness.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 send-pack.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/send-pack.c b/send-pack.c
index f3262f2..05926d2 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -304,15 +304,16 @@ int send_pack(struct send_pack_args *args,
 
old_hex = sha1_to_hex(ref-old_sha1);
new_hex = sha1_to_hex(ref-new_sha1);
-   if (!cmds_sent)
+   if (!cmds_sent) {
packet_buf_write(req_buf,
 %s %s %s%c%s,
 old_hex, new_hex, ref-name, 0,
 cap_buf.buf);
-   else
+   cmds_sent = 1;
+   } else {
packet_buf_write(req_buf, %s %s %s,
 old_hex, new_hex, ref-name);
-   cmds_sent++;
+   }
}
 
if (args-stateless_rpc) {
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 12/19] gpg-interface: move parse_gpg_output() to where it should be

2014-08-22 Thread Junio C Hamano
Earlier, ffb6d7d5 (Move commit GPG signature verification to
commit.c, 2013-03-31) moved this helper that used to be in pretty.c
(i.e. the output code path) to commit.c for better reusability.

It was a good first step in the right direction, but still suffers a
myopic view that commits will be the only thing we would ever want
to sign---we would actually want to be able to reuse it even wider.

The function interprets what GPG said; gpg-interface is obviously a
better place.  Move it there.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 commit.c| 36 
 gpg-interface.c | 36 
 gpg-interface.h | 17 -
 3 files changed, 48 insertions(+), 41 deletions(-)

diff --git a/commit.c b/commit.c
index ae7f2b1..01cdad2 100644
--- a/commit.c
+++ b/commit.c
@@ -1220,42 +1220,6 @@ free_return:
free(buf);
 }
 
-static struct {
-   char result;
-   const char *check;
-} sigcheck_gpg_status[] = {
-   { 'G', \n[GNUPG:] GOODSIG  },
-   { 'B', \n[GNUPG:] BADSIG  },
-   { 'U', \n[GNUPG:] TRUST_NEVER },
-   { 'U', \n[GNUPG:] TRUST_UNDEFINED },
-};
-
-static void parse_gpg_output(struct signature_check *sigc)
-{
-   const char *buf = sigc-gpg_status;
-   int i;
-
-   /* Iterate over all search strings */
-   for (i = 0; i  ARRAY_SIZE(sigcheck_gpg_status); i++) {
-   const char *found, *next;
-
-   if (!skip_prefix(buf, sigcheck_gpg_status[i].check + 1, 
found)) {
-   found = strstr(buf, sigcheck_gpg_status[i].check);
-   if (!found)
-   continue;
-   found += strlen(sigcheck_gpg_status[i].check);
-   }
-   sigc-result = sigcheck_gpg_status[i].result;
-   /* The trust messages are not followed by key/signer 
information */
-   if (sigc-result != 'U') {
-   sigc-key = xmemdupz(found, 16);
-   found += 17;
-   next = strchrnul(found, '\n');
-   sigc-signer = xmemdupz(found, next - found);
-   }
-   }
-}
-
 void check_commit_signature(const struct commit* commit, struct 
signature_check *sigc)
 {
struct strbuf payload = STRBUF_INIT;
diff --git a/gpg-interface.c b/gpg-interface.c
index ff07012..3c9624c 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -21,6 +21,42 @@ void signature_check_clear(struct signature_check *sigc)
sigc-key = NULL;
 }
 
+static struct {
+   char result;
+   const char *check;
+} sigcheck_gpg_status[] = {
+   { 'G', \n[GNUPG:] GOODSIG  },
+   { 'B', \n[GNUPG:] BADSIG  },
+   { 'U', \n[GNUPG:] TRUST_NEVER },
+   { 'U', \n[GNUPG:] TRUST_UNDEFINED },
+};
+
+void parse_gpg_output(struct signature_check *sigc)
+{
+   const char *buf = sigc-gpg_status;
+   int i;
+
+   /* Iterate over all search strings */
+   for (i = 0; i  ARRAY_SIZE(sigcheck_gpg_status); i++) {
+   const char *found, *next;
+
+   if (!skip_prefix(buf, sigcheck_gpg_status[i].check + 1, 
found)) {
+   found = strstr(buf, sigcheck_gpg_status[i].check);
+   if (!found)
+   continue;
+   found += strlen(sigcheck_gpg_status[i].check);
+   }
+   sigc-result = sigcheck_gpg_status[i].result;
+   /* The trust messages are not followed by key/signer 
information */
+   if (sigc-result != 'U') {
+   sigc-key = xmemdupz(found, 16);
+   found += 17;
+   next = strchrnul(found, '\n');
+   sigc-signer = xmemdupz(found, next - found);
+   }
+   }
+}
+
 void set_signing_key(const char *key)
 {
free(configured_signing_key);
diff --git a/gpg-interface.h b/gpg-interface.h
index 37c23da..8d677cc 100644
--- a/gpg-interface.h
+++ b/gpg-interface.h
@@ -5,16 +5,23 @@ struct signature_check {
char *payload;
char *gpg_output;
char *gpg_status;
-   char result; /* 0 (not checked),
- * N (checked but no further result),
- * U (untrusted good),
- * G (good)
- * B (bad) */
+
+   /*
+* possible result:
+* 0 (not checked)
+* N (checked but no further result)
+* U (untrusted good)
+* G (good)
+* B (bad)
+*/
+   char result;
char *signer;
char *key;
 };
 
 extern void signature_check_clear(struct signature_check *sigc);
+extern void parse_gpg_output(struct signature_check *);
+
 extern int sign_buffer(struct strbuf *buffer, struct strbuf *signature, const 
char *signing_key);
 extern int verify_signed_buffer(const char *payload, size_t payload_size, 
const char *signature, size_t 

[PATCH v2 14/19] pack-protocol doc: typofix for PKT-LINE

2014-08-22 Thread Junio C Hamano
Everywhere else we use PKT-LINE to denote the pkt-line formatted
data, but shallow/deepen messages are described with PKT_LINE().

Fix them.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 Documentation/technical/pack-protocol.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index 18dea8d..a845d51 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -212,9 +212,9 @@ out of what the server said it could do with the first 
'want' line.
   want-list =  first-want
   *additional-want
 
-  shallow-line  =  PKT_LINE(shallow SP obj-id)
+  shallow-line  =  PKT-LINE(shallow SP obj-id)
 
-  depth-request =  PKT_LINE(deepen SP depth)
+  depth-request =  PKT-LINE(deepen SP depth)
 
   first-want=  PKT-LINE(want SP obj-id SP capability-list LF)
   additional-want   =  PKT-LINE(want SP obj-id LF)
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 15/19] the beginning of the signed push

2014-08-22 Thread Junio C Hamano
While signed tags and commits assert that the objects thusly signed
came from you, who signed these objects, there is not a good way to
assert that you wanted to have a particular object at the tip of a
particular branch.  My signing v2.0.1 tag only means I want to call
the version v2.0.1, and it does not mean I want to push it out to my
'master' branch---it is likely that I only want it in 'maint'.

Introduce a mechanism that allows you to sign a push certificate
(for the lack of better name) every time you push, asserting that
what object you are pushing to update which ref that used to point
at what other object.  Think of it as a cryptographic protection for
ref updates, similar to signed tags/commits but working on an
orthogonal axis.

The basic flow based on this mechanism goes like this:

 1. You push out your work with git push --signed.

 2. The sending side learns where the remote refs are as usual,
together with what protocol extension the receiving end
supports.  If the receiving end does not advertise the protocol
extension push-cert, an attempt to git push --signed fails.

Otherwise, a text file, that looks like the following, is
prepared in core:

certificate version 0.1
pusher Junio C Hamano gits...@pobox.com 1315427886 -0700

7339ca65... 21580ecb... refs/heads/master
3793ac56... 12850bec... refs/heads/next

The file begins with a few header lines, which may grow as we
gain more experience.  The 'pusher' header records the name of
the signer (the value of user.signingkey configuration variable,
falling back to GIT_COMMITTER_{NAME|EMAIL}) and the time of the
certificate generation.  After the header, a blank line follows,
followed by a copy of the protocol message lines.

Each line shows the old and the new object name at the tip of
the ref this push tries to update, in the way identical to how
the underlying git push protocol exchange tells the ref
updates to the receiving end (by recording the old object
name, the push certificate also protects against replaying).  It
is expected that new command packet types other than the
old-new-refname kind will be included in push certificate in the
same way as would appear in the plain vanilla command packets in
unsigned pushes.

The user then is asked to sign this push certificate using GPG,
formatted in a way similar to how signed tag objects are signed,
and the result is sent to the other side (i.e. receive-pack).

In the protocol exchange, this step comes immediately before the
sender tells what the result of the push should be, which in
turn comes before it sends the pack data.

 3. When the receiving end sees a push certificate, the certificate
is written out as a blob.  The pre-receive hook can learn about
the certificate by checking GIT_PUSH_CERT environment variable,
which, if present, tells the object name of this blob, and make
the decision to allow or reject this push.  Additionally, the
post-receive hook can also look at the certificate, which may be
a good place to log all the received certificates for later
audits.

Because a push certificate carry the same information as the usual
command packets in the protocol exchange, we can omit the latter
when a push certificate is in use and reduce the protocol overhead.
This however is not included in this patch to make it easier to
review (in other words, the series at this step should never be
released without the remainder of the series, as it implements an
interim protocol that will be incompatible with the final one,
merely for reviewing purposes).  As such, the documentation update
for the protocol is left out of this step.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 Documentation/git-push.txt |  9 +-
 Documentation/git-receive-pack.txt | 16 +-
 builtin/push.c |  1 +
 builtin/receive-pack.c | 43 -
 send-pack.c| 64 ++
 send-pack.h|  1 +
 t/t5534-push-signed.sh | 63 +
 transport.c|  4 +++
 transport.h|  5 +++
 9 files changed, 203 insertions(+), 3 deletions(-)
 create mode 100755 t/t5534-push-signed.sh

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 21cd455..21b3f29 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -10,7 +10,8 @@ SYNOPSIS
 
 [verse]
 'git push' [--all | --mirror | --tags] [--follow-tags] [-n | --dry-run] 
[--receive-pack=git-receive-pack]
-  [--repo=repository] [-f | --force] [--prune] [-v | --verbose] [-u 
| --set-upstream]
+  [--repo=repository] [-f | --force] [--prune] [-v | --verbose]
+  [-u | --set-upstream] [--signed]
   

[PATCH v2 17/19] send-pack: send feature request on push-cert packet

2014-08-22 Thread Junio C Hamano
We would want to update the interim protocol so that we do not send
the usual update commands when the push certificate feature is in
use, as the same information is in the certificate.  Once that
happens, the push-cert packet may become the only protocol command,
but then there is no packet to put the feature request behind, like
we always did.

As we have prepared the receiving end that understands the push-cert
feature to accept the feature request on the first protocol packet
(other than shallow , which was an unfortunate historical mistake
that has to come before everything else), we can give the feature
request on the push-cert packet instead of the first update protocol
packet, in preparation for the next step to actually update to the
final protocol.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 send-pack.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/send-pack.c b/send-pack.c
index 10c907f..4125892 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -225,9 +225,10 @@ static const char *next_line(const char *line, size_t len)
return nl + 1;
 }
 
-static void generate_push_cert(struct strbuf *req_buf,
-  const struct ref *remote_refs,
-  struct send_pack_args *args)
+static int generate_push_cert(struct strbuf *req_buf,
+ const struct ref *remote_refs,
+ struct send_pack_args *args,
+ const char *cap_string)
 {
const struct ref *ref;
char stamp[60];
@@ -256,7 +257,7 @@ static void generate_push_cert(struct strbuf *req_buf,
if (sign_buffer(cert, cert, signing_key))
die(_(failed to sign the push certificate));
 
-   packet_buf_write(req_buf, push-cert\n);
+   packet_buf_write(req_buf, push-cert%c%s, 0, cap_string);
for (cp = cert.buf; cp  cert.buf + cert.len; cp = np) {
np = next_line(cp, cert.buf + cert.len - cp);
packet_buf_write(req_buf,
@@ -267,6 +268,7 @@ static void generate_push_cert(struct strbuf *req_buf,
 free_return:
free(signing_key);
strbuf_release(cert);
+   return 1;
 }
 
 int send_pack(struct send_pack_args *args,
@@ -335,7 +337,8 @@ int send_pack(struct send_pack_args *args,
advertise_shallow_grafts_buf(req_buf);
 
if (!args-dry_run  args-push_cert)
-   generate_push_cert(req_buf, remote_refs, args);
+   cmds_sent = generate_push_cert(req_buf, remote_refs, args,
+  cap_buf.buf);
 
/*
 * Clear the status for each ref and see if we need to send
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/18] Signed push

2014-08-22 Thread Junio C Hamano
Stefan Beller stefanbel...@gmail.com writes:

 On 22.08.2014 22:03, Junio C Hamano wrote:
 Stefan Beller stefanbel...@gmail.com writes:
 
 So there would be tags like:
 master_2014_08_21
 master_2014_08_22
 ...
 maint_2014_08_13
 maint_2014_08_21
 and so on. Whenever there is no tag at the tip of the branch, we'd
 know there is something wrong.
 
 Who creates that tag?
 

 My guess would be usability as tagging so many branches is cumbersome
 for a maintainer?

Did you answer my question?  Who creates these tags?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 13/19] gpg-interface: move parse_signature() to where it should be

2014-08-22 Thread Junio C Hamano
Our signed-tag objects set the standard format used by Git to store
GPG-signed payload (i.e. the payload followed by its detached
signature), and it made sense to have a helper to find the boundary
between the payload and its signature in tag.c back then.

Newer code added later to parse other kinds of objects that learned
to use the same format to store GPG-signed payload (e.g. signed
commits), however, kept using the helper from the same location.

Move it to gpg-interface; the helper is no longer about signed tag,
but it is how our code and data interact with GPG.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 gpg-interface.c | 21 +
 gpg-interface.h |  1 +
 tag.c   | 20 
 tag.h   |  1 -
 4 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/gpg-interface.c b/gpg-interface.c
index 3c9624c..0dd11ea 100644
--- a/gpg-interface.c
+++ b/gpg-interface.c
@@ -7,6 +7,9 @@
 static char *configured_signing_key;
 static const char *gpg_program = gpg;
 
+#define PGP_SIGNATURE -BEGIN PGP SIGNATURE-
+#define PGP_MESSAGE -BEGIN PGP MESSAGE-
+
 void signature_check_clear(struct signature_check *sigc)
 {
free(sigc-payload);
@@ -57,6 +60,24 @@ void parse_gpg_output(struct signature_check *sigc)
}
 }
 
+/*
+ * Look at GPG signed content (e.g. a signed tag object), whose
+ * payload is followed by a detached signature on it.  Return the
+ * offset where the embedded detached signature begins, or the end of
+ * the data when there is no such signature.
+ */
+size_t parse_signature(const char *buf, unsigned long size)
+{
+   char *eol;
+   size_t len = 0;
+   while (len  size  !starts_with(buf + len, PGP_SIGNATURE) 
+   !starts_with(buf + len, PGP_MESSAGE)) {
+   eol = memchr(buf + len, '\n', size - len);
+   len += eol ? eol - (buf + len) + 1 : size - len;
+   }
+   return len;
+}
+
 void set_signing_key(const char *key)
 {
free(configured_signing_key);
diff --git a/gpg-interface.h b/gpg-interface.h
index 8d677cc..93c76c2 100644
--- a/gpg-interface.h
+++ b/gpg-interface.h
@@ -20,6 +20,7 @@ struct signature_check {
 };
 
 extern void signature_check_clear(struct signature_check *sigc);
+extern size_t parse_signature(const char *buf, unsigned long size);
 extern void parse_gpg_output(struct signature_check *);
 
 extern int sign_buffer(struct strbuf *buffer, struct strbuf *signature, const 
char *signing_key);
diff --git a/tag.c b/tag.c
index 82d841b..5b0ac62 100644
--- a/tag.c
+++ b/tag.c
@@ -4,9 +4,6 @@
 #include tree.h
 #include blob.h
 
-#define PGP_SIGNATURE -BEGIN PGP SIGNATURE-
-#define PGP_MESSAGE -BEGIN PGP MESSAGE-
-
 const char *tag_type = tag;
 
 struct object *deref_tag(struct object *o, const char *warn, int warnlen)
@@ -143,20 +140,3 @@ int parse_tag(struct tag *item)
free(data);
return ret;
 }
-
-/*
- * Look at a signed tag object, and return the offset where
- * the embedded detached signature begins, or the end of the
- * data when there is no such signature.
- */
-size_t parse_signature(const char *buf, unsigned long size)
-{
-   char *eol;
-   size_t len = 0;
-   while (len  size  !starts_with(buf + len, PGP_SIGNATURE) 
-   !starts_with(buf + len, PGP_MESSAGE)) {
-   eol = memchr(buf + len, '\n', size - len);
-   len += eol ? eol - (buf + len) + 1 : size - len;
-   }
-   return len;
-}
diff --git a/tag.h b/tag.h
index bc8a1e4..f4580ae 100644
--- a/tag.h
+++ b/tag.h
@@ -17,6 +17,5 @@ extern int parse_tag_buffer(struct tag *item, const void 
*data, unsigned long si
 extern int parse_tag(struct tag *item);
 extern struct object *deref_tag(struct object *, const char *, int);
 extern struct object *deref_tag_noverify(struct object *);
-extern size_t parse_signature(const char *buf, unsigned long size);
 
 #endif /* TAG_H */
-- 
2.1.0-304-g950f846

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 19/19] signed push: fortify against replay attacks

2014-08-22 Thread Junio C Hamano
In order to prevent a valid push certificate for pushing into an
repository from getting replayed to push to an unrelated one, send a
nonce string from the receive-pack process and have the signer
include it in the push certificate.  The original nonce is exported
as GIT_PUSH_CERT_NONCE for the hooks to examine and match against
the value on the nonce header in the certificate to notice a replay.

Because the built-in nonce generation may not be suitable for all
situations, allow the server to invoke receive-pack with pregenerated
nonce from the command line argument.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 Documentation/git-receive-pack.txt|  8 +
 Documentation/technical/pack-protocol.txt |  5 +--
 Documentation/technical/protocol-capabilities.txt |  7 ++--
 builtin/receive-pack.c| 42 ---
 send-pack.c   | 19 +++---
 t/t5534-push-signed.sh| 17 +
 6 files changed, 78 insertions(+), 20 deletions(-)

diff --git a/Documentation/git-receive-pack.txt 
b/Documentation/git-receive-pack.txt
index 60151a6..983b24e 100644
--- a/Documentation/git-receive-pack.txt
+++ b/Documentation/git-receive-pack.txt
@@ -72,6 +72,13 @@ GIT_PUSH_CERT_STATUS::
using the same mnemonic as used in `%G?` format of `git log`
family of commands (see linkgit:git-log[1]).
 
+GIT_PUSH_CERT_NONCE::
+   The nonce string the process asked the signer to include
+   in the push certificate.  If this does not match the value
+   recorded on the nonce header in the push certificate, it
+   may indicate that the certificate is a valid one that is
+   being replayed from a separate git push session.
+
 This hook is called before any refname is updated and before any
 fast-forward checks are performed.
 
@@ -147,6 +154,7 @@ service:
if test -n ${GIT_PUSH_CERT-}  test ${GIT_PUSH_CERT_STATUS} = G
then
(
+   echo expected nonce is ${GIT_PUSH_NONCE}
git cat-file blob ${GIT_PUSH_CERT}
) | mail -s push from $GIT_PUSH_CERT_SIGNER push-log@mydomain
fi
diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index b86580b..67dd3c9 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -483,10 +483,11 @@ references.
 
   push-cert = PKT-LINE(push-cert NUL capability-list LF)
  PKT-LINE(certificate version 0.1 LF)
- PKT-LINE(pusher ident LF)
+ PKT-LINE(pusher SP ident LF)
+ PKT-LINE(nonce SP nonce LF)
  PKT-LINE(LF)
  *PKT-LINE(command LF)
- *PKT-LINE(GPG signature lines LF)
+ *PKT-LINE(detached GPG signature lines LF)
  PKT-LINE(push-cert-end LF)
 
   pack-file = PACK 28*(OCTET)
diff --git a/Documentation/technical/protocol-capabilities.txt 
b/Documentation/technical/protocol-capabilities.txt
index a478cc4..0c92dee 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -251,10 +251,11 @@ If the upload-pack server advertises this capability, 
fetch-pack may
 send want lines with SHA-1s that exist at the server but are not
 advertised by upload-pack.
 
-push-cert
--
+push-cert=nonce
+-
 
 The receive-pack server that advertises this capability is willing
-to accept a signed push certificate.  A send-pack client MUST NOT
+to accept a signed push certificate, and asks the nonce to be
+included in the push certificate.  A send-pack client MUST NOT
 send a push-cert packet unless the receive-pack server advertises
 this capability.
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 991e417..8ad4d9b 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -51,6 +51,7 @@ static const char *alt_shallow_file;
 static struct strbuf push_cert = STRBUF_INIT;
 static unsigned char push_cert_sha1[20];
 static struct signature_check sigcheck;
+static const char *push_cert_nonce;
 
 static enum deny_action parse_deny_action(const char *var, const char *value)
 {
@@ -142,15 +143,18 @@ static void show_ref(const char *path, const unsigned 
char *sha1)
if (ref_is_hidden(path))
return;
 
-   if (sent_capabilities)
+   if (sent_capabilities) {
packet_write(1, %s %s\n, sha1_to_hex(sha1), path);
-   else
-   packet_write(1, %s %s%c%s%s agent=%s\n,
+   } else {
+   packet_write(1, %s %s%c%s%s%s%s agent=%s\n,
 sha1_to_hex(sha1), path, 0,
- report-status delete-refs side-band-64k quiet 
push-cert,
+ report-status delete-refs 

[PATCH v2 18/19] signed push: remove duplicated protocol info

2014-08-22 Thread Junio C Hamano
With the interim protocol, we used to send the update commands even
though we already send a signed copy of the same information when
push certificate is in use.  Update the send-pack/receive-pack pair
not to do so.

The notable thing on the receive-pack side is that it makes sure
that there is no command sent over the traditional protocol packet
outside the push certificate.  Otherwise a pusher can claim to be
pushing one set of ref updates in the signed certificate while
issuing commands to update unrelated refs, and such an update will
evade later audits.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 Documentation/technical/pack-protocol.txt | 20 ++-
 Documentation/technical/protocol-capabilities.txt | 12 +++--
 builtin/receive-pack.c| 30 +--
 send-pack.c   |  2 +-
 4 files changed, 58 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/pack-protocol.txt 
b/Documentation/technical/pack-protocol.txt
index a845d51..b86580b 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -465,7 +465,7 @@ contain all the objects that the server will need to 
complete the new
 references.
 
 
-  update-request=  *shallow command-list [pack-file]
+  update-request=  *shallow ( command-list | push-cert ) [pack-file]
 
   shallow   =  PKT-LINE(shallow SP obj-id)
 
@@ -481,12 +481,30 @@ references.
   old-id=  obj-id
   new-id=  obj-id
 
+  push-cert = PKT-LINE(push-cert NUL capability-list LF)
+ PKT-LINE(certificate version 0.1 LF)
+ PKT-LINE(pusher ident LF)
+ PKT-LINE(LF)
+ *PKT-LINE(command LF)
+ *PKT-LINE(GPG signature lines LF)
+ PKT-LINE(push-cert-end LF)
+
   pack-file = PACK 28*(OCTET)
 
 
 If the receiving end does not support delete-refs, the sending end MUST
 NOT ask for delete command.
 
+If the receiving end does not support push-cert, the sending end MUST
+NOT send a push-cert command.
+
+When a push-cert command is sent, command-list MUST NOT be sent; the
+commands recorded in the push certificate is used instead.  The GPG
+signature lines are a detached signature for the contents recorded in
+the push certificate before the signature block begins and are used
+to certify that the commands were given by the pusher, who must be
+the signer.
+
 The pack-file MUST NOT be sent if the only command used is 'delete'.
 
 A pack-file MUST be sent if either create or update command is used,
diff --git a/Documentation/technical/protocol-capabilities.txt 
b/Documentation/technical/protocol-capabilities.txt
index e174343..a478cc4 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -18,8 +18,8 @@ was sent.  Server MUST NOT ignore capabilities that client 
requested
 and server advertised.  As a consequence of these rules, server MUST
 NOT advertise capabilities it does not understand.
 
-The 'report-status', 'delete-refs', and 'quiet' capabilities are sent and
-recognized by the receive-pack (push to server) process.
+The 'report-status', 'delete-refs', 'quiet', and 'push-cert' capabilities
+are sent and recognized by the receive-pack (push to server) process.
 
 The 'ofs-delta' and 'side-band-64k' capabilities are sent and recognized
 by both upload-pack and receive-pack protocols.  The 'agent' capability
@@ -250,3 +250,11 @@ allow-tip-sha1-in-want
 If the upload-pack server advertises this capability, fetch-pack may
 send want lines with SHA-1s that exist at the server but are not
 advertised by upload-pack.
+
+push-cert
+-
+
+The receive-pack server that advertises this capability is willing
+to accept a signed push certificate.  A send-pack client MUST NOT
+send a push-cert packet unless the receive-pack server advertises
+this capability.
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index abdc296..991e417 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -880,7 +880,7 @@ static void execute_commands(struct command *commands,
  the reported refs above);
 }
 
-static struct command **queue_command(struct command **p,
+static struct command **queue_command(struct command **tail,
  const char *line,
  int linelen)
 {
@@ -903,10 +903,32 @@ static struct command **queue_command(struct command **p,
hashcpy(cmd-new_sha1, new_sha1);
memcpy(cmd-ref_name, refname, reflen);
cmd-ref_name[reflen] = '\0';
-   *p = cmd;
+   *tail = cmd;
return cmd-next;
 }
 
+static void queue_commands_from_cert(struct command **tail,
+struct strbuf *push_cert)
+{
+   const char *boc, *eoc;
+
+   if 

[PATCH v2 16/19] receive-pack: GPG-validate push certificates

2014-08-22 Thread Junio C Hamano
Reusing the GPG signature check helpers we already have, verify
the signature in receive-pack and give the results to the hooks
via GIT_PUSH_CERT_{SIGNER,KEY,STATUS} environment variables.

Policy decisions, such as accepting or rejecting a good signature by
a key that is not fully trusted, is left to the hook and kept
outside of the core.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 Documentation/git-receive-pack.txt | 27 ++-
 builtin/receive-pack.c | 29 +
 t/t5534-push-signed.sh | 18 --
 3 files changed, 67 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-receive-pack.txt 
b/Documentation/git-receive-pack.txt
index 6c458af..60151a6 100644
--- a/Documentation/git-receive-pack.txt
+++ b/Documentation/git-receive-pack.txt
@@ -56,7 +56,21 @@ sha1-old and sha1-new should be valid objects in the 
repository.
 When accepting a signed push (see linkgit:git-push[1]), the signed
 push certificate is stored in a blob and an environment variable
 `GIT_PUSH_CERT` can be consulted for its object name.  See the
-description of `post-receive` hook for an example.
+description of `post-receive` hook for an example.  In addition, the
+certificate is verified using GPG and the result is exported with
+the following environment variables:
+
+GIT_PUSH_CERT_SIGNER::
+   The name and the e-mail address of the owner of the key that
+   signed the push certificate.
+
+GIT_PUSH_CERT_KEY::
+   The GPG key ID of the key that signed the push certificate.
+
+GIT_PUSH_CERT_STATUS::
+   The status of GPG verification of the push certificate,
+   using the same mnemonic as used in `%G?` format of `git log`
+   family of commands (see linkgit:git-log[1]).
 
 This hook is called before any refname is updated and before any
 fast-forward checks are performed.
@@ -106,13 +120,14 @@ the update.  Refs that were created will have sha1-old 
equal to
 0\{40}, otherwise sha1-old and sha1-new should be valid objects in
 the repository.
 
-The `GIT_PUSH_CERT` environment variable can be inspected, just as
+The `GIT_PUSH_CERT*` environment variables can be inspected, just as
 in `pre-receive` hook, after accepting a signed push.
 
 Using this hook, it is easy to generate mails describing the updates
 to the repository.  This example script sends one mail message per
 ref listing the commits pushed to the repository, and logs the push
-certificates of signed pushes to a file:
+certificates of signed pushes with good signatures to a logger
+service:
 
#!/bin/sh
# mail out commit update information.
@@ -129,9 +144,11 @@ certificates of signed pushes to a file:
mail -s Changes to ref $ref commit-list@mydomain
done
# log signed push certificate, if any
-   if test -n ${GIT_PUSH_CERT-}
+   if test -n ${GIT_PUSH_CERT-}  test ${GIT_PUSH_CERT_STATUS} = G
then
-   git cat-file blob ${GIT_PUSH_CERT} /var/log/push-log
+   (
+   git cat-file blob ${GIT_PUSH_CERT}
+   ) | mail -s push from $GIT_PUSH_CERT_SIGNER push-log@mydomain
fi
exit 0
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index f30df8a..abdc296 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -15,6 +15,8 @@
 #include connected.h
 #include argv-array.h
 #include version.h
+#include tag.h
+#include gpg-interface.h
 
 static const char receive_pack_usage[] = git receive-pack git-dir;
 
@@ -48,6 +50,7 @@ static int shallow_update;
 static const char *alt_shallow_file;
 static struct strbuf push_cert = STRBUF_INIT;
 static unsigned char push_cert_sha1[20];
+static struct signature_check sigcheck;
 
 static enum deny_action parse_deny_action(const char *var, const char *value)
 {
@@ -260,12 +263,38 @@ static void prepare_push_cert_sha1(struct child_process 
*proc)
struct argv_array env = ARGV_ARRAY_INIT;
 
if (!already_done) {
+   struct strbuf gpg_output = STRBUF_INIT;
+   struct strbuf gpg_status = STRBUF_INIT;
+   int bogs /* beginning_of_gpg_sig */;
+
already_done = 1;
if (write_sha1_file(push_cert.buf, push_cert.len, blob, 
push_cert_sha1))
hashclr(push_cert_sha1);
+
+   memset(sigcheck, '\0', sizeof(sigcheck));
+   sigcheck.result = 'N';
+
+   bogs = parse_signature(push_cert.buf, push_cert.len);
+   if (verify_signed_buffer(push_cert.buf, bogs,
+push_cert.buf + bogs, push_cert.len - 
bogs,
+gpg_output, gpg_status)  0) {
+   ; /* error running gpg */
+   } else {
+   sigcheck.payload = push_cert.buf;
+   sigcheck.gpg_output = gpg_output.buf;
+   sigcheck.gpg_status = 

Re: [PATCH 00/18] Signed push

2014-08-22 Thread Stefan Beller
On 22.08.2014 22:33, Junio C Hamano wrote:
 Stefan Beller stefanbel...@gmail.com writes:
 
 On 22.08.2014 22:03, Junio C Hamano wrote:
 Stefan Beller stefanbel...@gmail.com writes:

 So there would be tags like:
master_2014_08_21
master_2014_08_22
...
maint_2014_08_13
maint_2014_08_21
 and so on. Whenever there is no tag at the tip of the branch, we'd
 know there is something wrong.

 Who creates that tag?


 My guess would be usability as tagging so many branches is cumbersome
 for a maintainer?
 
 Did you answer my question?  Who creates these tags?
 

It would be up to the one who pushes, the user, or in our case you!

This way of working would require the informal notion of
'always tag the last commit before pushing.'

As I wrote in the first email, I made up this workaround and wanted to
see, what's so bad about that workaround and how to overcome the
problems. And all I could find was a burden on the maintainer/user.

Sorry,
Stefan

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/18] Signed push

2014-08-22 Thread Junio C Hamano
Stefan Beller stefanbel...@gmail.com writes:

 On 22.08.2014 22:33, Junio C Hamano wrote:
 Stefan Beller stefanbel...@gmail.com writes:
 
 On 22.08.2014 22:03, Junio C Hamano wrote:
 Stefan Beller stefanbel...@gmail.com writes:

 So there would be tags like:
   master_2014_08_21
   master_2014_08_22
   ...
   maint_2014_08_13
   maint_2014_08_21
 and so on. Whenever there is no tag at the tip of the branch, we'd
 know there is something wrong.

 Who creates that tag?


 My guess would be usability as tagging so many branches is cumbersome
 for a maintainer?
 
 Did you answer my question?  Who creates these tags?
 

 It would be up to the one who pushes, the user, or in our case you!
 ...
 As I wrote in the first email, I made up this workaround and wanted to
 see, what's so bad about that workaround and how to overcome the
 problems. And all I could find was a burden on the maintainer/user.

burden is not an issue, as I'll be signing the push certificate
anyway when I push.  A signed tag or a signed commit and signed push
certificate solves two completely separate and orthogonal issues.

What happens if you break into GitHub or k.org and did

$ git tag maint_2014_08_22 master_2014_08_22

to create an extra tag out of the tag signed by me?  If you want,
you could also remove the original while at it.  The goal is to let
us validate without having to trust the hosting site, its management
and its software, which is what creates the tag there, controls
where the tag sits in refs/ hierarchy and how it is shown to the
outside world.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/18] Signed push

2014-08-22 Thread Stefan Beller
On 23.08.2014 00:32, Junio C Hamano wrote:
 Stefan Beller stefanbel...@gmail.com writes:
 
 On 22.08.2014 22:33, Junio C Hamano wrote:
 Stefan Beller stefanbel...@gmail.com writes:

 On 22.08.2014 22:03, Junio C Hamano wrote:
 Stefan Beller stefanbel...@gmail.com writes:

 So there would be tags like:
  master_2014_08_21
  master_2014_08_22
  ...
  maint_2014_08_13
  maint_2014_08_21
 and so on. Whenever there is no tag at the tip of the branch, we'd
 know there is something wrong.

 Who creates that tag?


 My guess would be usability as tagging so many branches is cumbersome
 for a maintainer?

 Did you answer my question?  Who creates these tags?


 It would be up to the one who pushes, the user, or in our case you!
 ...
 As I wrote in the first email, I made up this workaround and wanted to
 see, what's so bad about that workaround and how to overcome the
 problems. And all I could find was a burden on the maintainer/user.
 
 burden is not an issue, as I'll be signing the push certificate
 anyway when I push.  A signed tag or a signed commit and signed push
 certificate solves two completely separate and orthogonal issues.
 
 What happens if you break into GitHub or k.org and did
 
 $ git tag maint_2014_08_22 master_2014_08_22

Ok, I personally haven't used tags a lot.
I just tried to
git tag -s testbreaktag v2.1.0
git show testbreaktag
# However it would still read:
tag v2.1.0
Tagger: Junio C Hamano gits...@pobox.com
Date:   Fri Aug 15 15:09:28 2014 -0700

So as I do not posess your private key I could not create signed tags
even if I were to break into github/k.org



 
 to create an extra tag out of the tag signed by me?  If you want,
 you could also remove the original while at it. 

Considering I'm in the hosting server,
could I delete the push cert as well?
Now that I deleted the push certificate,
I could pretend Junio just forgot to sign the push cert today
and we're back at the tag solution?

Ah wait! the subsequent push certs would not match,
I'd need to delete them as well.


 The goal is to let
 us validate without having to trust the hosting site, its management
 and its software, which is what creates the tag there, controls
 where the tag sits in refs/ hierarchy and how it is shown to the
 outside world.
 

Ok, I got the goal. :)

Thanks for your patience,
Stefan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/3] Push the NATIVE_CRLF Makefile variable to C and added a test for native.

2014-08-22 Thread Junio C Hamano
Torsten Bögershausen tbo...@web.de writes:

 Commit 95f31e9a correctly points out that the NATIVE_CRLF setting is
 incorrectly set on Mingw git. However, the Makefile variable is not
 propagated to the C preprocessor and results in no change. This patch
 pushes the definition to the C code and adds a test to validate that
 when core.eol as native is crlf, we actually normalize text files to this
 line ending convention when core.autocrlf is false.

 Signed-off-by: Pat Thoyts pattho...@users.sourceforge.net
 Signed-off-by: Stepan Kasal ka...@ucw.cz
 Signed-off-by: Torsten Bögershausen tbo...@web.de
 ---

Who should I record as the author of this patch?


 This mini series mainly updates git.git with patches from msysgit:
 Patch 1 is taken as is,
 Patch 2 is taken as is,
 and Patch 3 is the outcome of the code-review 

 Thanks for careful reading


  Makefile  |  3 +++
  t/t0026-eol-config.sh | 18 ++
  2 files changed, 21 insertions(+)

 diff --git a/Makefile b/Makefile
 index 2320de5..517036e 100644
 --- a/Makefile
 +++ b/Makefile
 @@ -1479,6 +1479,9 @@ ifdef NO_REGEX
   COMPAT_CFLAGS += -Icompat/regex
   COMPAT_OBJS += compat/regex/regex.o
  endif
 +ifdef NATIVE_CRLF
 + BASIC_CFLAGS += -DNATIVE_CRLF
 +endif
  
  ifdef USE_NED_ALLOCATOR
 COMPAT_CFLAGS += -Icompat/nedmalloc
 diff --git a/t/t0026-eol-config.sh b/t/t0026-eol-config.sh
 index 4807b0f..43a580a 100755
 --- a/t/t0026-eol-config.sh
 +++ b/t/t0026-eol-config.sh
 @@ -80,4 +80,22 @@ test_expect_success 'autocrlf=true overrides unset eol' '
   test -z $onediff  test -z $twodiff
  '
  
 +test_expect_success NATIVE_CRLF 'eol native is crlf' '
 +
 + rm -rf native_eol  mkdir native_eol 
 + ( cd native_eol 
 + printf *.txt text\n  .gitattributes
 + printf one\r\ntwo\r\nthree\r\n  filedos.txt
 + printf one\ntwo\nthree\n  fileunix.txt

I think I saw a few style fixes sent against the previous round of
patches; have you missed them?

 + git init 
 + git config core.autocrlf false 
 + git config core.eol native 
 + git add filedos.txt fileunix.txt 
 + git commit -m first 
 + rm file*.txt 
 + git reset --hard HEAD 
 + has_cr filedos.txt  has_cr fileunix.txt
 + )
 +'
 +
  test_done
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 18/18] signed push: final protocol update

2014-08-22 Thread Shawn Pearce
On Fri, Aug 22, 2014 at 10:59 AM, Junio C Hamano gits...@pobox.com wrote:
 Junio C Hamano gits...@pobox.com writes:

 There are a few gotchas I can certainly use help on, especially from
 a smart-http expert ;-).

  * pushed-to URL will identify the site and the repository, so
you cannot MITM my push to an experimental server and replay it
against the authoritative server.

However, the receiving end may not even know what name its users
call the repository being pushed into.  Obviously gethostname()
may not be what the pusher called us, and getcwd() may not match
the repository name without leading /var/repos/shard3/ path
components stripped, for example.

I am not sure if we even have the necessary information at
send-pack.c::send_pack() level, where it already has an
established connection to the server (hence it does not need to
know to whom it is talking to).


  * The receiving end will issue push-cert=nonce in its initial
capability advertisement, and this nonce will be given on the
PUSH_CERT_NONCE environment to the pre/post-receive hooks, to
allow the nonce nonce header in the signed certificate to be
checked against it.  You cannot capture my an earlier push to the
authoritative server and replay it later.

That would all work well within a single receive-pack process,
but with stateless RPC, it is unclear to me how we should
arrange the nonce the initial instance of receive-pack placed
on its capability advertisement to be securely passed to the
instance of receive-pack that actually receives the push
certificate.

 A good nonce may be something like taking the SHA-1 hash of the
 concatenation of the sitename, repo-path and the timestamp when the
 receive-pack generated the nonce.  Replaying a push certificate
 for a push to a repository at a site that gives such a nonce can
 succeed at the same chance of finding a SHA-1 collision [*1*].  As
 long as you exercise good hygiene and only push to repositories that
 give such nonce, we can do without checking pushed-to that says
 where the push went.

Yes, this is an interesting solution.

As you know, the stateless HTTP thing doesn't allow the nonce on the
server to be carried from the initial ref advertisement into the final
receive-pack. We would either need to write the nonce to disk and load
it back up later (ick), or use some sort of stateless nonce.

A stateless nonce could look like:

  nonce = HMAC_SHA1( SHA1(site+path) + '.' + now, site_key )

where site_key is a private key known to the server. It doesn't have
to be per-repo.

receive-pack would then be willing to accept any nonce whose timestamp
is within a window, e.g. 10 minutes of the current time, and whose
signature verifies in the HMAC. The 10 minute window is important to
allow clients time to generate the object list, perform delta
compression, and begin transmitting to the server.

 So nonce nonce is the only thing that is necessary to make them
 impossible to replay.  For auditing purposes, pushed-to URL that
 records the repository the pusher intended to push to may help but
 probably not necessary [*2*].

So pushed-to is still useful as a documentation/historical artifact,
but isn't important for verifying the certificate. That fixes a lot of
problems with repositories having different paths under e.g. git://
vs. ssh:// vs. https:// on the same host.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-ref-format: include refs/ in the argument or to strip it?

2014-08-22 Thread Jonathan Nieder
Junio C Hamano wrote:
 Michael Haggerty wrote[1]:
 Jonathan Nieder wrote:

 The check-ref-format documentation is pretty unclear, but the
 intent is that it would be used like

git check-ref-format heads/master

 (see the surviving examples in contrib/examples/). That way, it can
 enforce the rule (from git-check-ref-format(1))
[...]
 Thanks for the explanation and the pointer.

 I wanted to follow this discussion, especially the ellided [...]
 pointer, but had a hart time finding what pointer was.

I missed this question before.

The discussion happened at https://code-review.googlesource.com/1017.
It's easier to see after clicking the 'Expand All' button, but even
then it's hard to see the signal in the 'Patch Set n was rebased'
noise.

The pointer Michael mentioned was to the git-check-ref-format(1)
manpage and old 'check-ref-format' callers that can be found in
contrib/examples/.  Sorry for the lack of context.

Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/5] fix pack-refs pruning of refs/foo

2014-08-22 Thread Jeff King
I noticed that git pack-refs --all will pack a top-level ref like
refs/foo, but will not actually prune $GIT_DIR/refs/foo. I do not
see the point in having a policy not to pack refs/foo if --all is
given. But even if we did have such a policy, this seems broken; we
should either pack and prune, or leave them untouched. I don't see any
indication that the existing behavior was intentional.

The problem is that pack-refs's prune_ref calls lock_ref_sha1, which
enforces this no toplevel behavior. I am not sure there is any real
point to this, given that most callers use lock_any_ref_for_update,
which is exactly equivalent but without the toplevel check.

The first two patches deal with this by switching pack-refs to use
lock_any_ref_for_update. This will conflict slightly with Ronnie's
ref-transaction work, as he gets rid of lock_ref_sha1 entirely, and
moves the code directly into prune_ref. This can be trivially resolved
in favor of my patch, I think.

The third patch is a cleanup I noticed while looking around, and I think
should not conflict with anyone (and is a good thing to do).

The last two are trickier. I wondered if we could get rid of
lock_ref_sha1 entirely. After pack-refs, there are two callers:
fast-import.c and walker.c. After converting the first, it occurred to
me that Ronnie might be touching the same areas, and I see that yes,
indeed, there's quite a bit of conflict (and he reaches the same end
goal of dropping it entirely).

So in that sense I do not mind dropping the final two patches. Ronnie's
endpoint is much better, moving to a ref_transaction. However, there is
actually a buffer overflow in the existing code. Ronnie's series fixes
it in a similar way (moving to a strbuf), and I'm fine with that
endpoint. But given that the ref transaction code is not yet merged (and
would certainly never be maint-track), I think it is worth applying the
buffer overflow fix separately.

I think the final patch can probably be dropped, then. It is a clean-up,
but one that we can just get when Ronnie's series is merged.

  [1/5]: git-prompt: do not look for refs/stash in $GIT_DIR
  [2/5]: pack-refs: prune top-level refs like refs/foo
  [3/5]: fast-import: clean up pack_data pointer in end_packfile
  [4/5]: fast-import: fix buffer overflow in dump_tags
  [5/5]: fast-import: stop using lock_ref_sha1

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] git-prompt: do not look for refs/stash in $GIT_DIR

2014-08-22 Thread Jeff King
Since dd0b72c (bash prompt: use bash builtins to check stash
state, 2011-04-01), git-prompt checks whether we have a
stash by looking for $GIT_DIR/refs/stash. Generally external
programs should never do this, because they would miss
packed-refs.

That commit claims that packed-refs does not pack
refs/stash, but that is not quite true. It does pack the
ref, but due to a bug, fails to prune the ref. When we fix
that bug, we would want to be doing the right thing here.

Signed-off-by: Jeff King p...@peff.net
---
I know we are pretty sensitive to forks in the prompt code (after all,
that was the point of dd0b72c). This patch is essentially a reversion of
this hunk of dd0b72c, and is definitely safe.

I think we could probably get by with:

  [ -r $g/logs/ref/stash ]

since reflogs are always in the filesystem. However, that seems somewhat
short-sighted, as the work Ronnie is doing is moving in the direction of
more abstraction here. I hope a day will come soon when reflogs do not
have to be stored in $GIT_DIR/logs, and then we would end up updating
this code again.

 contrib/completion/git-prompt.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index 9d684b1..c5473dc 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -468,7 +468,8 @@ __git_ps1 ()
fi
fi
if [ -n ${GIT_PS1_SHOWSTASHSTATE-} ] 
-  [ -r $g/refs/stash ]; then
+  git rev-parse --verify --quiet refs/stash /dev/null
+   then
s=$
fi
 
-- 
2.1.0.346.ga0367b9

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/5] pack-refs: prune top-level refs like refs/foo

2014-08-22 Thread Jeff King
After we have packed all refs, we prune any loose refs that
correspond to what we packed. We do so by first taking a
lock with lock_ref_sha1, and then deleting the loose ref
file.

However, lock_ref_sha1 will refuse to take a lock on any
refs that exist at the top-level of the refs/ directory,
and we skip pruning the ref.  This is almost certainly not
what we want to happen here. The criteria to be pruned
should not differ from that to be packed; if a ref makes it
to prune_ref, it's because we want it both packed and
pruned (if there are refs you do not want to be packed, they
should be omitted much earlier by pack_ref_is_possible,
which we do in this case if --all is not given).

We can fix this by switching to lock_any_ref_for_update.
This behaves exactly the same with the exception of this
top-level check.

Signed-off-by: Jeff King p...@peff.net
---
 refs.c   | 3 ++-
 t/t3210-pack-refs.sh | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/refs.c b/refs.c
index 27927f2..82e5b1b 100644
--- a/refs.c
+++ b/refs.c
@@ -2387,7 +2387,8 @@ static void try_remove_empty_parents(char *name)
 /* make sure nobody touched the ref, and unlink */
 static void prune_ref(struct ref_to_prune *r)
 {
-   struct ref_lock *lock = lock_ref_sha1(r-name + 5, r-sha1);
+   struct ref_lock *lock = lock_any_ref_for_update(r-name, r-sha1,
+   0, NULL);
 
if (lock) {
unlink_or_warn(git_path(%s, r-name));
diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh
index 1a2080e..3a017bf 100755
--- a/t/t3210-pack-refs.sh
+++ b/t/t3210-pack-refs.sh
@@ -151,4 +151,11 @@ test_expect_success 'delete ref while another dangling 
packed ref' '
test_cmp /dev/null result
 '
 
+test_expect_success 'pack ref directly below refs/' '
+   git update-ref refs/top HEAD 
+   git pack-refs --all --prune 
+   grep refs/top .git/packed-refs 
+   test_path_is_missing .git/refs/top
+'
+
 test_done
-- 
2.1.0.346.ga0367b9

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] fast-import: clean up pack_data pointer in end_packfile

2014-08-22 Thread Jeff King
We have a global pointer pack_data pointing to the current
pack we have open. Inside end_packfile we have two new
pointers, old_p and new_p. The latter points to pack_data,
and the former points to the new installed version of the
packfile we get when we hand the file off to the regular
sha1_file machinery. When then free old_p.

Presumably the extra old_p pointer was there so that we
could overwrite pack_data with new_p and still free old_p,
but we don't do that. We just leave pack_data pointing to
bogus memory, and don't overwrite it until we call
start_packfile again (if ever).

This can cause problems for our die routine, which calls
end_packfile to clean things up. If we die at the wrong
moment, we can end up looking at invalid memory in
pack_data left after the last end_packfile().

Instead, let's make sure we set pack_data to NULL after we
free it, and make calling endfile() again with a NULL
pack_data a noop (there is nothing to end).

We can further make things less confusing by dropping old_p
entirely, and moving new_p closer to its point of use.

Signed-off-by: Jeff King p...@peff.net
---
Noticed while running fast-import under valgrind to debug the next
commit. :)

 fast-import.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index d73f58c..f25a4ae 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -946,10 +946,12 @@ static void unkeep_all_packs(void)
 
 static void end_packfile(void)
 {
-   struct packed_git *old_p = pack_data, *new_p;
+   if (!pack_data)
+   return;
 
clear_delta_base_cache();
if (object_count) {
+   struct packed_git *new_p;
unsigned char cur_pack_sha1[20];
char *idx_name;
int i;
@@ -991,10 +993,11 @@ static void end_packfile(void)
pack_id++;
}
else {
-   close(old_p-pack_fd);
-   unlink_or_warn(old_p-pack_name);
+   close(pack_data-pack_fd);
+   unlink_or_warn(pack_data-pack_name);
}
-   free(old_p);
+   free(pack_data);
+   pack_data = NULL;
 
/* We can't carry a delta across packfiles. */
strbuf_release(last_blob.data);
-- 
2.1.0.346.ga0367b9

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] fast-import: fix buffer overflow in dump_tags

2014-08-22 Thread Jeff King
When creating a new annotated tag, we sprintf the refname
into a static-sized buffer. If we have an absurdly long
tagname, like:

  git init repo 
  cd repo 
  git commit --allow-empty -m foo 
  git tag -m message mytag 
  git fast-export mytag |
  perl -lpe '/^tag/ and s/mytag/a x 8192/e' |
  git fast-import input

we'll overflow the buffer. We can fix it by using a strbuf.

Signed-off-by: Jeff King p...@peff.net
---
I'm not sure how easily exploitable this is. The buffer is on the stack,
and we definitely demolish the return address. But we never actually
return from the function, since lock_ref_sha1 will fail in such a case
and die (it cannot succeed because the name is longer than PATH_MAX,
which we check when concatenating it with $GIT_DIR).

Still, there is no limit to the size of buffer you can feed it, so it's
entirely possible you can overwrite something else and cause some
mischief. So I wouldn't call it trivially exploitable, but I would not
bet my servers that it is not (and of course it is easy to trigger if
you can convince somebody to run fast-import a stream, so any remote
helpers produce a potentially vulnerable situation).

 fast-import.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index f25a4ae..a1479e9 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1734,14 +1734,16 @@ static void dump_tags(void)
static const char *msg = fast-import;
struct tag *t;
struct ref_lock *lock;
-   char ref_name[PATH_MAX];
+   struct strbuf ref_name = STRBUF_INIT;
 
for (t = first_tag; t; t = t-next_tag) {
-   sprintf(ref_name, tags/%s, t-name);
-   lock = lock_ref_sha1(ref_name, NULL);
+   strbuf_reset(ref_name);
+   strbuf_addf(ref_name, tags/%s, t-name);
+   lock = lock_ref_sha1(ref_name.buf, NULL);
if (!lock || write_ref_sha1(lock, t-sha1, msg)  0)
-   failure |= error(Unable to update %s, ref_name);
+   failure |= error(Unable to update %s, ref_name.buf);
}
+   strbuf_release(ref_name);
 }
 
 static void dump_marks_helper(FILE *f,
-- 
2.1.0.346.ga0367b9

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5] fast-import: stop using lock_ref_sha1

2014-08-22 Thread Jeff King
We can use lock_any_ref_for_update instead. Besides being
more flexible, the only difference between the two is that
lock_ref_sha1 does not allow top-level refs like
refs/foo to be updated. However, we know that we do not
have such a ref, because we explicitly add the tags/
prefix ourselves.

Note that we now must feed the whole name refs/tags/X
instead of just tags/X to the function. As a result, our
failure error message is uses the longer name. This is
probably a good thing, though.

As an interesting side note, if we forgot to switch this
input to the function, the tests do not currently catch it.
We import a tag X and then check that we can access it at
tags/X. If we accidentally created tags/X at the
top-level $GIT_DIR instead of under refs/, we would still
find it due to our ref lookup procedure!

We do not make such a mistake in this patch, of course, but
while we're thinking about it, let's make the fast-import
tests more robust by checking for fully qualified
refnames.

Signed-off-by: Jeff King p...@peff.net
---
As I mentioned, I'd be OK with dropping this one in favor of just
waiting for Ronnie's transaction patches to graduate.

 fast-import.c  | 4 ++--
 t/t9300-fast-import.sh | 8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/fast-import.c b/fast-import.c
index a1479e9..04a85a4 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1738,8 +1738,8 @@ static void dump_tags(void)
 
for (t = first_tag; t; t = t-next_tag) {
strbuf_reset(ref_name);
-   strbuf_addf(ref_name, tags/%s, t-name);
-   lock = lock_ref_sha1(ref_name.buf, NULL);
+   strbuf_addf(ref_name, refs/tags/%s, t-name);
+   lock = lock_any_ref_for_update(ref_name.buf, NULL, 0, NULL);
if (!lock || write_ref_sha1(lock, t-sha1, msg)  0)
failure |= error(Unable to update %s, ref_name.buf);
}
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 5fc9ef2..f4c6673 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -153,7 +153,7 @@ tag series-A
 An annotated tag without a tagger
 EOF
 test_expect_success 'A: verify tag/series-A' '
-   git cat-file tag tags/series-A actual 
+   git cat-file tag refs/tags/series-A actual 
test_cmp expect actual
 '
 
@@ -165,7 +165,7 @@ tag series-A-blob
 An annotated tag that annotates a blob.
 EOF
 test_expect_success 'A: verify tag/series-A-blob' '
-   git cat-file tag tags/series-A-blob actual 
+   git cat-file tag refs/tags/series-A-blob actual 
test_cmp expect actual
 '
 
@@ -232,8 +232,8 @@ EOF
 test_expect_success \
'A: tag blob by sha1' \
'git fast-import input 
-   git cat-file tag tags/series-A-blob-2 actual 
-   git cat-file tag tags/series-A-blob-3 actual 
+   git cat-file tag refs/tags/series-A-blob-2 actual 
+   git cat-file tag refs/tags/series-A-blob-3 actual 
test_cmp expect actual'
 
 test_tick
-- 
2.1.0.346.ga0367b9
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-ref-format: include refs/ in the argument or to strip it?

2014-08-22 Thread Jeff King
On Fri, Aug 22, 2014 at 11:45:15AM -0700, Jonathan Nieder wrote:

 Junio C Hamano wrote:
 
 implication of which is that the 'at least one slash'
  rule was to expect things are 'refs/anything' so there will be at
  least one.  Even back then, that anything alone had at least one
  slash (e.g. heads/master), but the intention was *never* that we
  would forbid anything that does not have a slash by feeding
  anything part alone to check-ref-format, i.e. things like
  refs/stash were designed to be allowed.
 
 Now I'm more confused.  Until 5f7b202a (2008-01-01), there was a
 comment
 
   if (level  2)
   return -2; /* at least of form heads/blah */
 
 and that behavior has been preserved since the beginning.
 
 Why do most old callers pass a string that doesn't start with refs/
 (e.g., see the callers in 03feddd6, 2005-10-13)?  Has the intent been
 to relax the requirement since then?

Yeah, this weird do not allow refs/foo behavior has continually
confused me. Coincidentally I just noticed a case today where
pack-refs treats refs/foo specially for no good reason:

  http://thread.gmane.org/gmane.comp.version-control.git/255729

After much head scratching over the years, I am of the opinion that
nobody every really _meant_ to prevent refs/foo, and that code
comments like the one you quote above were an attempt to document
existing buggy behavior that was really trying to differentiate HEAD
from refs/*. That's just my opinion, though. :) I'd be happy if all of
the special-treatment of refs/foo went away and check_refname_format
always got the full ref.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: check-ref-format: include refs/ in the argument or to strip it?

2014-08-22 Thread Junio C Hamano
On Fri, Aug 22, 2014 at 10:46 PM, Jeff King p...@peff.net wrote:

 After much head scratching over the years, I am of the opinion that
 nobody every really _meant_ to prevent refs/foo...

Yup, that matches my understanding.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html