Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2012-12-29 Thread Jeff King
On Thu, Dec 27, 2012 at 04:11:51PM -0700, Martin Fick wrote:

 For a single user repo this is not a big deal, the lock can 
 always be cleaned up manually (and it is a rare occurrence).  
 However, in a multi user server environment, possibly even 
 from multiple hosts over a shared filesystem such as NFS, 
 stale locks could lead to serious downtime and risky recovery 
 (since it is currently hard to figure out if a lock really is 
 stale).  Even though stale locks are probably rare even today 
 in the larger shared repo case, as git scales to even larger 
 shared repositories, this will eventually become more of a 
 problem *1.  Naturally, this has me thinking that git should 
 possibly consider moving towards a lockless design for refs 
 in the long term.

FWIW, I am involved in cleaning up stale locks for a very large git
hosting site. It actually happens surprisingly little. I think it is
mostly because git holds actual locks for a very short period of time
(just enough to check that the value is unchanged from when we started a
lengthy operation, and then atomically write the new value).

So I agree it would be cool (and maybe open up new realms of
scalability) for git to be lockless, but in my experience, this isn't
that pressing a problem (and any solutions are not going to be backwards
compatible, so there is going to be a high deployment cost).

 My idea is based on using filenames to store sha1s instead of 
 file contents.  To do this, the sha1 one of a ref would be 
 stored in a file in a directory named after the loose ref.  I 
 believe this would then make it possible to have lockless 
 atomic ref updates by renaming the file.
 
 To more fully illustrate the idea, imagine that any file 
 (except for the null file) in the directory will represent the 
 value of the ref with its name, then the following 
 transitions can represent atomic state changes to a refs 
 value and existence:

Hmm. So basically you are relying on atomic rename() to move the value
around within a directory, rather than using write to move it around
within a file. Atomic rename is usually something we have on local
filesystems (and I think we rely on it elsewhere). Though I would not be
surprised if it is not atomic on all networked filesystems (though it is
on NFS, at least).

 1) To update the value from a known value to a new value 
 atomically, simply rename the file to the new value.  This 
 operation should only succeed if the file exists and is still 
 named old value before the rename.  This should even be 
 faster than today's approach, especially on remote filesystems 
 since it would require only 1 round trip in the success case 
 instead of 3!

OK. Makes sense.

 2) To delete the ref, simply delete the filename representing 
 the current value of the ref.  This ensures that you are 
 deleting the ref from a specific value.  I am not sure if git 
 needs to be able to delete refs without knowing their values?  
 If so, this would require reading the value and looping until 
 the delete succeeds, this may be a bit slow for a constantly 
 updated ref, but likely a rare situation (and not likely 
 worse than trying to acquire the ref-lock today).  Overall, 
 this again would likely be faster than today's approach.

We do sometimes delete without knowing the value. In most cases we would
not want to do this, but for some force-type commands, we do. You
would actually have the same problem with updating above, as we
sometimes update with the intent to overwrite whatever is there.

 3) To create a ref, it must be renamed from the null file (sha 
 ...) to the new value just as if it were being updated 
 from any other value, but there is one extra condition: 
 before renaming the null file, a full directory scan must be 
 done to ensure that the null file is the only file in the 
 directory (this condition exists because creating the 
 directory and null file cannot be atomic unless the filesystem 
 supports atomic directory renames, an expectation git does 
 not currently make).  I am not sure how this compares to 
 today's approach, but including the setup costs (described 
 below), I suspect it is slower.

Hmm. mkdir is atomic. So wouldn't it be sufficient to just mkdir and
create the correct sha1 file?  A simultaneous creator would fail on the
mkdir and abort. A simultaneous reader might see the directory, but it
would either see it as empty, or with the correct file. In the former
case, it would treat that the same as if the directory did not exist.

Speaking of which, you did not cover reading at all, but it would have
to be:

  dh = opendir(ref);
  if (!dh) {
  if (errno == ENOENT)
  return 0; /* no such ref */
  else
  return error(couldn't read ref);
  }

  while ((ent = readdir(dh)) {
  if (ent-d_name[0] == '.')
  /*
   * skip . and .., and leave room for annotating 
   * refs via dot-files

Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2012-12-29 Thread Jeff King
On Fri, Dec 28, 2012 at 07:50:14AM -0700, Martin Fick wrote:

 Hmm, actually I believe that with a small modification to the 
 semantics described here it would be possible to make multi 
 repo/branch commits work.   Simply allow the ref filename to 
 be locked by a transaction by appending the transaction ID to 
 the filename.  So if transaction 123 wants to lock master 
 which points currently to abcde, then it will move 
 master/abcde to master/abcde_123.  If transaction 123 is 
 designed so that any process can commit/complete/abort it 
 without requiring any locks which can go stale, then this ref 
 lock will never go stale either (easy as long as it writes 
 all its proposed updates somewhere upfront and has atomic 
 semantics for starting, committing and aborting).  On commit, 
 the ref lock gets updated to its new value: master/newsha and 
 on abort it gets unlocked: master/abcde.

Hmm. I thought our goal was to avoid locks? Isn't this just locking by
another name?

I guess your point is to have no locks in the normal case, and have
locked transactions as an optional add-on?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lockless Refs?

2012-12-29 Thread Jeff King
On Fri, Dec 28, 2012 at 09:15:52AM -0800, Junio C Hamano wrote:

 Martin Fick mf...@codeaurora.org writes:
 
  Hmm, actually I believe that with a small modification to the 
  semantics described here it would be possible to make multi 
  repo/branch commits work
 
  Shawn talked about adding multi repo/branch transaction 
  semantics to jgit, this might be something that git wants to 
  support also at some point?
 
 Shawn may have talked about it and you may have listened to it, but
 others wouldn't have any idea what kind of multi repo/branch
 transaction you are talking about.  Is it about I want to push
 this ref to that repo and push this other ref to that other repo,
 in what situation will it be used/useful, what are the failure
 modes, what are failure tolerances by the expected use cases, ...?
 
 Care to explain?

I cannot speak for Martin, but I am assuming the point is to atomically
update 2 (or more) refs on the same repo. That is, if I have a branch
refs/heads/foo and a ref pointing to meta-information (say, notes
about commits in foo, in refs/notes/meta/foo), I would want to git
push them, and only update them if _both_ will succeed, and otherwise
fail and update nothing.

I think Shawn mentioned this at the last GitTogether as a stumbling
block for pushing more of Gerrit's meta-information as refs over the git
protocol. But I might be mis-remembering.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] pack-objects: compression level for non-blobs

2012-12-29 Thread Jeff King
On Sat, Dec 29, 2012 at 12:27:47AM -0500, Jeff King wrote:

  I think I tried the partial decompression for commit header and it did
  not help much (or I misremember it, not so sure).
 
 I'll see if I can dig up the reference, as it was something I was going
 to look at next.

I tried the simple patch below, but it actually made things slower!  I'm
assuming it is because the streaming setup is not micro-optimized very
well. A custom read_sha1_until_blank_line() could probably do better.

diff --git a/commit.c b/commit.c
index e8eb0ae..efd6c06 100644
--- a/commit.c
+++ b/commit.c
@@ -8,6 +8,7 @@
 #include notes.h
 #include gpg-interface.h
 #include mergesort.h
+#include streaming.h
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char 
*buf, size_t len, const char **);
 
@@ -306,6 +307,39 @@ int parse_commit_buffer(struct commit *item, const void 
*buffer, unsigned long s
return 0;
 }
 
+static void *read_commit_header(const unsigned char *sha1,
+   enum object_type *type,
+   unsigned long *size)
+{
+   static const int chunk_size = 256;
+   struct strbuf buf = STRBUF_INIT;
+   struct git_istream *st;
+
+   st = open_istream(sha1, type, size, NULL);
+   if (!st)
+   return NULL;
+   while (1) {
+   size_t start = buf.len;
+   ssize_t readlen;
+
+   strbuf_grow(buf, chunk_size);
+   readlen = read_istream(st, buf.buf + start, chunk_size);
+   buf.buf[start + readlen + 1] = '\0';
+   buf.len += readlen;
+
+   if (readlen  0) {
+   close_istream(st);
+   strbuf_release(buf);
+   return NULL;
+   }
+   if (!readlen || strstr(buf.buf + start, \n\n))
+   break;
+   }
+
+   close_istream(st);
+   return strbuf_detach(buf, size);
+}
+
 int parse_commit(struct commit *item)
 {
enum object_type type;
@@ -317,7 +351,11 @@ int parse_commit(struct commit *item)
return -1;
if (item-object.parsed)
return 0;
-   buffer = read_sha1_file(item-object.sha1, type, size);
+
+   if (!save_commit_buffer)
+   buffer = read_commit_header(item-object.sha1, type, size);
+   else
+   buffer = read_sha1_file(item-object.sha1, type, size);
if (!buffer)
return error(Could not read %s,
 sha1_to_hex(item-object.sha1));
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] pack-objects: compression level for non-blobs

2012-12-29 Thread Jeff King
On Sat, Dec 29, 2012 at 04:05:58AM -0500, Jeff King wrote:

 On Sat, Dec 29, 2012 at 12:27:47AM -0500, Jeff King wrote:
 
   I think I tried the partial decompression for commit header and it did
   not help much (or I misremember it, not so sure).
  
  I'll see if I can dig up the reference, as it was something I was going
  to look at next.
 
 I tried the simple patch below, but it actually made things slower!  I'm
 assuming it is because the streaming setup is not micro-optimized very
 well. A custom read_sha1_until_blank_line() could probably do better.

Something like the patch below, which does speed things up. But not
nearly as much as I'd hoped:

  [before]
  $ best-of-five git rev-list --count --all
  real0m4.197s
  user0m4.112s
  sys 0m0.072s

  [after]
  $ best-of-five git rev-list --count --all
  real0m3.782s
  user0m3.708s
  sys 0m0.064s

Only about a 10% speedup (versus ~75% with uncompressed commits).

---
diff --git a/cache.h b/cache.h
index 18fdd18..a494d3b 100644
--- a/cache.h
+++ b/cache.h
@@ -724,6 +724,7 @@ int offset_1st_component(const char *path);
 
 /* object replacement */
 #define READ_SHA1_FILE_REPLACE 1
+#define READ_SHA1_FILE_HEADER 2
 extern void *read_sha1_file_extended(const unsigned char *sha1, enum 
object_type *type, unsigned long *size, unsigned flag);
 static inline void *read_sha1_file(const unsigned char *sha1, enum object_type 
*type, unsigned long *size)
 {
@@ -1059,7 +1060,7 @@ extern int is_pack_valid(struct packed_git *);
 extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t);
 extern off_t find_pack_entry_one(const unsigned char *, struct packed_git *);
 extern int is_pack_valid(struct packed_git *);
-extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, 
unsigned long *);
+extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, 
unsigned long *, int);
 extern unsigned long unpack_object_header_buffer(const unsigned char *buf, 
unsigned long len, enum object_type *type, unsigned long *sizep);
 extern unsigned long get_size_from_delta(struct packed_git *, struct 
pack_window **, off_t);
 extern int unpack_object_header(struct packed_git *, struct pack_window **, 
off_t *, unsigned long *);
diff --git a/commit.c b/commit.c
index e8eb0ae..0a088dc 100644
--- a/commit.c
+++ b/commit.c
@@ -312,12 +312,13 @@ int parse_commit(struct commit *item)
void *buffer;
unsigned long size;
int ret;
+   int flags = save_commit_buffer ? 0 : READ_SHA1_FILE_HEADER;
 
if (!item)
return -1;
if (item-object.parsed)
return 0;
-   buffer = read_sha1_file(item-object.sha1, type, size);
+   buffer = read_sha1_file_extended(item-object.sha1, type, size, 
flags);
if (!buffer)
return error(Could not read %s,
 sha1_to_hex(item-object.sha1));
diff --git a/fast-import.c b/fast-import.c
index c2a814e..a140d57 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1303,7 +1303,7 @@ static void *gfi_unpack_entry(
 */
p-pack_size = pack_size + 20;
}
-   return unpack_entry(p, oe-idx.offset, type, sizep);
+   return unpack_entry(p, oe-idx.offset, type, sizep, 0);
 }
 
 static const char *get_mode(const char *str, uint16_t *modep)
diff --git a/pack-check.c b/pack-check.c
index 63a595c..e4a43c0 100644
--- a/pack-check.c
+++ b/pack-check.c
@@ -116,7 +116,7 @@ static int verify_packfile(struct packed_git *p,
sha1_to_hex(entries[i].sha1),
p-pack_name, (uintmax_t)offset);
}
-   data = unpack_entry(p, entries[i].offset, type, size);
+   data = unpack_entry(p, entries[i].offset, type, size, 0);
if (!data)
err = error(cannot unpack %s from %s at offset 
%PRIuMAX,
sha1_to_hex(entries[i].sha1), p-pack_name,
diff --git a/sha1_file.c b/sha1_file.c
index 40b2329..1f1f31a 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1469,16 +1469,19 @@ static void *unpack_sha1_file(void *map, unsigned long 
mapsize, enum object_type
return *hdr ? -1 : type_from_string(type);
 }
 
-static void *unpack_sha1_file(void *map, unsigned long mapsize, enum 
object_type *type, unsigned long *size, const unsigned char *sha1)
+static void *unpack_sha1_file(void *map, unsigned long mapsize, enum 
object_type *type, unsigned long *size, const unsigned char *sha1, int 
stop_at_blank)
 {
int ret;
git_zstream stream;
-   char hdr[8192];
+   char hdr[512];
 
ret = unpack_sha1_header(stream, map, mapsize, hdr, sizeof(hdr));
if (ret  Z_OK || (*type = parse_sha1_header(hdr, size))  0)
return NULL;
-
+   if (stop_at_blank  strstr(hdr, \n\n)) {
+   *size = strlen(hdr);
+   return xstrdup(hdr);
+   

Re: [PATCH 2/3] do not write null sha1s to on-disk index

2012-12-29 Thread Jonathan Nieder
Hi Peff,

Jeff King wrote:

 --- a/read-cache.c
 +++ b/read-cache.c
 @@ -1800,6 +1800,8 @@ int write_index(struct index_state *istate, int newfd)
   continue;
   if (!ce_uptodate(ce)  is_racy_timestamp(istate, ce))
   ce_smudge_racily_clean_entry(ce);
 + if (is_null_sha1(ce-sha1))
 + return error(cache entry has null sha1: %s, ce-name);
   if (ce_write_entry(c, newfd, ce, previous_name)  0)
   return -1;

Quick heads up: this is tripping for me in the git am --abort
codepath:

  $ cd ~/src/linux
  $ git checkout v3.2.35
  $ git am -3sc /tmp/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch 
  Applying: ALSA: usb-audio: Fix missing autopm for MIDI input
  Using index info to reconstruct a base tree...
  Falling back to patching base and 3-way merge...
  error: Your local changes to the following files would be overwritten by 
merge:
sound/usb/midi.c
  Please, commit your changes or stash them before you can merge.
  Aborting
  Failed to merge in the changes.
  Patch failed at 0001 ALSA: usb-audio: Fix missing autopm for MIDI input
  When you have resolved this problem run git am --resolved.
  If you would prefer to skip this patch, instead run git am --skip.
  To restore the original branch and stop patching run git am --abort.
  $ git am --abort
  error: cache entry has null sha1: sound/usb/midi.c
  fatal: unable to write new index file
  Unstaged changes after reset:
  M   sound/usb/midi.c
  $

Reproducible using v1.8.1-rc3 and master.  Bisects to 4337b585 (do not
write null sha1s to on-disk index, 2012-07-28).  For comparison, older
gits produced

  $ git am --abort
  Unstaged changes after reset:
  M   sound/usb/midi.c

which is also not great but at least didn't involve any obviously
invalid behavior.  Known problem?

Thanks,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] do not write null sha1s to on-disk index

2012-12-29 Thread Jeff King
On Sat, Dec 29, 2012 at 02:03:46AM -0800, Jonathan Nieder wrote:

  --- a/read-cache.c
  +++ b/read-cache.c
  @@ -1800,6 +1800,8 @@ int write_index(struct index_state *istate, int newfd)
  continue;
  if (!ce_uptodate(ce)  is_racy_timestamp(istate, ce))
  ce_smudge_racily_clean_entry(ce);
  +   if (is_null_sha1(ce-sha1))
  +   return error(cache entry has null sha1: %s, ce-name);
  if (ce_write_entry(c, newfd, ce, previous_name)  0)
  return -1;
 
 Quick heads up: this is tripping for me in the git am --abort
 codepath:

Thanks. The intent was that this should never happen, and we are
protecting against bogus sha1s slipping into the index. So either we
have found a bug in am --abort, or the assumption that it should never
happen was wrong. :)

   $ cd ~/src/linux
   $ git checkout v3.2.35
   $ git am -3sc /tmp/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch 
   Applying: ALSA: usb-audio: Fix missing autopm for MIDI input
   Using index info to reconstruct a base tree...
   Falling back to patching base and 3-way merge...
   error: Your local changes to the following files would be overwritten by 
 merge:
   sound/usb/midi.c
   Please, commit your changes or stash them before you can merge.
   Aborting
   Failed to merge in the changes.
   Patch failed at 0001 ALSA: usb-audio: Fix missing autopm for MIDI input
   When you have resolved this problem run git am --resolved.
   If you would prefer to skip this patch, instead run git am --skip.
   To restore the original branch and stop patching run git am --abort.
   $ git am --abort
   error: cache entry has null sha1: sound/usb/midi.c
   fatal: unable to write new index file
   Unstaged changes after reset:
   M   sound/usb/midi.c
   $

I can't reproduce here. I can checkout v3.2.35, and I guess that the
patch you are applying comes from f5f1654, but I don't know your
local modification to sound/usb/midi.c. If I just make a fake
modification, I get this:

  $ git checkout v3.2.35
  $ echo fake sound/usb/midi.c
  $ git format-patch --stdout -1 f5f1654 | git am -3 ~/patch
  [same errors as you]
  $ git am --abort
  Unstaged changes after reset:
  M   sound/usb/midi.c

So I wonder if there is something in the way your working tree is
modified. Can you give more details?

 Reproducible using v1.8.1-rc3 and master.  Bisects to 4337b585 (do not
 write null sha1s to on-disk index, 2012-07-28).  For comparison, older
 gits produced
 
   $ git am --abort
   Unstaged changes after reset:
   M   sound/usb/midi.c

What does your index look like afterwards? Does it have a null sha1 in
it (check ls-files -s)?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] do not write null sha1s to on-disk index

2012-12-29 Thread Jonathan Nieder
Jeff King wrote:

 I can't reproduce here. I can checkout v3.2.35, and I guess that the
 patch you are applying comes from f5f1654, but I don't know your
 local modification to sound/usb/midi.c.

No local modification.  The unstaged change after git am --abort to
recover from a conflicted git am is a longstanding bug (at least a
couple of years old).

The patch creating conflicts is

queue-3.2/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch

from git://git.kernel.org/pub/scm/linux/kernel/git/bwh/linux-3.2.y-queue.git

[...]
   $ git am --abort
   Unstaged changes after reset:
   M   sound/usb/midi.c

 What does your index look like afterwards? Does it have a null sha1 in
 it (check ls-files -s)?

$ git diff-index --abbrev HEAD
:100644 100644 eeefbce3873c... ... Msound/usb/midi.c
$ git ls-files --abbrev -s sound/usb/midi.c
100644 eeefbce3873c 0   sound/usb/midi.c
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] do not write null sha1s to on-disk index

2012-12-29 Thread Jonathan Nieder
Jeff King wrote:

 Can you give more details?

$ GIT_TRACE=1 git am --abort
trace: exec: 'git-am' '--abort'
trace: run_command: 'git-am' '--abort'
trace: built-in: git 'rev-parse' '--parseopt' '--' '--abort'
trace: built-in: git 'rev-parse' '--git-dir'
trace: built-in: git 'rev-parse' '--show-prefix'
trace: built-in: git 'rev-parse' '--show-toplevel'
trace: built-in: git 'var' 'GIT_COMMITTER_IDENT'
trace: built-in: git 'rev-parse' '--verify' '-q' 'HEAD'
trace: built-in: git 'config' '--bool' '--get' 'am.keepcr'
trace: built-in: git 'rerere' 'clear'
trace: built-in: git 'rev-parse' '--verify' '-q' 'HEAD'
trace: built-in: git 'read-tree' '--reset' '-u' 'HEAD' 'ORIG_HEAD'
error: cache entry has null sha1: sound/usb/midi.c
fatal: unable to write new index file
trace: built-in: git 'reset' 'ORIG_HEAD'
Unstaged changes after reset:
M   sound/usb/midi.c
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] do not write null sha1s to on-disk index

2012-12-29 Thread Jeff King
On Sat, Dec 29, 2012 at 02:34:30AM -0800, Jonathan Nieder wrote:

 Jeff King wrote:
 
  I can't reproduce here. I can checkout v3.2.35, and I guess that the
  patch you are applying comes from f5f1654, but I don't know your
  local modification to sound/usb/midi.c.
 
 No local modification.  The unstaged change after git am --abort to
 recover from a conflicted git am is a longstanding bug (at least a
 couple of years old).
 
 The patch creating conflicts is
 
   queue-3.2/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch
 
 from git://git.kernel.org/pub/scm/linux/kernel/git/bwh/linux-3.2.y-queue.git

Hrm. But your output does not say there is a conflict. It says you have
a local modification and it does not try the merge:

 $ git am -3sc /tmp/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch
 Applying: ALSA: usb-audio: Fix missing autopm for MIDI input
 Using index info to reconstruct a base tree...
 Falling back to patching base and 3-way merge...
 error: Your local changes to the following files would be overwritten by 
 merge:
   sound/usb/midi.c
 Please, commit your changes or stash them before you can merge.
 Aborting

If I try to apply it, I get a real conflict:

  $ git checkout v3.2.35
  $ git am -3sc 
linux-3.2.y-queue/queue-3.2/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch
  Applying: ALSA: usb-audio: Fix missing autopm for MIDI input
  Using index info to reconstruct a base tree...
  M   sound/usb/midi.c
  Falling back to patching base and 3-way merge...
  Auto-merging sound/usb/midi.c
  CONFLICT (content): Merge conflict in sound/usb/midi.c

Although running git am --abort after that does seem to produce the
cache entry has null sha1 error. So I can start investigating from
there.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] do not write null sha1s to on-disk index

2012-12-29 Thread Jonathan Nieder
Jeff King wrote:

 Hrm. But your output does not say there is a conflict. It says you have
 a local modification and it does not try the merge:

That's probably operator error on my part when gathering output to
paste into the email.

In other words, nothing to see there. :)  Sorry for the confusion.

[...]
 If I try to apply it, I get a real conflict:
[...]
 Although running git am --abort after that does seem to produce the
 cache entry has null sha1 error.

Yep, that is what my report should have said.

'night,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] do not write null sha1s to on-disk index

2012-12-29 Thread Jeff King
On Sat, Dec 29, 2012 at 02:34:30AM -0800, Jonathan Nieder wrote:

$ git am --abort
Unstaged changes after reset:
M   sound/usb/midi.c
 
  What does your index look like afterwards? Does it have a null sha1 in
  it (check ls-files -s)?
 
 $ git diff-index --abbrev HEAD
 :100644 100644 eeefbce3873c... ... M  sound/usb/midi.c
 $ git ls-files --abbrev -s sound/usb/midi.c
 100644 eeefbce3873c 0 sound/usb/midi.c

Hmm. It looks like am --abort overwrites the index again after the
read-tree which complains.  If I downgrade the error in write_index to a
warning, like this:

diff --git a/read-cache.c b/read-cache.c
index fda78bc..70a6d86 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1797,7 +1797,7 @@ int write_index(struct index_state *istate, int newfd)
if (!ce_uptodate(ce)  is_racy_timestamp(istate, ce))
ce_smudge_racily_clean_entry(ce);
if (is_null_sha1(ce-sha1))
-   return error(cache entry has null sha1: %s, ce-name);
+   warning(cache entry has null sha1: %s, ce-name);
if (ce_write_entry(c, newfd, ce, previous_name)  0)
return -1;
}

and then just run this:

  [clear state from last run]
  $ rm -rf .git/rebase-apply
  $ git reset --hard

  [apply the patch; we get a conflict]
  $ git am -3sc queue-3.2/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch

  [now run just the read-tree from am --abort]
  $ git.compile read-tree --reset -u HEAD ORIG_HEAD
  warning: cache entry has null sha1: sound/usb/midi.c

  [and now check our index]
  $ git ls-files -s sound/usb/midi.c
  100644  0 sound/usb/midi.c

  [yes, this index is bogus]
  $ git write-tree
  error: invalid object 100644  for 
'sound/usb/midi.c'
  fatal: git-write-tree: error building trees

So I think this check may actually be finding a real bug. I have seen
these null sha1s in the wild, but I was never able to track down the
actual cause. Maybe this will give us a clue. Now we just need to work
backwards and figure out who is putting it in the in-memory index and
why.

I'll try to work on it tomorrow, but please don't let that stop you if
you want to keep digging in the meantime.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] pre-push hook support

2012-12-29 Thread Aaron Schrab

At 18:01 -0800 28 Dec 2012, Junio C Hamano gits...@pobox.com wrote:

One lesson we learned long time ago while doing hooks is to avoid
unbound number of command line arguments and instead feed them from
the standard input.  I think this should do the same.


Good point.  I had been trying to keep the interface for this hook as 
close as possible to the ones for other client-side hooks on the theory 
that less development effort may go into those than for server-side 
hooks.  But thinking on that more I certainly see that this could easily 
run into limits on argument length on some systems; especially when it's 
likely that each of those arguments is likely to be over 100 bytes long.


I'll work on an updated version which sends the variable length 
information over a pipe, using the command-line arguments only to pass 
the remote name and URL.



How does the hook communicate its decision to the calling Git?

Will it be all-or-none, or I'll allow these but not those?


Currently it just uses the exit code to communicate that back, so it's 
all-or-none.  I think I'll keep that in the updated version as well.


A future enhancement could modify the protocol to support reading from 
the hook's stdout the names of remote refs which are to be rejected, I 
think that just having the option for all-or-nothing is a good starting 
point.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] hooks: Add function to check if a hook exists

2012-12-29 Thread Aaron Schrab

At 18:08 -0800 28 Dec 2012, Junio C Hamano gits...@pobox.com wrote:

Aaron Schrab aa...@schrab.com writes:


Create find_hook() function to determine if a given hook exists and is
executable.  If it is the path to the script will be returned, otherwise
NULL is returned.


Sounds like a sensible thing to do.  To make sure the API is also
sensible, all the existing hooks should be updated to use this API,
no?


I'd been trying to keep the changes limited.  I'll see about modifying 
the existing places that run hooks in v2 of the series.



This is in support for an upcoming run_hook_argv() function which will
expect the full path to the hook script as the first element in the
argv_array.


There is currently a public function called run_hook() that squats
on the good name with a kludgy API that is too specific to using
separate index file.  Back when it was a private helper in the
implementation of git commit, it was perfectly fine, but it was
exported without giving much thought on the API.

If you are introducing a new run_hook_* function, give it a generic
enough API that lets all the existing hook callers to use it.  I
would imagine that the API requirement may be modelled after
run_command() API so that we can pass argv[] and tweak the hook's
environ[], as well as feeding its stdin and possibly reading from
its stdout.  That would be very useful.


I think the attraction of the run_hook() API is its simplicity.  It's 
currently a fairly thin wrapper around the run_command() API.  I suspect 
that if the run_hook() API were made generic enough to support all of 
the existing hook callers it would greatly complicate the existing calls 
to run_hook() while not providing much benefit to hook callers which 
can't currently use it beyond what run_command() offers.


Since I'm going to be changing the interface for this hook in v2 of the 
series so that it will be more complicated than can be readily addressed 
with the run_hook() API (and will have use a fixed number of arguments 
anyway) I'll be dropping the run_hook_argv() function.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Hold your fire, please

2012-12-29 Thread Adam Spiers
On Fri, Dec 28, 2012 at 11:52 PM, Junio C Hamano gits...@pobox.com wrote:
 Junio C Hamano gits...@pobox.com writes:

 Primarily in order to force me concentrate on the releng for the
 upcoming release, and also to encourage contributors to focus on
 finding and fixing any last minute regressions (rather than
 distracting others by showing publicly scratching their itches), I
 won't be queuing any patch that is not a regression fix, at least
 for the next few days.  I may not even review them.

 Thanks.

 Just as a friendly reminder, I am reposting this to remind people.

Ah, I thought this period had elapsed already.  I assume this applies
to check-ignore, in which case how long should I hold off before
submitting v4?

 And have a nice holiday if you are in areas where it is a holiday
 season ;-)

You too :-)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] pre-push hook support

2012-12-29 Thread Junio C Hamano
Aaron Schrab aa...@schrab.com writes:

 At 18:01 -0800 28 Dec 2012, Junio C Hamano gits...@pobox.com wrote:
Will it be all-or-none, or I'll allow these but not those?

 Currently it just uses the exit code to communicate that back, so it's
 all-or-none.  I think I'll keep that in the updated version as well.

Thanks; that sounds sensible.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] hooks: Add function to check if a hook exists

2012-12-29 Thread Junio C Hamano
Aaron Schrab aa...@schrab.com writes:

 Since I'm going to be changing the interface for this hook in v2 of
 the series so that it will be more complicated than can be readily
 addressed with the run_hook() API (and will have use a fixed number of
 arguments anyway) I'll be dropping the run_hook_argv() function.

Just to make sure there is no misunderstanding (sorry for sending
the message without finishing it with this clarification at the end
in the first place).  I didn't mean that converting all of the
existing callers must come earlier than introducing a new hook
invoker.

I just wanted to make sure that we are aware that we are adding to
our technical debt, if we are adding another that is also
specialized; as the proposed interface looked sufficiently generic,
it would be the ideal one to make _other_ ones thin wrappers around
it to unify the various codepaths.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fail to push over HTTP with MySQL authentication (Apache2)

2012-12-29 Thread Davide Baldini
Hi,

I'm not able to setup a public Git repository over plain HTTP with
MySQL authentication.
Both HTTP and authentication are provided by Apache2.

SETUP:
-

This setup is performed on Debian 6.0.4.

Apache2 (v. 2.2), with modules:
auth_mysql
WebDAV

Git (v. 1.7.8.3)
Git repository location:
local, for webserver: /var/www/public/GT_rulesets/GT00.git
public, for Git:  http://greatturn.org:8081/GT00.git

Git repository has been configured as:
cd /var/www/public/GT_rulesets/GT00.git
git init --bare
mv hooks/post-update.sample hooks/post-update
git update server-info
chmode 777 /var/www/public/GT_rulesets/GT00.git  # for testing.


FACTS:
-

The Apache side of my setup seems to work:
_   HTTP, MySQL authentication:
I point Iceweasel to http://greatturn.org:8081/ .
It asks for authentication; I authenticate with a username/
password pair taken from MySQL database (which doesn't exist as
a system user); It works, and I can see all the content of
the git repository GT00.git.
_   WebDAV:
I point Konqueror to webdav://greatturn.org:8081/ .
Works exactly as previous point.
_   Git:
Git can fetch the repository without problems:
git clone http://username:passw...@greatturn.org:8081/GT00.git

Pushing the locally fetched repository back to the remote one doesn't
work:
git push http://greatturn.org:8081/GT00.git master
asks for username and password:
 Username for 'greatturn.org:8081':
 Password for 'greatturn.org:8081':

I enter my credentials, then git outputs the following and exits:
 error: Cannot access URL http://greatturn.org:8081/GT00.git/,
return code 22
 fatal: git-http-push failed

On Apache's access.log, git produces all and no more than the
following:
 87.19.240.177 - - [29/Dec/2012:16:43:22 +0100] GET /GT00.git
/info/refs?service=git-receive-pack HTTP/1.1 401 767 -
git/1.7.8.3
 87.19.240.177 - - [29/Dec/2012:16:43:26 +0100] GET
/GT00.git/info/refs?service=git-receive-pack HTTP/1.1 401 767 -
git/1.7.8.3
 87.19.240.177 - davide [29/Dec/2012:16:43:26 +0100] GET
/GT00.git/info/refs?service=git-receive-pack HTTP/1.1 200 233 -
git/1.7.8.3
 87.19.240.177 - davide [29/Dec/2012:16:43:26 +0100] GET
/GT00.git/HEAD HTTP/1.1 200 258 - git/1.7.8.3
 87.19.240.177 - - [29/Dec/2012:16:43:26 +0100] PROPFIND
/GT00.git/ HTTP/1.1 401 767 - git/1.7.8.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[BUG] two-way read-tree can write null sha1s into index

2012-12-29 Thread Jeff King
On Sat, Dec 29, 2012 at 06:05:41AM -0500, Jeff King wrote:

   [clear state from last run]
   $ rm -rf .git/rebase-apply
   $ git reset --hard
 
   [apply the patch; we get a conflict]
   $ git am -3sc 
 queue-3.2/alsa-usb-audio-fix-missing-autopm-for-midi-input.patch
 
   [now run just the read-tree from am --abort]
   $ git.compile read-tree --reset -u HEAD ORIG_HEAD
   warning: cache entry has null sha1: sound/usb/midi.c
 
   [and now check our index]
   $ git ls-files -s sound/usb/midi.c
   100644  0 sound/usb/midi.c
 
   [yes, this index is bogus]
   $ git write-tree
   error: invalid object 100644  for 
 'sound/usb/midi.c'
   fatal: git-write-tree: error building trees
 
 So I think this check may actually be finding a real bug. I have seen
 these null sha1s in the wild, but I was never able to track down the
 actual cause. Maybe this will give us a clue. Now we just need to work
 backwards and figure out who is putting it in the in-memory index and
 why.

I made some progress on this, but I'd like a sanity check from others
(especially Junio). As far as I can tell, this is a bug in read-tree.

When we call read-tree --reset -u HEAD ORIG_HEAD, the first thing we
do with the index is call read_cache_unmerged. Originally that would
read the index, leaving aside any unmerged entries. However, as of
d1a43f2 (reset --hard/read-tree --reset -u: remove unmerged new paths,
2008-10-15), it actually creates a new cache entry. This serves as a
placeholder, so that we later know to update the working tree.

However, we later noticed that the sha1 of that unmerged entry was
just copied from some higher stage, leaving you with random content in
the index.  That was fixed by e11d7b5 (reset --merge: fix unmerged
case, 2009-12-31), which instead puts the null sha1 into the newly
created entry, and sets a CE_CONFLICTED flag. At the same time, it
teaches the unpack-trees machinery to pay attention to this flag, so
that oneway_merge throws away the current value.

However, it did not update the code paths for  twoway_merge, which is
where we end up in the read-tree above. We notice that the HEAD and
ORIG_HEAD versions are the same, and say oh, we can just reuse the
current version. But that's not true. The current version is bogus.

So I think we need to update twoway_merge to recognize unmerged entries,
which gives us two options:

  1. Reject the merge.

  2. Throw away the current unmerged entry in favor of the new entry
 (when old and new are the same, of course; otherwise we would
 reject).

I think (2) is the right thing. It fixes the entry of the bogus sha1
into the index, _and_ it solves the problem that git am --abort leaves
the conflicted entry as a modification. It should just go away. But
maybe I am forgetting some other case where read-tree should be more
conservative, and (1) is a safer choice.

Something like this patch:

diff --git a/unpack-trees.c b/unpack-trees.c
index 6d96366..e06e01f 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1746,14 +1746,19 @@ int twoway_merge(struct cache_entry **src, struct 
unpack_trees_options *o)
newtree = NULL;
 
if (current) {
-   if ((!oldtree  !newtree) || /* 4 and 5 */
-   (!oldtree  newtree 
-same(current, newtree)) || /* 6 and 7 */
-   (oldtree  newtree 
-same(oldtree, newtree)) || /* 14 and 15 */
-   (oldtree  newtree 
-!same(oldtree, newtree)  /* 18 and 19 */
-same(current, newtree))) {
+   if (current-ce_flags  CE_CONFLICTED) {
+   if (same(oldtree, newtree))
+   return merged_entry(newtree, current, o);
+   return o-gently ? -1 : reject_merge(current, o);
+   }
+   else if ((!oldtree  !newtree) || /* 4 and 5 */
+(!oldtree  newtree 
+ same(current, newtree)) || /* 6 and 7 */
+(oldtree  newtree 
+ same(oldtree, newtree)) || /* 14 and 15 */
+(oldtree  newtree 
+ !same(oldtree, newtree)  /* 18 and 19 */
+ same(current, newtree))) {
return keep_entry(current, o);
}
else if (oldtree  !newtree  same(current, oldtree)) {

I suspect threeway_merge may need a similar update, but I haven't looked
too carefully yet.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fail to push over HTTP with MySQL authentication (Apache2)

2012-12-29 Thread Jeff King
On Sat, Dec 29, 2012 at 08:54:32PM +0100, Davide Baldini wrote:

 SETUP:
 -
 [...]
 Git repository has been configured as:
 cd /var/www/public/GT_rulesets/GT00.git
 git init --bare
 mv hooks/post-update.sample hooks/post-update
 git update server-info
 chmode 777 /var/www/public/GT_rulesets/GT00.git  # for testing.

Should this last line be a chmod -R? Git init will create many
subdirectories, and you want to make sure they are all writable for
push.

 _   Git:
 Git can fetch the repository without problems:
 git clone http://username:passw...@greatturn.org:8081/GT00.git
 
 Pushing the locally fetched repository back to the remote one doesn't
 work:
 [...]
  87.19.240.177 - - [29/Dec/2012:16:43:26 +0100] PROPFIND /GT00.git/ 
  HTTP/1.1 401 767 - git/1.7.8.3

If fetch is working and push is not, I'd suspect WebDAV configuration
problems (and indeed, your credentials seem fine, but the PROPFIND is
returning a 401). Fetch works over stock HTTP and does not use WebDAV at
all. There is some documentation on setting up DAV here:

  
https://github.com/git/git/blob/master/Documentation/howto/setup-git-server-over-http.txt

but I have no idea if it is up-to-date or not.

However, before trying to investigate that avenue, have you considered
using git's smart-http backend instead of WebDAV? It's significantly
more efficient. You can get details and example apache configuration
from git help http-backend.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lockless Refs?

2012-12-29 Thread Martin Fick


Jeff King p...@peff.net wrote:

On Fri, Dec 28, 2012 at 09:15:52AM -0800, Junio C Hamano wrote:

 Martin Fick mf...@codeaurora.org writes:
 
  Hmm, actually I believe that with a small modification to the 
  semantics described here it would be possible to make multi 
  repo/branch commits work
 
  Shawn talked about adding multi repo/branch transaction 
  semantics to jgit, this might be something that git wants to 
  support also at some point?
 
 Shawn may have talked about it and you may have listened to it, but
 others wouldn't have any idea what kind of multi repo/branch
 transaction you are talking about.  Is it about I want to push
 this ref to that repo and push this other ref to that other repo,
 in what situation will it be used/useful, what are the failure
 modes, what are failure tolerances by the expected use cases, ...?
 
 Care to explain?

I cannot speak for Martin, but I am assuming the point is to atomically
update 2 (or more) refs on the same repo. That is, if I have a branch
refs/heads/foo and a ref pointing to meta-information (say, notes
about commits in foo, in refs/notes/meta/foo), I would want to git
push them, and only update them if _both_ will succeed, and otherwise
fail and update nothing.

My use case was cross repo/branch dependencies in Gerrit (which do not yet 
exist). Users want to be able to define several changes (destined for different 
project/branches) which can only be merged together.  If one change cannot be 
merged, the others should fail too.  The solutions we can think of generally 
need to hold ref locks while acquiring more ref locks, this drastically 
increases the opportunities for stale locks over the simple lock, check, 
update, unlock mode which git locks are currently used for.

I was perhaps making too big of a leap to assume that there would be other non 
Gerrit uses cases for this?  I assumed that other git projects which are spread 
across several git repos would need this? But maybe this simply wouldn't be 
practical with other git server solutions?

-Martin

Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora 
Forum
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2012-12-29 Thread Martin Fick
Jeff King p...@peff.net wrote:

On Fri, Dec 28, 2012 at 07:50:14AM -0700, Martin Fick wrote:

 Hmm, actually I believe that with a small modification to the 
 semantics described here it would be possible to make multi 
 repo/branch commits work.   Simply allow the ref filename to 
 be locked by a transaction by appending the transaction ID to 
 the filename.  So if transaction 123 wants to lock master 
 which points currently to abcde, then it will move 
 master/abcde to master/abcde_123.  If transaction 123 is 
 designed so that any process can commit/complete/abort it 
 without requiring any locks which can go stale, then this ref 
 lock will never go stale either (easy as long as it writes 
 all its proposed updates somewhere upfront and has atomic 
 semantics for starting, committing and aborting).  On commit, 
 the ref lock gets updated to its new value: master/newsha and 
 on abort it gets unlocked: master/abcde.

Hmm. I thought our goal was to avoid locks? Isn't this just locking by
another name?

It is a lock, but it is a lock with an owner: the transaction.  If the 
transaction has reliable recovery semantics, then the lock will be recoverable 
also.  This is possible if we have lock ownership (the transaction) which does 
not exist today for the ref locks.  With good lock ownership we gain the 
ability to reliably delete locks for a specific owner without the risk of 
deleting the lock when held by another owner (putting the owner in the filename 
is good, while putting the owner in the filecontents is not).   Lastly, for 
reliable recovery of stale locks we need the ability to determine when an owner 
has abandoned a lock.  I believe that the transaction semantics laid out below 
give this.


I guess your point is to have no locks in the normal case, and have
locked transactions as an optional add-on?

Basically.  If we design the transaction into the git semantics we could ensure 
that it is recoverable and we should not need to expose these reflocks outside 
of the transaction APIs.

To illustrate a simple transaction approach (borrowing some of Shawn's ideas), 
we could designate a directory to hold transaction files *1.  To prepare a 
transaction: write a list of repo:ref:oldvalue:newvalue to a file named id.new 
(in a stable sorted order based on repo:ref to prevent deadlocks).  This is not 
a state change and thus this file could be deleted by any process at anytime 
(preferably after a long grace period).

If file renames are atomic on the filesystem holding the transaction files then 
1, 2, 3 below will be atomic state changes.  It does not matter who performs 
state transitions 2 or 3.  It does not matter who implements the work following 
any of the 3 transitions, many processes could attempt the work in parallel (so 
could a human).
 
1) To start the transaction, rename the id.new file to id.  If the rename 
fails, start over if desired/still possible.  On success, ref locks for each 
entry should be acquired in listed order (to prevent deadlocks), using 
transaction id and oldvalue.  It is never legal to unlock a ref in this state 
(because a block could cause the unlock to be delayed until the commit phase).  
However, it is legal for any process to transition to abort at any time from 
this state, perhaps because of a failure to acquire a lock (held by another 
transaction), and definitely if a ref has changed (is no longer oldvalue).

2) To abort the transaction, rename the id file to id.abort.  This should only 
ever fail if commit was achieved first.  Once in this state, any process 
may/should unlock any ref locks belonging to this transaction id.  Once all 
refs are unlocked, id.abort may be deleted (it could be deleted earlier, but 
then cleanup will take longer).

3) To commit the transaction, rename the file to id.commit.  This should only 
ever fail if abort was achieved first. This transition should never be done 
until every listed ref is locked by the current transaction id.  Once in this 
phase, all refs may/should be moved to their new values and unlocked by any 
process. Once all refs are unlocked, id.commit may be deleted. 

Since any process attempting any of the work in these transactions could block 
at any time for an indefinite amount of time, these processes may wake after 
the transaction is aborted or comitted and the transaction files are cleaned 
up.  I believe that in these cases the only actions which could succeed by 
these waking processes is the ref locking action.  All such abandoned ref locks 
may/should be unlocked by any process.  This last rule means that no 
transaction ids should ever be reused,

-Martin


*1 We may want to adapt the simple model illustrated above to use git 
mechanisms such as refs to hold transaction info instead of files in a 
directory, and git submodule files to hold the list of refs to update.  

Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora 
Forum
--
To unsubscribe from this list: send 

Re: Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref)

2012-12-29 Thread Martin Fick
Jeff King p...@peff.net wrote:

On Thu, Dec 27, 2012 at 04:11:51PM -0700, Martin Fick wrote:
 My idea is based on using filenames to store sha1s instead of 
 file contents.  To do this, the sha1 one of a ref would be 
 stored in a file in a directory named after the loose ref.  I 
 believe this would then make it possible to have lockless 
 atomic ref updates by renaming the file.
 
 To more fully illustrate the idea, imagine that any file 
 (except for the null file) in the directory will represent the 
 value of the ref with its name, then the following 
 transitions can represent atomic state changes to a refs 
 value and existence:

Hmm. So basically you are relying on atomic rename() to move the value
around within a directory, rather than using write to move it around
within a file. Atomic rename is usually something we have on local
filesystems (and I think we rely on it elsewhere). Though I would not
be
surprised if it is not atomic on all networked filesystems (though it
is
on NFS, at least).

Yes.  I assume this is OK because doesn't git already rely on atomic renames?  
For example to rename the new packed-refs file to unlock it?

...

 3) To create a ref, it must be renamed from the null file (sha 
 ...) to the new value just as if it were being updated 
 from any other value, but there is one extra condition: 
 before renaming the null file, a full directory scan must be 
 done to ensure that the null file is the only file in the 
 directory (this condition exists because creating the 
 directory and null file cannot be atomic unless the filesystem 
 supports atomic directory renames, an expectation git does 
 not currently make).  I am not sure how this compares to 
 today's approach, but including the setup costs (described 
 below), I suspect it is slower.

Hmm. mkdir is atomic. So wouldn't it be sufficient to just mkdir and
create the correct sha1 file?

But then a process could mkdir and die leaving a stale empty dir with no 
reliable recovery mechanism.


Unfortunately, I think I see another flaw though! :( I should have known that I 
cannot separate an important check from its state transitioning action.  The 
following could happen:

 A does mkdir
 A creates null file
 A checks dir - no other files 
 B checks dir - no other files
 A renames null file to abcd
 C creates second null file 
 B renames second null file to defg

One way to fix this is to rely on directory renames, but I believe this is 
something git does not want to require of every FS? If we did, we could Change 
#3 to be:

3) To create a ref, it must be renamed from the null file (sha ...) to the 
new value just as if it were being updated from any other value. (No more scan)

Then, with reliable directory renames, a process could do what you suggested to 
a temporary directory, mkdir + create null file, then rename the temporary dir 
to refname.  This would prevent duplicate null files.  With a grace period, the 
temporary dirs could be cleaned up in case a process dies before the rename.  
This is your approach with reliable recovery.


 I don't know how this new scheme could be made to work with 
 the current scheme, it seems like perhaps new git releases 
 could be made to understand both the old and the new, and a 
 config option could be used to tell it which method to write 
 new refs with.  Since in this new scheme ref directory names 
 would conflict with old ref filenames, this would likely 
 prevent both schemes from erroneously being used 
 simultaneously (so they shouldn't corrupt each other), except 
 for the fact that refs can be nested in directories which 
 confuses things a bit.  I am not sure what a good solution to 
 this is?

I think you would need to bump core.repositoryformatversion, and just
never let old versions of git access the repository directly. Not the
end of the world, but it certainly increases deployment effort. If we
were going to do that, it would probably make sense to think about
solving the D/F conflict issues at the same time (i.e., start calling
refs/heads/foo in the filesystem refs.d/heads.d/foo.ref so that it
cannot conflict with refs.d/heads.d/foo.d/bar.ref).

Wouldn't you want to use a non legal ref character instead of dot? And without 
locks, we free up more of the ref namespace too I think? (Refs could end in 
.lock)

-Martin

Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora 
Forum
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC/PATCH] gitk: Visualize a merge commit with a right-click in gitk

2012-12-29 Thread Jason Holden
When first doing a merge in git-gui, the Visualize Merge button is
quite helpful to visualize the changes due to a merge.
But once the merge is complete, there's not a similarly convenient
way to recreate that merge view in gitk.

This commit adds to gitk the ability to right-click on a merge commit and
bring up a new gitk window displaying only those commits involved in
the merge.

When right-clicking on a non-merge commit, this option is grayed out.  This
patch also supports correct visualization of octopus merges

Signed-off-by: Jason Holden jason.k.holden.sw...@gmail.com
---
 gitk | 33 +
 1 file changed, 33 insertions(+)

diff --git a/gitk b/gitk
index 379582a..17e1fcb 100755
--- a/gitk
+++ b/gitk
@@ -2551,6 +2551,7 @@ proc makewindow {} {
{mc Compare with marked commit command compare_commits}
{mc Diff this - marked commit command {diffvsmark 0}}
{mc Diff marked commit - this command {diffvsmark 1}}
+   {mc Visualize this merge command visualize_merge}
 }
 $rowctxmenu configure -tearoff 0
 
@@ -2590,6 +2591,31 @@ proc makewindow {} {
 $diff_menu configure -tearoff 0
 }
 
+# Return the number of parents for a given sha1 id
+proc get_numparents_from_id {id} {
+global parentlist
+set row [rowofcommit $id]
+return [llength [lindex $parentlist $row]]
+}
+
+proc visualize_merge {} {
+global parents currentid parentlist
+global rowmenuid
+
+set num_parents [get_numparents_from_id $rowmenuid]
+set row [rowofcommit $rowmenuid]
+
+if {$num_parents = 2} {
+   set revlist $rowmenuid
+   for { set i 1 } {$i  $num_parents} {incr i} {
+
+   set revlist $revlist [lindex $parentlist $row 0]..[lindex 
$parentlist $row $i] $rowmenuid
+   }
+   
+   eval exec gitk $revlist
+}
+}
+
 # Windows sends all mouse wheel events to the current focused window, not
 # the one where the mouse hovers, so bind those events here and redirect
 # to the correct window
@@ -8577,6 +8603,13 @@ proc rowmenu {x y id} {
$menu entryconfigure 9 -state $mstate
$menu entryconfigure 10 -state $mstate
$menu entryconfigure 11 -state $mstate
+
+   # Disable visualize-merge on only one parent
+   if {[get_numparents_from_id $id] == 1} {
+   $menu entryconfigure 15 -state disabled
+   } else {
+   $menu entryconfigure 15 -state normal
+   }
 } else {
set menu $fakerowmenu
 }
-- 
1.8.1.rc3.28.g0ab5d1f

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fail to push over HTTP with MySQL authentication (Apache2)

2012-12-29 Thread Davide Baldini
On 12/29/12 22:01, Jeff King wrote:

 However, before trying to investigate that avenue, have you considered
 using git's smart-http backend instead of WebDAV? It's significantly
 more efficient. You can get details and example apache configuration
 from git help http-backend.

Thank you for the suggestions, I've been able to have the setup running.
I abandoned the horrible WebDAV in favour of Smart HTTP, and it's been
quite straightforward to configure and run successfully.

There is some documentation on setting up DAV here:
 
   
 https://github.com/git/git/blob/master/Documentation/howto/setup-git-server-over-http.txt
 
 but I have no idea if it is up-to-date or not.

No it's not up-to-date. It should refer to Apache versions 2.0 .
Some of its config directives are inexistent in modern Apaches.

- Davide
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html