Re: [PATCH] config: use a static lock_file struct

2017-08-29 Thread Jeff King
On Wed, Aug 30, 2017 at 12:55:55AM -0400, Jeff King wrote:

> > I feel like right now we meet (1) and not (2). But I think if we keep to
> > that lower bar of (1), it might not be that bad. We're assuming now that
> > there's no race on the tempfile->active flag, for instance. We could
> > probably make a similar assumption about putting items onto or taking
> > them off of a linked list (it's not really atomic, but a single pointer
> > assignment is probably "atomic enough" for our purposes).
> 
> Something like this, which AFAICT is about as safe as the existing code
> in its list manipulation. It retains the "active" flag as an additional
> check which I think isn't strictly necessary, but potentially catches
> some logic errors.

The patch below demonstrates how this could be used to turn some
"xcalloc" lock_files into stack variables that are allowed to go out of
scope after we commit or rollback. This solves the three lock-related
leaks reported by valgrind when running t.

_But_ it also demonstrates an interesting downside of this approach.
Some functions are lazy in their error paths. For instance, look at
write_index_as_tree(). We take the lock early in the function, but may
return before a commit or rollback if we hit an error.  With the current
code this "leaks" the tempfile, which is wrong. But nobody notices
because in practice the program exits soon after and we clean up the
tempfile then.

But with a stack variable lock_file, that minor problem becomes a major
one: our stack variable got added to a global linked list and the
atexit() handler will try to read it. But now of course it will contain
garbage, since the variable went out of scope.

So it's probably safer to just let tempfile.c handle the whole lifetime,
and have it put all live tempfiles on the heap, and free them when
they're deactivated. That means our creation signature becomes more
like:

  struct tempfile *create_tempfile(const char *path);

and delete_tempfile() actually calls free() on it (though we'd probably
want to skip the free() from a signal handler for the usual reasons).

I don't have a patch for that, but just food for thought while exploring
the implications of my earlier suggestion.

---
diff --git a/builtin/update-index.c b/builtin/update-index.c
index d955cd56b3..bf7420b808 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -917,7 +917,7 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
struct refresh_params refresh_args = {0, _errors};
int lock_error = 0;
int split_index = -1;
-   struct lock_file *lock_file;
+   struct lock_file lock_file = LOCK_INIT;
struct parse_opt_ctx_t ctx;
strbuf_getline_fn getline_fn;
int parseopt_state = PARSE_OPT_UNKNOWN;
@@ -1016,11 +1016,8 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
 
git_config(git_default_config, NULL);
 
-   /* We can't free this memory, it becomes part of a linked list parsed 
atexit() */
-   lock_file = xcalloc(1, sizeof(struct lock_file));
-
/* we will diagnose later if it turns out that we need to update it */
-   newfd = hold_locked_index(lock_file, 0);
+   newfd = hold_locked_index(_file, 0);
if (newfd < 0)
lock_error = errno;
 
@@ -1155,11 +1152,11 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
exit(128);
unable_to_lock_die(get_index_file(), lock_error);
}
-   if (write_locked_index(_index, lock_file, COMMIT_LOCK))
+   if (write_locked_index(_index, _file, COMMIT_LOCK))
die("Unable to write new index file");
}
 
-   rollback_lock_file(lock_file);
+   rollback_lock_file(_file);
 
return has_errors ? 1 : 0;
 }
diff --git a/cache-tree.c b/cache-tree.c
index 2440d1dc89..440f92aeca 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -603,19 +603,15 @@ static struct cache_tree *cache_tree_find(struct 
cache_tree *it, const char *pat
 int write_index_as_tree(unsigned char *sha1, struct index_state *index_state, 
const char *index_path, int flags, const char *prefix)
 {
int entries, was_valid, newfd;
-   struct lock_file *lock_file;
+   struct lock_file lock_file = LOCK_INIT;
 
-   /*
-* We can't free this memory, it becomes part of a linked list
-* parsed atexit()
-*/
-   lock_file = xcalloc(1, sizeof(struct lock_file));
-
-   newfd = hold_lock_file_for_update(lock_file, index_path, 
LOCK_DIE_ON_ERROR);
+   newfd = hold_lock_file_for_update(_file, index_path, 
LOCK_DIE_ON_ERROR);
 
entries = read_index_from(index_state, index_path);
-   if (entries < 0)
+   if (entries < 0) {
+   rollback_lock_file(_file);
return WRITE_TREE_UNREADABLE_INDEX;
+   }
if (flags & WRITE_TREE_IGNORE_CACHE_TREE)

Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()

2017-08-29 Thread Jeff King
On Tue, Aug 29, 2017 at 05:48:09PM -0400, Jeff King wrote:

> On Tue, Aug 29, 2017 at 09:22:08PM +0200, Martin Ågren wrote:
> 
> > One take-away for me from this thread is that memory-leak-plugging seems
> > to be appreciated, i.e., it's not just an academic problem. I think I'll
> > look into it some more, i.e., slowly pursue option 2. :-) That might help
> > give some insights on how to identify interesting leaks.
> 
> Yes, I think if it's tractable that's the best path forward. It's not
> clear to me how many of the things valgrind reports are false positives
> (and in what way). Certainly the lock_file patch I just posted seems
> like something worth addressing, even if we _tend_ to only leak a small
> number of such structs per program invocation. It's not a problem now,
> but it's a sign of a bad pattern.

I wanted to see what it would take to get t to pass with "definite"
leak-checking on, as in:

diff --git a/t/valgrind/valgrind.sh b/t/valgrind/valgrind.sh
index 669ebaf68b..5f01c06280 100755
--- a/t/valgrind/valgrind.sh
+++ b/t/valgrind/valgrind.sh
@@ -10,7 +10,7 @@ test-*)
;;
 esac
 
-TOOL_OPTIONS='--leak-check=no'
+TOOL_OPTIONS='--leak-check=full --errors-for-leak-kinds=definite'
 
 test -z "$GIT_VALGRIND_ENABLED" &&
 exec "$program" "$@"

The results gave an interesting overview of the problems I'd expect to
face for the rest of the tests. I started with the config lock_file fix
I sent earlier, but we needed a few more.

This one in cmd_update_index() can become static in the same way:

diff --git a/builtin/update-index.c b/builtin/update-index.c
index d955cd56b3..3bff7de720 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -908,6 +908,7 @@ static int reupdate_callback(struct parse_opt_ctx_t *ctx,
 
 int cmd_update_index(int argc, const char **argv, const char *prefix)
 {
+   static struct lock_file lock_file;
int newfd, entries, has_errors = 0, nul_term_line = 0;
enum uc_mode untracked_cache = UC_UNSPECIFIED;
int read_from_stdin = 0;
@@ -917,7 +918,6 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
struct refresh_params refresh_args = {0, _errors};
int lock_error = 0;
int split_index = -1;
-   struct lock_file *lock_file;
struct parse_opt_ctx_t ctx;
strbuf_getline_fn getline_fn;
int parseopt_state = PARSE_OPT_UNKNOWN;
@@ -1016,11 +1016,8 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
 
git_config(git_default_config, NULL);
 
-   /* We can't free this memory, it becomes part of a linked list parsed 
atexit() */
-   lock_file = xcalloc(1, sizeof(struct lock_file));
-
/* we will diagnose later if it turns out that we need to update it */
-   newfd = hold_locked_index(lock_file, 0);
+   newfd = hold_locked_index(_file, 0);
if (newfd < 0)
lock_error = errno;
 
@@ -1155,11 +1152,11 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
exit(128);
unable_to_lock_die(get_index_file(), lock_error);
}
-   if (write_locked_index(_index, lock_file, COMMIT_LOCK))
+   if (write_locked_index(_index, _file, COMMIT_LOCK))
die("Unable to write new index file");
}
 
-   rollback_lock_file(lock_file);
+   rollback_lock_file(_file);
 
return has_errors ? 1 : 0;
 }

But there's another one in write_index_as_tree(), which is less
obviously a candidate for turning into a single static variable. We
might plausibly have multiple indices. I hacked around it with:

diff --git a/cache-tree.c b/cache-tree.c
index 2440d1dc89..f2a5aa138f 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -603,7 +603,7 @@ static struct cache_tree *cache_tree_find(struct cache_tree 
*it, const char *pat
 int write_index_as_tree(unsigned char *sha1, struct index_state *index_state, 
const char *index_path, int flags, const char *prefix)
 {
int entries, was_valid, newfd;
-   struct lock_file *lock_file;
+   static struct lock_file *lock_file;
 
/*
 * We can't free this memory, it becomes part of a linked list

which means that if we call this function once per program we'll keep a
single static pointer, and valgrind will consider it "reachable". But if
we call multiple, they really will leak as before. So it's definitely a
hack, though it was enough for t. I think I'd favor loosening the
"you can never free locks" requirement instead, which makes this problem
just go away (we could even declare it as a stack variable then).

Next up is the fact that setup_git_env() happily overwrites its own
pointers without freeing them if it's called twice. I expect this will
be a reasonably common problem for things like config or option parsing
where we might see multiple values. In the simple cases, we just need to
make sure we do:

  free(foo);
  foo 

Re: [PATCH] config: use a static lock_file struct

2017-08-29 Thread Jeff King
On Wed, Aug 30, 2017 at 12:31:47AM -0400, Jeff King wrote:

> > It was surprisingly hard trying to get that code to do the right thing,
> > non-racily, in every error path. Since `remove_tempfiles()` can be
> > called any time (even from a signal handler), the linked list of
> > `tempfile` objects has to be kept valid at all times but can't use
> > mutexes. I didn't have the energy to keep going and make the lock
> > objects freeable.
> > 
> > I suppose the task could be made a bit easier by using `sigprocmask(2)`
> > or `pthread_sigmask(3)` to make its running environment a little bit
> > less hostile.
> 
> I think there are really two levels of carefulness here:
> 
>   1. Avoiding complicated things during a signal handler that may rely
>  on having a sane state from the rest of the program (e.g.,
>  half-formed entries, stdio locks, etc).
> 
>   2. Being truly race-free in the face of a signal arriving while we're
>  running arbitrary code that might have a tempfile struct in a funny
>  state.
> 
> I feel like right now we meet (1) and not (2). But I think if we keep to
> that lower bar of (1), it might not be that bad. We're assuming now that
> there's no race on the tempfile->active flag, for instance. We could
> probably make a similar assumption about putting items onto or taking
> them off of a linked list (it's not really atomic, but a single pointer
> assignment is probably "atomic enough" for our purposes).

Something like this, which AFAICT is about as safe as the existing code
in its list manipulation. It retains the "active" flag as an additional
check which I think isn't strictly necessary, but potentially catches
some logic errors.

The strbuf_reset() calls become strbuf_release(), since we're promising
the caller that they could now free the struct (or let it go out of
scope) if they chose. Probably during a signal handler we should skip
that (we know the struct is off the list and non-active at that point,
but we could possibly hit libc's free() mutex).

diff --git a/tempfile.c b/tempfile.c
index 6843710670..a7d964ebf8 100644
--- a/tempfile.c
+++ b/tempfile.c
@@ -69,7 +69,6 @@ static void remove_tempfiles(int skip_fclose)
tempfile_list->fp = NULL;
delete_tempfile(tempfile_list);
}
-   tempfile_list = tempfile_list->next;
}
 }
 
@@ -85,39 +84,54 @@ static void remove_tempfiles_on_signal(int signo)
raise(signo);
 }
 
-/*
- * Initialize *tempfile if necessary and add it to tempfile_list.
- */
-static void prepare_tempfile_object(struct tempfile *tempfile)
+static void init_tempfile(struct tempfile *tempfile)
+{
+   tempfile->fd = -1;
+   tempfile->fp = NULL;
+   tempfile->active = 0;
+   tempfile->owner = 0;
+   tempfile->next = NULL;
+   strbuf_init(>filename, 0);
+}
+
+static void activate_tempfile(struct tempfile *tempfile)
 {
-   if (!tempfile_list) {
-   /* One-time initialization */
+   static volatile int initialized;
+
+   if (!initialized) {
sigchain_push_common(remove_tempfiles_on_signal);
atexit(remove_tempfiles_on_exit);
+   initialized = 1;
}
 
if (tempfile->active)
-   die("BUG: prepare_tempfile_object called for active object");
-   if (!tempfile->on_list) {
-   /* Initialize *tempfile and add it to tempfile_list: */
-   tempfile->fd = -1;
-   tempfile->fp = NULL;
-   tempfile->active = 0;
-   tempfile->owner = 0;
-   strbuf_init(>filename, 0);
-   tempfile->next = tempfile_list;
-   tempfile_list = tempfile;
-   tempfile->on_list = 1;
-   } else if (tempfile->filename.len) {
-   /* This shouldn't happen, but better safe than sorry. */
-   die("BUG: prepare_tempfile_object called for improperly-reset 
object");
+   die("BUG: activate_tempfile called for active object");
+
+   tempfile->next = tempfile_list;
+   tempfile_list = tempfile;
+   tempfile->active = 1;
+}
+
+static void deactivate_tempfile(struct tempfile *tempfile)
+{
+   struct tempfile *volatile *p;
+
+   if (!tempfile->active)
+   return;
+
+   tempfile->active = 0;
+   for (p = _list; *p; p = &(*p)->next) {
+   if (*p == tempfile) {
+   *p = tempfile->next;
+   break;
+   }
}
 }
 
 /* Make sure errno contains a meaningful value on error */
 int create_tempfile(struct tempfile *tempfile, const char *path)
 {
-   prepare_tempfile_object(tempfile);
+   init_tempfile(tempfile);
 
strbuf_add_absolute_path(>filename, path);
tempfile->fd = open(tempfile->filename.buf,
@@ -127,11 +141,11 @@ int create_tempfile(struct tempfile *tempfile, const char 
*path)
tempfile->fd = 

Re: [PATCH] config: use a static lock_file struct

2017-08-29 Thread Michael Haggerty
On Wed, Aug 30, 2017 at 6:31 AM, Jeff King  wrote:
> On Wed, Aug 30, 2017 at 05:25:18AM +0200, Michael Haggerty wrote:
>> It was surprisingly hard trying to get that code to do the right thing,
>> non-racily, in every error path. Since `remove_tempfiles()` can be
>> called any time (even from a signal handler), the linked list of
>> `tempfile` objects has to be kept valid at all times but can't use
>> mutexes. I didn't have the energy to keep going and make the lock
>> objects freeable.
>>
>> I suppose the task could be made a bit easier by using `sigprocmask(2)`
>> or `pthread_sigmask(3)` to make its running environment a little bit
>> less hostile.
>
> I think there are really two levels of carefulness here:
>
>   1. Avoiding complicated things during a signal handler that may rely
>  on having a sane state from the rest of the program (e.g.,
>  half-formed entries, stdio locks, etc).
>
>   2. Being truly race-free in the face of a signal arriving while we're
>  running arbitrary code that might have a tempfile struct in a funny
>  state.
>
> I feel like right now we meet (1) and not (2). But I think if we keep to
> that lower bar of (1), it might not be that bad. We're assuming now that
> there's no race on the tempfile->active flag, for instance. We could
> probably make a similar assumption about putting items onto or taking
> them off of a linked list (it's not really atomic, but a single pointer
> assignment is probably "atomic enough" for our purposes).
>
> Or I dunno. There's a lot of "volatile" modifiers sprinkled around.
> Maybe those are enough to give us (2) as well (though in that case, I
> think we'd still be as OK with the list manipulation as we are with the
> active flag manipulation).

I did in fact strive for both (1) and (2), though I know that there
are a couple of short races remaining in the code (plus who knows how
many bugs).

Actually, I think you're right that a single "probably-atomic" pointer
assignment would be enough to remove an entry from the list safely. I
was thinking about what it would take to make the list thread-safe
(which it currently is not, but probably doesn't need to be(?)).
Modifying the linked list is not that difficult if you assume that
there is only a single thread modifying the list at any given time.

Michael


Re: [PATCH] config: use a static lock_file struct

2017-08-29 Thread Jeff King
On Wed, Aug 30, 2017 at 05:25:18AM +0200, Michael Haggerty wrote:

> >>> In the long run we may want to drop the "tempfiles must remain forever"
> >>> rule. This is certainly not the first time it has caused confusion or
> >>> leaks. And I don't think it's a fundamental issue, just the way the code
> >>> is written. But in the interim, this fix is probably worth doing.
> > 
> > The main issue is that the caller needs to make sure they're removed
> > from the list (via commit or rollback) before being freed.
> > 
> > As far as I know anyway. This restriction dates back to the very early
> > days of the lockfile code and has been carried through the various
> > tempfile-cleanup refactorings over the years (mostly because each was
> > afraid to make functional changes).
> > 
> > +cc Michael, who did most comprehensive cleanup of that code.
> 
> It was surprisingly hard trying to get that code to do the right thing,
> non-racily, in every error path. Since `remove_tempfiles()` can be
> called any time (even from a signal handler), the linked list of
> `tempfile` objects has to be kept valid at all times but can't use
> mutexes. I didn't have the energy to keep going and make the lock
> objects freeable.
> 
> I suppose the task could be made a bit easier by using `sigprocmask(2)`
> or `pthread_sigmask(3)` to make its running environment a little bit
> less hostile.

I think there are really two levels of carefulness here:

  1. Avoiding complicated things during a signal handler that may rely
 on having a sane state from the rest of the program (e.g.,
 half-formed entries, stdio locks, etc).

  2. Being truly race-free in the face of a signal arriving while we're
 running arbitrary code that might have a tempfile struct in a funny
 state.

I feel like right now we meet (1) and not (2). But I think if we keep to
that lower bar of (1), it might not be that bad. We're assuming now that
there's no race on the tempfile->active flag, for instance. We could
probably make a similar assumption about putting items onto or taking
them off of a linked list (it's not really atomic, but a single pointer
assignment is probably "atomic enough" for our purposes).

Or I dunno. There's a lot of "volatile" modifiers sprinkled around.
Maybe those are enough to give us (2) as well (though in that case, I
think we'd still be as OK with the list manipulation as we are with the
active flag manipulation).

-Peff


Re: [PATCH] config: use a static lock_file struct

2017-08-29 Thread Michael Haggerty
On 08/29/2017 09:12 PM, Jeff King wrote:
> On Tue, Aug 29, 2017 at 12:09:28PM -0700, Brandon Williams wrote:
> 
>>> -- >8 --
>>> Subject: [PATCH] config: use a static lock_file struct
>>>
>>> When modifying git config, we xcalloc() a struct lock_file
>>> but never free it. This is necessary because the tempfile
>>> code (upon which the locking code is built) requires that
>>> the resulting struct remain valid through the life of the
>>> program. However, it also confuses leak-checkers like
>>> valgrind because only the inner "struct tempfile" is still
>>> reachable; no pointer to the outer lock_file is kept.
>>
>> Is this just due to a limitation in the tempfile code?  Would it be
>> possible to improve the tempfile code such that we don't need to require
>> that a tempfile, once created, is required to exist for the remaining
>> life of the program?
> 
> Yes. Like I wrote below:
> 
>>> ---
>>> In the long run we may want to drop the "tempfiles must remain forever"
>>> rule. This is certainly not the first time it has caused confusion or
>>> leaks. And I don't think it's a fundamental issue, just the way the code
>>> is written. But in the interim, this fix is probably worth doing.
> 
> The main issue is that the caller needs to make sure they're removed
> from the list (via commit or rollback) before being freed.
> 
> As far as I know anyway. This restriction dates back to the very early
> days of the lockfile code and has been carried through the various
> tempfile-cleanup refactorings over the years (mostly because each was
> afraid to make functional changes).
> 
> +cc Michael, who did most comprehensive cleanup of that code.

It was surprisingly hard trying to get that code to do the right thing,
non-racily, in every error path. Since `remove_tempfiles()` can be
called any time (even from a signal handler), the linked list of
`tempfile` objects has to be kept valid at all times but can't use
mutexes. I didn't have the energy to keep going and make the lock
objects freeable.

I suppose the task could be made a bit easier by using `sigprocmask(2)`
or `pthread_sigmask(3)` to make its running environment a little bit
less hostile.

Michael


Re: [RFC 0/7] transitioning to protocol v2

2017-08-29 Thread Jeff King
On Tue, Aug 29, 2017 at 04:08:25PM -0400, Jeff Hostetler wrote:

> I just wanted to jump in here and say I've done some initial
> testing of this against VSTS and so far it seems fine.  And yes,
> we have a custom git server.

Great, thank you for checking.

> VSTS doesn't support the "git://" protocol, so the double-null trick
> isn't an issue for us.  But "https://; worked just fine.  I'm still
> asking around internally whether we support passing SSH environment
> variables.

The key thing for ssh is not whether you support passing environment
variables. It's whether you quietly ignore unknown variables rather than
cutting off the connection.

To support the v2 protocol you'd need to pass the new variables, but
you'd also need to modify your server to actually do something useful
with them anyway. At this point we're mostly concerned with whether we
can safely pass the variables to current implementations unconditionally
and get a reasonable outcome.

To be honest, I'm not all that worried about VSTS either way. It's a
centralized service which will likely get v2 extensions implemented in a
timely manner (once they exist). I'm much more concerned about
shrink-wrap implementations deployed in the wild, which may hang around
without being upgraded for years. If v2-aware clients send requests that
cause those old implementations to choke, users won't be happy.

-Peff


Re: [PATCH 04/10] packed_delete_refs(): implement method

2017-08-29 Thread Michael Haggerty
On 08/29/2017 08:07 PM, Brandon Williams wrote:
> On 08/29, Michael Haggerty wrote:
>> [...]
>> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
>> index d19b3bfba5..83a088118f 100644
>> --- a/refs/packed-backend.c
>> +++ b/refs/packed-backend.c
>> @@ -1082,7 +1082,50 @@ static int packed_initial_transaction_commit(struct 
>> ref_store *ref_store,
>>  static int packed_delete_refs(struct ref_store *ref_store, const char *msg,
>>   struct string_list *refnames, unsigned int flags)
>>  {
>> -die("BUG: not implemented yet");
>> +struct packed_ref_store *refs =
>> +packed_downcast(ref_store, REF_STORE_WRITE, "delete_refs");
>> +struct strbuf err = STRBUF_INIT;
>> +struct ref_transaction *transaction;
>> +struct string_list_item *item;
>> +int ret;
>> +
>> +(void)refs; /* We need the check above, but don't use the variable */
> 
> Can't say I've seen a line like this before, Is the intent to just mark
> that this variable won't be used like you mention in the comment?


The `(void)refs;` part tells the compiler not to complain about a
variable that is defined but never used.

So why define the variable in the first place? I could instead do

(void)packed_downcast(ref_store, REF_STORE_WRITE, "delete_refs");

My reason (which people might disagree with) is that I like to keep this
boilerplate consistent from method to method. To switch to the second
variant, not only would the line look different, but it would also have
to be moved down several lines, past the other variables' declarations,
to avoid the dreaded "ISO C90 forbids mixed declarations and code".

I'd prefer to be able to mix declarations and code, but the project's
style prohibits it.

Michael


Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list

2017-08-29 Thread Michael Haggerty
v3 looks good to me. Thanks!

Reviewed-by: Michael Haggerty 

Michael


Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()

2017-08-29 Thread Jeff King
On Tue, Aug 29, 2017 at 09:22:08PM +0200, Martin Ågren wrote:

> One take-away for me from this thread is that memory-leak-plugging seems
> to be appreciated, i.e., it's not just an academic problem. I think I'll
> look into it some more, i.e., slowly pursue option 2. :-) That might help
> give some insights on how to identify interesting leaks.

Yes, I think if it's tractable that's the best path forward. It's not
clear to me how many of the things valgrind reports are false positives
(and in what way). Certainly the lock_file patch I just posted seems
like something worth addressing, even if we _tend_ to only leak a small
number of such structs per program invocation. It's not a problem now,
but it's a sign of a bad pattern.

-Peff


Re: [RFC 0/7] transitioning to protocol v2

2017-08-29 Thread Brandon Williams
On 08/29, Jeff Hostetler wrote:
> 
> 
> On 8/25/2017 1:35 PM, Jonathan Nieder wrote:
> >Hi,
> >
> >Jeff King wrote:
> >>On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:
> >
> >>>Another version of Git's wire protocol is a topic that has been discussed 
> >>>and
> >>>attempted by many in the community over the years.  The biggest challenge, 
> >>>as
> >>>far as I understand, has been coming up with a transition plan to using 
> >>>the new
> >>>server without breaking existing clients and servers.  As such this RFC is
> >>>really only concerned with solidifying a transition plan.  Once it has been
> >>>decided how we can transition to a new protocol we can get into decided 
> >>>what
> >>>this new protocol would look like (though it would obviously eliminate the 
> >>>ref
> >>>advertisement ;).
> >>
> >
> >>I don't think libgit2 implements the server side. That leaves probably
> >>JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
> >>and GitLab use.
> >
> >I'd be happy if someone tests the patches against those. :)
> 
> I just wanted to jump in here and say I've done some initial
> testing of this against VSTS and so far it seems fine.  And yes,
> we have a custom git server.
> 
> VSTS doesn't support the "git://" protocol, so the double-null trick
> isn't an issue for us.  But "https://; worked just fine.  I'm still
> asking around internally whether we support passing SSH environment
> variables.
> 
> Jeff
> 

Thanks for checking on this, I really appreciate it.  Please let me know
if anything I haven't thought of becomes an issue.

I'm currently working on getting these patches into a more polished
state to be used (as discussed elsewhere on this thread) as a precursor
to an actual v2.

-- 
Brandon Williams


Re: [RFC 0/7] transitioning to protocol v2

2017-08-29 Thread Jeff Hostetler



On 8/25/2017 1:35 PM, Jonathan Nieder wrote:

Hi,

Jeff King wrote:

On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:



Another version of Git's wire protocol is a topic that has been discussed and
attempted by many in the community over the years.  The biggest challenge, as
far as I understand, has been coming up with a transition plan to using the new
server without breaking existing clients and servers.  As such this RFC is
really only concerned with solidifying a transition plan.  Once it has been
decided how we can transition to a new protocol we can get into decided what
this new protocol would look like (though it would obviously eliminate the ref
advertisement ;).





I don't think libgit2 implements the server side. That leaves probably
JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
and GitLab use.


I'd be happy if someone tests the patches against those. :)


I just wanted to jump in here and say I've done some initial
testing of this against VSTS and so far it seems fine.  And yes,
we have a custom git server.

VSTS doesn't support the "git://" protocol, so the double-null trick
isn't an issue for us.  But "https://; worked just fine.  I'm still
asking around internally whether we support passing SSH environment
variables.

Jeff



Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()

2017-08-29 Thread Martin Ågren
On 29 August 2017 at 20:53, Jeff King  wrote:
> On Tue, Aug 29, 2017 at 06:51:52PM +0100, Lars Schneider wrote:
>> What if we run a few selected tests with valgrind and count all files that
>> valgrind mentions (a single leak has multiple file mentions because of
>> the stack trace and other leak indicators). We record these counts and let
>> TravisCI scream if one of the numbers increases.
>>
>> I wonder how stable/fragile such a metric would be as a simple refactoring
>> could easily change these numbers. Below I ran valgrind on t5510 before and
>> after Martin's patch. The diff below clearly shows the pkt-line leak.
>>
>> Would it make sense to pursue something like this in TravisCI to avoid
>> "pkt-line" kind of leaks in the future?
>
> I don't think that would work, because simply adding new tests would
> bump the leak count, without the code actually growing worse.
>
> But think about your strategy for a moment: what you're really trying to
> do is say "these existing leaks are OK because they are too many for us
> to count, but we want to make sure we don't add _new_ ones". We already
> have two techniques for distinguishing old ones from new ones:
>
>   1. Mark existing ones with valgrind suppressions so they do not warn
>  at all.
>
>   2. Fix them, so that the "existing" count drops to zero.
>
> Option (2), of course, has the added side effect that it's actually
> fixing potential problems. :)

One idea would be to report that "t1234 used to complete without leaks,
but now there is a leak". Of course, the leak could be triggered by some
"irrelevant" setup step that was just added and not by the functionality
that is actually tested. But maybe then the reaction might be "oh, let's
fix this one leak so that this test is leak-free again". If that then
also reduces the leakiness of some other tests, good. That might help
against the "there are too many leaks, where should I start?"-effect.

Or similarly "This stack-trace was reported. I haven't seen it before."
It could be a refactoring, it could be an old leak that hasn't been
triggered in that way before, or it could be a new leak.

But to be honest, I think that only works -- in a psychological sense --
once the number of leaks is small enough, however much that is.

One take-away for me from this thread is that memory-leak-plugging seems
to be appreciated, i.e., it's not just an academic problem. I think I'll
look into it some more, i.e., slowly pursue option 2. :-) That might help
give some insights on how to identify interesting leaks.

Martin


Re: [PATCH] config: use a static lock_file struct

2017-08-29 Thread Jeff King
On Tue, Aug 29, 2017 at 12:09:28PM -0700, Brandon Williams wrote:

> > -- >8 --
> > Subject: [PATCH] config: use a static lock_file struct
> > 
> > When modifying git config, we xcalloc() a struct lock_file
> > but never free it. This is necessary because the tempfile
> > code (upon which the locking code is built) requires that
> > the resulting struct remain valid through the life of the
> > program. However, it also confuses leak-checkers like
> > valgrind because only the inner "struct tempfile" is still
> > reachable; no pointer to the outer lock_file is kept.
> 
> Is this just due to a limitation in the tempfile code?  Would it be
> possible to improve the tempfile code such that we don't need to require
> that a tempfile, once created, is required to exist for the remaining
> life of the program?

Yes. Like I wrote below:

> > ---
> > In the long run we may want to drop the "tempfiles must remain forever"
> > rule. This is certainly not the first time it has caused confusion or
> > leaks. And I don't think it's a fundamental issue, just the way the code
> > is written. But in the interim, this fix is probably worth doing.

The main issue is that the caller needs to make sure they're removed
from the list (via commit or rollback) before being freed.

As far as I know anyway. This restriction dates back to the very early
days of the lockfile code and has been carried through the various
tempfile-cleanup refactorings over the years (mostly because each was
afraid to make functional changes).

+cc Michael, who did most comprehensive cleanup of that code.

-Peff


Re: [PATCH] config: use a static lock_file struct

2017-08-29 Thread Brandon Williams
On 08/29, Brandon Williams wrote:
> On 08/29, Jeff King wrote:
> > On Tue, Aug 29, 2017 at 02:53:41PM -0400, Jeff King wrote:
> > 
> > > It looks like the config code has a minor-ish leak. Patch to follow.
> > 
> > Here it is.
> > 
> > -- >8 --
> > Subject: [PATCH] config: use a static lock_file struct
> > 
> > When modifying git config, we xcalloc() a struct lock_file
> > but never free it. This is necessary because the tempfile
> > code (upon which the locking code is built) requires that
> > the resulting struct remain valid through the life of the
> > program. However, it also confuses leak-checkers like
> > valgrind because only the inner "struct tempfile" is still
> > reachable; no pointer to the outer lock_file is kept.
> 
> Is this just due to a limitation in the tempfile code?  Would it be
> possible to improve the tempfile code such that we don't need to require
> that a tempfile, once created, is required to exist for the remaining
> life of the program?
> 
> > 
> > Other code paths solve this by using a single static lock
> > struct. We can do the same here, because we know that we'll
> > only lock and modify one config file at a time (and
> > assertions within the lockfile code will ensure that this
> > remains the case).
> > 
> > That removes a real leak (when we fail to free the struct
> > after locking fails) as well as removes the valgrind false
> > positive. It also means that doing N sequential
> > config-writes will use a constant amount of memory, rather
> > than leaving stale lock_files for each.
> > 
> > Note that since "lock" is no longer a pointer, it can't be
> > NULL anymore. But that's OK. We used that feature only to
> > avoid calling rollback_lock_file() on an already-committed
> > lock. Since the lockfile code keeps its own "active" flag,
> > it's a noop to rollback an inactive lock, and we don't have
> > to worry about this ourselves.
> > 
> > Signed-off-by: Jeff King 
> > ---
> > In the long run we may want to drop the "tempfiles must remain forever"
> > rule. This is certainly not the first time it has caused confusion or
> > leaks. And I don't think it's a fundamental issue, just the way the code
> > is written. But in the interim, this fix is probably worth doing.

I didn't read far enough apparently :)  I took a look at this once and
found that the in order to do this we would just need to be careful in
modifying a list of tempfiles.

> > 
> >  config.c | 24 +++-
> >  1 file changed, 7 insertions(+), 17 deletions(-)
> > 
> > diff --git a/config.c b/config.c
> > index d0d8ce823a..1603f96e40 100644
> > --- a/config.c
> > +++ b/config.c
> > @@ -2450,7 +2450,7 @@ int git_config_set_multivar_in_file_gently(const char 
> > *config_filename,
> >  {
> > int fd = -1, in_fd = -1;
> > int ret;
> > -   struct lock_file *lock = NULL;
> > +   static struct lock_file lock;
> > char *filename_buf = NULL;
> > char *contents = NULL;
> > size_t contents_sz;
> > @@ -2469,8 +2469,7 @@ int git_config_set_multivar_in_file_gently(const char 
> > *config_filename,
> >  * The lock serves a purpose in addition to locking: the new
> >  * contents of .git/config will be written into it.
> >  */
> > -   lock = xcalloc(1, sizeof(struct lock_file));
> > -   fd = hold_lock_file_for_update(lock, config_filename, 0);
> > +   fd = hold_lock_file_for_update(, config_filename, 0);
> > if (fd < 0) {
> > error_errno("could not lock config file %s", config_filename);
> > free(store.key);
> > @@ -2583,8 +2582,8 @@ int git_config_set_multivar_in_file_gently(const char 
> > *config_filename,
> > close(in_fd);
> > in_fd = -1;
> >  
> > -   if (chmod(get_lock_file_path(lock), st.st_mode & 0) < 0) {
> > -   error_errno("chmod on %s failed", 
> > get_lock_file_path(lock));
> > +   if (chmod(get_lock_file_path(), st.st_mode & 0) < 0) {
> > +   error_errno("chmod on %s failed", 
> > get_lock_file_path());
> > ret = CONFIG_NO_WRITE;
> > goto out_free;
> > }
> > @@ -2639,28 +2638,19 @@ int git_config_set_multivar_in_file_gently(const 
> > char *config_filename,
> > contents = NULL;
> > }
> >  
> > -   if (commit_lock_file(lock) < 0) {
> > +   if (commit_lock_file() < 0) {
> > error_errno("could not write config file %s", config_filename);
> > ret = CONFIG_NO_WRITE;
> > -   lock = NULL;
> > goto out_free;
> > }
> >  
> > -   /*
> > -* lock is committed, so don't try to roll it back below.
> > -* NOTE: Since lockfile.c keeps a linked list of all created
> > -* lock_file structures, it isn't safe to free(lock).  It's
> > -* better to just leave it hanging around.
> > -*/
> > -   lock = NULL;
> > ret = 0;
> >  
> > /* Invalidate the config cache */
> > git_config_clear();
> >  
> >  out_free:
> > -   if 

Re: [PATCH] config: use a static lock_file struct

2017-08-29 Thread Brandon Williams
On 08/29, Jeff King wrote:
> On Tue, Aug 29, 2017 at 02:53:41PM -0400, Jeff King wrote:
> 
> > It looks like the config code has a minor-ish leak. Patch to follow.
> 
> Here it is.
> 
> -- >8 --
> Subject: [PATCH] config: use a static lock_file struct
> 
> When modifying git config, we xcalloc() a struct lock_file
> but never free it. This is necessary because the tempfile
> code (upon which the locking code is built) requires that
> the resulting struct remain valid through the life of the
> program. However, it also confuses leak-checkers like
> valgrind because only the inner "struct tempfile" is still
> reachable; no pointer to the outer lock_file is kept.

Is this just due to a limitation in the tempfile code?  Would it be
possible to improve the tempfile code such that we don't need to require
that a tempfile, once created, is required to exist for the remaining
life of the program?

> 
> Other code paths solve this by using a single static lock
> struct. We can do the same here, because we know that we'll
> only lock and modify one config file at a time (and
> assertions within the lockfile code will ensure that this
> remains the case).
> 
> That removes a real leak (when we fail to free the struct
> after locking fails) as well as removes the valgrind false
> positive. It also means that doing N sequential
> config-writes will use a constant amount of memory, rather
> than leaving stale lock_files for each.
> 
> Note that since "lock" is no longer a pointer, it can't be
> NULL anymore. But that's OK. We used that feature only to
> avoid calling rollback_lock_file() on an already-committed
> lock. Since the lockfile code keeps its own "active" flag,
> it's a noop to rollback an inactive lock, and we don't have
> to worry about this ourselves.
> 
> Signed-off-by: Jeff King 
> ---
> In the long run we may want to drop the "tempfiles must remain forever"
> rule. This is certainly not the first time it has caused confusion or
> leaks. And I don't think it's a fundamental issue, just the way the code
> is written. But in the interim, this fix is probably worth doing.
> 
>  config.c | 24 +++-
>  1 file changed, 7 insertions(+), 17 deletions(-)
> 
> diff --git a/config.c b/config.c
> index d0d8ce823a..1603f96e40 100644
> --- a/config.c
> +++ b/config.c
> @@ -2450,7 +2450,7 @@ int git_config_set_multivar_in_file_gently(const char 
> *config_filename,
>  {
>   int fd = -1, in_fd = -1;
>   int ret;
> - struct lock_file *lock = NULL;
> + static struct lock_file lock;
>   char *filename_buf = NULL;
>   char *contents = NULL;
>   size_t contents_sz;
> @@ -2469,8 +2469,7 @@ int git_config_set_multivar_in_file_gently(const char 
> *config_filename,
>* The lock serves a purpose in addition to locking: the new
>* contents of .git/config will be written into it.
>*/
> - lock = xcalloc(1, sizeof(struct lock_file));
> - fd = hold_lock_file_for_update(lock, config_filename, 0);
> + fd = hold_lock_file_for_update(, config_filename, 0);
>   if (fd < 0) {
>   error_errno("could not lock config file %s", config_filename);
>   free(store.key);
> @@ -2583,8 +2582,8 @@ int git_config_set_multivar_in_file_gently(const char 
> *config_filename,
>   close(in_fd);
>   in_fd = -1;
>  
> - if (chmod(get_lock_file_path(lock), st.st_mode & 0) < 0) {
> - error_errno("chmod on %s failed", 
> get_lock_file_path(lock));
> + if (chmod(get_lock_file_path(), st.st_mode & 0) < 0) {
> + error_errno("chmod on %s failed", 
> get_lock_file_path());
>   ret = CONFIG_NO_WRITE;
>   goto out_free;
>   }
> @@ -2639,28 +2638,19 @@ int git_config_set_multivar_in_file_gently(const char 
> *config_filename,
>   contents = NULL;
>   }
>  
> - if (commit_lock_file(lock) < 0) {
> + if (commit_lock_file() < 0) {
>   error_errno("could not write config file %s", config_filename);
>   ret = CONFIG_NO_WRITE;
> - lock = NULL;
>   goto out_free;
>   }
>  
> - /*
> -  * lock is committed, so don't try to roll it back below.
> -  * NOTE: Since lockfile.c keeps a linked list of all created
> -  * lock_file structures, it isn't safe to free(lock).  It's
> -  * better to just leave it hanging around.
> -  */
> - lock = NULL;
>   ret = 0;
>  
>   /* Invalidate the config cache */
>   git_config_clear();
>  
>  out_free:
> - if (lock)
> - rollback_lock_file(lock);
> + rollback_lock_file();
>   free(filename_buf);
>   if (contents)
>   munmap(contents, contents_sz);
> @@ -2669,7 +2659,7 @@ int git_config_set_multivar_in_file_gently(const char 
> *config_filename,
>   return ret;
>  
>  write_err_out:
> - ret = 

[PATCH] config: use a static lock_file struct

2017-08-29 Thread Jeff King
On Tue, Aug 29, 2017 at 02:53:41PM -0400, Jeff King wrote:

> It looks like the config code has a minor-ish leak. Patch to follow.

Here it is.

-- >8 --
Subject: [PATCH] config: use a static lock_file struct

When modifying git config, we xcalloc() a struct lock_file
but never free it. This is necessary because the tempfile
code (upon which the locking code is built) requires that
the resulting struct remain valid through the life of the
program. However, it also confuses leak-checkers like
valgrind because only the inner "struct tempfile" is still
reachable; no pointer to the outer lock_file is kept.

Other code paths solve this by using a single static lock
struct. We can do the same here, because we know that we'll
only lock and modify one config file at a time (and
assertions within the lockfile code will ensure that this
remains the case).

That removes a real leak (when we fail to free the struct
after locking fails) as well as removes the valgrind false
positive. It also means that doing N sequential
config-writes will use a constant amount of memory, rather
than leaving stale lock_files for each.

Note that since "lock" is no longer a pointer, it can't be
NULL anymore. But that's OK. We used that feature only to
avoid calling rollback_lock_file() on an already-committed
lock. Since the lockfile code keeps its own "active" flag,
it's a noop to rollback an inactive lock, and we don't have
to worry about this ourselves.

Signed-off-by: Jeff King 
---
In the long run we may want to drop the "tempfiles must remain forever"
rule. This is certainly not the first time it has caused confusion or
leaks. And I don't think it's a fundamental issue, just the way the code
is written. But in the interim, this fix is probably worth doing.

 config.c | 24 +++-
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/config.c b/config.c
index d0d8ce823a..1603f96e40 100644
--- a/config.c
+++ b/config.c
@@ -2450,7 +2450,7 @@ int git_config_set_multivar_in_file_gently(const char 
*config_filename,
 {
int fd = -1, in_fd = -1;
int ret;
-   struct lock_file *lock = NULL;
+   static struct lock_file lock;
char *filename_buf = NULL;
char *contents = NULL;
size_t contents_sz;
@@ -2469,8 +2469,7 @@ int git_config_set_multivar_in_file_gently(const char 
*config_filename,
 * The lock serves a purpose in addition to locking: the new
 * contents of .git/config will be written into it.
 */
-   lock = xcalloc(1, sizeof(struct lock_file));
-   fd = hold_lock_file_for_update(lock, config_filename, 0);
+   fd = hold_lock_file_for_update(, config_filename, 0);
if (fd < 0) {
error_errno("could not lock config file %s", config_filename);
free(store.key);
@@ -2583,8 +2582,8 @@ int git_config_set_multivar_in_file_gently(const char 
*config_filename,
close(in_fd);
in_fd = -1;
 
-   if (chmod(get_lock_file_path(lock), st.st_mode & 0) < 0) {
-   error_errno("chmod on %s failed", 
get_lock_file_path(lock));
+   if (chmod(get_lock_file_path(), st.st_mode & 0) < 0) {
+   error_errno("chmod on %s failed", 
get_lock_file_path());
ret = CONFIG_NO_WRITE;
goto out_free;
}
@@ -2639,28 +2638,19 @@ int git_config_set_multivar_in_file_gently(const char 
*config_filename,
contents = NULL;
}
 
-   if (commit_lock_file(lock) < 0) {
+   if (commit_lock_file() < 0) {
error_errno("could not write config file %s", config_filename);
ret = CONFIG_NO_WRITE;
-   lock = NULL;
goto out_free;
}
 
-   /*
-* lock is committed, so don't try to roll it back below.
-* NOTE: Since lockfile.c keeps a linked list of all created
-* lock_file structures, it isn't safe to free(lock).  It's
-* better to just leave it hanging around.
-*/
-   lock = NULL;
ret = 0;
 
/* Invalidate the config cache */
git_config_clear();
 
 out_free:
-   if (lock)
-   rollback_lock_file(lock);
+   rollback_lock_file();
free(filename_buf);
if (contents)
munmap(contents, contents_sz);
@@ -2669,7 +2659,7 @@ int git_config_set_multivar_in_file_gently(const char 
*config_filename,
return ret;
 
 write_err_out:
-   ret = write_error(get_lock_file_path(lock));
+   ret = write_error(get_lock_file_path());
goto out_free;
 
 }
-- 
2.14.1.721.gc5bc1565f1



Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()

2017-08-29 Thread Jeff King
On Tue, Aug 29, 2017 at 06:51:52PM +0100, Lars Schneider wrote:

> I set $TOOL_OPTIONS in valgrind.sh: to 
> '--leak-check=full --errors-for-leak-kinds=definite'
> 
> ... but I also had to adjust t/test-lib-functions.sh:test_create_repo
> as I ran into the error "cannot run git init -- have you built things yet?".

Yeah, this is a general problem with run-time analyzers. If we're not
mostly error free than the setup code often breaks before you even get
to the interesting part of the test.

It looks like the config code has a minor-ish leak. Patch to follow.

> What if we run a few selected tests with valgrind and count all files that
> valgrind mentions (a single leak has multiple file mentions because of
> the stack trace and other leak indicators). We record these counts and let 
> TravisCI scream if one of the numbers increases.
> 
> I wonder how stable/fragile such a metric would be as a simple refactoring 
> could easily change these numbers. Below I ran valgrind on t5510 before and
> after Martin's patch. The diff below clearly shows the pkt-line leak.
> 
> Would it make sense to pursue something like this in TravisCI to avoid 
> "pkt-line" kind of leaks in the future?

I don't think that would work, because simply adding new tests would
bump the leak count, without the code actually growing worse.

But think about your strategy for a moment: what you're really trying to
do is say "these existing leaks are OK because they are too many for us
to count, but we want to make sure we don't add _new_ ones". We already
have two techniques for distinguishing old ones from new ones:

  1. Mark existing ones with valgrind suppressions so they do not warn
 at all.

  2. Fix them, so that the "existing" count drops to zero.

Option (2), of course, has the added side effect that it's actually
fixing potential problems. :)

-Peff


Re: [PATCH 00/10] Implement transactions for the packed ref store

2017-08-29 Thread Brandon Williams
On 08/29, Michael Haggerty wrote:
> This should be the second-to-last patch series in the quest for
> mmapped packed-refs.
> 
> Before this patch series, there is still more coupling than necessary
> between `files_ref_store` and `packed_ref_store`:
> 
> * `files_ref_store` adds packed references (e.g., during `git clone`
>   and `git pack-refs`) by inserting them into the `packed_ref_store`'s
>   internal `ref_cache`, then calling `commit_packed_refs()`. But
>   `files_ref_store` shouldn't even *know* that `packed_ref_store` uses
>   a `ref_cache`, let alone muck about in it.
> 
> * `files_ref_store` deletes packed references by calling
>   `repack_without_refs()`.
> 
> Instead, implement reference transactions and `delete_refs()` for
> `packed_ref_store`, and change `files_ref_store` to use these standard
> methods rather than internal `packed_ref_store` methods.
> 
> This patch series finishes up the encapsulation of `packed_ref_store`.
> At the end of the series, the outside world only interacts with it via
> the refs API plus a couple of locking-related functions. That will
> make it easy for the next patch series to change the internal workings
> of `packed_ref_store` without affecting `files_ref_store`.
> 
> Along the way, we fix some longstanding problems with reference
> updates. Quoting from patch [08/10]:
> 
> First, the old code didn't obtain the `packed-refs` lock until
> `files_transaction_finish()`. This means that a failure to acquire the
> `packed-refs` lock (e.g., due to contention with another process)
> wasn't detected until it was too late (problems like this are supposed
> to be detected in the "prepare" phase). The new code acquires the
> `packed-refs` lock in `files_transaction_prepare()`, the same stage of
> the processing when the loose reference locks are being acquired,
> removing another reason why the "prepare" phase might succeed and the
> "finish" phase might nevertheless fail.
> 
> Second, the old code deleted the loose version of a reference before
> deleting any packed version of the same reference. This left a moment
> when another process might think that the packed version of the
> reference is current, which is incorrect. (Even worse, the packed
> version of the reference can be arbitrarily old, and might even point
> at an object that has since been garbage-collected.)
> 
> Third, if a reference deletion fails to acquire the `packed-refs` lock
> altogether, then the old code might leave the repository in the
> incorrect state (possibly corrupt) described in the previous
> paragraph.
> 
> Now we activate the new "packed-refs" file (sans any references that
> are being deleted) *before* deleting the corresponding loose
> references. But we hold the "packed-refs" lock until after the loose
> references have been finalized, thus preventing a simultaneous
> "pack-refs" process from packing the loose version of the reference in
> the time gap, which would otherwise defeat our attempt to delete it.
> 
> This patch series is also available as branch
> `packed-ref-transactions` in my fork on GitHub [1]. A draft of the
> patch series to change `packed_ref_store` to use mmap and get rid of
> its `ref_cache` is available as branch `mmap-packed-refs` from the
> same source.

Overall the patches look sane to me, though I don't believe I'm
qualified in this area to give you a complete thumbs up since I don't
understand the refs code super well yet.  I do like reading patch from
you as you do a great job of laying out what you are doing in code,
comments and commit messages, something I'm trying to get better at :)

-- 
Brandon Williams


Re: [PATCH 09/10] packed-backend: rip out some now-unused code

2017-08-29 Thread Brandon Williams
On 08/29, Michael Haggerty wrote:
> Now the outside world interacts with the packed ref store only via the
> generic refs API plus a few lock-related functions. This allows us to
> delete some functions that are no longer used, thereby completing the
> encapsulation of the packed ref store.

Love seeing patches which only remove old code :)

> 
> Signed-off-by: Michael Haggerty 
> ---
>  refs/packed-backend.c | 193 
> --
>  refs/packed-backend.h |   8 ---
>  2 files changed, 201 deletions(-)
> 
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 83a088118f..90f44c1fbb 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -91,19 +91,6 @@ struct ref_store *packed_ref_store_create(const char *path,
>   return ref_store;
>  }
>  
> -/*
> - * Die if refs is not the main ref store. caller is used in any
> - * necessary error messages.
> - */
> -static void packed_assert_main_repository(struct packed_ref_store *refs,
> -   const char *caller)
> -{
> - if (refs->store_flags & REF_STORE_MAIN)
> - return;
> -
> - die("BUG: operation %s only allowed for main ref store", caller);
> -}
> -
>  /*
>   * Downcast `ref_store` to `packed_ref_store`. Die if `ref_store` is
>   * not a `packed_ref_store`. Also die if `packed_ref_store` doesn't
> @@ -321,40 +308,6 @@ static struct ref_dir *get_packed_refs(struct 
> packed_ref_store *refs)
>   return get_packed_ref_dir(get_packed_ref_cache(refs));
>  }
>  
> -/*
> - * Add or overwrite a reference in the in-memory packed reference
> - * cache. This may only be called while the packed-refs file is locked
> - * (see packed_refs_lock()). To actually write the packed-refs file,
> - * call commit_packed_refs().
> - */
> -void add_packed_ref(struct ref_store *ref_store,
> - const char *refname, const struct object_id *oid)
> -{
> - struct packed_ref_store *refs =
> - packed_downcast(ref_store, REF_STORE_WRITE,
> - "add_packed_ref");
> - struct ref_dir *packed_refs;
> - struct ref_entry *packed_entry;
> -
> - if (!is_lock_file_locked(>lock))
> - die("BUG: packed refs not locked");
> -
> - if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL))
> - die("Reference has invalid format: '%s'", refname);
> -
> - packed_refs = get_packed_refs(refs);
> - packed_entry = find_ref_entry(packed_refs, refname);
> - if (packed_entry) {
> - /* Overwrite the existing entry: */
> - oidcpy(_entry->u.value.oid, oid);
> - packed_entry->flag = REF_ISPACKED;
> - oidclr(_entry->u.value.peeled);
> - } else {
> - packed_entry = create_ref_entry(refname, oid, REF_ISPACKED);
> - add_ref_entry(packed_refs, packed_entry);
> - }
> -}
> -
>  /*
>   * Return the ref_entry for the given refname from the packed
>   * references.  If it does not exist, return NULL.
> @@ -592,152 +545,6 @@ int packed_refs_is_locked(struct ref_store *ref_store)
>  static const char PACKED_REFS_HEADER[] =
>   "# pack-refs with: peeled fully-peeled \n";
>  
> -/*
> - * Write the current version of the packed refs cache from memory to
> - * disk. The packed-refs file must already be locked for writing (see
> - * packed_refs_lock()). Return zero on success. On errors, rollback
> - * the lockfile, write an error message to `err`, and return a nonzero
> - * value.
> - */
> -int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err)
> -{
> - struct packed_ref_store *refs =
> - packed_downcast(ref_store, REF_STORE_WRITE | REF_STORE_MAIN,
> - "commit_packed_refs");
> - struct packed_ref_cache *packed_ref_cache =
> - get_packed_ref_cache(refs);
> - int ok;
> - int ret = -1;
> - struct strbuf sb = STRBUF_INIT;
> - FILE *out;
> - struct ref_iterator *iter;
> - char *packed_refs_path;
> -
> - if (!is_lock_file_locked(>lock))
> - die("BUG: commit_packed_refs() called when unlocked");
> -
> - /*
> -  * If packed-refs is a symlink, we want to overwrite the
> -  * symlinked-to file, not the symlink itself. Also, put the
> -  * staging file next to it:
> -  */
> - packed_refs_path = get_locked_file_path(>lock);
> - strbuf_addf(, "%s.new", packed_refs_path);
> - if (create_tempfile(>tempfile, sb.buf) < 0) {
> - strbuf_addf(err, "unable to create file %s: %s",
> - sb.buf, strerror(errno));
> - strbuf_release();
> - goto out;
> - }
> - strbuf_release();
> -
> - out = fdopen_tempfile(>tempfile, "w");
> - if (!out) {
> - strbuf_addf(err, "unable to fdopen packed-refs tempfile: %s",
> - strerror(errno));
> - goto error;
> - }
> -
> - 

Re: [PATCH 04/10] packed_delete_refs(): implement method

2017-08-29 Thread Brandon Williams
On 08/29, Michael Haggerty wrote:
> Implement `packed_delete_refs()` using a reference transaction. This
> means that `files_delete_refs()` can use `refs_delete_refs()` instead
> of `repack_without_refs()` to delete any packed references, decreasing
> the coupling between the classes.
> 
> Signed-off-by: Michael Haggerty 
> ---
>  refs/files-backend.c  |  2 +-
>  refs/packed-backend.c | 45 -
>  2 files changed, 45 insertions(+), 2 deletions(-)
> 
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index fccbc24ac4..2c78f63494 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -1157,7 +1157,7 @@ static int files_delete_refs(struct ref_store 
> *ref_store, const char *msg,
>   if (packed_refs_lock(refs->packed_ref_store, 0, ))
>   goto error;
>  
> - if (repack_without_refs(refs->packed_ref_store, refnames, )) {
> + if (refs_delete_refs(refs->packed_ref_store, msg, refnames, flags)) {
>   packed_refs_unlock(refs->packed_ref_store);
>   goto error;
>   }
> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index d19b3bfba5..83a088118f 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -1082,7 +1082,50 @@ static int packed_initial_transaction_commit(struct 
> ref_store *ref_store,
>  static int packed_delete_refs(struct ref_store *ref_store, const char *msg,
>struct string_list *refnames, unsigned int flags)
>  {
> - die("BUG: not implemented yet");
> + struct packed_ref_store *refs =
> + packed_downcast(ref_store, REF_STORE_WRITE, "delete_refs");
> + struct strbuf err = STRBUF_INIT;
> + struct ref_transaction *transaction;
> + struct string_list_item *item;
> + int ret;
> +
> + (void)refs; /* We need the check above, but don't use the variable */

Can't say I've seen a line like this before, Is the intent to just mark
that this variable won't be used like you mention in the comment?

> +
> + if (!refnames->nr)
> + return 0;
> +
> + /*
> +  * Since we don't check the references' old_oids, the
> +  * individual updates can't fail, so we can pack all of the
> +  * updates into a single transaction.
> +  */
> +
> + transaction = ref_store_transaction_begin(ref_store, );
> + if (!transaction)
> + return -1;
> +
> + for_each_string_list_item(item, refnames) {
> + if (ref_transaction_delete(transaction, item->string, NULL,
> +flags, msg, )) {
> + warning(_("could not delete reference %s: %s"),
> + item->string, err.buf);
> + strbuf_reset();
> + }
> + }
> +
> + ret = ref_transaction_commit(transaction, );
> +
> + if (ret) {
> + if (refnames->nr == 1)
> + error(_("could not delete reference %s: %s"),
> +   refnames->items[0].string, err.buf);
> + else
> + error(_("could not delete references: %s"), err.buf);
> + }
> +
> + ref_transaction_free(transaction);
> + strbuf_release();
> + return ret;
>  }
>  
>  static int packed_pack_refs(struct ref_store *ref_store, unsigned int flags)
> -- 
> 2.14.1
> 

-- 
Brandon Williams


Re: [PATCH] pkt-line: re-'static'-ify buffer in packet_write_fmt_1()

2017-08-29 Thread Lars Schneider

> On 28 Aug 2017, at 06:11, Martin Ågren  wrote:
> 
> On 28 August 2017 at 01:23, Jeff King  wrote:
>> On Sun, Aug 27, 2017 at 10:04:55PM +0200, Lars Schneider wrote:
>> 
>>> I did run all tests under valgrind using "make valgrind" and I found
>>> the following files with potential issues:
>>> 
>>> cat valgrind.out | perl -nE 'say /^==.+\((.+)\:.+\)$/' | sort | uniq -c
>>> 7102
>>>  2 clone.c
>>> 33 common-main.c
>>>  6 connect.c
>>> 64 git.c
>>>  4 ls-remote.c
>>> 126 run-command.c
>>> 12 transport.c
>>>  7 worktree.c
>> 
>> I'm not sure where valgrind.out comes from. The individual
>> test-results/*.out files may have valgrind output, but I don't think
>> they usually contain leak output.
>> 
>> Doing "valgrind ./git-upload-pack . /dev/null" mentions
>> leaked memory but not the locations. Adding --leak-check=full shows that
>> most of it comes from format_packet().
>> 
>> And applying Martin's patch drops the "definitely lost" category down to
>> 0 bytes (there's still 550k in "still reachable", but those are in the
>> "exit will free them for us" category).
>> 
>>> No mention of "pkt-line.c". Did you run Git with valgrind on one of
>>> your repositories to find it?
>> 
>> I'm curious, too. I don't think the valgrind setup in our test suite is
>> great for finding leaks right now.
> 
> Sorry for being brief. I've patched t/valgrind/valgrind.sh to say
> "--leak-check=yes". Then I run "./t --valgrind", simply because
> running the complete suite gives more reports than I could possibly
> process.

I set $TOOL_OPTIONS in valgrind.sh: to 
'--leak-check=full --errors-for-leak-kinds=definite'

... but I also had to adjust t/test-lib-functions.sh:test_create_repo
as I ran into the error "cannot run git init -- have you built things yet?".

With these changes I was able to see the leak running valgrind with t5110.

---

What if we run a few selected tests with valgrind and count all files that
valgrind mentions (a single leak has multiple file mentions because of
the stack trace and other leak indicators). We record these counts and let 
TravisCI scream if one of the numbers increases.

I wonder how stable/fragile such a metric would be as a simple refactoring 
could easily change these numbers. Below I ran valgrind on t5510 before and
after Martin's patch. The diff below clearly shows the pkt-line leak.

Would it make sense to pursue something like this in TravisCI to avoid 
"pkt-line" kind of leaks in the future? 


## Valgrind run with leak

$ ./t5510-fetch.sh --valgrind | perl -nE 'say /^==.+\((.+)\:.+\)$/' | sort | 
uniq -c
15529
  39 abspath.c
  30 add.c
  34 branch.c
  10 bundle.c
 268 clone.c
  13 commit.c
 471 common-main.c
  52 config.c
  50 connect.c
   4 connected.c
   1 diff-lib.c
 102 dir.c
 120 environment.c
   4 fetch-pack.c
  47 fetch.c
  14 files-backend.c
   1 git-compat-util.h
 871 git.c
  96 init-db.c
  26 iterator.c
   1 list-objects.c
   4 log-tree.c
   2 object.c
   1 pack-objects.c
   6 parse-options.c
  10 pathspec.c
  83 pkt-line.c
   6 precompose_utf8.c
   2 push.c
  67 refs.c
  96 remote.c
  90 repository.c
   4 rev-list.c
  10 revision.c
 165 run-command.c
  42 setup.c
 288 strbuf.c
  51 strbuf.h
   2 tag.c
 107 transport.c
  10 tree-diff.c
  85 upload-pack.c
   1 usage.c
 455 wrapper.c


## Valgrind run with Matrin's patch to fix the leak
$ ./t5510-fetch.sh --valgrind | perl -nE 'say /^==.+\((.+)\:.+\)$/' | sort | 
uniq -c
14931
  39 abspath.c
  30 add.c
  34 branch.c
  10 bundle.c
 268 clone.c
  13 commit.c
 433 common-main.c
  52 config.c
  50 connect.c
   6 connected.c
   1 diff-lib.c
 102 dir.c
 120 environment.c
   4 fetch-pack.c
  53 fetch.c
  14 files-backend.c
   1 git-compat-util.h
 879 git.c
  96 init-db.c
   1 iterator.c
   1 list-objects.c
   4 log-tree.c
   2 object.c
   1 pack-objects.c
   6 parse-options.c
  10 pathspec.c
   6 precompose_utf8.c
   2 push.c
  26 refs.c
  96 remote.c
  90 repository.c
   6 rev-list.c
  14 revision.c
 171 run-command.c
  42 setup.c
 246 strbuf.c
  50 strbuf.h
   2 tag.c
 107 transport.c
  10 tree-diff.c
   2 upload-pack.c
   1 usage.c
 415 wrapper.c


## Diff

15529| 14931 
  39 abspath.c   39 abspath.c
  30 add.c   30 add.c
  34 branch.c34 branch.c
  10 bundle.c10 bundle.c
 268 clone.c268 clone.c
  13 commit.c13 commit.c
 471 common-main.c   |  433 common-main.c
  52 config.c52 config.c
  50 connect.c   50 connect.c
   4 connected.c |6 connected.c
   1 diff-lib.c   1 diff-lib.c
 102 dir.c  102 dir.c
 120 environment.c  120 environment.c
   4 fetch-pack.c 4 fetch-pack.c
  47 fetch.c |   53 fetch.c
  14 files-backend.c 14 files-backend.c
   1 git-compat-util.h1 git-compat-util.h
 871 git.c   |  879 git.c
  96 init-db.c   96 init-db.c
  26 

[PATCH v3 3/3] refs/files-backend: correct return value in lock_ref_for_update

2017-08-29 Thread Martin Ågren
In one code path we return a literal -1 and not a symbolic constant. The
value -1 would be interpreted as TRANSACTION_NAME_CONFLICT, which is
wrong. Use TRANSACTION_GENERIC_ERROR instead (that is the only other
return value we have to choose from).

Noticed-by: Michael Haggerty 
Signed-off-by: Martin Ågren 
---
v3: this is a new patch, as suggested by Michael

 refs/files-backend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index a2b3df21b..ad05d1d5f 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2311,7 +2311,7 @@ static int lock_ref_for_update(struct files_ref_store 
*refs,
strbuf_addf(err, "cannot lock ref '%s': 
"
"error reading reference",

original_update_refname(update));
-   ret = -1;
+   ret = TRANSACTION_GENERIC_ERROR;
goto out;
}
} else if (check_old_oid(update, >old_oid, err)) {
-- 
2.14.1.151.g45c1275a3.dirty



[PATCH v3 2/3] refs/files-backend: fix memory leak in lock_ref_for_update

2017-08-29 Thread Martin Ågren
After the previous patch, none of the functions we call hold on to
`referent.buf`, so we can safely release the string buffer before
returning.

Signed-off-by: Martin Ågren 
---
v3: identical to v2

 refs/files-backend.c | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 73615d869..a2b3df21b 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2266,7 +2266,7 @@ static int lock_ref_for_update(struct files_ref_store 
*refs,
struct strbuf referent = STRBUF_INIT;
int mustexist = (update->flags & REF_HAVE_OLD) &&
!is_null_oid(>old_oid);
-   int ret;
+   int ret = 0;
struct ref_lock *lock;
 
files_assert_main_repository(refs, "lock_ref_for_update");
@@ -2278,7 +2278,7 @@ static int lock_ref_for_update(struct files_ref_store 
*refs,
ret = split_head_update(update, transaction, head_ref,
affected_refnames, err);
if (ret)
-   return ret;
+   goto out;
}
 
ret = lock_raw_ref(refs, update->refname, mustexist,
@@ -2292,7 +2292,7 @@ static int lock_ref_for_update(struct files_ref_store 
*refs,
strbuf_addf(err, "cannot lock ref '%s': %s",
original_update_refname(update), reason);
free(reason);
-   return ret;
+   goto out;
}
 
update->backend_data = lock;
@@ -2311,10 +2311,12 @@ static int lock_ref_for_update(struct files_ref_store 
*refs,
strbuf_addf(err, "cannot lock ref '%s': 
"
"error reading reference",

original_update_refname(update));
-   return -1;
+   ret = -1;
+   goto out;
}
} else if (check_old_oid(update, >old_oid, err)) {
-   return TRANSACTION_GENERIC_ERROR;
+   ret = TRANSACTION_GENERIC_ERROR;
+   goto out;
}
} else {
/*
@@ -2328,13 +2330,15 @@ static int lock_ref_for_update(struct files_ref_store 
*refs,
  referent.buf, transaction,
  affected_refnames, err);
if (ret)
-   return ret;
+   goto out;
}
} else {
struct ref_update *parent_update;
 
-   if (check_old_oid(update, >old_oid, err))
-   return TRANSACTION_GENERIC_ERROR;
+   if (check_old_oid(update, >old_oid, err)) {
+   ret = TRANSACTION_GENERIC_ERROR;
+   goto out;
+   }
 
/*
 * If this update is happening indirectly because of a
@@ -2371,7 +2375,8 @@ static int lock_ref_for_update(struct files_ref_store 
*refs,
"cannot update ref '%s': %s",
update->refname, write_err);
free(write_err);
-   return TRANSACTION_GENERIC_ERROR;
+   ret = TRANSACTION_GENERIC_ERROR;
+   goto out;
} else {
update->flags |= REF_NEEDS_COMMIT;
}
@@ -2385,10 +2390,14 @@ static int lock_ref_for_update(struct files_ref_store 
*refs,
if (close_ref(lock)) {
strbuf_addf(err, "couldn't close '%s.lock'",
update->refname);
-   return TRANSACTION_GENERIC_ERROR;
+   ret = TRANSACTION_GENERIC_ERROR;
+   goto out;
}
}
-   return 0;
+
+out:
+   strbuf_release();
+   return ret;
 }
 
 /*
-- 
2.14.1.151.g45c1275a3.dirty



[PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list

2017-08-29 Thread Martin Ågren
split_symref_update() receives a string-pointer `referent` and adds it
to the list of `affected_refnames`. The list simply holds on to the
pointers it is given, it does not copy the strings and it does not ever
free them. The `referent` string in split_symref_update() belongs to a
string buffer in the caller. After we return, the string will be leaked.

In the next patch, we want to properly release the string buffer in the
caller, but we can't safely do so until we've made sure that
`affected_refnames` will not be holding on to a pointer to the string.
We could configure the list to handle its own resources, but it would
mean some alloc/free-churning. The list is already handling other
strings (through other code paths) which we do not need to worry about,
and we'd be memory-churning those strings too, completely unnecessary.

Observe that split_symref_update() creates a `new_update`-object through
ref_transaction_add_update(), after which `new_update->refname` is a
copy of `referent`. The difference is, this copy will be freed, and it
will be freed *after* `affected_refnames` has been cleared.

Rearrange the handling of `referent`, so that we don't add it directly
to `affected_refnames`. Instead, first just check whether `referent`
exists in the string list, and later add `new_update->refname`.

Helped-by: Michael Haggerty 
Signed-off-by: Martin Ågren 
---
v3: "O(lg N)" in comment; if-BUG() instead of assert()

 refs/files-backend.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index fccbc24ac..73615d869 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2144,13 +2144,12 @@ static int split_symref_update(struct files_ref_store 
*refs,
 
/*
 * First make sure that referent is not already in the
-* transaction. This insertion is O(N) in the transaction
+* transaction. This check is O(lg N) in the transaction
 * size, but it happens at most once per symref in a
 * transaction.
 */
-   item = string_list_insert(affected_refnames, referent);
-   if (item->util) {
-   /* An entry already existed */
+   if (string_list_has_string(affected_refnames, referent)) {
+   /* An entry already exists */
strbuf_addf(err,
"multiple updates for '%s' (including one "
"via symref '%s') are not allowed",
@@ -2185,6 +2184,17 @@ static int split_symref_update(struct files_ref_store 
*refs,
update->flags |= REF_LOG_ONLY | REF_NODEREF;
update->flags &= ~REF_HAVE_OLD;
 
+   /*
+* Add the referent. This insertion is O(N) in the transaction
+* size, but it happens at most once per symref in a
+* transaction. Make sure to add new_update->refname, which will
+* be valid as long as affected_refnames is in use, and NOT
+* referent, which might soon be freed by our caller.
+*/
+   item = string_list_insert(affected_refnames, new_update->refname);
+   if (item->util)
+   BUG("%s unexpectedly found in affected_refnames",
+   new_update->refname);
item->util = new_update;
 
return 0;
-- 
2.14.1.151.g45c1275a3.dirty



Re: [PATCH] diff-highlight: Add clean target to Makefile

2017-08-29 Thread Jeff King
On Tue, Aug 29, 2017 at 12:23:11PM +0100, Daniel Watkins wrote:

> Now that `make` produces a file, we should have a clean target to remove
> it.
> [...]
> +clean:
> + $(RM) diff-highlight
> +

Makes sense. Thanks!

-Peff


Re: [PATCH 2/3] merge-recursive: remove return value from get_files_dirs

2017-08-29 Thread Jeff King
On Tue, Aug 29, 2017 at 03:58:00PM +, Kevin Willford wrote:

> > > The return value of the get_files_dirs call is never being used.
> > > Looking at the history of the file and it was originally only
> > > being used for debug output statements.  Also when
> > > read_tree_recursive return value is non zero it is changed to
> > > zero.  This leads me to believe that it doesn't matter if
> > > read_tree_recursive gets an error.
> > 
> > Or that the function is buggy. :)
> 
> That was one of my questions as well.  Should read_tree_recursive
> be propagating a -1 and merge_trees be checking for that and bail
> when the call to get_files_dirs return is < 0?  I made a commit with
> this change and ran the tests and they all still passed so either this
> return really doesn't matter or there are not sufficient tests covering
> it.

Right, in a normal flow I'd say that it should propagate the -1, etc.
But since the only possible error is parse_tree() failing, which happens
in a corrupt repository, I think it would be OK to just die(). That
saves having to deal with error propagation. It's not very graceful,
perhaps, but the important thing is that we don't quietly ignore the
error.

> I went with this change because it was not changing any of the
> current functionality and if we find a case where it matters that
> read_tree_recursive fails due to bad tree or something else we
> can address it then.

I think it's worth doing while we're thinking about it now, but it
certainly can be a separate patch.

-Peff


RE: [PATCH 2/3] merge-recursive: remove return value from get_files_dirs

2017-08-29 Thread Kevin Willford
> 
> On Mon, Aug 28, 2017 at 02:28:28PM -0600, Kevin Willford wrote:
> 
> > The return value of the get_files_dirs call is never being used.
> > Looking at the history of the file and it was originally only
> > being used for debug output statements.  Also when
> > read_tree_recursive return value is non zero it is changed to
> > zero.  This leads me to believe that it doesn't matter if
> > read_tree_recursive gets an error.
> 
> Or that the function is buggy. :)

That was one of my questions as well.  Should read_tree_recursive
be propagating a -1 and merge_trees be checking for that and bail
when the call to get_files_dirs return is < 0?  I made a commit with
this change and ran the tests and they all still passed so either this
return really doesn't matter or there are not sufficient tests covering
it.

I went with this change because it was not changing any of the
current functionality and if we find a case where it matters that
read_tree_recursive fails due to bad tree or something else we
can address it then.

> 
> I'm tempted to say that we should probably die() when
> read_tree_recursive fails. This should only happen if we fail to parse
> the tree, or if our callback (save_files_dirs here) returns failure, and
> the latter looks like it never happens.
> 
> > Since the debug output has been removed and the caller isn't
> > checking the return value there is no reason to keep calulating
> > and returning a value.
> 
> Agreed, and I'm happy to see dead code go.
> 
> Minor nit: s/calulating/calculating/ in your commit message.

When will that spell checker for git messages be ready? ;)

> 
> -Peff


Re: [PATCH v5 35/40] Add Documentation/technical/external-odb.txt

2017-08-29 Thread Christian Couder
On Mon, Aug 28, 2017 at 8:59 PM, Ben Peart  wrote:
>
> On 8/3/2017 5:19 AM, Christian Couder wrote:
>>
>> +Helpers
>> +===
>> +
>> +ODB helpers are commands that have to be registered using either the
>> +"odb..subprocessCommand" or the "odb..scriptCommand"
>> +config variables.
>> +
>> +Registering such a command tells Git that an external odb called
>> + exists and that the registered command should be used to
>> +communicate with it.
>
> What order are the odb handlers called? Are they called before or after the
> regular object store code for loose, pack and alternates?  Is the order
> configurable?

For get_*_object instructions the regular code is called before the odb helpers.
(So the odb helper code is at the end of stat_sha1_file() and of
open_sha1_file() in sha1_file.c.)

For put_*_object instructions the regular code is called after the odb helpers.
(So the odb helper code is at the beginning of write_sha1_file() in
sha1_file.c.)

And no this order is not configurable, but of course it could be made
configurable.

>> + - 'get_direct '
>> +
>> +This instruction is similar as the other 'get_*' instructions except
>> +that no object should be sent from the helper to Git. Instead the
>> +helper should directly write the requested object into a loose object
>> +file in the ".git/objects" directory.
>> +
>> +After the helper has sent the "status=success" packet and the
>> +following flush packet in process mode, or after it has exited in the
>> +script mode, Git should lookup again for a loose object file with the
>> +requested sha1.
>
> When will git call get_direct vs one of the other get_* functions?

It is called just before exiting when git cannot find an object.
It is not exactly at the same place as other get_* instructions as I
tried to reuse your code and as it looks like it makes it easier to
retry the regular code after the odb helper code.

> Could the
> functionality of enabling a helper to populate objects into the regular
> object store be provided by having a ODB helper that returned the object
> data as requested by get_git_obj or get_raw_obj but also stored it in the
> regular object store as a loose object (or pack file) for future calls?

I am not sure I understand what you mean.
If a helper returns the object data as requested by get_git_obj or
get_raw_obj, then currently Git will itself store the object locally
in its regular object store, so it is redundant for the helper to also
store or try to store the object in the regular object store.


Re: git describe and "the smallest number of commits possible"

2017-08-29 Thread Michael J Gruber
Stefan Beller venit, vidit, dixit 28.08.2017 20:24:
> On Sat, Aug 26, 2017 at 7:47 AM, Kévin Le Gouguec
>  wrote:
>> Hi,
>>
>> I've asked this question on the git-users Google Groups list[1], and
>> while the answers there were interesting, I still cannot figure
>> whether my problem comes from an actual bug, a misleading manpage, my
>> inability to understand the code and/or the manpage, or a combination
>> of the three.
>>
>> I noticed this problem on 2.1.4 (Debian oldstable); I can reproduce it
>> on next (7ef03bb281b2220d0f2414365e4949beb2337042). Quoting
>> git-describe(1)'s SEARCH STRATEGY section:
>>
>>> If multiple tags were found during the walk then the tag which has
>>> the fewest commits different from the input commit-ish will be
>>> selected and output. Here fewest commits different is defined as the
>>> number of commits which would be shown by `git log tag..input` will
>>> be the smallest number of commits possible.
> 
> Maybe relevant
> https://public-inbox.org/git/20160421113004.ga3...@aepfle.de/
> (specifically the discussion after this patch, it sheds light on the
> heuristics used, though your case here doesn't use these heuristics)

git describe is driving my crazy sometimes, too - both the results and
the code ;)

[long read with lot's of debug output follows, sorry]

What is going on here is the following:
git describe --tags decides based on "depth".

git tag --merged dcc3ef3ee7 lists both emacs-25.1 and emac-25.2 (among
many others).

git rev-list --count emacs-25.1..dcc3ef3ee7
5126

git rev-list --count emacs-25.2..dcc3ef3ee7

4793

So, you would rightly expect 25.2 to be preferred over 25.1.

Now, "git describe" does not do a "rev-list --count" for each tag but,
instead, walks back from dcc3ef3ee7 and notes when it reaches a tagged
commit, and how many steps it takes to get there. The information
"reachable by the i-th tag that I encountered" is kept by a field of
width 27, which limits the --candidates values. Once the walk passes
more tags it is aborted.

Then the possible matches are sorted.

Then the depth calculation is finished.

Funny order of actions, isn't it? Smells like BUG.
At least it makes the debug output meaningless.

Here's the debug output before and after finish_depth_calculation():

searching to describe HEAD
 lightweight 4069 emacs-25.1
 lightweight 4086 emacs-25.1-rc2
 lightweight 4126 emacs-25.1-rc1
 annotated   4185 emacs-25.2
 annotated   4220 emacs-25.2-rc2
 lightweight 4226 emacs-25.0.95
 annotated   4236 emacs-25.2-rc1
 annotated   4280 emacs-25.1.91
 lightweight 4305 emacs-25.0.94
 lightweight 4329 emacs-25.1.90
traversed 4462 commits
more than 10 tags found; listed 10 most recent
gave up search at 5c587fdff164e8b90beb47f6da64b4884290e40a
[finish_depth_calculation()]
 lightweight   129847 emacs-25.1
 lightweight 4086 emacs-25.1-rc2
 lightweight 4126 emacs-25.1-rc1
 annotated   4185 emacs-25.2
 annotated   4220 emacs-25.2-rc2
 lightweight 4226 emacs-25.0.95
 annotated   4236 emacs-25.2-rc1
 annotated   4280 emacs-25.1.91
 lightweight 4305 emacs-25.0.94
 lightweight 4329 emacs-25.1.90
traversed 130257 commits
more than 10 tags found; listed 10 most recent
gave up search at 5c587fdff164e8b90beb47f6da64b4884290e40a
emacs-25.1-129847-gdcc3ef3ee7

Now, I can't claim that I can wrap my head around
finish_depth_calculation() and the way depth is calculated here, let
alone those huge numbers (probably per not-contained tag or so), but
that order of operations is clearly wrong.

Now, if we fix finish_depth_calculation() to actually do the work for
all possible candidates, and the sort afterwards, the result is:

LANG=C devgit describe --tags --debug --candidates=10
searching to describe HEAD
 lightweight   129847 emacs-25.1
 lightweight   129864 emacs-25.1-rc2
 lightweight   129904 emacs-25.1-rc1
 annotated 129981 emacs-25.2
 lightweight   130004 emacs-25.0.95
 annotated 130016 emacs-25.2-rc2
 annotated 130032 emacs-25.2-rc1
 annotated 130076 emacs-25.1.91
 lightweight   130083 emacs-25.0.94
 lightweight   130125 emacs-25.1.90
traversed 130257 commits
more than 10 tags found; listed 10 most recent
gave up search at 5c587fdff164e8b90beb47f6da64b4884290e40a
emacs-25.1-129847-gdcc3ef3ee7

LANG=C devgit describe --tags --debug --candidates=100

searching to describe HEAD
 annotated  13470 emacs-24.5-rc3-fixed
 lightweight13471 emacs-24.5-rc3
 lightweight13476 emacs-24.5-rc2
 lightweight13482 emacs-24.5-rc1
 lightweight13511 emacs-24.4.91
 lightweight13554 emacs-24.4.90
 lightweight13899 emacs-24.4
 lightweight13902 emacs-24.4-rc1
 lightweight13959 emacs-24.3.94
 annotated  13972 mh-e-8.6
 lightweight14049 emacs-24.3.93
 lightweight14176 emacs-24.3.92
 lightweight   129847 emacs-25.1
 lightweight   129864 emacs-25.1-rc2
 lightweight   129904 emacs-25.1-rc1
 lightweight   129971 emacs-25.0.92
 

[PATCH] diff-highlight: Add clean target to Makefile

2017-08-29 Thread Daniel Watkins
Now that `make` produces a file, we should have a clean target to remove
it.

Signed-off-by: Daniel Watkins 
---
 contrib/diff-highlight/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/contrib/diff-highlight/Makefile b/contrib/diff-highlight/Makefile
index fbf5c5824..f2be7cc92 100644
--- a/contrib/diff-highlight/Makefile
+++ b/contrib/diff-highlight/Makefile
@@ -17,4 +17,7 @@ shebang.perl: FORCE
 test: all
$(MAKE) -C t
 
+clean:
+   $(RM) diff-highlight
+
 .PHONY: FORCE
-- 
2.14.1



Re: [PATCH v2 2/2] refs/files-backend: fix memory leak in lock_ref_for_update

2017-08-29 Thread Martin Ågren
On 29 August 2017 at 10:39, Michael Haggerty  wrote:
> On 08/28/2017 10:32 PM, Martin Ågren wrote:
>> After the previous patch, none of the functions we call hold on to
>> `referent.buf`, so we can safely release the string buffer before
>> returning.
>
> This patch looks good to me, but I did notice a pre-existing problem in
> the area...
>
>> ---
>>  refs/files-backend.c | 31 ---
>>  1 file changed, 20 insertions(+), 11 deletions(-)
>>
>> diff --git a/refs/files-backend.c b/refs/files-backend.c
>> index bdb0e22e5..15f34b10e 100644
>> --- a/refs/files-backend.c
>> +++ b/refs/files-backend.c
>> [...]
>> @@ -2305,10 +2305,12 @@ static int lock_ref_for_update(struct 
>> files_ref_store *refs,
>>   strbuf_addf(err, "cannot lock ref 
>> '%s': "
>>   "error reading reference",
>>   
>> original_update_refname(update));
>> - return -1;
>> + ret = -1;
>> + goto out;
>
> It is incorrect to return -1 here. First of all, stylistically, the
> return value should be a symbolic constant. But in fact, it should be
> returning `TRANSACTION_GENERIC_ERROR` here, whereas -1 is
> `TRANSACTION_NAME_CONFLICT`. So the code is not just stylistically
> wrong; it is functionally wrong.
>
> I know that this is not your mistake, but would you like to add another
> patch to your series to fix this up? I'd do it myself, but it's a little
> bit awkward because the fix will conflict with your patch.

Sure. I'll send out a v3 later today. I'll fix this in a third patch,
and I'll also address your comments on the first patch.

Thanks a lot.

Martin


Re: [PATCH v5 35/40] Add Documentation/technical/external-odb.txt

2017-08-29 Thread Christian Couder
On Fri, Aug 25, 2017 at 11:23 PM, Jonathan Tan  wrote:
> On Fri, 25 Aug 2017 08:14:08 +0200
> Christian Couder  wrote:
>
>> As Git is used by more and more by people having different needs, I
>> think it is not realistic to expect that we can optimize its object
>> storage for all these different needs. So a better strategy is to just
>> let them store objects in external stores.
> [snip]
>> About these many use cases, I gave the "really big binary files"
>> example which is why Git LFS exists (and which GitLab is interested in
>> better solving), and the "really big number of files that are fetched
>> only as needed" example which Microsoft is interested in solving. I
>> could also imagine that some people have both big text files and big
>> binary files in which case the "core.bigfilethreshold" might not work
>> well, or that some people already have blobs in some different stores
>> (like HTTP servers, Docker registries, artifact stores, ...) and want
>> to fetch them from there as much as possible.
>
> Thanks for explaining the use cases - this makes sense, especially the
> last one which motivates the different modes for the "get" command
> (return raw bytes vs populating the Git repository with loose/packed
> objects).

Thanks for this comment. Now the beginning of the "Purpose" section
looks like this:


The purpose of this mechanism is to make possible to handle Git
objects, especially blobs, in much more flexible ways.

Currently Git can store its objects only in the form of loose objects
in separate files or packed objects in a pack file. These existing
object stores cannot be easily optimized for many different kind of
contents.

So the current stores are not flexible enough for some important use
cases like handling really big binary files or handling a really big
number of files that are fetched only as needed. And it is not
realistic to expect that Git could fully natively handle many of such
use cases. Git would need to natively implement different internal
stores which would be a huge burden and which could lead to
re-implement things like HTTP servers, Docker registries or artifact
stores that already exist outside Git.


>> And then letting people
>> use different stores can make clones or fetches restartable which
>> would solve another problem people have long been complaining about...
>
> This is unrelated to the rest of my e-mail, but out of curiosity, how
> would a different store make clones or fetches restartable? Do you mean
> that Git would invoke a "fetch" command through the ODB protocol instead
> of using its own native protocol?

Yeah, the idea is that during a clone (or a fetch), Git could first
fetch some refs with "meta information", for example refs/odb/magic/*
(where "magic" is the odb name) which would tell if some (new) bundles
are available.
If there are (new) bundles available then during the 'init'
instruction, which takes place just after this first fetch, the
external odb helper will notice that and "fetch" the (new) bundles
using a restartable protocol, for example HTTP.

If something goes wrong when the helper "fetches" a bundle, the helper
could force the clone (or the fetch) to error out (after maybe
retrying), and when the user (or the helper itself) tries again to
clone (or fetch), the helper would restart its bundle "fetch" (using
the restartable protocol).

When this "fetch" eventually succeeds, then the helper will unbundle
what it received, and then give back control to the second regular
part of the clone (or fetch). This regular part of the clone (or
fetch) will then try to fetch the usual refs, but as the unbundling
has already updated the content of the usual refs (as well as the
object stores) this fetch will find that everything is up-to-date.

Ok, maybe everything is not quite up-to-date and there are still
things to fetch, but anyway the biggest part of the clone (or fetch)
has already been made using a restartable protocol, so we are doing
much better than if we are not restartable at all.

There are examples in t0430 at the end of the patch series of
restartable clones. They don't really test that one can indeed restart
a clone, but they show that things "just work" when `git clone` is
split into an initial fetch (that updates "meta information" refs),
then the 'init' instruction sent to an helper (that fetches and
unbundles a bundle based on the "meta information" refs) and then the
regular part of the clone.

>> >> +Furthermore many improvements that are dependent on specific setups
>> >> +could be implemented in the way Git objects are managed if it was
>> >> +possible to customize how the Git objects are handled. For example a
>> >> +restartable clone using the bundle mechanism has often been requested,
>> >> +but implementing that would go against the current strict rules under
>> >> +which the Git objects are currently handled.
>> >
>> > So in this example, you would 

Re: [PATCH 3/3] merge-recursive: change current file dir string_lists to hashmap

2017-08-29 Thread Jeff King
On Mon, Aug 28, 2017 at 02:28:29PM -0600, Kevin Willford wrote:

> The code was using two string_lists, one for the directories and
> one for the files.  The code never checks the lists independently
> so we should be able to only use one list.  The string_list also
> is a O(log n) for lookup and insertion.  Switching this to use a
> hashmap will give O(1) which will save some time when there are
> millions of paths that will be checked.

Makes sense. I think when this code was written we didn't have the
generalized hashmap.c, which is why it relies on string lists.

> diff --git a/merge-recursive.c b/merge-recursive.c
> index d47f40ea81..ef4fe5f09f 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -24,6 +24,26 @@
>  #include "dir.h"
>  #include "submodule.h"
>  
> +struct path_hashmap_entry {
> + struct hashmap_entry;
> + char path[FLEX_ARRAY];
> +};

This seems like a good use of FLEX_ARRAY.

> +static unsigned int (*pathhash)(const char *path);
> +static int (*pathcmp)(const char *a, const char *b);

These global function pointers are sufficiently out of the ordinary that
it might be worth adding a comment about how they're intended to be
set and used.

I also suspect they could be stuffed into the merge_options struct along
with the hashmap itself, and then passed via the cmp_data pointer (which
is relatively new-ish; I won't be surprised if you wrote this patch
before that existed and forward-ported it).

> +static int path_hashmap_cmp(const void *cmp_data,
> + const void *entry,
> + const void *entry_or_key,
> + const void *keydata)
> +{
> + const struct path_hashmap_entry *a = entry;
> + const struct path_hashmap_entry *b = entry_or_key;
> + const char *key = keydata;
> +
> + return pathcmp(a->path, key ? key : b->path);
> +}

The hashmap interface makes its cmp functions a bit tricky to write, but
this all looks correct to me.

> @@ -642,6 +662,8 @@ static void add_flattened_path(struct strbuf *out, const 
> char *s)
>  
>  static char *unique_path(struct merge_options *o, const char *path, const 
> char *branch)
>  {
> + struct path_hashmap_entry dummy;
> + struct path_hashmap_entry *entry;
>   struct strbuf newpath = STRBUF_INIT;
>   int suffix = 0;
>   size_t base_len;
> @@ -650,14 +672,17 @@ static char *unique_path(struct merge_options *o, const 
> char *path, const char *
>   add_flattened_path(, branch);
>  
>   base_len = newpath.len;
> - while (string_list_has_string(>current_file_set, newpath.buf) ||
> -string_list_has_string(>current_directory_set, newpath.buf) ||
> + hashmap_entry_init(, pathhash(newpath.buf));
> + while (hashmap_get(>current_file_dir_set, , newpath.buf) ||
>  (!o->call_depth && file_exists(newpath.buf))) {
>   strbuf_setlen(, base_len);
>   strbuf_addf(, "_%d", suffix++);
> + hashmap_entry_init(, pathhash(newpath.buf));
>   }

I think you can use hashmap_get_from_hash() here to avoid dealing with
"dummy" yourself.

> @@ -1941,8 +1966,7 @@ int merge_trees(struct merge_options *o,
>   if (unmerged_cache()) {
>   struct string_list *entries, *re_head, *re_merge;
>   int i;
> - string_list_clear(>current_file_set, 1);
> - string_list_clear(>current_directory_set, 1);
> + hashmap_init(>current_file_dir_set, path_hashmap_cmp, NULL, 
> 512);

This hunk seems funny. string_list_clear() assumes the input is an
already-initialized string list, and we drop all of its entries. But
hashmap_init assumes we have an _uninitialized_ chunk of memory, and
throws away whatever was there.

Do we ever hit this code path when there's actually something in the
hashmap?

I'd think we would want to hashmap_init() at the same place we used to
string_list_init(). I.e., in init_merge_options.

> @@ -1978,6 +2002,8 @@ int merge_trees(struct merge_options *o,
>   string_list_clear(re_head, 0);
>   string_list_clear(entries, 1);
>  
> + hashmap_free(>current_file_dir_set, 1);

Ah, I think this answers my question above. We only care about the
contents during this one bit of code, so we do an init/free pair. And
since there's no "clear" function for a hashmap that drops the entries
without resetting things like the comparison function, we _have_ to
call hashmap_init in the earlier hunk.

That does make me wonder if it would be sane to just declare the hashmap
local to this block, which would make its lifetime much more clear. I
guess having it in merge_options is convenient for passing it around,
though.

> @@ -2179,8 +2205,8 @@ void init_merge_options(struct merge_options *o)
> [...]

The rest of the patch looks like a straight-forward conversion. With the
understanding I have above, I think everything here is correct, but
there are a few minor bits it might be worth addressing to make the 

Re: [PATCH v2 2/2] refs/files-backend: fix memory leak in lock_ref_for_update

2017-08-29 Thread Michael Haggerty
On 08/28/2017 10:32 PM, Martin Ågren wrote:
> After the previous patch, none of the functions we call hold on to
> `referent.buf`, so we can safely release the string buffer before
> returning.

This patch looks good to me, but I did notice a pre-existing problem in
the area...

> ---
>  refs/files-backend.c | 31 ---
>  1 file changed, 20 insertions(+), 11 deletions(-)
> 
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index bdb0e22e5..15f34b10e 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> [...]
> @@ -2305,10 +2305,12 @@ static int lock_ref_for_update(struct files_ref_store 
> *refs,
>   strbuf_addf(err, "cannot lock ref '%s': 
> "
>   "error reading reference",
>   
> original_update_refname(update));
> - return -1;
> + ret = -1;
> + goto out;

It is incorrect to return -1 here. First of all, stylistically, the
return value should be a symbolic constant. But in fact, it should be
returning `TRANSACTION_GENERIC_ERROR` here, whereas -1 is
`TRANSACTION_NAME_CONFLICT`. So the code is not just stylistically
wrong; it is functionally wrong.

I know that this is not your mistake, but would you like to add another
patch to your series to fix this up? I'd do it myself, but it's a little
bit awkward because the fix will conflict with your patch.

Thanks,
Michael


Re: [PATCH v2 1/2] refs/files-backend: add longer-scoped copy of string to list

2017-08-29 Thread Michael Haggerty
On 08/28/2017 10:32 PM, Martin Ågren wrote:
> split_symref_update() receives a string-pointer `referent` and adds it
> to the list of `affected_refnames`. The list simply holds on to the
> pointers it is given, it does not copy the strings and it never frees
> them. The `referent` string in split_symref_update() belongs to a string
> buffer in the caller. After we return, the string will be leaked.
> 
> In the next patch, we want to properly release the string buffer in the
> caller, but we can't safely do so until we've made sure that
> `affected_refnames` will not be holding on to a pointer to the string.
> We could configure the list to handle its own resources, but it would
> mean some alloc/free-churning. The list is already handling other
> strings (through other code paths) which we do not need to worry about,
> and we'd be memory-churning those strings too, completely unnecessary.
> 
> Observe that split_symref_update() creates a `new_update`-object and
> that `new_update->refname` is then a copy of `referent`. The difference
> is, this copy will be freed, and it will be freed *after*
> `affected_refnames` has been cleared.
> 
> Rearrange the handling of `referent`, so that we don't add it directly
> to `affected_refnames`. Instead, first just check whether `referent`
> exists in the string list, and later add `new_update->refname`.
> 
> Helped-by: Michael Haggerty 
> Signed-off-by: Martin Ågren 
> ---
> Thanks Junio and Michael for your comments on the first version. This
> first patch is now completely different and much much better (thanks
> Michael!). The commit message should also be better (sorry Junio...).
> The second one has a new commit message, but the diff is the same.
> 
> Martin
> 
>  refs/files-backend.c | 16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 5cca55510..bdb0e22e5 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -2140,13 +2140,12 @@ static int split_symref_update(struct files_ref_store 
> *refs,
>  
>   /*
>* First make sure that referent is not already in the
> -  * transaction. This insertion is O(N) in the transaction
> +  * transaction. This check is O(N) in the transaction

The check is not O(N); it is O(lg N) because it is implemented via a
binary search in the (sorted) `affected_refnames`. The insertion below
is still O(N), because it has to shift entries later in the list to the
right to make room for the new entry.

>* size, but it happens at most once per symref in a
>* transaction.
>*/
> - item = string_list_insert(affected_refnames, referent);
> - if (item->util) {
> - /* An entry already existed */
> + if (string_list_has_string(affected_refnames, referent)) {
> + /* An entry already exists */
>   strbuf_addf(err,
>   "multiple updates for '%s' (including one "
>   "via symref '%s') are not allowed",
> @@ -2181,6 +2180,15 @@ static int split_symref_update(struct files_ref_store 
> *refs,
>   update->flags |= REF_LOG_ONLY | REF_NODEREF;
>   update->flags &= ~REF_HAVE_OLD;
>  
> + /*
> +  * Add the referent. This insertion is O(N) in the transaction
> +  * size, but it happens at most once per symref in a
> +  * transaction. Make sure to add new_update->refname, which will
> +  * be valid as long as affected_refnames is in use, and NOT
> +  * referent, which might soon be freed by our caller.
> +  */
> + item = string_list_insert(affected_refnames, new_update->refname);
> + assert(!item->util);

We generally avoid using `assert()`. It would be preferable to use

if (item->util)
BUG("%s unexpectedly found in affected_refnames",
new_update->refname);

>   item->util = new_update;
>  
>   return 0;
> 

Otherwise, looks good!

Michael


[PATCH 03/10] packed_ref_store: implement reference transactions

2017-08-29 Thread Michael Haggerty
Implement the methods needed to support reference transactions for
the packed-refs backend. The new methods are not yet used.

Signed-off-by: Michael Haggerty 
---
 refs/packed-backend.c | 313 +-
 refs/packed-backend.h |   9 ++
 2 files changed, 319 insertions(+), 3 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 7e348feac3..d19b3bfba5 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -744,25 +744,332 @@ static int packed_init_db(struct ref_store *ref_store, 
struct strbuf *err)
return 0;
 }
 
+/*
+ * Write the packed-refs from the cache to the packed-refs tempfile,
+ * incorporating any changes from `updates`. `updates` must be a
+ * sorted string list whose keys are the refnames and whose util
+ * values are `struct ref_update *`. On error, rollback the tempfile,
+ * write an error message to `err`, and return a nonzero value.
+ *
+ * The packfile must be locked before calling this function and will
+ * remain locked when it is done.
+ */
+static int write_with_updates(struct packed_ref_store *refs,
+ struct string_list *updates,
+ struct strbuf *err)
+{
+   struct ref_iterator *iter = NULL;
+   size_t i;
+   int ok;
+   FILE *out;
+   struct strbuf sb = STRBUF_INIT;
+   char *packed_refs_path;
+
+   if (!is_lock_file_locked(>lock))
+   die("BUG: write_with_updates() called while unlocked");
+
+   /*
+* If packed-refs is a symlink, we want to overwrite the
+* symlinked-to file, not the symlink itself. Also, put the
+* staging file next to it:
+*/
+   packed_refs_path = get_locked_file_path(>lock);
+   strbuf_addf(, "%s.new", packed_refs_path);
+   free(packed_refs_path);
+   if (create_tempfile(>tempfile, sb.buf) < 0) {
+   strbuf_addf(err, "unable to create file %s: %s",
+   sb.buf, strerror(errno));
+   strbuf_release();
+   return -1;
+   }
+   strbuf_release();
+
+   out = fdopen_tempfile(>tempfile, "w");
+   if (!out) {
+   strbuf_addf(err, "unable to fdopen packed-refs tempfile: %s",
+   strerror(errno));
+   goto error;
+   }
+
+   if (fprintf(out, "%s", PACKED_REFS_HEADER) < 0)
+   goto write_error;
+
+   /*
+* We iterate in parallel through the current list of refs and
+* the list of updates, processing an entry from at least one
+* of the lists each time through the loop. When the current
+* list of refs is exhausted, set iter to NULL. When the list
+* of updates is exhausted, leave i set to updates->nr.
+*/
+   iter = packed_ref_iterator_begin(>base, "",
+DO_FOR_EACH_INCLUDE_BROKEN);
+   if ((ok = ref_iterator_advance(iter)) != ITER_OK)
+   iter = NULL;
+
+   i = 0;
+
+   while (iter || i < updates->nr) {
+   struct ref_update *update = NULL;
+   int cmp;
+
+   if (i >= updates->nr) {
+   cmp = -1;
+   } else {
+   update = updates->items[i].util;
+
+   if (!iter)
+   cmp = +1;
+   else
+   cmp = strcmp(iter->refname, update->refname);
+   }
+
+   if (!cmp) {
+   /*
+* There is both an old value and an update
+* for this reference. Check the old value if
+* necessary:
+*/
+   if ((update->flags & REF_HAVE_OLD)) {
+   if (is_null_oid(>old_oid)) {
+   strbuf_addf(err, "cannot update ref 
'%s': "
+   "reference already exists",
+   update->refname);
+   goto error;
+   } else if (oidcmp(>old_oid, iter->oid)) 
{
+   strbuf_addf(err, "cannot update ref 
'%s': "
+   "is at %s but expected %s",
+   update->refname,
+   oid_to_hex(iter->oid),
+   
oid_to_hex(>old_oid));
+   goto error;
+   }
+   }
+
+   /* Now figure out what to use for the new value: */
+   if ((update->flags & REF_HAVE_NEW)) {
+   /*
+* The 

[PATCH 02/10] struct ref_transaction: add a place for backends to store data

2017-08-29 Thread Michael Haggerty
`packed_ref_store` is going to want to store some transaction-wide
data, so make a place for it.

Signed-off-by: Michael Haggerty 
---
 refs/refs-internal.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index b02dc5a7e3..d7d344de73 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -242,6 +242,7 @@ struct ref_transaction {
size_t alloc;
size_t nr;
enum ref_transaction_state state;
+   void *backend_data;
 };
 
 /*
-- 
2.14.1



[PATCH 10/10] files_transaction_finish(): delete reflogs before references

2017-08-29 Thread Michael Haggerty
If the deletion steps unexpectedly fail, it is less bad to leave a
reference without its reflog than it is to leave a reflog without its
reference, since the latter is an invalid repository state.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 35 +--
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 4f4c47b9db..36f81b4f28 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2630,6 +2630,27 @@ static int files_transaction_finish(struct ref_store 
*ref_store,
}
}
 
+   /*
+* Now that updates are safely completed, we can perform
+* deletes. First delete the reflogs of any references that
+* will be deleted, since (in the unexpected event of an
+* error) leaving a reference without a reflog is less bad
+* than leaving a reflog without a reference (the latter is a
+* mildly invalid repository state):
+*/
+   for (i = 0; i < transaction->nr; i++) {
+   struct ref_update *update = transaction->updates[i];
+   if (update->flags & REF_DELETING &&
+   !(update->flags & REF_LOG_ONLY) &&
+   !(update->flags & REF_ISPRUNING)) {
+   strbuf_reset();
+   files_reflog_path(refs, , update->refname);
+   if (!unlink_or_warn(sb.buf))
+   try_remove_empty_parents(refs, update->refname,
+
REMOVE_EMPTY_PARENTS_REFLOG);
+   }
+   }
+
/*
 * Perform deletes now that updates are safely completed.
 *
@@ -2666,20 +2687,6 @@ static int files_transaction_finish(struct ref_store 
*ref_store,
}
}
 
-   /* Delete the reflogs of any references that were deleted: */
-   for (i = 0; i < transaction->nr; i++) {
-   struct ref_update *update = transaction->updates[i];
-   if (update->flags & REF_DELETING &&
-   !(update->flags & REF_LOG_ONLY) &&
-   !(update->flags & REF_ISPRUNING)) {
-   strbuf_reset();
-   files_reflog_path(refs, , update->refname);
-   if (!unlink_or_warn(sb.buf))
-   try_remove_empty_parents(refs, update->refname,
-
REMOVE_EMPTY_PARENTS_REFLOG);
-   }
-   }
-
clear_loose_ref_cache(refs);
 
 cleanup:
-- 
2.14.1



[PATCH 09/10] packed-backend: rip out some now-unused code

2017-08-29 Thread Michael Haggerty
Now the outside world interacts with the packed ref store only via the
generic refs API plus a few lock-related functions. This allows us to
delete some functions that are no longer used, thereby completing the
encapsulation of the packed ref store.

Signed-off-by: Michael Haggerty 
---
 refs/packed-backend.c | 193 --
 refs/packed-backend.h |   8 ---
 2 files changed, 201 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 83a088118f..90f44c1fbb 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -91,19 +91,6 @@ struct ref_store *packed_ref_store_create(const char *path,
return ref_store;
 }
 
-/*
- * Die if refs is not the main ref store. caller is used in any
- * necessary error messages.
- */
-static void packed_assert_main_repository(struct packed_ref_store *refs,
- const char *caller)
-{
-   if (refs->store_flags & REF_STORE_MAIN)
-   return;
-
-   die("BUG: operation %s only allowed for main ref store", caller);
-}
-
 /*
  * Downcast `ref_store` to `packed_ref_store`. Die if `ref_store` is
  * not a `packed_ref_store`. Also die if `packed_ref_store` doesn't
@@ -321,40 +308,6 @@ static struct ref_dir *get_packed_refs(struct 
packed_ref_store *refs)
return get_packed_ref_dir(get_packed_ref_cache(refs));
 }
 
-/*
- * Add or overwrite a reference in the in-memory packed reference
- * cache. This may only be called while the packed-refs file is locked
- * (see packed_refs_lock()). To actually write the packed-refs file,
- * call commit_packed_refs().
- */
-void add_packed_ref(struct ref_store *ref_store,
-   const char *refname, const struct object_id *oid)
-{
-   struct packed_ref_store *refs =
-   packed_downcast(ref_store, REF_STORE_WRITE,
-   "add_packed_ref");
-   struct ref_dir *packed_refs;
-   struct ref_entry *packed_entry;
-
-   if (!is_lock_file_locked(>lock))
-   die("BUG: packed refs not locked");
-
-   if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL))
-   die("Reference has invalid format: '%s'", refname);
-
-   packed_refs = get_packed_refs(refs);
-   packed_entry = find_ref_entry(packed_refs, refname);
-   if (packed_entry) {
-   /* Overwrite the existing entry: */
-   oidcpy(_entry->u.value.oid, oid);
-   packed_entry->flag = REF_ISPACKED;
-   oidclr(_entry->u.value.peeled);
-   } else {
-   packed_entry = create_ref_entry(refname, oid, REF_ISPACKED);
-   add_ref_entry(packed_refs, packed_entry);
-   }
-}
-
 /*
  * Return the ref_entry for the given refname from the packed
  * references.  If it does not exist, return NULL.
@@ -592,152 +545,6 @@ int packed_refs_is_locked(struct ref_store *ref_store)
 static const char PACKED_REFS_HEADER[] =
"# pack-refs with: peeled fully-peeled \n";
 
-/*
- * Write the current version of the packed refs cache from memory to
- * disk. The packed-refs file must already be locked for writing (see
- * packed_refs_lock()). Return zero on success. On errors, rollback
- * the lockfile, write an error message to `err`, and return a nonzero
- * value.
- */
-int commit_packed_refs(struct ref_store *ref_store, struct strbuf *err)
-{
-   struct packed_ref_store *refs =
-   packed_downcast(ref_store, REF_STORE_WRITE | REF_STORE_MAIN,
-   "commit_packed_refs");
-   struct packed_ref_cache *packed_ref_cache =
-   get_packed_ref_cache(refs);
-   int ok;
-   int ret = -1;
-   struct strbuf sb = STRBUF_INIT;
-   FILE *out;
-   struct ref_iterator *iter;
-   char *packed_refs_path;
-
-   if (!is_lock_file_locked(>lock))
-   die("BUG: commit_packed_refs() called when unlocked");
-
-   /*
-* If packed-refs is a symlink, we want to overwrite the
-* symlinked-to file, not the symlink itself. Also, put the
-* staging file next to it:
-*/
-   packed_refs_path = get_locked_file_path(>lock);
-   strbuf_addf(, "%s.new", packed_refs_path);
-   if (create_tempfile(>tempfile, sb.buf) < 0) {
-   strbuf_addf(err, "unable to create file %s: %s",
-   sb.buf, strerror(errno));
-   strbuf_release();
-   goto out;
-   }
-   strbuf_release();
-
-   out = fdopen_tempfile(>tempfile, "w");
-   if (!out) {
-   strbuf_addf(err, "unable to fdopen packed-refs tempfile: %s",
-   strerror(errno));
-   goto error;
-   }
-
-   if (fprintf(out, "%s", PACKED_REFS_HEADER) < 0) {
-   strbuf_addf(err, "error writing to %s: %s",
-   get_tempfile_path(>tempfile), 
strerror(errno));
-   goto error;
- 

[PATCH 08/10] files_ref_store: use a transaction to update packed refs

2017-08-29 Thread Michael Haggerty
When processing a `files_ref_store` transaction, it is sometimes
necessary to delete some references from the "packed-refs" file. Do
that using a reference transaction conducted against the
`packed_ref_store`.

This change further decouples `files_ref_store` from
`packed_ref_store`. It also fixes multiple problems, including the two
revealed by test cases added in the previous commit.

First, the old code didn't obtain the `packed-refs` lock until
`files_transaction_finish()`. This means that a failure to acquire the
`packed-refs` lock (e.g., due to contention with another process)
wasn't detected until it was too late (problems like this are supposed
to be detected in the "prepare" phase). The new code acquires the
`packed-refs` lock in `files_transaction_prepare()`, the same stage of
the processing when the loose reference locks are being acquired,
removing another reason why the "prepare" phase might succeed and the
"finish" phase might nevertheless fail.

Second, the old code deleted the loose version of a reference before
deleting any packed version of the same reference. This left a moment
when another process might think that the packed version of the
reference is current, which is incorrect. (Even worse, the packed
version of the reference can be arbitrarily old, and might even point
at an object that has since been garbage-collected.)

Third, if a reference deletion fails to acquire the `packed-refs` lock
altogether, then the old code might leave the repository in the
incorrect state (possibly corrupt) described in the previous
paragraph.

Now we activate the new "packed-refs" file (sans any references that
are being deleted) *before* deleting the corresponding loose
references. But we hold the "packed-refs" lock until after the loose
references have been finalized, thus preventing a simultaneous
"pack-refs" process from packing the loose version of the reference in
the time gap, which would otherwise defeat our attempt to delete it.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 132 +--
 t/t1404-update-ref-errors.sh |   4 +-
 2 files changed, 103 insertions(+), 33 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 29c7c78602..4f4c47b9db 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2391,13 +2391,22 @@ static int lock_ref_for_update(struct files_ref_store 
*refs,
return 0;
 }
 
+struct files_transaction_backend_data {
+   struct ref_transaction *packed_transaction;
+   int packed_refs_locked;
+};
+
 /*
  * Unlock any references in `transaction` that are still locked, and
  * mark the transaction closed.
  */
-static void files_transaction_cleanup(struct ref_transaction *transaction)
+static void files_transaction_cleanup(struct files_ref_store *refs,
+ struct ref_transaction *transaction)
 {
size_t i;
+   struct files_transaction_backend_data *backend_data =
+   transaction->backend_data;
+   struct strbuf err = STRBUF_INIT;
 
for (i = 0; i < transaction->nr; i++) {
struct ref_update *update = transaction->updates[i];
@@ -2409,6 +2418,17 @@ static void files_transaction_cleanup(struct 
ref_transaction *transaction)
}
}
 
+   if (backend_data->packed_transaction &&
+   ref_transaction_abort(backend_data->packed_transaction, )) {
+   error("error aborting transaction: %s", err.buf);
+   strbuf_release();
+   }
+
+   if (backend_data->packed_refs_locked)
+   packed_refs_unlock(refs->packed_ref_store);
+
+   free(backend_data);
+
transaction->state = REF_TRANSACTION_CLOSED;
 }
 
@@ -2425,12 +2445,17 @@ static int files_transaction_prepare(struct ref_store 
*ref_store,
char *head_ref = NULL;
int head_type;
struct object_id head_oid;
+   struct files_transaction_backend_data *backend_data;
+   struct ref_transaction *packed_transaction = NULL;
 
assert(err);
 
if (!transaction->nr)
goto cleanup;
 
+   backend_data = xcalloc(1, sizeof(*backend_data));
+   transaction->backend_data = backend_data;
+
/*
 * Fail if a refname appears more than once in the
 * transaction. (If we end up splitting up any updates using
@@ -2497,6 +2522,41 @@ static int files_transaction_prepare(struct ref_store 
*ref_store,
  head_ref, _refnames, err);
if (ret)
break;
+
+   if (update->flags & REF_DELETING &&
+   !(update->flags & REF_LOG_ONLY) &&
+   !(update->flags & REF_ISPRUNING)) {
+   /*
+* This reference has to be deleted from
+* packed-refs if it exists there.
+*/
+   if 

[PATCH 01/10] packed-backend: don't adjust the reference count on lock/unlock

2017-08-29 Thread Michael Haggerty
The old code incremented the packed ref cache reference count when
acquiring the packed-refs lock, and decremented the count when
releasing the lock. This is unnecessary because a locked packed-refs
file cannot be changed, so there is no reason for the cache to become
stale.

Moreover, the extra reference count causes a problem if we
intentionally clear the packed refs cache, as we sometimes need to do
if we change the cache in anticipation of writing a change to disk,
but then the write to disk fails. In that case, `packed_refs_unlock()`
would have no easy way to find the cache whose reference count it
needs to decrement.

This whole issue will soon become moot due to upcoming changes that
avoid changing the in-memory cache as part of updating the packed-refs
on disk, but this change makes that transition easier.

Signed-off-by: Michael Haggerty 
---
 refs/packed-backend.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 412c85034f..7e348feac3 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -525,7 +525,6 @@ int packed_refs_lock(struct ref_store *ref_store, int 
flags, struct strbuf *err)
"packed_refs_lock");
static int timeout_configured = 0;
static int timeout_value = 1000;
-   struct packed_ref_cache *packed_ref_cache;
 
if (!timeout_configured) {
git_config_get_int("core.packedrefstimeout", _value);
@@ -560,9 +559,7 @@ int packed_refs_lock(struct ref_store *ref_store, int 
flags, struct strbuf *err)
 */
validate_packed_ref_cache(refs);
 
-   packed_ref_cache = get_packed_ref_cache(refs);
-   /* Increment the reference count to prevent it from being freed: */
-   acquire_packed_ref_cache(packed_ref_cache);
+   get_packed_ref_cache(refs);
return 0;
 }
 
@@ -576,7 +573,6 @@ void packed_refs_unlock(struct ref_store *ref_store)
if (!is_lock_file_locked(>lock))
die("BUG: packed_refs_unlock() called when not locked");
rollback_lock_file(>lock);
-   release_packed_ref_cache(refs->cache);
 }
 
 int packed_refs_is_locked(struct ref_store *ref_store)
-- 
2.14.1



[PATCH 04/10] packed_delete_refs(): implement method

2017-08-29 Thread Michael Haggerty
Implement `packed_delete_refs()` using a reference transaction. This
means that `files_delete_refs()` can use `refs_delete_refs()` instead
of `repack_without_refs()` to delete any packed references, decreasing
the coupling between the classes.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c  |  2 +-
 refs/packed-backend.c | 45 -
 2 files changed, 45 insertions(+), 2 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index fccbc24ac4..2c78f63494 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1157,7 +1157,7 @@ static int files_delete_refs(struct ref_store *ref_store, 
const char *msg,
if (packed_refs_lock(refs->packed_ref_store, 0, ))
goto error;
 
-   if (repack_without_refs(refs->packed_ref_store, refnames, )) {
+   if (refs_delete_refs(refs->packed_ref_store, msg, refnames, flags)) {
packed_refs_unlock(refs->packed_ref_store);
goto error;
}
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index d19b3bfba5..83a088118f 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1082,7 +1082,50 @@ static int packed_initial_transaction_commit(struct 
ref_store *ref_store,
 static int packed_delete_refs(struct ref_store *ref_store, const char *msg,
 struct string_list *refnames, unsigned int flags)
 {
-   die("BUG: not implemented yet");
+   struct packed_ref_store *refs =
+   packed_downcast(ref_store, REF_STORE_WRITE, "delete_refs");
+   struct strbuf err = STRBUF_INIT;
+   struct ref_transaction *transaction;
+   struct string_list_item *item;
+   int ret;
+
+   (void)refs; /* We need the check above, but don't use the variable */
+
+   if (!refnames->nr)
+   return 0;
+
+   /*
+* Since we don't check the references' old_oids, the
+* individual updates can't fail, so we can pack all of the
+* updates into a single transaction.
+*/
+
+   transaction = ref_store_transaction_begin(ref_store, );
+   if (!transaction)
+   return -1;
+
+   for_each_string_list_item(item, refnames) {
+   if (ref_transaction_delete(transaction, item->string, NULL,
+  flags, msg, )) {
+   warning(_("could not delete reference %s: %s"),
+   item->string, err.buf);
+   strbuf_reset();
+   }
+   }
+
+   ret = ref_transaction_commit(transaction, );
+
+   if (ret) {
+   if (refnames->nr == 1)
+   error(_("could not delete reference %s: %s"),
+ refnames->items[0].string, err.buf);
+   else
+   error(_("could not delete references: %s"), err.buf);
+   }
+
+   ref_transaction_free(transaction);
+   strbuf_release();
+   return ret;
 }
 
 static int packed_pack_refs(struct ref_store *ref_store, unsigned int flags)
-- 
2.14.1



[PATCH 06/10] files_initial_transaction_commit(): use a transaction for packed refs

2017-08-29 Thread Michael Haggerty
Used a `packed_ref_store` transaction in the implementation of
`files_initial_transaction_commit()` rather than using internal
features of the packed ref store. This further decouples
`files_ref_store` from `packed_ref_store`.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 29 +++--
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0fc44eaca5..29c7c78602 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2663,6 +2663,7 @@ static int files_initial_transaction_commit(struct 
ref_store *ref_store,
size_t i;
int ret = 0;
struct string_list affected_refnames = STRING_LIST_INIT_NODUP;
+   struct ref_transaction *packed_transaction = NULL;
 
assert(err);
 
@@ -2695,6 +2696,12 @@ static int files_initial_transaction_commit(struct 
ref_store *ref_store,
 _refnames))
die("BUG: initial ref transaction called with existing refs");
 
+   packed_transaction = 
ref_store_transaction_begin(refs->packed_ref_store, err);
+   if (!packed_transaction) {
+   ret = TRANSACTION_GENERIC_ERROR;
+   goto cleanup;
+   }
+
for (i = 0; i < transaction->nr; i++) {
struct ref_update *update = transaction->updates[i];
 
@@ -2707,6 +2714,15 @@ static int files_initial_transaction_commit(struct 
ref_store *ref_store,
ret = TRANSACTION_NAME_CONFLICT;
goto cleanup;
}
+
+   /*
+* Add a reference creation for this reference to the
+* packed-refs transaction:
+*/
+   ref_transaction_add_update(packed_transaction, update->refname,
+  update->flags & ~REF_HAVE_OLD,
+  update->new_oid.hash, 
update->old_oid.hash,
+  NULL);
}
 
if (packed_refs_lock(refs->packed_ref_store, 0, err)) {
@@ -2714,21 +2730,14 @@ static int files_initial_transaction_commit(struct 
ref_store *ref_store,
goto cleanup;
}
 
-   for (i = 0; i < transaction->nr; i++) {
-   struct ref_update *update = transaction->updates[i];
-
-   if ((update->flags & REF_HAVE_NEW) &&
-   !is_null_oid(>new_oid))
-   add_packed_ref(refs->packed_ref_store, update->refname,
-  >new_oid);
-   }
-
-   if (commit_packed_refs(refs->packed_ref_store, err)) {
+   if (initial_ref_transaction_commit(packed_transaction, err)) {
ret = TRANSACTION_GENERIC_ERROR;
goto cleanup;
}
 
 cleanup:
+   if (packed_transaction)
+   ref_transaction_free(packed_transaction);
packed_refs_unlock(refs->packed_ref_store);
transaction->state = REF_TRANSACTION_CLOSED;
string_list_clear(_refnames, 0);
-- 
2.14.1



[PATCH 05/10] files_pack_refs(): use a reference transaction to write packed refs

2017-08-29 Thread Michael Haggerty
Now that the packed reference store supports transactions, we can use
a transaction to write the packed versions of references that we want
to pack. This decreases the coupling between `files_ref_store` and
`packed_ref_store`.

Signed-off-by: Michael Haggerty 
---
 refs/files-backend.c | 24 +---
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 2c78f63494..0fc44eaca5 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1100,9 +1100,14 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
int ok;
struct ref_to_prune *refs_to_prune = NULL;
struct strbuf err = STRBUF_INIT;
+   struct ref_transaction *transaction;
 
packed_refs_lock(refs->packed_ref_store, LOCK_DIE_ON_ERROR, );
 
+   transaction = ref_store_transaction_begin(refs->packed_ref_store, );
+   if (!transaction)
+   return -1;
+
iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0);
while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
/*
@@ -1115,12 +1120,14 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
continue;
 
/*
-* Create an entry in the packed-refs cache equivalent
-* to the one from the loose ref cache, except that
-* we don't copy the peeled status, because we want it
-* to be re-peeled.
+* Add a reference creation for this reference to the
+* packed-refs transaction:
 */
-   add_packed_ref(refs->packed_ref_store, iter->refname, 
iter->oid);
+   if (ref_transaction_update(transaction, iter->refname,
+  iter->oid->hash, NULL,
+  REF_NODEREF, NULL, ))
+   die("failure preparing to create packed reference %s: 
%s",
+   iter->refname, err.buf);
 
/* Schedule the loose reference for pruning if requested. */
if ((flags & PACK_REFS_PRUNE)) {
@@ -1134,8 +1141,11 @@ static int files_pack_refs(struct ref_store *ref_store, 
unsigned int flags)
if (ok != ITER_DONE)
die("error while iterating over references");
 
-   if (commit_packed_refs(refs->packed_ref_store, ))
-   die("unable to overwrite old ref-pack file: %s", err.buf);
+   if (ref_transaction_commit(transaction, ))
+   die("unable to write new packed-refs: %s", err.buf);
+
+   ref_transaction_free(transaction);
+
packed_refs_unlock(refs->packed_ref_store);
 
prune_refs(refs, refs_to_prune);
-- 
2.14.1



[PATCH 07/10] t1404: demonstrate two problems with reference transactions

2017-08-29 Thread Michael Haggerty
Currently, a loose reference is deleted even before locking the
`packed-refs` file, let alone deleting any packed version of the
reference. This leads to two problems, demonstrated by two new tests:

* While a reference is being deleted, other processes might see the
  old, packed value of the reference for a moment before the packed
  version is deleted. Normally this would be hard to observe, but we
  can prolong the window by locking the `packed-refs` file externally
  before running `update-ref`, then unlocking it before `update-ref`'s
  attempt to acquire the lock times out.

* If the `packed-refs` file is locked so long that `update-ref` fails
  to lock it, then the reference can be left permanently in the
  incorrect state described in the previous point.

In a moment, both problems will be fixed.

Signed-off-by: Michael Haggerty 
---
The first of the new tests is rather involved; it uses two background
processes plus a foreground process that polls the value of a
reference. But despite sensitivity to timing, I think it should be
robust even in this broken state. Once the functionality being tested
is fixed, this test should never produce false positives, though
really bad timing (e.g., if it takes more than a second for
`update-ref` to get going) could still lead to false negatives.

Each of the new tests takes about a second to run because they
simulate lock contention.

If anybody has suggestions for better ways to test these things,
please speak up :-)

 t/t1404-update-ref-errors.sh | 71 
 1 file changed, 71 insertions(+)

diff --git a/t/t1404-update-ref-errors.sh b/t/t1404-update-ref-errors.sh
index c34ece48f5..752f83c377 100755
--- a/t/t1404-update-ref-errors.sh
+++ b/t/t1404-update-ref-errors.sh
@@ -404,4 +404,75 @@ test_expect_success 'broken reference blocks indirect 
create' '
test_cmp expected output.err
 '
 
+test_expect_failure 'no bogus intermediate values during delete' '
+   prefix=refs/slow-transaction &&
+   # Set up a reference with differing loose and packed versions:
+   git update-ref $prefix/foo $C &&
+   git pack-refs --all &&
+   git update-ref $prefix/foo $D &&
+   git for-each-ref $prefix >unchanged &&
+   # Now try to update the reference, but hold the `packed-refs` lock
+   # for a while to see what happens while the process is blocked:
+   : >.git/packed-refs.lock &&
+   test_when_finished "rm -f .git/packed-refs.lock" &&
+   {
+   sleep 1 &&
+   rm -f .git/packed-refs.lock &
+   } &&
+   pid1=$! &&
+   {
+   # Note: the following command is intentionally run in the
+   # background. We extend the timeout so that `update-ref`
+   # tries to acquire the `packed-refs` lock longer than it
+   # takes the background process above to delete it:
+   git -c core.packedrefstimeout=2000 update-ref -d $prefix/foo &
+   } &&
+   pid2=$! &&
+   ok=true &&
+   while kill -0 $pid2 2>/dev/null
+   do
+   sha1=$(git rev-parse --verify --quiet $prefix/foo || echo 
undefined) &&
+   case "$sha1" in
+   $D)
+   # This is OK; it just means that nothing has happened 
yet.
+   : ;;
+   undefined)
+   # This is OK; it means the deletion was successful.
+   : ;;
+   $C)
+   # This value should never be seen. Probably the loose
+   # reference has been deleted but the packed reference
+   # is still there:
+   echo "$prefix/foo incorrectly observed to be C" &&
+   break
+   ;;
+   *)
+   # WTF?
+   echo "$prefix/foo unexpected value observed: $sha1" &&
+   break
+   ;;
+   esac
+   done >out &&
+   wait $pid1 &&
+   wait $pid2 &&
+   test_must_be_empty out &&
+   test_must_fail git rev-parse --verify --quiet $prefix/foo
+'
+
+test_expect_failure 'delete fails cleanly if packed-refs file is locked' '
+   prefix=refs/locked-packed-refs &&
+   # Set up a reference with differing loose and packed versions:
+   git update-ref $prefix/foo $C &&
+   git pack-refs --all &&
+   git update-ref $prefix/foo $D &&
+   git for-each-ref $prefix >unchanged &&
+   # Now try to delete it while the `packed-refs` lock is held:
+   : >.git/packed-refs.lock &&
+   test_when_finished "rm -f .git/packed-refs.lock" &&
+   test_must_fail git update-ref -d $prefix/foo >out 2>err &&
+   git for-each-ref $prefix >actual &&
+   grep "Unable to create $Q.*packed-refs.lock$Q: File exists" err &&
+   test_cmp unchanged actual
+'
+
 test_done
-- 
2.14.1



[PATCH 00/10] Implement transactions for the packed ref store

2017-08-29 Thread Michael Haggerty
This should be the second-to-last patch series in the quest for
mmapped packed-refs.

Before this patch series, there is still more coupling than necessary
between `files_ref_store` and `packed_ref_store`:

* `files_ref_store` adds packed references (e.g., during `git clone`
  and `git pack-refs`) by inserting them into the `packed_ref_store`'s
  internal `ref_cache`, then calling `commit_packed_refs()`. But
  `files_ref_store` shouldn't even *know* that `packed_ref_store` uses
  a `ref_cache`, let alone muck about in it.

* `files_ref_store` deletes packed references by calling
  `repack_without_refs()`.

Instead, implement reference transactions and `delete_refs()` for
`packed_ref_store`, and change `files_ref_store` to use these standard
methods rather than internal `packed_ref_store` methods.

This patch series finishes up the encapsulation of `packed_ref_store`.
At the end of the series, the outside world only interacts with it via
the refs API plus a couple of locking-related functions. That will
make it easy for the next patch series to change the internal workings
of `packed_ref_store` without affecting `files_ref_store`.

Along the way, we fix some longstanding problems with reference
updates. Quoting from patch [08/10]:

First, the old code didn't obtain the `packed-refs` lock until
`files_transaction_finish()`. This means that a failure to acquire the
`packed-refs` lock (e.g., due to contention with another process)
wasn't detected until it was too late (problems like this are supposed
to be detected in the "prepare" phase). The new code acquires the
`packed-refs` lock in `files_transaction_prepare()`, the same stage of
the processing when the loose reference locks are being acquired,
removing another reason why the "prepare" phase might succeed and the
"finish" phase might nevertheless fail.

Second, the old code deleted the loose version of a reference before
deleting any packed version of the same reference. This left a moment
when another process might think that the packed version of the
reference is current, which is incorrect. (Even worse, the packed
version of the reference can be arbitrarily old, and might even point
at an object that has since been garbage-collected.)

Third, if a reference deletion fails to acquire the `packed-refs` lock
altogether, then the old code might leave the repository in the
incorrect state (possibly corrupt) described in the previous
paragraph.

Now we activate the new "packed-refs" file (sans any references that
are being deleted) *before* deleting the corresponding loose
references. But we hold the "packed-refs" lock until after the loose
references have been finalized, thus preventing a simultaneous
"pack-refs" process from packing the loose version of the reference in
the time gap, which would otherwise defeat our attempt to delete it.

This patch series is also available as branch
`packed-ref-transactions` in my fork on GitHub [1]. A draft of the
patch series to change `packed_ref_store` to use mmap and get rid of
its `ref_cache` is available as branch `mmap-packed-refs` from the
same source.

Michael

[1] https://github.com/mhagger/git

Michael Haggerty (10):
  packed-backend: don't adjust the reference count on lock/unlock
  struct ref_transaction: add a place for backends to store data
  packed_ref_store: implement reference transactions
  packed_delete_refs(): implement method
  files_pack_refs(): use a reference transaction to write packed refs
  files_initial_transaction_commit(): use a transaction for packed refs
  t1404: demonstrate two problems with reference transactions
  files_ref_store: use a transaction to update packed refs
  packed-backend: rip out some now-unused code
  files_transaction_finish(): delete reflogs before references

 refs/files-backend.c | 200 ++-
 refs/packed-backend.c| 457 +--
 refs/packed-backend.h|  17 +-
 refs/refs-internal.h |   1 +
 t/t1404-update-ref-errors.sh |  71 +++
 5 files changed, 534 insertions(+), 212 deletions(-)

-- 
2.14.1



Re: [PATCH 2/3] merge-recursive: remove return value from get_files_dirs

2017-08-29 Thread Jeff King
On Mon, Aug 28, 2017 at 06:45:29PM -0400, Ben Peart wrote:

> > Since the debug output has been removed and the caller isn't
> > checking the return value there is no reason to keep calulating
> > and returning a value.
> > 
> 
> Did a quick search and you're right, nothing is using the return value. No
> reason to spend time calculating it.  Looks good.

This is actually one we can reasonably trust the compiler to check for
us. Though of course it doesn't hurt to look at the callers and make
sure that it's sane that they do not care about this return value (I
don't know merge-recursive.c very well, but the current return value
does seem kind of useless).

-Peff


Re: [PATCH 2/3] merge-recursive: remove return value from get_files_dirs

2017-08-29 Thread Jeff King
On Mon, Aug 28, 2017 at 02:28:28PM -0600, Kevin Willford wrote:

> The return value of the get_files_dirs call is never being used.
> Looking at the history of the file and it was originally only
> being used for debug output statements.  Also when
> read_tree_recursive return value is non zero it is changed to
> zero.  This leads me to believe that it doesn't matter if
> read_tree_recursive gets an error.

Or that the function is buggy. :)

I'm tempted to say that we should probably die() when
read_tree_recursive fails. This should only happen if we fail to parse
the tree, or if our callback (save_files_dirs here) returns failure, and
the latter looks like it never happens.

> Since the debug output has been removed and the caller isn't
> checking the return value there is no reason to keep calulating
> and returning a value.

Agreed, and I'm happy to see dead code go.

Minor nit: s/calulating/calculating/ in your commit message.

-Peff


Re: [PATCH 1/3] merge-recursive: fix memory leak

2017-08-29 Thread Jeff King
On Mon, Aug 28, 2017 at 02:28:27PM -0600, Kevin Willford wrote:

> In merge_trees if process_renames or process_entry returns less
> than zero, the method will just return and not free re_merge,
> re_head, or entries.
> 
> This change cleans up the allocated variables before returning
> to the caller.

Good catch. I suspect this function could stand to be refactored a bit.
For instance, pulling those inner bits of the conditional into a helper
would let us do:

  re_merge = get_renames(...);
  ... other setup ...

  clean = our_new_helper(re_merge, ...);

  string_clear(re_merge);
  ... other cleanup ...

  if (clean < 0)
return clean;

without having to resort to a goto. But certainly I don't mind this much
more minimal change, which fixes the actual functionality problem.

-Peff