Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-09-03 Thread Brad King
On 09/03/2013 12:43 AM, Michael Haggerty wrote:
 Hmmm, I see that you changed the signature of update_refs() to take an
 array of pointers.  My suggestion was unclear, but I didn't mean that
 the function signature had to be changed.
[snip]
 However, your approach is also fine.

Okay.  Thanks for reviewing!

-Brad
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-09-02 Thread Brad King
On 08/31/2013 02:19 PM, Michael Haggerty wrote:
 s/themeselves/themselves/

Fixed.

 +struct ref_update *u1 = (struct ref_update *)(r1);
 +struct ref_update *u2 = (struct ref_update *)(r2);
 
 If you declare u1 and u2 to be const struct ref_update * (i.e., add
 const), then you have const correctness and don't need the explicit
 casts.  (And the parentheses around r1 and r2 are superfluous in any case.)

Fixed.

 +ret = strcmp(u1-ref_name, u2-ref_name);
 
 Is there a need to compare more than ref_name?  If two entries are found
 with the same name, then ref_update_reject_duplicates() will error out

Junio mentioned possibility of auto-combining identical entries which would
need full ordering.  I think that can be added later so for now we can sort
only by ref name.  Thanks.

 +if (!strcmp(updates[i - 1].ref_name, updates[i].ref_name))
 +break;
 
 The error handling code could be right here instead of the break
 statement, removing the need for the if conditional.

Fixed.

 +/* Allocate work space: */
 +updates = xmalloc(sizeof(struct ref_update) * n);
 
 It seems preferred here to write
 
   updates = xmalloc(sizeof(*updates) * n);
 
 as this will continue to work if the type of updates is ever changed.

Yes, thanks.

 Similarly for the next lines.
 
 +types = xmalloc(sizeof(int) * n);
 +locks = xmalloc(sizeof(struct ref_lock *) * n);
 +delnames = xmalloc(sizeof(const char *) * n);
 
 An alternative to managing separate arrays to hold types and locks would
 be to include the scratch space in struct ref_update and document it
 for internal use only; need not be initialized by caller.  On the one
 hand it's ugly to cruft up the interface with internal implementation
 details; on the other hand there is precedent for this sort of thing
 (e.g., ref_lock::force_write or lock_file::on_list) and it would
 simplify the code.

I think the goto cleanup reorganization simplifies the code enough
to not need this.  After changing updates to an array of pointers
it needs to be separate so we can sort.  Also delnames needs to be
a separate array to pass to repack_without_refs.

 +/* Copy, sort, and reject duplicate refs: */
 +memcpy(updates, updates_orig, sizeof(struct ref_update) * n);
 +qsort(updates, n, sizeof(struct ref_update), ref_update_compare);
 
 You could save some space and memory shuffling (during memcpy() and
 qsort()) if you would declare updates to be an array of pointers to
 struct ref_update rather than an array of structs.  Sorting could then
 be done by moving pointers around instead of moving the structs.  This
 would also make it easier for update_refs() to pass information about
 the references back to its caller, should that ever be needed.

Good idea.  Changed in next revision.

 +ret |= update_ref_write(action,
 +updates[i].ref_name,
 +updates[i].new_sha1,
 +locks[i], onerr);
 +locks[i] = 0; /* freed by update_ref_write */
 +}
 +
 
 Hmmm, if one of the calls to update_ref_write() fails, would it be safer
 to abort the rest of the work (especially the reference deletions)?

Yes.  Since we already have the lock at this point something must be
going pretty wrong if this fails so it is best to abort altogether.

 +free(updates);
 +free(types);
 +free(locks);
 +free(delnames);
 +return ret;
 +}
 
 There's a lot of duplicated cleanup code in the function.  If you put a
 label before the final for loop, and if you initialize the locks array
 to zeros (e.g., by using xcalloc()), then the three exits could all
 share the same code ret = 1; goto cleanup;.

Done, thanks.

 +struct ref_update {
 
 Please document this structure, especially the relationship between
 have_old and old_sha1.

Done.  I also moved it to the top of the header just under ref_lock
so it can be used by other APIs later.

Thanks,
-Brad
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-09-02 Thread Brad King
On 09/01/2013 02:08 AM, Junio C Hamano wrote:
 Though the refs themeselves cannot be modified together in a single
 
 themselves.

Fixed.

 I notice that we are using an array of structures and letting qsort
 swap 50~64 bytes of data

Michael suggested this too, so fixed.

 Optionally we could silently dedup multiple identical updates and
 not fail it in ref-update-reject-duplicates.  But that does not have
 to be done until we find people's script would benefit from such a
 nicety.

We can always be less strict about input later, so I'd like to keep
the implementation simpler for now.

 By the way, unless there is a strong reason not to do so,
 post-increment i++ (and pre-decrement --i, if you use it) is the
 norm around here.

Okay, fixed in entire series.

 +locks[i] = 0; /* freed by update_ref_write */
 
 I think what is assigned here is a NULL pointer.

Yes, fixed.

Thanks,
-Brad
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-09-02 Thread Michael Haggerty
On 09/02/2013 07:20 PM, Brad King wrote:
 On 09/01/2013 02:08 AM, Junio C Hamano wrote:
 Though the refs themeselves cannot be modified together in a single

 themselves.
 
 Fixed.
 
 I notice that we are using an array of structures and letting qsort
 swap 50~64 bytes of data
 
 Michael suggested this too, so fixed.

Hmmm, I see that you changed the signature of update_refs() to take an
array of pointers.  My suggestion was unclear, but I didn't mean that
the function signature had to be changed.  Rather, I meant that *within*
the function, you could have created an array of pointers to the
structures in the input array and thereafter accessed it via the pointers:

int update_refs(const char *action, const struct ref_update *updates_orig,
int n, enum action_on_err onerr)
{
[...]
struct ref_update **updates;

[...]
updates = xcalloc(n, sizeof(*updates));
for (i = 0; i  n; i++)
updates[i] = updates_orig[i];
[...]
}

However, your approach is also fine.  It will typically involve more
malloc()s but smaller memcpy()s (i.e., via ALLOC_GROW()) at the caller,
and since usually the number of ref_updates being done at one time will
be limited anyway, I don't see a reason to prefer one version over the
other.

Thanks for making the change.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-09-01 Thread Junio C Hamano
Brad King brad.k...@kitware.com writes:

 Add 'struct ref_update' to encode the information needed to update or
 delete a ref (name, new sha1, optional old sha1, no-deref flag).  Add
 function 'update_refs' accepting an array of updates to perform.  First
 sort the input array to order locks consistently everywhere and reject
 multiple updates to the same ref.  Then acquire locks on all refs with
 verified old values.  Then update or delete all refs accordingly.  Fail
 if any one lock cannot be obtained or any one old value does not match.

OK.  The code releases the locks it acquired so far when it fails,
which is good.

 Though the refs themeselves cannot be modified together in a single

themselves.

 atomic transaction, this function does enable some useful semantics.
 For example, a caller may create a new branch starting from the head of
 another branch and rewind the original branch at the same time.  This
 transfers ownership of commits between branches without risk of losing
 commits added to the original branch by a concurrent process, or risk of
 a concurrent process creating the new branch first.

 +static int ref_update_compare(const void *r1, const void *r2)
 +{
 + struct ref_update *u1 = (struct ref_update *)(r1);
 + struct ref_update *u2 = (struct ref_update *)(r2);
 + int ret;

Let's have a blank line between the end of decls and the beginning
of the body here.

 + ret = strcmp(u1-ref_name, u2-ref_name);
 + if (ret)
 + return ret;
 + ret = hashcmp(u1-new_sha1, u2-new_sha1);
 + if (ret)
 + return ret;
 + ret = hashcmp(u1-old_sha1, u2-old_sha1);
 + if (ret)
 + return ret;
 + ret = u1-flags - u2-flags;
 + if (ret)
 + return ret;
 + return u1-have_old - u2-have_old;
 +}

I notice that we are using an array of structures and letting qsort
swap 50~64 bytes of data, instead of sorting an array of pointers,
each element of which points at a structure.  This may not matter
unless we are asked to update thousands at once, so I think it is OK
for now.

 +static int ref_update_reject_duplicates(struct ref_update *updates, int n,
 + enum action_on_err onerr)
 +{
 + int i;
 + for (i = 1; i  n; ++i)
 + if (!strcmp(updates[i - 1].ref_name, updates[i].ref_name))
 + break;

Optionally we could silently dedup multiple identical updates and
not fail it in ref-update-reject-duplicates.  But that does not have
to be done until we find people's script would benefit from such a
nicety.

By the way, unless there is a strong reason not to do so,
post-increment i++ (and pre-decrement --i, if you use it) is the
norm around here.  Especially in places like the third part of a
for(;;) loop where people are used to see i++, breaking the idiom
makes readers wonder if there is something else going on.

 + /* Perform updates first so live commits remain referenced: */
 + for (i = 0; i  n; ++i)
 + if (!is_null_sha1(updates[i].new_sha1)) {
 + ret |= update_ref_write(action,
 + updates[i].ref_name,
 + updates[i].new_sha1,
 + locks[i], onerr);
 + locks[i] = 0; /* freed by update_ref_write */

I think what is assigned here is a NULL pointer.

Will locally tweak while queuing.  Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-08-31 Thread Michael Haggerty
On 08/30/2013 08:12 PM, Brad King wrote:
 Add 'struct ref_update' to encode the information needed to update or
 delete a ref (name, new sha1, optional old sha1, no-deref flag).  Add
 function 'update_refs' accepting an array of updates to perform.  First
 sort the input array to order locks consistently everywhere and reject
 multiple updates to the same ref.  Then acquire locks on all refs with
 verified old values.  Then update or delete all refs accordingly.  Fail
 if any one lock cannot be obtained or any one old value does not match.
 
 Though the refs themeselves cannot be modified together in a single

s/themeselves/themselves/

 atomic transaction, this function does enable some useful semantics.
 For example, a caller may create a new branch starting from the head of
 another branch and rewind the original branch at the same time.  This
 transfers ownership of commits between branches without risk of losing
 commits added to the original branch by a concurrent process, or risk of
 a concurrent process creating the new branch first.
 
 Signed-off-by: Brad King brad.k...@kitware.com
 ---
  refs.c |  121 
 
  refs.h |   14 
  2 files changed, 135 insertions(+)
 
 diff --git a/refs.c b/refs.c
 index 3bcd26e..901a38a 100644
 --- a/refs.c
 +++ b/refs.c
 @@ -3238,6 +3238,127 @@ int update_ref(const char *action, const char 
 *refname,
   return update_ref_write(action, refname, sha1, lock, onerr);
  }
  
 +static int ref_update_compare(const void *r1, const void *r2)
 +{
 + struct ref_update *u1 = (struct ref_update *)(r1);
 + struct ref_update *u2 = (struct ref_update *)(r2);

If you declare u1 and u2 to be const struct ref_update * (i.e., add
const), then you have const correctness and don't need the explicit
casts.  (And the parentheses around r1 and r2 are superfluous in any case.)

 + int ret;

Style: we usually put a blank line between variable declarations and the
first line of code.

 + ret = strcmp(u1-ref_name, u2-ref_name);
 + if (ret)
 + return ret;
 + ret = hashcmp(u1-new_sha1, u2-new_sha1);
 + if (ret)
 + return ret;
 + ret = hashcmp(u1-old_sha1, u2-old_sha1);
 + if (ret)
 + return ret;
 + ret = u1-flags - u2-flags;
 + if (ret)
 + return ret;
 + return u1-have_old - u2-have_old;
 +}

Is there a need to compare more than ref_name?  If two entries are found
with the same name, then ref_update_reject_duplicates() will error out
anyway.  So the relative order among entries with the same name is
irrelevant.  I think it would be OK to return 0 for any entries with the
same ref_name, even if they differ in other fields.

 +
 +static int ref_update_reject_duplicates(struct ref_update *updates, int n,
 + enum action_on_err onerr)
 +{
 + int i;
 + for (i = 1; i  n; ++i)
 + if (!strcmp(updates[i - 1].ref_name, updates[i].ref_name))
 + break;

The error handling code could be right here instead of the break
statement, removing the need for the if conditional.

 + if (i  n) {
 + const char *str = Multiple updates for ref '%s' not allowed.;
 + switch (onerr) {
 + case MSG_ON_ERR: error(str, updates[i].ref_name); break;
 + case DIE_ON_ERR: die(str, updates[i].ref_name); break;
 + case QUIET_ON_ERR: break;
 + }
 + return 1;
 + }
 + return 0;
 +}
 +
 +int update_refs(const char *action, const struct ref_update *updates_orig,
 + int n, enum action_on_err onerr)
 +{
 + int ret = 0, delnum = 0, i;
 + struct ref_update *updates;
 + int *types;
 + struct ref_lock **locks;
 + const char **delnames;
 +
 + if (!updates_orig || !n)
 + return 0;
 +
 + /* Allocate work space: */
 + updates = xmalloc(sizeof(struct ref_update) * n);

It seems preferred here to write

updates = xmalloc(sizeof(*updates) * n);

as this will continue to work if the type of updates is ever changed.
Similarly for the next lines.

 + types = xmalloc(sizeof(int) * n);
 + locks = xmalloc(sizeof(struct ref_lock *) * n);
 + delnames = xmalloc(sizeof(const char *) * n);

An alternative to managing separate arrays to hold types and locks would
be to include the scratch space in struct ref_update and document it
for internal use only; need not be initialized by caller.  On the one
hand it's ugly to cruft up the interface with internal implementation
details; on the other hand there is precedent for this sort of thing
(e.g., ref_lock::force_write or lock_file::on_list) and it would
simplify the code.

 +
 + /* Copy, sort, and reject duplicate refs: */
 + memcpy(updates, updates_orig, sizeof(struct ref_update) * n);
 + qsort(updates, n, sizeof(struct ref_update), ref_update_compare);

You could save some space and memory 

[PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-08-30 Thread Brad King
Add 'struct ref_update' to encode the information needed to update or
delete a ref (name, new sha1, optional old sha1, no-deref flag).  Add
function 'update_refs' accepting an array of updates to perform.  First
sort the input array to order locks consistently everywhere and reject
multiple updates to the same ref.  Then acquire locks on all refs with
verified old values.  Then update or delete all refs accordingly.  Fail
if any one lock cannot be obtained or any one old value does not match.

Though the refs themeselves cannot be modified together in a single
atomic transaction, this function does enable some useful semantics.
For example, a caller may create a new branch starting from the head of
another branch and rewind the original branch at the same time.  This
transfers ownership of commits between branches without risk of losing
commits added to the original branch by a concurrent process, or risk of
a concurrent process creating the new branch first.

Signed-off-by: Brad King brad.k...@kitware.com
---
 refs.c |  121 
 refs.h |   14 
 2 files changed, 135 insertions(+)

diff --git a/refs.c b/refs.c
index 3bcd26e..901a38a 100644
--- a/refs.c
+++ b/refs.c
@@ -3238,6 +3238,127 @@ int update_ref(const char *action, const char *refname,
return update_ref_write(action, refname, sha1, lock, onerr);
 }
 
+static int ref_update_compare(const void *r1, const void *r2)
+{
+   struct ref_update *u1 = (struct ref_update *)(r1);
+   struct ref_update *u2 = (struct ref_update *)(r2);
+   int ret;
+   ret = strcmp(u1-ref_name, u2-ref_name);
+   if (ret)
+   return ret;
+   ret = hashcmp(u1-new_sha1, u2-new_sha1);
+   if (ret)
+   return ret;
+   ret = hashcmp(u1-old_sha1, u2-old_sha1);
+   if (ret)
+   return ret;
+   ret = u1-flags - u2-flags;
+   if (ret)
+   return ret;
+   return u1-have_old - u2-have_old;
+}
+
+static int ref_update_reject_duplicates(struct ref_update *updates, int n,
+   enum action_on_err onerr)
+{
+   int i;
+   for (i = 1; i  n; ++i)
+   if (!strcmp(updates[i - 1].ref_name, updates[i].ref_name))
+   break;
+   if (i  n) {
+   const char *str = Multiple updates for ref '%s' not allowed.;
+   switch (onerr) {
+   case MSG_ON_ERR: error(str, updates[i].ref_name); break;
+   case DIE_ON_ERR: die(str, updates[i].ref_name); break;
+   case QUIET_ON_ERR: break;
+   }
+   return 1;
+   }
+   return 0;
+}
+
+int update_refs(const char *action, const struct ref_update *updates_orig,
+   int n, enum action_on_err onerr)
+{
+   int ret = 0, delnum = 0, i;
+   struct ref_update *updates;
+   int *types;
+   struct ref_lock **locks;
+   const char **delnames;
+
+   if (!updates_orig || !n)
+   return 0;
+
+   /* Allocate work space: */
+   updates = xmalloc(sizeof(struct ref_update) * n);
+   types = xmalloc(sizeof(int) * n);
+   locks = xmalloc(sizeof(struct ref_lock *) * n);
+   delnames = xmalloc(sizeof(const char *) * n);
+
+   /* Copy, sort, and reject duplicate refs: */
+   memcpy(updates, updates_orig, sizeof(struct ref_update) * n);
+   qsort(updates, n, sizeof(struct ref_update), ref_update_compare);
+   if (ref_update_reject_duplicates(updates, n, onerr)) {
+   free(updates);
+   free(types);
+   free(locks);
+   free(delnames);
+   return 1;
+   }
+
+   /* Acquire all locks while verifying old values: */
+   for (i = 0; i  n; ++i) {
+   locks[i] = update_ref_lock(updates[i].ref_name,
+  (updates[i].have_old ?
+   updates[i].old_sha1 : NULL),
+  updates[i].flags,
+  types[i], onerr);
+   if (!locks[i])
+   break;
+   }
+
+   /* Abort if we did not get all locks: */
+   if (i  n) {
+   while (--i = 0)
+   unlock_ref(locks[i]);
+   free(updates);
+   free(types);
+   free(locks);
+   free(delnames);
+   return 1;
+   }
+
+   /* Perform updates first so live commits remain referenced: */
+   for (i = 0; i  n; ++i)
+   if (!is_null_sha1(updates[i].new_sha1)) {
+   ret |= update_ref_write(action,
+   updates[i].ref_name,
+   updates[i].new_sha1,
+   locks[i], onerr);
+   locks[i] = 0; /* freed by