Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-09-03 Thread Brad King
On 09/03/2013 12:43 AM, Michael Haggerty wrote:
> Hmmm, I see that you changed the signature of update_refs() to take an
> array of pointers.  My suggestion was unclear, but I didn't mean that
> the function signature had to be changed.
[snip]
> However, your approach is also fine.

Okay.  Thanks for reviewing!

-Brad
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-09-02 Thread Michael Haggerty
On 09/02/2013 07:20 PM, Brad King wrote:
> On 09/01/2013 02:08 AM, Junio C Hamano wrote:
>>> Though the refs themeselves cannot be modified together in a single
>>
>> "themselves".
> 
> Fixed.
> 
>> I notice that we are using an array of structures and letting qsort
>> swap 50~64 bytes of data
> 
> Michael suggested this too, so fixed.

Hmmm, I see that you changed the signature of update_refs() to take an
array of pointers.  My suggestion was unclear, but I didn't mean that
the function signature had to be changed.  Rather, I meant that *within*
the function, you could have created an array of pointers to the
structures in the input array and thereafter accessed it via the pointers:

int update_refs(const char *action, const struct ref_update *updates_orig,
int n, enum action_on_err onerr)
{
[...]
struct ref_update **updates;

[...]
updates = xcalloc(n, sizeof(*updates));
for (i = 0; i < n; i++)
updates[i] = &updates_orig[i];
[...]
}

However, your approach is also fine.  It will typically involve more
malloc()s but smaller memcpy()s (i.e., via ALLOC_GROW()) at the caller,
and since usually the number of ref_updates being done at one time will
be limited anyway, I don't see a reason to prefer one version over the
other.

Thanks for making the change.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-09-02 Thread Brad King
On 08/31/2013 02:19 PM, Michael Haggerty wrote:
> s/themeselves/themselves/

Fixed.

>> +struct ref_update *u1 = (struct ref_update *)(r1);
>> +struct ref_update *u2 = (struct ref_update *)(r2);
> 
> If you declare u1 and u2 to be "const struct ref_update *" (i.e., add
> "const"), then you have const correctness and don't need the explicit
> casts.  (And the parentheses around r1 and r2 are superfluous in any case.)

Fixed.

>> +ret = strcmp(u1->ref_name, u2->ref_name);
> 
> Is there a need to compare more than ref_name?  If two entries are found
> with the same name, then ref_update_reject_duplicates() will error out

Junio mentioned possibility of auto-combining identical entries which would
need full ordering.  I think that can be added later so for now we can sort
only by ref name.  Thanks.

>> +if (!strcmp(updates[i - 1].ref_name, updates[i].ref_name))
>> +break;
> 
> The error handling code could be right here instead of the "break"
> statement, removing the need for the "if" conditional.

Fixed.

>> +/* Allocate work space: */
>> +updates = xmalloc(sizeof(struct ref_update) * n);
> 
> It seems preferred here to write
> 
>   updates = xmalloc(sizeof(*updates) * n);
> 
> as this will continue to work if the type of updates is ever changed.

Yes, thanks.

> Similarly for the next lines.
> 
>> +types = xmalloc(sizeof(int) * n);
>> +locks = xmalloc(sizeof(struct ref_lock *) * n);
>> +delnames = xmalloc(sizeof(const char *) * n);
> 
> An alternative to managing separate arrays to hold types and locks would
> be to include the scratch space in struct ref_update and document it
> "for internal use only; need not be initialized by caller".  On the one
> hand it's ugly to cruft up the "interface" with internal implementation
> details; on the other hand there is precedent for this sort of thing
> (e.g., ref_lock::force_write or lock_file::on_list) and it would
> simplify the code.

I think the "goto cleanup" reorganization simplifies the code enough
to not need this.  After changing "updates" to an array of pointers
it needs to be separate so we can sort.  Also "delnames" needs to be
a separate array to pass to repack_without_refs.

>> +/* Copy, sort, and reject duplicate refs: */
>> +memcpy(updates, updates_orig, sizeof(struct ref_update) * n);
>> +qsort(updates, n, sizeof(struct ref_update), ref_update_compare);
> 
> You could save some space and memory shuffling (during memcpy() and
> qsort()) if you would declare "updates" to be an array of pointers to
> "struct ref_update" rather than an array of structs.  Sorting could then
> be done by moving pointers around instead of moving the structs.  This
> would also make it easier for update_refs() to pass information about
> the references back to its caller, should that ever be needed.

Good idea.  Changed in next revision.

>> +ret |= update_ref_write(action,
>> +updates[i].ref_name,
>> +updates[i].new_sha1,
>> +locks[i], onerr);
>> +locks[i] = 0; /* freed by update_ref_write */
>> +}
>> +
> 
> Hmmm, if one of the calls to update_ref_write() fails, would it be safer
> to abort the rest of the work (especially the reference deletions)?

Yes.  Since we already have the lock at this point something must be
going pretty wrong if this fails so it is best to abort altogether.

>> +free(updates);
>> +free(types);
>> +free(locks);
>> +free(delnames);
>> +return ret;
>> +}
> 
> There's a lot of duplicated cleanup code in the function.  If you put a
> label before the final for loop, and if you initialize the locks array
> to zeros (e.g., by using xcalloc()), then the three exits could all
> share the same code "ret = 1; goto cleanup;".

Done, thanks.

>> +struct ref_update {
> 
> Please document this structure, especially the relationship between
> have_old and old_sha1.

Done.  I also moved it to the top of the header just under ref_lock
so it can be used by other APIs later.

Thanks,
-Brad
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-09-02 Thread Brad King
On 09/01/2013 02:08 AM, Junio C Hamano wrote:
>> Though the refs themeselves cannot be modified together in a single
> 
> "themselves".

Fixed.

> I notice that we are using an array of structures and letting qsort
> swap 50~64 bytes of data

Michael suggested this too, so fixed.

> Optionally we could silently dedup multiple identical updates and
> not fail it in ref-update-reject-duplicates.  But that does not have
> to be done until we find people's script would benefit from such a
> nicety.

We can always be less strict about input later, so I'd like to keep
the implementation simpler for now.

> By the way, unless there is a strong reason not to do so,
> post-increment "i++" (and pre-decrement "--i", if you use it) is the
> norm around here.

Okay, fixed in entire series.

>> +locks[i] = 0; /* freed by update_ref_write */
> 
> I think what is assigned here is a NULL pointer.

Yes, fixed.

Thanks,
-Brad
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-08-31 Thread Junio C Hamano
Brad King  writes:

> Add 'struct ref_update' to encode the information needed to update or
> delete a ref (name, new sha1, optional old sha1, no-deref flag).  Add
> function 'update_refs' accepting an array of updates to perform.  First
> sort the input array to order locks consistently everywhere and reject
> multiple updates to the same ref.  Then acquire locks on all refs with
> verified old values.  Then update or delete all refs accordingly.  Fail
> if any one lock cannot be obtained or any one old value does not match.

OK.  The code releases the locks it acquired so far when it fails,
which is good.

> Though the refs themeselves cannot be modified together in a single

"themselves".

> atomic transaction, this function does enable some useful semantics.
> For example, a caller may create a new branch starting from the head of
> another branch and rewind the original branch at the same time.  This
> transfers ownership of commits between branches without risk of losing
> commits added to the original branch by a concurrent process, or risk of
> a concurrent process creating the new branch first.

> +static int ref_update_compare(const void *r1, const void *r2)
> +{
> + struct ref_update *u1 = (struct ref_update *)(r1);
> + struct ref_update *u2 = (struct ref_update *)(r2);
> + int ret;

Let's have a blank line between the end of decls and the beginning
of the body here.

> + ret = strcmp(u1->ref_name, u2->ref_name);
> + if (ret)
> + return ret;
> + ret = hashcmp(u1->new_sha1, u2->new_sha1);
> + if (ret)
> + return ret;
> + ret = hashcmp(u1->old_sha1, u2->old_sha1);
> + if (ret)
> + return ret;
> + ret = u1->flags - u2->flags;
> + if (ret)
> + return ret;
> + return u1->have_old - u2->have_old;
> +}

I notice that we are using an array of structures and letting qsort
swap 50~64 bytes of data, instead of sorting an array of pointers,
each element of which points at a structure.  This may not matter
unless we are asked to update thousands at once, so I think it is OK
for now.

> +static int ref_update_reject_duplicates(struct ref_update *updates, int n,
> + enum action_on_err onerr)
> +{
> + int i;
> + for (i = 1; i < n; ++i)
> + if (!strcmp(updates[i - 1].ref_name, updates[i].ref_name))
> + break;

Optionally we could silently dedup multiple identical updates and
not fail it in ref-update-reject-duplicates.  But that does not have
to be done until we find people's script would benefit from such a
nicety.

By the way, unless there is a strong reason not to do so,
post-increment "i++" (and pre-decrement "--i", if you use it) is the
norm around here.  Especially in places like the third part of a
for(;;) loop where people are used to see "i++", breaking the idiom
makes readers wonder if there is something else going on.

> + /* Perform updates first so live commits remain referenced: */
> + for (i = 0; i < n; ++i)
> + if (!is_null_sha1(updates[i].new_sha1)) {
> + ret |= update_ref_write(action,
> + updates[i].ref_name,
> + updates[i].new_sha1,
> + locks[i], onerr);
> + locks[i] = 0; /* freed by update_ref_write */

I think what is assigned here is a NULL pointer.

Will locally tweak while queuing.  Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-08-31 Thread Michael Haggerty
On 08/30/2013 08:12 PM, Brad King wrote:
> Add 'struct ref_update' to encode the information needed to update or
> delete a ref (name, new sha1, optional old sha1, no-deref flag).  Add
> function 'update_refs' accepting an array of updates to perform.  First
> sort the input array to order locks consistently everywhere and reject
> multiple updates to the same ref.  Then acquire locks on all refs with
> verified old values.  Then update or delete all refs accordingly.  Fail
> if any one lock cannot be obtained or any one old value does not match.
> 
> Though the refs themeselves cannot be modified together in a single

s/themeselves/themselves/

> atomic transaction, this function does enable some useful semantics.
> For example, a caller may create a new branch starting from the head of
> another branch and rewind the original branch at the same time.  This
> transfers ownership of commits between branches without risk of losing
> commits added to the original branch by a concurrent process, or risk of
> a concurrent process creating the new branch first.
> 
> Signed-off-by: Brad King 
> ---
>  refs.c |  121 
> 
>  refs.h |   14 
>  2 files changed, 135 insertions(+)
> 
> diff --git a/refs.c b/refs.c
> index 3bcd26e..901a38a 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -3238,6 +3238,127 @@ int update_ref(const char *action, const char 
> *refname,
>   return update_ref_write(action, refname, sha1, lock, onerr);
>  }
>  
> +static int ref_update_compare(const void *r1, const void *r2)
> +{
> + struct ref_update *u1 = (struct ref_update *)(r1);
> + struct ref_update *u2 = (struct ref_update *)(r2);

If you declare u1 and u2 to be "const struct ref_update *" (i.e., add
"const"), then you have const correctness and don't need the explicit
casts.  (And the parentheses around r1 and r2 are superfluous in any case.)

> + int ret;

Style: we usually put a blank line between variable declarations and the
first line of code.

> + ret = strcmp(u1->ref_name, u2->ref_name);
> + if (ret)
> + return ret;
> + ret = hashcmp(u1->new_sha1, u2->new_sha1);
> + if (ret)
> + return ret;
> + ret = hashcmp(u1->old_sha1, u2->old_sha1);
> + if (ret)
> + return ret;
> + ret = u1->flags - u2->flags;
> + if (ret)
> + return ret;
> + return u1->have_old - u2->have_old;
> +}

Is there a need to compare more than ref_name?  If two entries are found
with the same name, then ref_update_reject_duplicates() will error out
anyway.  So the relative order among entries with the same name is
irrelevant.  I think it would be OK to return 0 for any entries with the
same ref_name, even if they differ in other fields.

> +
> +static int ref_update_reject_duplicates(struct ref_update *updates, int n,
> + enum action_on_err onerr)
> +{
> + int i;
> + for (i = 1; i < n; ++i)
> + if (!strcmp(updates[i - 1].ref_name, updates[i].ref_name))
> + break;

The error handling code could be right here instead of the "break"
statement, removing the need for the "if" conditional.

> + if (i < n) {
> + const char *str = "Multiple updates for ref '%s' not allowed.";
> + switch (onerr) {
> + case MSG_ON_ERR: error(str, updates[i].ref_name); break;
> + case DIE_ON_ERR: die(str, updates[i].ref_name); break;
> + case QUIET_ON_ERR: break;
> + }
> + return 1;
> + }
> + return 0;
> +}
> +
> +int update_refs(const char *action, const struct ref_update *updates_orig,
> + int n, enum action_on_err onerr)
> +{
> + int ret = 0, delnum = 0, i;
> + struct ref_update *updates;
> + int *types;
> + struct ref_lock **locks;
> + const char **delnames;
> +
> + if (!updates_orig || !n)
> + return 0;
> +
> + /* Allocate work space: */
> + updates = xmalloc(sizeof(struct ref_update) * n);

It seems preferred here to write

updates = xmalloc(sizeof(*updates) * n);

as this will continue to work if the type of updates is ever changed.
Similarly for the next lines.

> + types = xmalloc(sizeof(int) * n);
> + locks = xmalloc(sizeof(struct ref_lock *) * n);
> + delnames = xmalloc(sizeof(const char *) * n);

An alternative to managing separate arrays to hold types and locks would
be to include the scratch space in struct ref_update and document it
"for internal use only; need not be initialized by caller".  On the one
hand it's ugly to cruft up the "interface" with internal implementation
details; on the other hand there is precedent for this sort of thing
(e.g., ref_lock::force_write or lock_file::on_list) and it would
simplify the code.

> +
> + /* Copy, sort, and reject duplicate refs: */
> + memcpy(updates, updates_orig, sizeof(struct ref_update) * n);
> + qsort(updates,

[PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates

2013-08-30 Thread Brad King
Add 'struct ref_update' to encode the information needed to update or
delete a ref (name, new sha1, optional old sha1, no-deref flag).  Add
function 'update_refs' accepting an array of updates to perform.  First
sort the input array to order locks consistently everywhere and reject
multiple updates to the same ref.  Then acquire locks on all refs with
verified old values.  Then update or delete all refs accordingly.  Fail
if any one lock cannot be obtained or any one old value does not match.

Though the refs themeselves cannot be modified together in a single
atomic transaction, this function does enable some useful semantics.
For example, a caller may create a new branch starting from the head of
another branch and rewind the original branch at the same time.  This
transfers ownership of commits between branches without risk of losing
commits added to the original branch by a concurrent process, or risk of
a concurrent process creating the new branch first.

Signed-off-by: Brad King 
---
 refs.c |  121 
 refs.h |   14 
 2 files changed, 135 insertions(+)

diff --git a/refs.c b/refs.c
index 3bcd26e..901a38a 100644
--- a/refs.c
+++ b/refs.c
@@ -3238,6 +3238,127 @@ int update_ref(const char *action, const char *refname,
return update_ref_write(action, refname, sha1, lock, onerr);
 }
 
+static int ref_update_compare(const void *r1, const void *r2)
+{
+   struct ref_update *u1 = (struct ref_update *)(r1);
+   struct ref_update *u2 = (struct ref_update *)(r2);
+   int ret;
+   ret = strcmp(u1->ref_name, u2->ref_name);
+   if (ret)
+   return ret;
+   ret = hashcmp(u1->new_sha1, u2->new_sha1);
+   if (ret)
+   return ret;
+   ret = hashcmp(u1->old_sha1, u2->old_sha1);
+   if (ret)
+   return ret;
+   ret = u1->flags - u2->flags;
+   if (ret)
+   return ret;
+   return u1->have_old - u2->have_old;
+}
+
+static int ref_update_reject_duplicates(struct ref_update *updates, int n,
+   enum action_on_err onerr)
+{
+   int i;
+   for (i = 1; i < n; ++i)
+   if (!strcmp(updates[i - 1].ref_name, updates[i].ref_name))
+   break;
+   if (i < n) {
+   const char *str = "Multiple updates for ref '%s' not allowed.";
+   switch (onerr) {
+   case MSG_ON_ERR: error(str, updates[i].ref_name); break;
+   case DIE_ON_ERR: die(str, updates[i].ref_name); break;
+   case QUIET_ON_ERR: break;
+   }
+   return 1;
+   }
+   return 0;
+}
+
+int update_refs(const char *action, const struct ref_update *updates_orig,
+   int n, enum action_on_err onerr)
+{
+   int ret = 0, delnum = 0, i;
+   struct ref_update *updates;
+   int *types;
+   struct ref_lock **locks;
+   const char **delnames;
+
+   if (!updates_orig || !n)
+   return 0;
+
+   /* Allocate work space: */
+   updates = xmalloc(sizeof(struct ref_update) * n);
+   types = xmalloc(sizeof(int) * n);
+   locks = xmalloc(sizeof(struct ref_lock *) * n);
+   delnames = xmalloc(sizeof(const char *) * n);
+
+   /* Copy, sort, and reject duplicate refs: */
+   memcpy(updates, updates_orig, sizeof(struct ref_update) * n);
+   qsort(updates, n, sizeof(struct ref_update), ref_update_compare);
+   if (ref_update_reject_duplicates(updates, n, onerr)) {
+   free(updates);
+   free(types);
+   free(locks);
+   free(delnames);
+   return 1;
+   }
+
+   /* Acquire all locks while verifying old values: */
+   for (i = 0; i < n; ++i) {
+   locks[i] = update_ref_lock(updates[i].ref_name,
+  (updates[i].have_old ?
+   updates[i].old_sha1 : NULL),
+  updates[i].flags,
+  &types[i], onerr);
+   if (!locks[i])
+   break;
+   }
+
+   /* Abort if we did not get all locks: */
+   if (i < n) {
+   while (--i >= 0)
+   unlock_ref(locks[i]);
+   free(updates);
+   free(types);
+   free(locks);
+   free(delnames);
+   return 1;
+   }
+
+   /* Perform updates first so live commits remain referenced: */
+   for (i = 0; i < n; ++i)
+   if (!is_null_sha1(updates[i].new_sha1)) {
+   ret |= update_ref_write(action,
+   updates[i].ref_name,
+   updates[i].new_sha1,
+   locks[i], onerr);
+   locks[i] = 0; /* freed by update_ref_wri