Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates
On 09/03/2013 12:43 AM, Michael Haggerty wrote: > Hmmm, I see that you changed the signature of update_refs() to take an > array of pointers. My suggestion was unclear, but I didn't mean that > the function signature had to be changed. [snip] > However, your approach is also fine. Okay. Thanks for reviewing! -Brad -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates
On 09/02/2013 07:20 PM, Brad King wrote: > On 09/01/2013 02:08 AM, Junio C Hamano wrote: >>> Though the refs themeselves cannot be modified together in a single >> >> "themselves". > > Fixed. > >> I notice that we are using an array of structures and letting qsort >> swap 50~64 bytes of data > > Michael suggested this too, so fixed. Hmmm, I see that you changed the signature of update_refs() to take an array of pointers. My suggestion was unclear, but I didn't mean that the function signature had to be changed. Rather, I meant that *within* the function, you could have created an array of pointers to the structures in the input array and thereafter accessed it via the pointers: int update_refs(const char *action, const struct ref_update *updates_orig, int n, enum action_on_err onerr) { [...] struct ref_update **updates; [...] updates = xcalloc(n, sizeof(*updates)); for (i = 0; i < n; i++) updates[i] = &updates_orig[i]; [...] } However, your approach is also fine. It will typically involve more malloc()s but smaller memcpy()s (i.e., via ALLOC_GROW()) at the caller, and since usually the number of ref_updates being done at one time will be limited anyway, I don't see a reason to prefer one version over the other. Thanks for making the change. Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates
On 08/31/2013 02:19 PM, Michael Haggerty wrote: > s/themeselves/themselves/ Fixed. >> +struct ref_update *u1 = (struct ref_update *)(r1); >> +struct ref_update *u2 = (struct ref_update *)(r2); > > If you declare u1 and u2 to be "const struct ref_update *" (i.e., add > "const"), then you have const correctness and don't need the explicit > casts. (And the parentheses around r1 and r2 are superfluous in any case.) Fixed. >> +ret = strcmp(u1->ref_name, u2->ref_name); > > Is there a need to compare more than ref_name? If two entries are found > with the same name, then ref_update_reject_duplicates() will error out Junio mentioned possibility of auto-combining identical entries which would need full ordering. I think that can be added later so for now we can sort only by ref name. Thanks. >> +if (!strcmp(updates[i - 1].ref_name, updates[i].ref_name)) >> +break; > > The error handling code could be right here instead of the "break" > statement, removing the need for the "if" conditional. Fixed. >> +/* Allocate work space: */ >> +updates = xmalloc(sizeof(struct ref_update) * n); > > It seems preferred here to write > > updates = xmalloc(sizeof(*updates) * n); > > as this will continue to work if the type of updates is ever changed. Yes, thanks. > Similarly for the next lines. > >> +types = xmalloc(sizeof(int) * n); >> +locks = xmalloc(sizeof(struct ref_lock *) * n); >> +delnames = xmalloc(sizeof(const char *) * n); > > An alternative to managing separate arrays to hold types and locks would > be to include the scratch space in struct ref_update and document it > "for internal use only; need not be initialized by caller". On the one > hand it's ugly to cruft up the "interface" with internal implementation > details; on the other hand there is precedent for this sort of thing > (e.g., ref_lock::force_write or lock_file::on_list) and it would > simplify the code. I think the "goto cleanup" reorganization simplifies the code enough to not need this. After changing "updates" to an array of pointers it needs to be separate so we can sort. Also "delnames" needs to be a separate array to pass to repack_without_refs. >> +/* Copy, sort, and reject duplicate refs: */ >> +memcpy(updates, updates_orig, sizeof(struct ref_update) * n); >> +qsort(updates, n, sizeof(struct ref_update), ref_update_compare); > > You could save some space and memory shuffling (during memcpy() and > qsort()) if you would declare "updates" to be an array of pointers to > "struct ref_update" rather than an array of structs. Sorting could then > be done by moving pointers around instead of moving the structs. This > would also make it easier for update_refs() to pass information about > the references back to its caller, should that ever be needed. Good idea. Changed in next revision. >> +ret |= update_ref_write(action, >> +updates[i].ref_name, >> +updates[i].new_sha1, >> +locks[i], onerr); >> +locks[i] = 0; /* freed by update_ref_write */ >> +} >> + > > Hmmm, if one of the calls to update_ref_write() fails, would it be safer > to abort the rest of the work (especially the reference deletions)? Yes. Since we already have the lock at this point something must be going pretty wrong if this fails so it is best to abort altogether. >> +free(updates); >> +free(types); >> +free(locks); >> +free(delnames); >> +return ret; >> +} > > There's a lot of duplicated cleanup code in the function. If you put a > label before the final for loop, and if you initialize the locks array > to zeros (e.g., by using xcalloc()), then the three exits could all > share the same code "ret = 1; goto cleanup;". Done, thanks. >> +struct ref_update { > > Please document this structure, especially the relationship between > have_old and old_sha1. Done. I also moved it to the top of the header just under ref_lock so it can be used by other APIs later. Thanks, -Brad -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates
On 09/01/2013 02:08 AM, Junio C Hamano wrote: >> Though the refs themeselves cannot be modified together in a single > > "themselves". Fixed. > I notice that we are using an array of structures and letting qsort > swap 50~64 bytes of data Michael suggested this too, so fixed. > Optionally we could silently dedup multiple identical updates and > not fail it in ref-update-reject-duplicates. But that does not have > to be done until we find people's script would benefit from such a > nicety. We can always be less strict about input later, so I'd like to keep the implementation simpler for now. > By the way, unless there is a strong reason not to do so, > post-increment "i++" (and pre-decrement "--i", if you use it) is the > norm around here. Okay, fixed in entire series. >> +locks[i] = 0; /* freed by update_ref_write */ > > I think what is assigned here is a NULL pointer. Yes, fixed. Thanks, -Brad -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates
Brad King writes: > Add 'struct ref_update' to encode the information needed to update or > delete a ref (name, new sha1, optional old sha1, no-deref flag). Add > function 'update_refs' accepting an array of updates to perform. First > sort the input array to order locks consistently everywhere and reject > multiple updates to the same ref. Then acquire locks on all refs with > verified old values. Then update or delete all refs accordingly. Fail > if any one lock cannot be obtained or any one old value does not match. OK. The code releases the locks it acquired so far when it fails, which is good. > Though the refs themeselves cannot be modified together in a single "themselves". > atomic transaction, this function does enable some useful semantics. > For example, a caller may create a new branch starting from the head of > another branch and rewind the original branch at the same time. This > transfers ownership of commits between branches without risk of losing > commits added to the original branch by a concurrent process, or risk of > a concurrent process creating the new branch first. > +static int ref_update_compare(const void *r1, const void *r2) > +{ > + struct ref_update *u1 = (struct ref_update *)(r1); > + struct ref_update *u2 = (struct ref_update *)(r2); > + int ret; Let's have a blank line between the end of decls and the beginning of the body here. > + ret = strcmp(u1->ref_name, u2->ref_name); > + if (ret) > + return ret; > + ret = hashcmp(u1->new_sha1, u2->new_sha1); > + if (ret) > + return ret; > + ret = hashcmp(u1->old_sha1, u2->old_sha1); > + if (ret) > + return ret; > + ret = u1->flags - u2->flags; > + if (ret) > + return ret; > + return u1->have_old - u2->have_old; > +} I notice that we are using an array of structures and letting qsort swap 50~64 bytes of data, instead of sorting an array of pointers, each element of which points at a structure. This may not matter unless we are asked to update thousands at once, so I think it is OK for now. > +static int ref_update_reject_duplicates(struct ref_update *updates, int n, > + enum action_on_err onerr) > +{ > + int i; > + for (i = 1; i < n; ++i) > + if (!strcmp(updates[i - 1].ref_name, updates[i].ref_name)) > + break; Optionally we could silently dedup multiple identical updates and not fail it in ref-update-reject-duplicates. But that does not have to be done until we find people's script would benefit from such a nicety. By the way, unless there is a strong reason not to do so, post-increment "i++" (and pre-decrement "--i", if you use it) is the norm around here. Especially in places like the third part of a for(;;) loop where people are used to see "i++", breaking the idiom makes readers wonder if there is something else going on. > + /* Perform updates first so live commits remain referenced: */ > + for (i = 0; i < n; ++i) > + if (!is_null_sha1(updates[i].new_sha1)) { > + ret |= update_ref_write(action, > + updates[i].ref_name, > + updates[i].new_sha1, > + locks[i], onerr); > + locks[i] = 0; /* freed by update_ref_write */ I think what is assigned here is a NULL pointer. Will locally tweak while queuing. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates
On 08/30/2013 08:12 PM, Brad King wrote: > Add 'struct ref_update' to encode the information needed to update or > delete a ref (name, new sha1, optional old sha1, no-deref flag). Add > function 'update_refs' accepting an array of updates to perform. First > sort the input array to order locks consistently everywhere and reject > multiple updates to the same ref. Then acquire locks on all refs with > verified old values. Then update or delete all refs accordingly. Fail > if any one lock cannot be obtained or any one old value does not match. > > Though the refs themeselves cannot be modified together in a single s/themeselves/themselves/ > atomic transaction, this function does enable some useful semantics. > For example, a caller may create a new branch starting from the head of > another branch and rewind the original branch at the same time. This > transfers ownership of commits between branches without risk of losing > commits added to the original branch by a concurrent process, or risk of > a concurrent process creating the new branch first. > > Signed-off-by: Brad King > --- > refs.c | 121 > > refs.h | 14 > 2 files changed, 135 insertions(+) > > diff --git a/refs.c b/refs.c > index 3bcd26e..901a38a 100644 > --- a/refs.c > +++ b/refs.c > @@ -3238,6 +3238,127 @@ int update_ref(const char *action, const char > *refname, > return update_ref_write(action, refname, sha1, lock, onerr); > } > > +static int ref_update_compare(const void *r1, const void *r2) > +{ > + struct ref_update *u1 = (struct ref_update *)(r1); > + struct ref_update *u2 = (struct ref_update *)(r2); If you declare u1 and u2 to be "const struct ref_update *" (i.e., add "const"), then you have const correctness and don't need the explicit casts. (And the parentheses around r1 and r2 are superfluous in any case.) > + int ret; Style: we usually put a blank line between variable declarations and the first line of code. > + ret = strcmp(u1->ref_name, u2->ref_name); > + if (ret) > + return ret; > + ret = hashcmp(u1->new_sha1, u2->new_sha1); > + if (ret) > + return ret; > + ret = hashcmp(u1->old_sha1, u2->old_sha1); > + if (ret) > + return ret; > + ret = u1->flags - u2->flags; > + if (ret) > + return ret; > + return u1->have_old - u2->have_old; > +} Is there a need to compare more than ref_name? If two entries are found with the same name, then ref_update_reject_duplicates() will error out anyway. So the relative order among entries with the same name is irrelevant. I think it would be OK to return 0 for any entries with the same ref_name, even if they differ in other fields. > + > +static int ref_update_reject_duplicates(struct ref_update *updates, int n, > + enum action_on_err onerr) > +{ > + int i; > + for (i = 1; i < n; ++i) > + if (!strcmp(updates[i - 1].ref_name, updates[i].ref_name)) > + break; The error handling code could be right here instead of the "break" statement, removing the need for the "if" conditional. > + if (i < n) { > + const char *str = "Multiple updates for ref '%s' not allowed."; > + switch (onerr) { > + case MSG_ON_ERR: error(str, updates[i].ref_name); break; > + case DIE_ON_ERR: die(str, updates[i].ref_name); break; > + case QUIET_ON_ERR: break; > + } > + return 1; > + } > + return 0; > +} > + > +int update_refs(const char *action, const struct ref_update *updates_orig, > + int n, enum action_on_err onerr) > +{ > + int ret = 0, delnum = 0, i; > + struct ref_update *updates; > + int *types; > + struct ref_lock **locks; > + const char **delnames; > + > + if (!updates_orig || !n) > + return 0; > + > + /* Allocate work space: */ > + updates = xmalloc(sizeof(struct ref_update) * n); It seems preferred here to write updates = xmalloc(sizeof(*updates) * n); as this will continue to work if the type of updates is ever changed. Similarly for the next lines. > + types = xmalloc(sizeof(int) * n); > + locks = xmalloc(sizeof(struct ref_lock *) * n); > + delnames = xmalloc(sizeof(const char *) * n); An alternative to managing separate arrays to hold types and locks would be to include the scratch space in struct ref_update and document it "for internal use only; need not be initialized by caller". On the one hand it's ugly to cruft up the "interface" with internal implementation details; on the other hand there is precedent for this sort of thing (e.g., ref_lock::force_write or lock_file::on_list) and it would simplify the code. > + > + /* Copy, sort, and reject duplicate refs: */ > + memcpy(updates, updates_orig, sizeof(struct ref_update) * n); > + qsort(updates,
[PATCH v2 6/8] refs: add update_refs for multiple simultaneous updates
Add 'struct ref_update' to encode the information needed to update or delete a ref (name, new sha1, optional old sha1, no-deref flag). Add function 'update_refs' accepting an array of updates to perform. First sort the input array to order locks consistently everywhere and reject multiple updates to the same ref. Then acquire locks on all refs with verified old values. Then update or delete all refs accordingly. Fail if any one lock cannot be obtained or any one old value does not match. Though the refs themeselves cannot be modified together in a single atomic transaction, this function does enable some useful semantics. For example, a caller may create a new branch starting from the head of another branch and rewind the original branch at the same time. This transfers ownership of commits between branches without risk of losing commits added to the original branch by a concurrent process, or risk of a concurrent process creating the new branch first. Signed-off-by: Brad King --- refs.c | 121 refs.h | 14 2 files changed, 135 insertions(+) diff --git a/refs.c b/refs.c index 3bcd26e..901a38a 100644 --- a/refs.c +++ b/refs.c @@ -3238,6 +3238,127 @@ int update_ref(const char *action, const char *refname, return update_ref_write(action, refname, sha1, lock, onerr); } +static int ref_update_compare(const void *r1, const void *r2) +{ + struct ref_update *u1 = (struct ref_update *)(r1); + struct ref_update *u2 = (struct ref_update *)(r2); + int ret; + ret = strcmp(u1->ref_name, u2->ref_name); + if (ret) + return ret; + ret = hashcmp(u1->new_sha1, u2->new_sha1); + if (ret) + return ret; + ret = hashcmp(u1->old_sha1, u2->old_sha1); + if (ret) + return ret; + ret = u1->flags - u2->flags; + if (ret) + return ret; + return u1->have_old - u2->have_old; +} + +static int ref_update_reject_duplicates(struct ref_update *updates, int n, + enum action_on_err onerr) +{ + int i; + for (i = 1; i < n; ++i) + if (!strcmp(updates[i - 1].ref_name, updates[i].ref_name)) + break; + if (i < n) { + const char *str = "Multiple updates for ref '%s' not allowed."; + switch (onerr) { + case MSG_ON_ERR: error(str, updates[i].ref_name); break; + case DIE_ON_ERR: die(str, updates[i].ref_name); break; + case QUIET_ON_ERR: break; + } + return 1; + } + return 0; +} + +int update_refs(const char *action, const struct ref_update *updates_orig, + int n, enum action_on_err onerr) +{ + int ret = 0, delnum = 0, i; + struct ref_update *updates; + int *types; + struct ref_lock **locks; + const char **delnames; + + if (!updates_orig || !n) + return 0; + + /* Allocate work space: */ + updates = xmalloc(sizeof(struct ref_update) * n); + types = xmalloc(sizeof(int) * n); + locks = xmalloc(sizeof(struct ref_lock *) * n); + delnames = xmalloc(sizeof(const char *) * n); + + /* Copy, sort, and reject duplicate refs: */ + memcpy(updates, updates_orig, sizeof(struct ref_update) * n); + qsort(updates, n, sizeof(struct ref_update), ref_update_compare); + if (ref_update_reject_duplicates(updates, n, onerr)) { + free(updates); + free(types); + free(locks); + free(delnames); + return 1; + } + + /* Acquire all locks while verifying old values: */ + for (i = 0; i < n; ++i) { + locks[i] = update_ref_lock(updates[i].ref_name, + (updates[i].have_old ? + updates[i].old_sha1 : NULL), + updates[i].flags, + &types[i], onerr); + if (!locks[i]) + break; + } + + /* Abort if we did not get all locks: */ + if (i < n) { + while (--i >= 0) + unlock_ref(locks[i]); + free(updates); + free(types); + free(locks); + free(delnames); + return 1; + } + + /* Perform updates first so live commits remain referenced: */ + for (i = 0; i < n; ++i) + if (!is_null_sha1(updates[i].new_sha1)) { + ret |= update_ref_write(action, + updates[i].ref_name, + updates[i].new_sha1, + locks[i], onerr); + locks[i] = 0; /* freed by update_ref_wri