Re: [RFC/PATCH] notes-util.c: replace git_config with git_config_get_string
Tanay Abhra tanay...@gmail.com writes: On 6/30/2014 7:04 PM, Karsten Blees wrote: Am 29.06.2014 13:01, schrieb Eric Sunshine: On Thu, Jun 26, 2014 at 4:19 AM, Tanay Abhra tanay...@gmail.com wrote: On 6/25/2014 1:24 PM, Eric Sunshine wrote: On Mon, Jun 23, 2014 at 6:41 AM, Tanay Abhra tanay...@gmail.com wrote: Use git_config_get_string instead of git_config to take advantage of the config hash-table api which provides a cleaner control flow. Signed-off-by: Tanay Abhra tanay...@gmail.com --- notes-utils.c | 31 +++ 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/notes-utils.c b/notes-utils.c index a0b1d7b..fdc9912 100644 --- a/notes-utils.c +++ b/notes-utils.c @@ -68,22 +68,23 @@ static combine_notes_fn parse_combine_notes_fn(const char *v) return NULL; } -static int notes_rewrite_config(const char *k, const char *v, void *cb) +static void notes_rewrite_config(struct notes_rewrite_cfg *c) { - struct notes_rewrite_cfg *c = cb; - if (starts_with(k, notes.rewrite.) !strcmp(k+14, c-cmd)) { - c-enabled = git_config_bool(k, v); - return 0; - } else if (!c-mode_from_env !strcmp(k, notes.rewritemode)) { + struct strbuf key = STRBUF_INIT; + const char *v; + strbuf_addf(key, notes.rewrite.%s, c-cmd); + + if (!git_config_get_string(key.buf, v)) + c-enabled = git_config_bool(key.buf, v); + + if (!c-mode_from_env !git_config_get_string(notes.rewritemode, v)) { if (!v) - return config_error_nonbool(k); + config_error_nonbool(notes.rewritemode); There's a behavior change here. In the original code, the callback function would return -1, which would cause the program to die() if the config.c:die_on_error flag was set. The new code merely emits an error. Is this change serious enough? Can I ignore it? IMO its better to Fail Fast than continue with some invalid config (which may lead to more severe errors such as data corruption / data loss). Noted but, what I am trying to do with the rewrite is emit an error and not set the value if the value found is a NULL. The only change is that program will not crash in this case and warn the user not set a NULL value for a non boolean key. This won't lead to severe errors as the value will not be set if found value is a NULL. The change probably makes sense, but as much as possible, keep refactoring patches and patches introducing a semantic change separate. It's much easier to review, and helps user digging history and finding one of the commits. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] wt-status: simplify building of summary limit argument
René Scharfe l@web.de writes: Signed-off-by: Rene Scharfe l@web.de --- wt-status.c | 4 +--- Both patches sound straightforward and good to me. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: move detection doesnt take filename into account
- Ursprungligt meddelande - Från: Elliot Wolk elliot.w...@gmail.com Till: git@vger.kernel.org Skickat: måndag, 30 jun 2014 8:38:18 Ämne: move detection doesnt take filename into account if you move two identical {e.g.: empty} files to two new locations in a single commit, the move detection picks them {seemingly?} arbitrarily. it should use a statistical algorithm to compare the filenames and pick a likely match. I think it does, but based on filename suffix. E.g. here is a rename of three empty files with a suffix. 3 files changed, 0 insertions(+), 0 deletions(-) rename 1.a = 2.a (100%) rename 1.b = 2.b (100%) rename 1.c = 2.c (100%) -- robin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix filter-branch to eliminate duplicate mapped parents
On Mon, Jun 30, 2014 at 10:20:27PM +0100, Charles Bailey wrote: From: Charles Bailey cbaile...@bloomberg.net This change ensure that duplicate parents are pruned before the parent filter and ensures that --prune-empty is idempotent, removing all empty non-merge commits in a singe pass. s/change ensure/change ensures/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v12 11/11] Documentation: add documentation for 'git interpret-trailers'
On Mon, Jun 30, 2014 at 1:57 PM, Jakub Narębski jna...@gmail.com wrote: Christian Couder wrote: + + +* Configure a 'sign' trailer with a command to automatically add a + 'Signed-off-by: ' with the author information only if there is no + 'Signed-off-by: ' already, and show how it works: ++ + +$ git config trailer.sign.key Signed-off-by: +$ git config trailer.sign.ifmissing add +$ git config trailer.sign.ifexists doNothing +$ git config trailer.sign.command 'echo $(git config user.name) $(git config user.email)' +$ git interpret-trailers EOF + EOF How to configure git-interpret-trailers command so that it follow current rules for DCO: * Signed-off-by: is always at bottom; we can have signoff+signoff+ack+signoff * Signed-off-by: can repeat itself with the same author; this denotes steps in coming up with current version of the patch. * but we shouldn't repeat the same signoff one after another Right now something like: signoff+signoff+ack+signoff is not supported. It could be, if someone implements more options for the trailer.token.where config variable. Right now the only options are after and before, but it could be possible to have end and start too. And maybe end should be the default instead of after. When I first worked on this series I was under the impression that people wanted to group all the trailers with the same token together. So we want to allow this: Signed-off-by: A U Thor aut...@example.com Signed-off-by: Joe R. Hacker j...@hacker.com Acked-by: D E Veloper develo...@example.com Signed-off-by: C O Mitter commit...@example.com but prevent this Signed-off-by: C O Mitter commit...@example.com Signed-off-by: C O Mitter commit...@example.com This can be prevented by using addIfDifferentNeighbor, for example like this: $ git config trailer.sign.ifexists addIfDifferentNeighbor So I think the full config for what you want would be something like: $ git config trailer.sign.key Signed-off-by: $ git config trailer.sign.ifmissing add $ git config trailer.sign.ifexists addIfDifferentNeighbor $ git config trailer.sign.where end $ git config trailer.sign.command 'echo $(git config user.name) $(git config user.email)' $ git config trailer.ack.key Acked-by: $ git config trailer.ack.ifmissing add $ git config trailer.ack.ifexists addIfDifferentNeighbor $ git config trailer.ack.where end Best, Christian. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: move detection doesnt take filename into account
interesting that it considers suffixes {only suffixes following periods?}. this is insufficient, in my opinion. with all other things being equal, it ought to find the closest match {using smith-waterman or some such algorithm}. as a real-world use case, i have a repository with empty files that mirrors the file structure of a directory containing large binary files. when i move a dir, it seems to select the files renamed at random. On 07/01/2014 05:16 AM, Robin Rosenberg wrote: - Ursprungligt meddelande - Från: Elliot Wolk elliot.w...@gmail.com Till: git@vger.kernel.org Skickat: måndag, 30 jun 2014 8:38:18 Ämne: move detection doesnt take filename into account if you move two identical {e.g.: empty} files to two new locations in a single commit, the move detection picks them {seemingly?} arbitrarily. it should use a statistical algorithm to compare the filenames and pick a likely match. I think it does, but based on filename suffix. E.g. here is a rename of three empty files with a suffix. 3 files changed, 0 insertions(+), 0 deletions(-) rename 1.a = 2.a (100%) rename 1.b = 2.b (100%) rename 1.c = 2.c (100%) -- robin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Jun 2014, #06; Thu, 26)
Johannes Sixt j...@kdbg.org writes: Am 27.06.2014 00:02, schrieb Junio C Hamano: Four mingw series are still in limbo--are they in good enough shape for Windows folks who wanted to upstream them? I've now tested the Unicode patches a bit, and I didn't notice a regression in my use-cases. The patches are good to go, IMHO. Thanks; let's merge them to 'next' and down to 'master' in a week or two, then. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: move detection doesnt take filename into account
Robin Rosenberg robin.rosenb...@dewire.com writes: I think it does, but based on filename suffix. E.g. here is a rename of three empty files with a suffix. 3 files changed, 0 insertions(+), 0 deletions(-) rename 1.a = 2.a (100%) rename 1.b = 2.b (100%) rename 1.c = 2.c (100%) This is not more than a chance. We tie-break rename source candidates that have the same content similarity score to a rename destination using name similarity, whose implementation has been diffcore-rename.c::basename_same(), which scores 1 if `basename $src` and `basename $dst` are the same and 0 otherwise, i.e. from 1.a to a/1.a is judged to be a better rename than from 1.a to a/2.a but otherwise there is nothing that favors rename from 1.a to 2.a over 1.a to 2.b. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix filter-branch to eliminate duplicate mapped parents
Charles Bailey char...@hashpling.org writes: I worked on this after discovering that --prune-empty often left some apparently empty commits that I was wasn't expecting it to leave and that running filter-branch --prune-empty in a loop would often do many passes where it was still pruning empty former merge commits. Good find. The test is a simple example of such a case. A non-ff merge of a commit that only changes a file that is to be pruned gets squashed into an empty non-merge commit that should be pruned. git-filter-branch.sh | 8 +++- t/t7003-filter-branch.sh | 11 +++ 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/git-filter-branch.sh b/git-filter-branch.sh index 86d6994..c5b82a8 100755 --- a/git-filter-branch.sh +++ b/git-filter-branch.sh @@ -332,7 +332,13 @@ while read commit parents; do parentstr= for parent in $parents; do for reparent in $(map $parent); do - parentstr=$parentstr -p $reparent + case $parentstr in + * -p $reparent*) + ;; + *) + parentstr=$parentstr -p $reparent + ;; + esac The case arms seem to be indented one level too deep; I'll fix it up locally when queuing, so no need to resend. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: move detection doesnt take filename into account
thanks for the info! then i suppose my bug is a petition to have name similarity instead use a different statistical matching algorithm. On 07/01/2014 10:57 AM, Junio C Hamano wrote: Robin Rosenberg robin.rosenb...@dewire.com writes: I think it does, but based on filename suffix. E.g. here is a rename of three empty files with a suffix. 3 files changed, 0 insertions(+), 0 deletions(-) rename 1.a = 2.a (100%) rename 1.b = 2.b (100%) rename 1.c = 2.c (100%) This is not more than a chance. We tie-break rename source candidates that have the same content similarity score to a rename destination using name similarity, whose implementation has been diffcore-rename.c::basename_same(), which scores 1 if `basename $src` and `basename $dst` are the same and 0 otherwise, i.e. from 1.a to a/1.a is judged to be a better rename than from 1.a to a/2.a but otherwise there is nothing that favors rename from 1.a to 2.a over 1.a to 2.b. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/8] paint_down_to_common: use prio_queue
Jeff King p...@peff.net writes: The downside is that our priority queue is not stable, which means that commits with the same timestamp may not come out in the order we put them in. You can see this in the test update in t6024. That test does a recursive merge across a set of commits that all have the same timestamp. For the virtual ancestor, the test currently ends up with blob like this: Temporary merge branch 1 Temporary merge branch 1 C === B Temporary merge branch 2 === A Temporary merge branch 2 but with this patch, the positions of B and A are swapped. This is probably fine, as the order is an internal implementation detail anyway (it would _not_ be fine if we were using a priority queue for git log traversal, which should show commits in parent order). Interesting that the queue is not stable, but the test can still rely on a fixed output. While I tend to agree that for the purpose of this code path, the order is an internal implementation detail, but I wonder if it would benefit us a lot if we taught prio-queue to be optionally more stable, which would allow us to use it in other code paths that care. If we really wanted to, I would imagine that we could keep the insertion counter in the elements of the queue to make the result stable (i.e. the void **array would become something like struct { int insertion_ctr; void *thing; } *array). I'm slightly hesitant because of the stability thing mentioned above. I _think_ it's probably fine. But we could also implement a stable_prio_queue on top of the existing prio_queue if we're concerned (and that may be something we want to do anyway, because git log would want that if it switched to a priority queue). Heh, I should have read the below-three-dashs commentary before commenting (I often start working from the commit messages in git log and then go back to the original thread). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Vote now for the New WSEAS President for 2014-2017 - WSEAS Presidency Elections for 2014-2017 --- See also mmctse.org
Vote now for the New WSEAS President for 2014-2017 - WSEAS Presidency Elections for 2014-2017 Past WSEAS Presidents http://www.wseas.org/cms.action?id=7692 - WSEAS Members (http://www.wseas.org/cms.action?id=7659 as of February 14, 2014) can vote from now and until July 10 for new President of the Society by sending email to: wseas-t...@wseas.org. The results will be announced on our page after July 10. The 4 candidates are: (click on each name to see their CV or visit: http://www.wseas.org/cms.action? id=7694) [ ] Prof. Dr. Ion Cocui [ ] Dr. Radoslav Bozov [ ] Prof. Nikos Mastorakis [ ] Assoc. Prof. Cornelia Aida Bulucea In order to vote, send your email to wseas-t...@wseas.org copy/pasting the above list and replacing [ ] with [+] for the candidate of your choice You can send only 1 vote and vote for only 1 candidate. Vote retraction/change is not accepted. Your WSEAS member ID number must be included in your email (all members have been sent a reminder email concerning their ID#) Reminder: During January and February 2014, all our members were sent the following email: If you want to be Candidate for the position of the President, you must send your CV to wseas-t...@wseas.org until March 10, 2014. As you know the World Scientific and Engineering Academy and Society (WSEAS) is a registered non-profit association with Registration Number 22324/15 March 1999 in Hellenic (Greek) Republic. Our Society has the following members (as of February 14, 2014):http://www.wseas.org/cms.action?id=7659 All members participate in electing a new president every three years. New elections have been announced to be held between July 1-10, 2014. If you want to be Candidate for the position of the President, you must send your CV to wseas-t...@wseas.org until March 10, 2014. The President of the Society is required to be physically present at the WSEAS Headquarters at least 3 days per week. Our members will receive instructions of how to vote by email. WSEAS is also a non-profit international research Academy, with several research projects having been carried out at WSEAS or in collaboration with WSEAS:http://www.wseas.org/cms.action?id=7660 Many Summer Schools and Seminars have also been organized by our Academy in the last 15 years: http://www.wseas.org/cms.action?id=7660 Several research collaborations have been successfully completed and through those research projects many students received their Ph.D. by our collaborating universities. For your voting you will need your ID Number Past WSEAS Presidents http://www.wseas.org/cms.action?id=7692 See also mmctse.org. Do you want to come as Invited Speaker or Plenary Speaker in www.mmctse.org ? - Should you want to unsubscribe, send an email to wseas-t...@wseas.org with Subject: BLOCK git@vger.kernel.org -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/8] add functions for memory-efficient bitmaps
Jeff King p...@peff.net writes: On Sun, Jun 29, 2014 at 03:41:37AM -0400, Eric Sunshine wrote: +static inline void bitset_set(unsigned char *bits, int n) +{ + bits[n / CHAR_BIT] |= 1 (n % CHAR_BIT); +} Is it intentional or an oversight that there is no way to clear a bit in the set? Intentional in the sense that I had no need for it in my series, and I didn't think about it. I doubt many callers would want it, since commit traversals tend to propagate bits through the graph, and then clean them up all at once. And the right way to clean up slabbed data like this is to just clear the slab. Of course somebody may use the code for something besides commit traversals. But I'd rather avoid adding dead code on the off chance that somebody uses it later (and then gets to find out whether it even works or not!). Another thing I noticed was that the definition of and the commentary on bitset_equal() and bitset_empty() sounded somewhat undecided. These functions take max that is deliberately named differently from num_bits (the width of the bitsets involved), inviting to use them for testing only earlier bits in the bitset as long as the caller understands the caveat, but the caveat requires that the partial bitset to test must be byte-aligned, which makes it not very useful in practice, which means we probably do not want them to be used for any max other than num_bits. They probably would want either: * be made to truly honor max num_bits case, by special casing the last byte that has max-th bit, to officially allow them to be used for partial bitset test; or * take num_bits, not max, to clarify that callers must use them only on the full bitset. In either case, there needs another item in the caller's responsibility list at the beginning of bitset.h: 4. Ensure that padding bits at the end of the bitset array are initialized to 0. In the description of bitset_sizeof(), the comment hints it by using xcalloc() in the example, but a careless user may be tempted to implement bitset_clr() and then do: int i; unsigned char *bits = malloc(bitset_sizeof(nr)); for (i = 0; i nr; i++) bitset_clr(bits, i); assert(bitset_empty(bits, nr)); and the implementation of bitset_empty(), even if we rename s/max/num_bits/, will choke if (nr % CHAR_BIT) and malloc() gave us non-zero bit in the padding. For the sake of simplicity, I am inclined to vote for not allowing their use on a partial-sub-bitset. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] git-merge-file: do not add LF at EOF while applying unrelated change
Johannes Schindelin johannes.schinde...@gmx.de writes: Hi Max, On Sun, 29 Jun 2014, Max Kirillov wrote: diff --git a/xdiff/xmerge.c b/xdiff/xmerge.c index 9e13b25..625198e 100644 --- a/xdiff/xmerge.c +++ b/xdiff/xmerge.c @@ -245,11 +245,11 @@ static int xdl_fill_merge_buffer(xdfenv_t *xe1, const char *name1, dest ? dest + size : NULL); /* Postimage from side #1 */ if (m-mode 1) -size += xdl_recs_copy(xe1, m-i1, m-chg1, 1, +size += xdl_recs_copy(xe1, m-i1, m-chg1, (m-mode 2), dest ? dest + size : NULL); /* Postimage from side #2 */ if (m-mode 2) -size += xdl_recs_copy(xe2, m-i2, m-chg2, 1, +size += xdl_recs_copy(xe2, m-i2, m-chg2, 0, dest ? dest + size : NULL); } else continue; Makes sense to me, especially with the nice explanation in the commit message. Having said that, here is my ACK for the current revision of the patch series ... Thanks, both. Queued. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: move detection doesnt take filename into account
Elliot Wolk elliot.w...@gmail.com writes: On 07/01/2014 10:57 AM, Junio C Hamano wrote: Robin Rosenberg robin.rosenb...@dewire.com writes: I think it does, but based on filename suffix. E.g. here is a rename of three empty files with a suffix. 3 files changed, 0 insertions(+), 0 deletions(-) rename 1.a = 2.a (100%) rename 1.b = 2.b (100%) rename 1.c = 2.c (100%) This is not more than a chance. We tie-break rename source candidates that have the same content similarity score to a rename destination using name similarity, whose implementation has been diffcore-rename.c::basename_same(), which scores 1 if `basename $src` and `basename $dst` are the same and 0 otherwise, i.e. from 1.a to a/1.a is judged to be a better rename than from 1.a to a/2.a but otherwise there is nothing that favors rename from 1.a to 2.a over 1.a to 2.b. thanks for the info! then i suppose my bug is a petition to have name similarity instead use a different statistical matching algorithm. [administrivia: please do not top-post on this list] I didn't think it through but my gut feeling is that we could change the name similarity score to be the length of the tail part that matches (e.g. 1.a to a/2.a that has the same two bytes at the tail is a better match than to a/2.b that does not share any tail, and to a/1.a that shares the three bytes at the tail is an even better match). Oh, and rename basename_same() to something else; currently it is only used as the name similarity, and after such a change, it will stay to be name similarity but will not be asking are basenames the same? anymore. Hint, hint... -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/8] paint_down_to_common: use prio_queue
On Tue, Jul 01, 2014 at 09:23:21AM -0700, Junio C Hamano wrote: but with this patch, the positions of B and A are swapped. This is probably fine, as the order is an internal implementation detail anyway (it would _not_ be fine if we were using a priority queue for git log traversal, which should show commits in parent order). Interesting that the queue is not stable, but the test can still rely on a fixed output. I think it is deterministic for a particular sequence of inserts/pops, but not stable with respect to insertion order. While I tend to agree that for the purpose of this code path, the order is an internal implementation detail, but I wonder if it would benefit us a lot if we taught prio-queue to be optionally more stable, which would allow us to use it in other code paths that care. If we really wanted to, I would imagine that we could keep the insertion counter in the elements of the queue to make the result stable (i.e. the void **array would become something like struct { int insertion_ctr; void *thing; } *array). Yeah, I think the reasons to be stable are: 1. To be on the safe side for operations like this where it _shouldn't_ matter, but perhaps there are hidden dependencies we don't know of. 2. To make it easier for later callers to use prio-queue for cases where it does matter (and I think git log is one of these). If we can do it without a big performance loss (and I don't see any reason it should be any worse than a slight bump to the constant-factor of the logarithmic operations), it probably makes sense to. I'll take a look at it (in fact, I already implemented something like it once long ago in the thread I linked to earlier). My sense of taste says it should be a stable_prio_queue implemented on top of prio_queue (i.e., storing pointers to the struct you mention above). That means you can still use the unstable one if you want the (presumably minor) performance benefit, and it keeps the logic nice and tidy. But given that we have implemented prio_queue using void pointers, I think it would introduce an extra pointer per item and an extra layer of indirection on each access. So maybe it is better to just build it in. The low-cost alternative is to implement prio_queue to hold items of arbitrary size. I'm not sure if that is the worth the complexity and maintenance cost. Heh, I should have read the below-three-dashs commentary before commenting (I often start working from the commit messages in git log and then go back to the original thread). I always wonder how people read those. I tend to write them as if people have (just) read the commit message, but not yet read the patch. -Peff PS Thanks for your earlier comments on the actual commit-slab painting algorithm. Responding to those is taking more thinking, and I haven't gotten to it yet, but it's on my agenda. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/8] add functions for memory-efficient bitmaps
On Tue, Jul 01, 2014 at 09:57:13AM -0700, Junio C Hamano wrote: Another thing I noticed was that the definition of and the commentary on bitset_equal() and bitset_empty() sounded somewhat undecided. These functions take max that is deliberately named differently from num_bits (the width of the bitsets involved), inviting to use them for testing only earlier bits in the bitset as long as the caller understands the caveat, but the caveat requires that the partial bitset to test must be byte-aligned, which makes it not very useful in practice, which means we probably do not want them to be used for any max other than num_bits. Yeah, I added that comment because I found max to be confusing, but couldn't think of a better name. I'm not sure why num_bits did not occur to me, as that makes it completely obvious. * take num_bits, not max, to clarify that callers must use them only on the full bitset. This seems like the right solution to me. Handling partially aligned bytes adds to the complexity and may hurt performance (in fact, I think bitset_equal could actually just call memcmp, which I should fix). That's fine if callers care about that feature, but I actually don't anticipate any that do. By the way, I chose unsigned char as the storage format somewhat arbitrarily. Performance might be better with unsigned int or even unsigned long. It means potentially wasting more space, but not more than one word (minus a byte) per commit (so about 3MB on linux.git). I'll try to do some timings to see if it's worth doing. In either case, there needs another item in the caller's responsibility list at the beginning of bitset.h: 4. Ensure that padding bits at the end of the bitset array are initialized to 0. Agreed. That is definitely a requirement I had in mind, but I didn't think to write it down. I'll fix both points in the re-roll. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 08/16] use skip_prefix to avoid magic numbers
On Mon, Jun 23, 2014 at 02:44:23PM -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: diff --git a/connect.c b/connect.c index 94a6650..37ff018 100644 --- a/connect.c +++ b/connect.c @@ -140,12 +141,12 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len, if (!len) break; - if (len 4 starts_with(buffer, ERR )) - die(remote error: %s, buffer + 4); + if (len 4 skip_prefix(buffer, ERR , arg)) + die(remote error: %s, arg); Makes one wonder if we should do something special to a line with only ERR and nothing else on it, which the other end may have meant us to give a blank line to make the output more readable. I don't think that would buy us much. We have always accepted only a single ERR line and died immediately. So any changes of that nature would have to be made in the client, and then servers would have to wait N time units before it was safe to start using the feature (otherwise old clients just get the blank line!). I also don't think blank lines by themselves are useful. You'd want them in addition to being able to handle multiple lines. So a nicer fix is more along the lines of accept multiple ERR lines, including blank lines, followed by a terminating line (ERRDONE or something). Then servers can do: ERR unable to access foo.git: Printer on fire ERR ERR You may have misspelled the repository name. Did you mean: ERR ERR foobar.git ERRDONE Old clients would see the first line and die. Newer clients would print the helpful hint. Servers would just need to make sure that the first line stands on its own to cover both cases. A fix, if one turns out to be needed, is outside the scope of this patch, though, I think. Yeah, definitely a separate topic. It is not something I think anybody has asked for, but I can imagine a site like GitHub making use of it (we already show custom errors for http, but there's no room beyond the single ERR line). And teaching the clients now expands the options for servers later. So it might be worth doing just as a potential feature. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Prevent space after directories in tcsh completion
Hello script works beautifully except for a small thing: reporoot ls folder/ folder_file.txt reproot git commit foTAB completes to git commit folderSPACE without presenting the completion options (git diff foTAB completes as expected to git diff folder and gives the 2 completion options) is that easily fixable too? -- View this message in context: http://git.661346.n2.nabble.com/PATCH-Prevent-space-after-directories-in-tcsh-completion-tp757p7614312.html Sent from the git mailing list archive at Nabble.com. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/8] string-list: add pos to iterator callback
Jeff King p...@peff.net writes: When we are running a string-list foreach or filter, the callback function sees only the string_list_item, along with a void* data pointer provided by the caller. This is sufficient for most purposes. However, it can also be useful to know the position of the item within the string (for example, if the data pointer s/the string/-list/ (or s//_list/). points to a secondary array in which each element corresponds to part of the string list). We can help this use case by providing the position to each callback. Signed-off-by: Jeff King p...@peff.net --- The diff here is noisy, and I expect in the long run that the one caller I add to builtin/tag.c later in the series will eventually stop using string_list entirely (in favor of a custom struct), which may leave us with no callers that actually use the new field. I do think the logic above is sound, though, and it's a potentially useful thing. There may be other sites that avoid the for_each wrapper in favor of iterating themselves simply _because_ they needed to know the position (I would just do the same here, except that my new caller wants to use filter_string_list, which is a little more complicated). While I can buy that some callers would want to learn the pos in the list, I am not sure if this is a good direction to go. Primarily, I am having trouble sifting between want and need. How often do callers want to do this? If only narrow specialized callers want this, is it unreasonable to ask them to add a int ctr in their cb_data and use pos = ((struct theirs *)cb_data)-ctr++ in their callback instead? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] sha1_file: avoid overrunning alternate object base string
While checking if a new alternate object database is a duplicate make sure that old and new base paths have the same length before comparing them with memcmp. This avoids overrunning the buffer of the existing entry if the new one is longer and it stops rejecting foobar/ after foo/ was already added. Signed-off-by: Rene Scharfe l...@web.de --- sha1_file.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sha1_file.c b/sha1_file.c index 8adab14..b7ad6c1 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -315,7 +315,8 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base, int * thing twice, or object directory itself. */ for (alt = alt_odb_list; alt; alt = alt-next) { - if (!memcmp(ent-base, alt-base, pfxlen)) { + if (pfxlen == alt-name - alt-base - 1 + !memcmp(ent-base, alt-base, pfxlen)) { free(ent); return -1; } -- 2.0.0 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/8] commit: provide a fast multi-tip contains function
Junio C Hamano gits...@pobox.com writes: If you are trying to do branch --with $commit, what you would want is not exactly paint-down-to-common(all branches, $commit), but something that paints down to $commit from all branches, with an optimization that cuts off the traversal at a commit that is reachable from $commit. If a traversal from branch B reached such a commit that is reachable from $commit, you can stop the traversal because propagating the bit for B further down to its parents will not reach the $commit you are interested in. I forgot about the other direction, i.e. branch --merged $commit. Since I do git branch --no-merged pu fairly often, I care about its performance ;-) We paint $commit as Left, and tips of all the branches as Right. We try to paint down from $commit but the traversal cannot terminate if it reaches a commit that is reachable from one Right ref---we may find that we can reach more Right refs by following its ancestor. We can stop when we paint Right bits fully, of course, but I wonder if we can do better than that. Suppose we just painted a commit with L and Rn bit (i.e. the commit is a common ancestor of the $commit and one branch). We know that traversing down from there will never reach the tip of the branch whose color is Rn (otherwise we will have a cycle from that commit back to the tip of the branch), so if the reason we are continuing the traversal is to prove that the tip of the branch Rn is reachable (or unreachable) from $commit, there is no reason to continue digging from there. Can we exploit that to terminate the traversal earlier? When we encounter a new commit that is painted with L bit and some but not necessarily all R bits, we propagate the bits and check the R bits. Some of the commits in Right set that correspond to R bits may have been painted in L (i.e. we found branches that are merged to $commit), and digging further from this commit will not give us any new information. Other commits are not painted in L (i.e. we do not yet know if $commit merges these branches), so we may need to keep digging. So perhaps we can stop if a commit is painted in L and also has all the R bits that correspond to Right commits that are not yet proven reachable from $commit (i.e. not painted in L)? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/8] string-list: add pos to iterator callback
On Tue, Jul 01, 2014 at 10:45:19AM -0700, Junio C Hamano wrote: Jeff King p...@peff.net writes: When we are running a string-list foreach or filter, the callback function sees only the string_list_item, along with a void* data pointer provided by the caller. This is sufficient for most purposes. However, it can also be useful to know the position of the item within the string (for example, if the data pointer s/the string/-list/ (or s//_list/). Thanks, yeah. While I can buy that some callers would want to learn the pos in the list, I am not sure if this is a good direction to go. Primarily, I am having trouble sifting between want and need. How often do callers want to do this? If only narrow specialized callers want this, is it unreasonable to ask them to add a int ctr in their cb_data and use pos = ((struct theirs *)cb_data)-ctr++ in their callback instead? Not all that often, I suppose (I only add one caller in this series). I just found it a little hack-ish to force callers to keep their own counter when we already have it (especially because it is not too hard to get it wrong, for example by forgetting to increment the counter in one code path of the callback). Here's what the caller would look like without pos (done as a patch on top of the series, so pos is still there, but no longer used). diff --git a/builtin/tag.c b/builtin/tag.c index f17192c..dc6f105 100644 --- a/builtin/tag.c +++ b/builtin/tag.c @@ -151,14 +151,20 @@ static int util_is_not_null(struct string_list_item *it, int pos, void *data) return !!it-util; } -static int matches_contains(struct string_list_item *it, int pos, void *data) +struct contains_callback_data { + int ctr; + unsigned char *contains; +}; + +static int matches_contains(struct string_list_item *it, int pos, void *vdata) { - unsigned char *contains = data; - return contains[pos]; + struct contains_callback_data *data = vdata; + return data-contains[data-ctr++]; } static void limit_by_contains(struct string_list *tags, struct commit_list *with) { + struct contains_callback_data cbdata; struct commit_list *tag_commits = NULL, **tail = tag_commits; unsigned char *result; int i; @@ -180,7 +186,10 @@ static void limit_by_contains(struct string_list *tags, struct commit_list *with result = xmalloc(tags-nr); commit_contains(with, tag_commits, NULL, result); - filter_string_list(tags, 1, matches_contains, result); + + cbdata.ctr = 0; + cbdata.contains = result; + filter_string_list(tags, 1, matches_contains, cbdata); free(result); free_commit_list(tag_commits); So I think it's a small change in a lot of places rather than a kind of ugly change in one spot. All that being said, I think the real issue here is that I want more than a string list (I am already using the util field for the sha1, but this code would be potentially simpler if I could also store the commit object). In the long run I hope to factor out a ref-listing API that can be used by tag, branch, and for-each-ref. And then this string-list code would go away in favor of a more expansive struct. So it may not be worth worrying about keeping it too clean. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/8] commit: provide a fast multi-tip contains function
Junio C Hamano gits...@pobox.com writes: I forgot about the other direction, i.e. branch --merged $commit. Since I do git branch --no-merged pu fairly often, I care about its performance ;-) We paint $commit as Left, and tips of all the branches as Right. We try to paint down from $commit but the traversal cannot terminate if it reaches a commit that is reachable from one Right ref---we may find that we can reach more Right refs by following its ancestor. We can stop when we paint Right bits fully, of course, but I wonder if we can do better than that. Suppose we just painted a commit with L and Rn bit (i.e. the commit is a common ancestor of the $commit and one branch). We know that traversing down from there will never reach the tip of the branch whose color is Rn (otherwise we will have a cycle from that commit back to the tip of the branch), so if the reason we are continuing the traversal is to prove that the tip of the branch Rn is reachable (or unreachable) from $commit, there is no reason to continue digging from there. Can we exploit that to terminate the traversal earlier? I forgot to mention this case because with branch --no-merged pu it never happens, but another clue we can terminate traversal earier with is when $commit is found to be an ancestor of some Right commits. Then we can start ignoring Rn bits for these Right commits that can reach the Left one, as we know they can never be reached from the Left. That is, the last sentence in the paragraph ... When we encounter a new commit that is painted with L bit and some but not necessarily all R bits, we propagate the bits and check the R bits. Some of the commits in Right set that correspond to R bits may have been painted in L (i.e. we found branches that are merged to $commit), and digging further from this commit will not give us any new information. Other commits are not painted in L (i.e. we do not yet know if $commit merges these branches), so we may need to keep digging. So perhaps we can stop if a commit is painted in L and also has all the R bits that correspond to Right commits that are not yet proven reachable from $commit (i.e. not painted in L)? ... will be more like ignore Rn bits for Right commits that are painted in L (i.e. they are reachable from L) or the Left commit is painted with (i.e. they reach L). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
On Tue, 2014-07-01 at 06:16 +0200, Torsten Bögershausen wrote: diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh index 6c33e28..7c60675 100755 --- a/t/t0090-cache-tree.sh +++ b/t/t0090-cache-tree.sh @@ -85,9 +85,22 @@ test_expect_success 'reset --hard without index gives cache-tree' ' test_shallow_cache_tree ' -test_expect_failure 'checkout gives cache-tree' ' +test_expect_success 'checkout gives cache-tree' ' + git tag current git checkout HEAD^ test_shallow_cache_tree The chainis broken here. Does the test now pass, because git tag is added ? The tag does not cause the cache-tree to be created, so git tag does not cause the test to pass. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] cache-tree: Create/update cache-tree on checkout
When git checkout checks out a branch, create or update the cache-tree so that subsequent operations are faster. Signed-off-by: David Turner dtur...@twitter.com --- builtin/checkout.c| 8 cache-tree.c | 5 +++-- t/t0090-cache-tree.sh | 19 --- 3 files changed, 27 insertions(+), 5 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index 07cf555..a023a86 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -553,6 +553,14 @@ static int merge_working_tree(const struct checkout_opts *opts, } } + if (!active_cache_tree) + active_cache_tree = cache_tree(); + + if (!cache_tree_fully_valid(active_cache_tree)) + cache_tree_update(active_cache_tree, + (const struct cache_entry * const *)active_cache, + active_nr, 0); + if (write_cache(newfd, active_cache, active_nr) || commit_locked_index(lock_file)) die(_(unable to write new index file)); diff --git a/cache-tree.c b/cache-tree.c index 7fa524a..28d2356 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -612,9 +612,10 @@ int write_cache_as_tree(unsigned char *sha1, int flags, const char *prefix) cache_tree_find(active_cache_tree, prefix); if (!subtree) return WRITE_TREE_PREFIX_ERROR; - hashcpy(sha1, subtree-sha1); + if (sha1) + hashcpy(sha1, subtree-sha1); } - else + else if (sha1) hashcpy(sha1, active_cache_tree-sha1); if (0 = newfd) diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh index 6c33e28..98fb1ab 100755 --- a/t/t0090-cache-tree.sh +++ b/t/t0090-cache-tree.sh @@ -44,14 +44,14 @@ test_expect_success 'read-tree HEAD establishes cache-tree' ' test_expect_success 'git-add invalidates cache-tree' ' test_when_finished git reset --hard; git read-tree HEAD - echo I changed this file foo + echo I changed this file foo git add foo test_invalid_cache_tree ' test_expect_success 'update-index invalidates cache-tree' ' test_when_finished git reset --hard; git read-tree HEAD - echo I changed this file foo + echo I changed this file foo git update-index --add foo test_invalid_cache_tree ' @@ -85,9 +85,22 @@ test_expect_success 'reset --hard without index gives cache-tree' ' test_shallow_cache_tree ' -test_expect_failure 'checkout gives cache-tree' ' +test_expect_success 'checkout gives cache-tree' ' + git tag current git checkout HEAD^ test_shallow_cache_tree ' +test_expect_success 'checkout -b gives cache-tree' ' + git checkout current + git checkout -b prev HEAD^ + test_shallow_cache_tree +' + +test_expect_success 'checkout -B gives cache-tree' ' + git checkout current + git checkout -B prev HEAD^ + test_shallow_cache_tree +' + test_done -- 2.0.0.390.gcb682f8 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] cache-tree: Write index with updated cache-tree after commit
During the commit process, the cache-tree is updated. We need to write this updated cache-tree so that it's ready for subsequent commands. Add test code which demonstrates that git commit now writes the cache tree. Also demonstrate that cache-tree invalidation is correct. Signed-off-by: David Turner dtur...@twitter.com --- builtin/commit.c | 15 ++-- t/t0090-cache-tree.sh | 63 --- 2 files changed, 67 insertions(+), 11 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index 9cfef6c..dbd4f4b 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -342,6 +342,8 @@ static char *prepare_index(int argc, const char **argv, const char *prefix, discard_cache(); read_cache_from(index_lock.filename); + if (update_main_cache_tree(WRITE_TREE_SILENT) = 0) + write_cache(fd, active_cache, active_nr); commit_style = COMMIT_NORMAL; return index_lock.filename; @@ -383,14 +385,10 @@ static char *prepare_index(int argc, const char **argv, const char *prefix, if (!only !pathspec.nr) { fd = hold_locked_index(index_lock, 1); refresh_cache_or_die(refresh_flags); - if (active_cache_changed) { - update_main_cache_tree(WRITE_TREE_SILENT); - if (write_cache(fd, active_cache, active_nr) || - commit_locked_index(index_lock)) - die(_(unable to write new_index file)); - } else { - rollback_lock_file(index_lock); - } + update_main_cache_tree(WRITE_TREE_SILENT); + if (write_cache(fd, active_cache, active_nr) || + commit_locked_index(index_lock)) + die(_(unable to write new_index file)); commit_style = COMMIT_AS_IS; return get_index_file(); } @@ -435,6 +433,7 @@ static char *prepare_index(int argc, const char **argv, const char *prefix, fd = hold_locked_index(index_lock, 1); add_remove_files(partial); refresh_cache(REFRESH_QUIET); + update_main_cache_tree(WRITE_TREE_SILENT); if (write_cache(fd, active_cache, active_nr) || close_lock_file(index_lock)) die(_(unable to write new_index file)); diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh index 8437c5f..625157e 100755 --- a/t/t0090-cache-tree.sh +++ b/t/t0090-cache-tree.sh @@ -13,11 +13,35 @@ cmp_cache_tree () { test_cmp $1 filtered } +grep_nonmatch_ok () { +grep $@ +test $? = 2 return 1 +return 0 +} + # We don't bother with actually checking the SHA1: # test-dump-cache-tree already verifies that all existing data is # correct. +generate_expected_cache_tree () { + dir=$1${1:+/} + parent=$2 + # ls-files might have foo/bar, foo/bar/baz, and foo/bar/quux + # We want to count only foo because it's the only direct child + subtrees=$(git ls-files|grep /|cut -d / -f 1|uniq) + subtree_count=$(echo $subtrees|grep_nonmatch_ok -c .) + entries=$(git ls-files|wc -l) + printf SHA $dir (%d entries, %d subtrees)\n $entries $subtree_count + for subtree in $subtrees + do + cd $subtree + generate_expected_cache_tree $dir$subtree $dir || return 1 + cd .. + done + dir=$parent +} + test_shallow_cache_tree () { - printf SHA (%d entries, 0 subtrees)\n $(git ls-files|wc -l) expect + generate_expected_cache_tree expect cmp_cache_tree expect } @@ -35,7 +59,7 @@ test_no_cache_tree () { cmp_cache_tree expect } -test_expect_failure 'initial commit has cache-tree' ' +test_expect_success 'initial commit has cache-tree' ' test_commit foo test_shallow_cache_tree ' @@ -60,6 +84,18 @@ test_expect_success 'git-add in subdir invalidates cache-tree' ' test_invalid_cache_tree ' +cat before \EOF +SHA (3 entries, 2 subtrees) +SHA dir1/ (1 entries, 0 subtrees) +SHA dir2/ (1 entries, 0 subtrees) +EOF + +cat expect \EOF +invalid (2 subtrees) +invalid dir1/ (0 subtrees) +SHA dir2/ (1 entries, 0 subtrees) +EOF + test_expect_success 'git-add in subdir does not invalidate sibling cache-tree' ' git tag no-children test_when_finished git reset --hard no-children; git read-tree HEAD @@ -67,8 +103,10 @@ test_expect_success 'git-add in subdir does not invalidate sibling cache-tree' ' test_commit dir1/a test_commit dir2/b echo I changed this file dir1/a + cmp_cache_tree before + echo I changed this file dir1/a git add dir1/a - test_invalid_cache_tree dir1/ + cmp_cache_tree expect ' test_expect_success 'update-index invalidates
[PATCH 2/3] test-dump-cache-tree: Improve output format and exit code
Make test-dump-cache-tree more useful for testing. Do not treat known invalid trees as errors (and do not produce non-zero exit code), because we can fall back to the non-cache-tree codepath. Signed-off-by: David Turner dtur...@twitter.com --- t/t0090-cache-tree.sh | 28 +--- test-dump-cache-tree.c | 24 2 files changed, 37 insertions(+), 15 deletions(-) diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh index 98fb1ab..8437c5f 100755 --- a/t/t0090-cache-tree.sh +++ b/t/t0090-cache-tree.sh @@ -21,10 +21,13 @@ test_shallow_cache_tree () { cmp_cache_tree expect } +# Test that the cache-tree for a given directory is invalid. +# If no directory is given, check that the root is invalid test_invalid_cache_tree () { - echo invalid (0 subtrees) expect - printf SHA #(ref) (%d entries, 0 subtrees)\n $(git ls-files|wc -l) expect - cmp_cache_tree expect + test-dump-cache-tree actual + sed -e s/$_x40/SHA/ -e s/[0-9]* subtrees//g actual filtered + expect=$(printf invalid $1 ()\n) + fgrep $expect filtered } test_no_cache_tree () { @@ -49,6 +52,25 @@ test_expect_success 'git-add invalidates cache-tree' ' test_invalid_cache_tree ' +test_expect_success 'git-add in subdir invalidates cache-tree' ' + test_when_finished git reset --hard; git read-tree HEAD + mkdir dirx + echo I changed this file dirx/foo + git add dirx/foo + test_invalid_cache_tree +' + +test_expect_success 'git-add in subdir does not invalidate sibling cache-tree' ' + git tag no-children + test_when_finished git reset --hard no-children; git read-tree HEAD + mkdir dir1 dir2 + test_commit dir1/a + test_commit dir2/b + echo I changed this file dir1/a + git add dir1/a + test_invalid_cache_tree dir1/ +' + test_expect_success 'update-index invalidates cache-tree' ' test_when_finished git reset --hard; git read-tree HEAD echo I changed this file foo diff --git a/test-dump-cache-tree.c b/test-dump-cache-tree.c index 47eab97..ad42ca1 100644 --- a/test-dump-cache-tree.c +++ b/test-dump-cache-tree.c @@ -6,12 +6,12 @@ static void dump_one(struct cache_tree *it, const char *pfx, const char *x) { if (it-entry_count 0) - printf(%-40s %s%s (%d subtrees)\n, - invalid, x, pfx, it-subtree_nr); + printf(%-40s %s (%d subtrees)%s\n, + invalid, pfx, it-subtree_nr, x); else - printf(%s %s%s (%d entries, %d subtrees)\n, - sha1_to_hex(it-sha1), x, pfx, - it-entry_count, it-subtree_nr); + printf(%s %s (%d entries, %d subtrees)%s\n, + sha1_to_hex(it-sha1), pfx, + it-entry_count, it-subtree_nr, x); } static int dump_cache_tree(struct cache_tree *it, @@ -25,19 +25,19 @@ static int dump_cache_tree(struct cache_tree *it, /* missing in either */ return 0; - if (it-entry_count 0) { + if (it-entry_count 0) + /* invalid */ dump_one(it, pfx, ); - dump_one(ref, pfx, #(ref) ); - if (it-subtree_nr != ref-subtree_nr) - errs = 1; - } else { - dump_one(it, pfx, ); if (hashcmp(it-sha1, ref-sha1) || ref-entry_count != it-entry_count || ref-subtree_nr != it-subtree_nr) { - dump_one(ref, pfx, #(ref) ); + /* claims to be valid but is lying */ + dump_one(ref, pfx, #(error)); errs = 1; + } else { + /* is actually valid */ + dump_one(it, pfx, ); } } -- 2.0.0.390.gcb682f8 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC] Add --force-seen option to git-fast-export
I've been playing with git-stitch-repo and similar tools all of which seem to use the fast export to get their input data with. I have need to provide these tools with a hints file so they can be given the extra metadata about parents - to do this they need to be able to decode marks-commit id's reliably. git-stitch-repo reads it input direct from git-fast-export in a pipe, so I could change that (by reading from saved output files) or modify git to have a mode to ensure that the mark/commit id map could be made stable. I've chosen the later (mainly because it occurred to me first - but it does seem the simpler to code ) There are no tests yet, and I'm not entirely sure I like the new options name. Comments welcome. -- Patch follows - Allow the use of --import/export-marks options to guarantee stability of the marks assigned to commits for tools that use the fast-export format to manipulate history. Signed-off-by: Roger Gammans rgamm...@gammascience.co.uk --- Documentation/git-fast-export.txt | 5 + builtin/fast-export.c | 15 --- 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/Documentation/git-fast-export.txt b/Documentation/git-fast-export.txt index 221506b..10f58fa 100644 --- a/Documentation/git-fast-export.txt +++ b/Documentation/git-fast-export.txt @@ -81,6 +81,11 @@ If the backend uses a similar \--import-marks file, this allows for incremental bidirectional exporting of the repository by keeping the marks the same across runs. +--force-seen:: + Only meaningful with \--import-marks . This prevents the suppression + of commits listed the \--import-marks file from being exported again. + The allows multiple runs with a guaranteed marks/commit id stability + --fake-missing-tagger:: Some old repositories have tags without a tagger. The fast-import protocol was pretty strict about that, and did not diff --git a/builtin/fast-export.c b/builtin/fast-export.c index ef44816..06f0366 100644 --- a/builtin/fast-export.c +++ b/builtin/fast-export.c @@ -29,6 +29,7 @@ static enum { ABORT, VERBATIM, WARN, WARN_STRIP, STRIP } signed_tag_mode = ABORT static enum { ERROR, DROP, REWRITE } tag_of_filtered_mode = ERROR; static int fake_missing_tagger; static int use_done_feature; +static int force_seen; static int no_data; static int full_tree; static struct string_list extra_refs = STRING_LIST_INIT_NODUP; @@ -324,13 +325,18 @@ static void handle_commit(struct commit *commit, struct rev_info *rev) if (!S_ISGITLINK(diff_queued_diff.queue[i]-two-mode)) export_blob(diff_queued_diff.queue[i]-two-sha1); - mark_next_object(commit-object); + uint32_t mark = get_object_mark(commit-object); + if (! mark) { + mark_next_object(commit-object); + mark = last_idnum; + } + if (!is_encoding_utf8(encoding)) reencoded = reencode_string(message, UTF-8, encoding); if (!commit-parents) printf(reset %s\n, (const char*)commit-util); printf(commit %s\nmark :%PRIu32\n%.*s\n%.*s\ndata %u\n%s, - (const char *)commit-util, last_idnum, + (const char *)commit-util, mark, (int)(author_end - author), author, (int)(committer_end - committer), committer, (unsigned)(reencoded @@ -668,7 +674,8 @@ static void import_marks(char *input_file) mark_object(object, mark); - object-flags |= SHOWN; + if (! force_seen ) + object-flags |= SHOWN; } fclose(f); } @@ -713,6 +720,8 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix) N_(Output full tree for each commit)), OPT_BOOL(0, use-done-feature, use_done_feature, N_(Use the done feature to terminate the stream)), + OPT_BOOL(0, force-seen, force_seen, +N_(Output commit and blobs even if in an imported marks-file)), OPT_BOOL(0, no-data, no_data, N_(Skip output of blob data)), OPT_STRING_LIST(0, refspec, refspecs_list, N_(refspec), N_(Apply refspec to exported refs)), -- 1.8.5.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
David Turner dtur...@twopensource.com writes: When git checkout checks out a branch, create or update the cache-tree so that subsequent operations are faster. Signed-off-by: David Turner dtur...@twitter.com --- builtin/checkout.c| 8 cache-tree.c | 5 +++-- t/t0090-cache-tree.sh | 15 ++- 3 files changed, 25 insertions(+), 3 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index 07cf555..a023a86 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -553,6 +553,14 @@ static int merge_working_tree(const struct checkout_opts *opts, } } + if (!active_cache_tree) + active_cache_tree = cache_tree(); + + if (!cache_tree_fully_valid(active_cache_tree)) + cache_tree_update(active_cache_tree, + (const struct cache_entry * const *)active_cache, + active_nr, 0); + This looks much better than the previous round, but it still does allow verify_cache() to throw noises against unmerged entries in the cache, as WRITE_TREE_SILENT flag is not passed down, no? $ git checkout master^0 $ git am $this_message $ make $ edit builtin/checkout.c ;# make changes to the above lines $ ./git checkout -m master^0 x builtin/checkout.c: unmerged (972c8a7b28f16f88475268f9a714048c228e69db) x builtin/checkout.c: unmerged (f1dc56e55f7b2200412142b10517458ccfda2952) x builtin/checkout.c: unmerged (3b9753ba8c19e7dfe6e922f30eb85c83a92a4596) M builtin/checkout.c Warning: you are leaving 1 commit behind, not connected to any of your branches: 25fab54 cache-tree: Create/update cache-tree on checkout Switched to branch 'master' Passing WRITE_TREE_SILENT in the flags parameter will get rid of the conflict notice output from the above. The user is not interested in writing a brand new tree object at all in this case, so it feels wrong to actually let the call chain go down to update_one() and create new tree objects. Side note. And passing WRITE_TREE_DRY_RUN is not a good solution either, because a later write_cache_as_tree() will not create the necessary tree object once you stuff a tree object name in the cache-tree. What we want in this code path is a way to repair a sub cache_tree if it can be repaired without creating a new tree object and otherwise leave that part invalid. The existing cache-tree framework is not prepared to do that kind of thing. It wants to start from the bottom and percolate things up, computing levels nearer to the top-level only after it fully created the trees for deeper levels, because it is meant to be used only when we really want to write out trees. We may want to reuse update_one() but I am not convinced that doing an equivalent of write-tree when you switch branches is the right approach in the first place. You will eventually write it out as a tree, and having a relatively undamaged cache-tree will help you when you do so, but spending the cycles necessary to compute a fully populated cache-tree, only to let it degrade over time until you are ready to write it out as a tree, somehow sounds like asking for a duplicated work upfront. By the way, you seem to have touched write_cache_as_tree() in the same patch, but I am not sure what makes the change necessary. I do not see a new caller to it that passes a NULL to its sha1 parameter. diff --git a/cache-tree.c b/cache-tree.c index 7fa524a..28d2356 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -612,9 +612,10 @@ int write_cache_as_tree(unsigned char *sha1, int flags, const char *prefix) cache_tree_find(active_cache_tree, prefix); if (!subtree) return WRITE_TREE_PREFIX_ERROR; - hashcpy(sha1, subtree-sha1); + if (sha1) + hashcpy(sha1, subtree-sha1); } - else + else if (sha1) hashcpy(sha1, active_cache_tree-sha1); if (0 = newfd) diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh index 6c33e28..7c60675 100755 --- a/t/t0090-cache-tree.sh +++ b/t/t0090-cache-tree.sh @@ -85,9 +85,22 @@ test_expect_success 'reset --hard without index gives cache-tree' ' test_shallow_cache_tree ' -test_expect_failure 'checkout gives cache-tree' ' +test_expect_success 'checkout gives cache-tree' ' + git tag current git checkout HEAD^ test_shallow_cache_tree ' +test_expect_success 'checkout -b gives cache-tree' ' + git checkout current + git checkout -b prev HEAD^ + test_shallow_cache_tree +' + +test_expect_success 'checkout -B gives cache-tree' ' + git checkout current + git checkout -B prev HEAD^ + test_shallow_cache_tree +' + test_done -- To unsubscribe from this list: send the line unsubscribe
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
David Turner dtur...@twopensource.com writes: On Tue, 2014-07-01 at 06:16 +0200, Torsten Bögershausen wrote: diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh index 6c33e28..7c60675 100755 --- a/t/t0090-cache-tree.sh +++ b/t/t0090-cache-tree.sh @@ -85,9 +85,22 @@ test_expect_success 'reset --hard without index gives cache-tree' ' test_shallow_cache_tree ' -test_expect_failure 'checkout gives cache-tree' ' +test_expect_success 'checkout gives cache-tree' ' +git tag current git checkout HEAD^ test_shallow_cache_tree The chainis broken here. Does the test now pass, because git tag is added ? The tag does not cause the cache-tree to be created, so git tag does not cause the test to pass. That does not explain why it is a good idea to declare success of this test if this new git tag current fails here for whatever reason (e.g. somebody updated git tag for a reason that is completely unrelated to cache-tree and made it segfault without creating the current tag). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
David Turner dtur...@twopensource.com writes: When git checkout checks out a branch, create or update the cache-tree so that subsequent operations are faster. Signed-off-by: David Turner dtur...@twitter.com --- Could you number your patches e.g. [PATCH v4 1/3] and/or summarize below the three-dash line what got updated since the last round? The readers can guess without when one is sending a reroll every other day or less frequently, but with rerolls more often than that, it gets unwieldy to check which points raised during the review have been addressed. Thanks. builtin/checkout.c| 8 cache-tree.c | 5 +++-- t/t0090-cache-tree.sh | 19 --- 3 files changed, 27 insertions(+), 5 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index 07cf555..a023a86 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -553,6 +553,14 @@ static int merge_working_tree(const struct checkout_opts *opts, } } + if (!active_cache_tree) + active_cache_tree = cache_tree(); + + if (!cache_tree_fully_valid(active_cache_tree)) + cache_tree_update(active_cache_tree, + (const struct cache_entry * const *)active_cache, + active_nr, 0); + if (write_cache(newfd, active_cache, active_nr) || commit_locked_index(lock_file)) die(_(unable to write new index file)); diff --git a/cache-tree.c b/cache-tree.c index 7fa524a..28d2356 100644 --- a/cache-tree.c +++ b/cache-tree.c @@ -612,9 +612,10 @@ int write_cache_as_tree(unsigned char *sha1, int flags, const char *prefix) cache_tree_find(active_cache_tree, prefix); if (!subtree) return WRITE_TREE_PREFIX_ERROR; - hashcpy(sha1, subtree-sha1); + if (sha1) + hashcpy(sha1, subtree-sha1); } - else + else if (sha1) hashcpy(sha1, active_cache_tree-sha1); if (0 = newfd) diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh index 6c33e28..98fb1ab 100755 --- a/t/t0090-cache-tree.sh +++ b/t/t0090-cache-tree.sh @@ -44,14 +44,14 @@ test_expect_success 'read-tree HEAD establishes cache-tree' ' test_expect_success 'git-add invalidates cache-tree' ' test_when_finished git reset --hard; git read-tree HEAD - echo I changed this file foo + echo I changed this file foo git add foo test_invalid_cache_tree ' test_expect_success 'update-index invalidates cache-tree' ' test_when_finished git reset --hard; git read-tree HEAD - echo I changed this file foo + echo I changed this file foo git update-index --add foo test_invalid_cache_tree ' @@ -85,9 +85,22 @@ test_expect_success 'reset --hard without index gives cache-tree' ' test_shallow_cache_tree ' -test_expect_failure 'checkout gives cache-tree' ' +test_expect_success 'checkout gives cache-tree' ' + git tag current git checkout HEAD^ test_shallow_cache_tree ' +test_expect_success 'checkout -b gives cache-tree' ' + git checkout current + git checkout -b prev HEAD^ + test_shallow_cache_tree +' + +test_expect_success 'checkout -B gives cache-tree' ' + git checkout current + git checkout -B prev HEAD^ + test_shallow_cache_tree +' + test_done -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 02/14] submodules: Add the lib-submodule-update.sh test library
Add this test library to simplify covering all combinations of submodule update scenarios without having to add those to a test of each work tree manipulating command over and over again. The functions test_submodule_switch() and test_submodule_forced_switch() are intended to be called from a test script with a single argument. This argument is either a work tree manipulating command (including any command line options) or a function (when more than a single git command is needed to switch work trees from the current HEAD to another commit). This command (or function) is passed a target branch as argument. The two new functions check that each submodule transition is handled as expected, which currently means that submodule work trees are not affected until git submodule update is called. The forced variant is for commands using their '-f' or '--hard' option and expects them to overwrite local modifications as a result. Each of these two functions contains 14 tests_expect_* calls. Calling one of these test functions the first time creates a repository named submodule_update_repo. At first it contains two files, then a single submodule is added in another commit followed by commits covering all relevant submodule modifications. This repository is newly cloned into the submodule_update for each test_expect_* to avoid interference between different parts of the test functions (some to-be-tested commands also manipulate refs along with the work tree, e.g. git reset). Follow-up commits will then call these two test functions for all work tree manipulating commands (with a combination of all their options relevant to what they do with the work tree) making sure they work as expected. Later this test library will be extended to cover merges resulting in conflicts too. Also it is intended to be easily extendable for the recursive update functionality, where even more combinations of submodule modifications have to be tested for. This version documents two bugs in current Git with expected failures: *) When a submodule is replaced with a tracked file of the same name the submodule work tree including any local modifications (and even the whole history if it uses a .git directory instead of a gitfile!) is silently removed. *) Forced work tree updates happily manipulate files in the directory of a submodule that has just been removed in the superproject (but is of course still present in the work tree due to the way submodules are currently handled). This becomes dangerous when files in the submodule directory are overwritten by files from the new superproject commit, as any modifications to the submodule files will be lost) and is expected to also destroy history in the - admittedly unlikely case - the new commit adds a file named .git to the submodule directory. Signed-off-by: Jens Lehmann jens.lehm...@web.de --- Am 20.06.2014 19:31, schrieb Junio C Hamano: Jens Lehmann jens.lehm...@web.de writes: +rm -rf sub1 +git checkout -f $1 +git status -u -s actual +test_must_be_empty actual +sha1=$(git rev-parse HEAD:sub1 || true) $ xx=; xx=$(git rev-parse HEAD:no-such-path || true) ; echo $? ; echo $xx fatal: Path 'no-such-path' does not exist in 'HEAD' 0 HEAD:no-such-path Perhaps you want --verify (or --revs-only) there, i.e. sha1=$(git rev-parse --verify HEAD:sub1 || :) or sha1=$(git rev-parse --revs-only HEAD:sub1) +if test -n $sha1 ... +sha1=$(git rev-parse $commit:$submodule) And here too. Thanks, I squashed your SQUASH??? commit (currently in pu) into this version. t/lib-submodule-update.sh | 632 ++ 1 file changed, 632 insertions(+) create mode 100755 t/lib-submodule-update.sh diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh new file mode 100755 index 000..ceeaaeb --- /dev/null +++ b/t/lib-submodule-update.sh @@ -0,0 +1,632 @@ +# Create a submodule layout used for all tests below. +# +# The following use cases are covered: +# - New submodule (no_submodule = add_sub1) +# - Removed submodule (add_sub1 = remove_sub1) +# - Updated submodule (add_sub1 = modify_sub1) +# - Submodule updated to invalid commit (add_sub1 = invalid_sub1) +# - Submodule updated from invalid commit (invalid_sub1 = valid_sub1) +# - Submodule replaced by tracked files in directory (add_sub1 = +# replace_sub1_with_directory) +# - Directory containing tracked files replaced by submodule +# (replace_sub1_with_directory = replace_directory_with_sub1) +# - Submodule replaced by tracked file with the same name (add_sub1 = +# replace_sub1_with_file) +# - Tracked file replaced by submodule (replace_sub1_with_file = +# replace_file_with_sub1) +# +# --O-O +# / ^ replace_directory_with_sub1 +# /
Re: [PATCH 2/3] test-dump-cache-tree: Improve output format and exit code
David Turner dtur...@twopensource.com writes: Make test-dump-cache-tree more useful for testing. Do not treat known invalid trees as errors (and do not produce non-zero exit code), because we can fall back to the non-cache-tree codepath. Under-explained. more useful is subjective so I won't complain about it being not explained enough, but I cannot quite parse and understand the second sentence. It is not we treat known invalid trees as errors. I think what you meant is we insist that a cache-tree entry, even if it is an invalidated one, must record the correct number of subtrees and signal errors otherwise, which is wrong. I honestly cannot guess what you meant to say by because we can diff --git a/test-dump-cache-tree.c b/test-dump-cache-tree.c index 47eab97..ad42ca1 100644 --- a/test-dump-cache-tree.c +++ b/test-dump-cache-tree.c @@ -6,12 +6,12 @@ static void dump_one(struct cache_tree *it, const char *pfx, const char *x) { if (it-entry_count 0) - printf(%-40s %s%s (%d subtrees)\n, -invalid, x, pfx, it-subtree_nr); + printf(%-40s %s (%d subtrees)%s\n, +invalid, pfx, it-subtree_nr, x); else - printf(%s %s%s (%d entries, %d subtrees)\n, -sha1_to_hex(it-sha1), x, pfx, -it-entry_count, it-subtree_nr); + printf(%s %s (%d entries, %d subtrees)%s\n, +sha1_to_hex(it-sha1), pfx, +it-entry_count, it-subtree_nr, x); I am guessing this is related to the more useful, but I cannot offhand tell what this output shuffling is about. It would be better to illustrate in the log message to support the more useful claim by showing how improved/readable the output got with this change. } static int dump_cache_tree(struct cache_tree *it, @@ -25,19 +25,19 @@ static int dump_cache_tree(struct cache_tree *it, /* missing in either */ return 0; - if (it-entry_count 0) { + if (it-entry_count 0) + /* invalid */ dump_one(it, pfx, ); - dump_one(ref, pfx, #(ref) ); Unfortunately this is not quite what we would want; this #(ref) output is so that we can see what tree object we should be referring to, while debugging, if this entry were not invalidated; losing that does not Improve output---it loses information from the output. - if (it-subtree_nr != ref-subtree_nr) - errs = 1; I am guessing that this is the change you did not explain quite enough with do not treat ... as errors. This code expects that even an invalidated cache-tree entry knows how many subtrees it has, which is unreasonable. I do not recall offhand if we used to have some code to ensure that such an invariant holds or not, but when invalidating a directory (say t/) by adding a new subdirectory and a new file in it (e.g. t/dir/file) to the index, we do not do anything other than to invalidate t/ and t/dir/, and I do not think the codepath recounts the number of subdirectories to adjust subtree_nr in any way to match the reality, so removal of this check is the right thing to do. - } else { - dump_one(it, pfx, ); if (hashcmp(it-sha1, ref-sha1) || ref-entry_count != it-entry_count || ref-subtree_nr != it-subtree_nr) { - dump_one(ref, pfx, #(ref) ); + /* claims to be valid but is lying */ + dump_one(ref, pfx, #(error)); errs = 1; + } else { + /* is actually valid */ + dump_one(it, pfx, ); } } -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout
On Tue, 2014-07-01 at 14:03 -0700, Junio C Hamano wrote: David Turner dtur...@twopensource.com writes: On Tue, 2014-07-01 at 06:16 +0200, Torsten Bögershausen wrote: diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh index 6c33e28..7c60675 100755 --- a/t/t0090-cache-tree.sh +++ b/t/t0090-cache-tree.sh @@ -85,9 +85,22 @@ test_expect_success 'reset --hard without index gives cache-tree' ' test_shallow_cache_tree ' -test_expect_failure 'checkout gives cache-tree' ' +test_expect_success 'checkout gives cache-tree' ' + git tag current git checkout HEAD^ test_shallow_cache_tree The chainis broken here. Does the test now pass, because git tag is added ? The tag does not cause the cache-tree to be created, so git tag does not cause the test to pass. That does not explain why it is a good idea to declare success of this test if this new git tag current fails here for whatever reason (e.g. somebody updated git tag for a reason that is completely unrelated to cache-tree and made it segfault without creating the current tag). Indeed; that's why the latest version includes . -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] cache-tree: Write index with updated cache-tree after commit
David Turner dtur...@twopensource.com writes: During the commit process, the cache-tree is updated. We need to write this updated cache-tree so that it's ready for subsequent commands. Add test code which demonstrates that git commit now writes the cache tree. Also demonstrate that cache-tree invalidation is correct. Signed-off-by: David Turner dtur...@twitter.com --- builtin/commit.c | 15 ++-- t/t0090-cache-tree.sh | 63 --- 2 files changed, 67 insertions(+), 11 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index 9cfef6c..dbd4f4b 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -342,6 +342,8 @@ static char *prepare_index(int argc, const char **argv, const char *prefix, discard_cache(); read_cache_from(index_lock.filename); + if (update_main_cache_tree(WRITE_TREE_SILENT) = 0) + write_cache(fd, active_cache, active_nr); OK, interactive-add may leave the cache-tree not quite populated; we are going to write out a tree from the cache so we need to update the in-core cache tree anyway, so calling update-main-cache-tree here would not hurt (it will speed up the write-cache-as-tree we will eventually call). We might want to see if we are really changing anything, though. What happens if the interactive-add gave us an index with fully valid cache-tree? Is that rare enough not to matter (not a rhetorical question)? @@ -383,14 +385,10 @@ static char *prepare_index(int argc, const char **argv, const char *prefix, if (!only !pathspec.nr) { fd = hold_locked_index(index_lock, 1); refresh_cache_or_die(refresh_flags); - if (active_cache_changed) { - update_main_cache_tree(WRITE_TREE_SILENT); - if (write_cache(fd, active_cache, active_nr) || - commit_locked_index(index_lock)) - die(_(unable to write new_index file)); - } else { - rollback_lock_file(index_lock); - } + update_main_cache_tree(WRITE_TREE_SILENT); + if (write_cache(fd, active_cache, active_nr) || + commit_locked_index(index_lock)) + die(_(unable to write new_index file)); How about doing this part like the following instead, so that we can avoid the overhead of uselessly rewriting the index file when we do not have to? diff --git a/builtin/commit.c b/builtin/commit.c index 5e2221c..7d730a5 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -383,8 +383,11 @@ static char *prepare_index(int argc, const char **argv, const char *prefix, if (!only !pathspec.nr) { fd = hold_locked_index(index_lock, 1); refresh_cache_or_die(refresh_flags); - if (active_cache_changed) { + if (active_cache_changed || !cache_tree_fully_valid(active_cache_tree)) { update_main_cache_tree(WRITE_TREE_SILENT); + active_cache_changed = 1; + } + if (active_cache_changed) { if (write_cache(fd, active_cache, active_nr) || commit_locked_index(index_lock)) die(_(unable to write new_index file)); It makes me wonder if we should teach update_main_cache_tree() to somehow smudge active_cache_changed bit as necessary. Then we do not have to make the call to update-main-cache-tree conditional. @@ -435,6 +433,7 @@ static char *prepare_index(int argc, const char **argv, const char *prefix, fd = hold_locked_index(index_lock, 1); add_remove_files(partial); refresh_cache(REFRESH_QUIET); + update_main_cache_tree(WRITE_TREE_SILENT); if (write_cache(fd, active_cache, active_nr) || close_lock_file(index_lock)) die(_(unable to write new_index file)); This is the index that will be used after we create the commit (which will be created from a temporary index that will be discarded immediately after we create the commit). As we _know_ we are changing something in this code path by calling add_remote_files(), it is fine to call update-main-cache-tree here unconditionally. I didn't notice it when I gave the previous review comment but while reviewing this round, we already do the cache-tree population for commit -a in this function, which suggests me that it is the right place to do these changes. Modulo minor niggles, I like this iteration much better than the previous one. Thanks for working on this. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 00/16] add performance tracing facility
...slowly turning into a full-fledged trace overhaul... When working on the documentation, I discovered GIT_TRACE_PACK_ACCESS, which uses a completely separate trace implementation that is 3 times faster by keeping the file open...so this round includes a performance patch that brings the trace API up to speed. Changes since v6: [04-06]: New. [07-08]: Separated the 'disable for test' aspect into a separate patch. GIT_TRACE_BARE=1 is set in test-lib.sh rather than individual tests. Moved static trace_bare variable from global scope into prepare_trace_line(). [09]:Cast timeval.tv_usec to long to prevent compiler warning. [11,13]: Factor out '#define TRACE_CONTEXT __FILE__' so that it can be easily changed to __FUNCTION__. [14]:Added GIT_TRACE_PERFORMANCE to Documentation/git.txt [15-16]: New. [01-03,10,12] are unchanged save resolving conflicts. Karsten Blees (16): trace: move trace declarations from cache.h to new trace.h trace: consistently name the format parameter trace: remove redundant printf format attribute trace: improve trace performance Documentation/git.txt: improve documentation of 'GIT_TRACE*' variables sha1_file: change GIT_TRACE_PACK_ACCESS logging to use trace API trace: add infrastructure to augment trace output with additional info trace: disable additional trace output for unit tests trace: add current timestamp to all trace output trace: move code around, in preparation to file:line output trace: add 'file:line' to all trace output trace: add high resolution timer function to debug performance issues trace: add trace_performance facility to debug performance issues git: add performance tracing for git's main() function to debug scripts wt-status: simplify performance measurement by using getnanotime() progress: simplify performance measurement by using getnanotime() Documentation/git.txt | 59 +--- Makefile | 7 + builtin/receive-pack.c | 2 +- cache.h| 13 +- commit.h | 1 + config.mak.uname | 1 + git-compat-util.h | 4 + git.c | 2 + pkt-line.c | 8 +- progress.c | 71 +- sha1_file.c| 30 +--- shallow.c | 10 +- t/test-lib.sh | 4 + trace.c| 369 - trace.h| 105 ++ wt-status.c| 14 +- 16 files changed, 524 insertions(+), 176 deletions(-) create mode 100644 trace.h -- 2.0.0.406.ge74f8ff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 02/16] trace: consistently name the format parameter
The format parameter to trace_printf functions is sometimes abbreviated 'fmt'. Rename to 'format' everywhere (consistent with POSIX' printf specification). Signed-off-by: Karsten Blees bl...@dcon.de Signed-off-by: Junio C Hamano gits...@pobox.com --- trace.c | 22 +++--- trace.h | 2 +- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/trace.c b/trace.c index 08180a9..37a7fa9 100644 --- a/trace.c +++ b/trace.c @@ -62,7 +62,7 @@ static int get_trace_fd(const char *key, int *need_close) static const char err_msg[] = Could not trace into fd given by GIT_TRACE environment variable; -static void trace_vprintf(const char *key, const char *fmt, va_list ap) +static void trace_vprintf(const char *key, const char *format, va_list ap) { struct strbuf buf = STRBUF_INIT; @@ -70,25 +70,25 @@ static void trace_vprintf(const char *key, const char *fmt, va_list ap) return; set_try_to_free_routine(NULL); /* is never reset */ - strbuf_vaddf(buf, fmt, ap); + strbuf_vaddf(buf, format, ap); trace_strbuf(key, buf); strbuf_release(buf); } __attribute__((format (printf, 2, 3))) -void trace_printf_key(const char *key, const char *fmt, ...) +void trace_printf_key(const char *key, const char *format, ...) { va_list ap; - va_start(ap, fmt); - trace_vprintf(key, fmt, ap); + va_start(ap, format); + trace_vprintf(key, format, ap); va_end(ap); } -void trace_printf(const char *fmt, ...) +void trace_printf(const char *format, ...) { va_list ap; - va_start(ap, fmt); - trace_vprintf(GIT_TRACE, fmt, ap); + va_start(ap, format); + trace_vprintf(GIT_TRACE, format, ap); va_end(ap); } @@ -106,7 +106,7 @@ void trace_strbuf(const char *key, const struct strbuf *buf) close(fd); } -void trace_argv_printf(const char **argv, const char *fmt, ...) +void trace_argv_printf(const char **argv, const char *format, ...) { struct strbuf buf = STRBUF_INIT; va_list ap; @@ -117,8 +117,8 @@ void trace_argv_printf(const char **argv, const char *fmt, ...) return; set_try_to_free_routine(NULL); /* is never reset */ - va_start(ap, fmt); - strbuf_vaddf(buf, fmt, ap); + va_start(ap, format); + strbuf_vaddf(buf, format, ap); va_end(ap); sq_quote_argv(buf, argv, 0); diff --git a/trace.h b/trace.h index 6cc4541..8fea50b 100644 --- a/trace.h +++ b/trace.h @@ -11,7 +11,7 @@ extern void trace_argv_printf(const char **argv, const char *format, ...); extern void trace_repo_setup(const char *prefix); extern int trace_want(const char *key); __attribute__((format (printf, 2, 3))) -extern void trace_printf_key(const char *key, const char *fmt, ...); +extern void trace_printf_key(const char *key, const char *format, ...); extern void trace_strbuf(const char *key, const struct strbuf *buf); #endif /* TRACE_H */ -- 2.0.0.406.ge74f8ff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 01/16] trace: move trace declarations from cache.h to new trace.h
Also include direct dependencies (strbuf.h and git-compat-util.h for __attribute__) so that trace.h can be used independently of cache.h, e.g. in test programs. Signed-off-by: Karsten Blees bl...@dcon.de Signed-off-by: Junio C Hamano gits...@pobox.com --- cache.h | 13 ++--- trace.h | 17 + 2 files changed, 19 insertions(+), 11 deletions(-) create mode 100644 trace.h diff --git a/cache.h b/cache.h index cbe1935..466f6b3 100644 --- a/cache.h +++ b/cache.h @@ -7,6 +7,7 @@ #include advice.h #include gettext.h #include convert.h +#include trace.h #include SHA1_HEADER #ifndef git_SHA_CTX @@ -1377,17 +1378,7 @@ extern void *alloc_tag_node(void); extern void *alloc_object_node(void); extern void alloc_report(void); -/* trace.c */ -__attribute__((format (printf, 1, 2))) -extern void trace_printf(const char *format, ...); -__attribute__((format (printf, 2, 3))) -extern void trace_argv_printf(const char **argv, const char *format, ...); -extern void trace_repo_setup(const char *prefix); -extern int trace_want(const char *key); -__attribute__((format (printf, 2, 3))) -extern void trace_printf_key(const char *key, const char *fmt, ...); -extern void trace_strbuf(const char *key, const struct strbuf *buf); - +/* pkt-line.c */ void packet_trace_identity(const char *prog); /* add */ diff --git a/trace.h b/trace.h new file mode 100644 index 000..6cc4541 --- /dev/null +++ b/trace.h @@ -0,0 +1,17 @@ +#ifndef TRACE_H +#define TRACE_H + +#include git-compat-util.h +#include strbuf.h + +__attribute__((format (printf, 1, 2))) +extern void trace_printf(const char *format, ...); +__attribute__((format (printf, 2, 3))) +extern void trace_argv_printf(const char **argv, const char *format, ...); +extern void trace_repo_setup(const char *prefix); +extern int trace_want(const char *key); +__attribute__((format (printf, 2, 3))) +extern void trace_printf_key(const char *key, const char *fmt, ...); +extern void trace_strbuf(const char *key, const struct strbuf *buf); + +#endif /* TRACE_H */ -- 2.0.0.406.ge74f8ff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 03/16] trace: remove redundant printf format attribute
trace_printf_key() is the only non-static function that duplicates the printf format attribute in the .c file, remove it for consistency. Signed-off-by: Karsten Blees bl...@dcon.de Signed-off-by: Junio C Hamano gits...@pobox.com --- trace.c | 1 - 1 file changed, 1 deletion(-) diff --git a/trace.c b/trace.c index 37a7fa9..3e31558 100644 --- a/trace.c +++ b/trace.c @@ -75,7 +75,6 @@ static void trace_vprintf(const char *key, const char *format, va_list ap) strbuf_release(buf); } -__attribute__((format (printf, 2, 3))) void trace_printf_key(const char *key, const char *format, ...) { va_list ap; -- 2.0.0.406.ge74f8ff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 04/16] trace: improve trace performance
The trace API currently rechecks the environment variable and reopens the trace file on every API call. This has the ugly side effect that errors (e.g. file cannot be opened, or the user specified a relative path) are also reported on every call. Performance can be improved by about factor three by remembering the environment state and keeping the file open. Replace the 'const char *key' parameter in the API with a pointer to a 'struct trace_key' that bundles the environment variable name with additional, trace-internal state. Change the call sites of these APIs to use a static 'struct trace_key' instead of a string constant. In trace.c::get_trace_fd(), save and reuse the file descriptor in 'struct trace_key'. Add a 'trace_disable()' API, so that packet_trace() can cleanly disable tracing when it encounters packed data (instead of using unsetenv()). Signed-off-by: Karsten Blees bl...@dcon.de --- builtin/receive-pack.c | 2 +- commit.h | 1 + pkt-line.c | 8 ++-- shallow.c | 10 ++--- trace.c| 100 ++--- trace.h| 16 ++-- 6 files changed, 78 insertions(+), 59 deletions(-) diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index c323081..1451050 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -438,7 +438,7 @@ static int update_shallow_ref(struct command *cmd, struct shallow_info *si) uint32_t mask = 1 (cmd-index % 32); int i; - trace_printf_key(GIT_TRACE_SHALLOW, + trace_printf_key(trace_shallow, shallow: update_shallow_ref %s\n, cmd-ref_name); for (i = 0; i si-shallow-nr; i++) if (si-used_shallow[i] diff --git a/commit.h b/commit.h index a9f177b..08ef643 100644 --- a/commit.h +++ b/commit.h @@ -235,6 +235,7 @@ extern void assign_shallow_commits_to_refs(struct shallow_info *info, int *ref_status); extern int delayed_reachability_test(struct shallow_info *si, int c); extern void prune_shallow(int show_only); +extern struct trace_key trace_shallow; int is_descendant_of(struct commit *, struct commit_list *); int in_merge_bases(struct commit *, struct commit *); diff --git a/pkt-line.c b/pkt-line.c index bc63b3b..8bc89b1 100644 --- a/pkt-line.c +++ b/pkt-line.c @@ -3,7 +3,7 @@ char packet_buffer[LARGE_PACKET_MAX]; static const char *packet_trace_prefix = git; -static const char trace_key[] = GIT_TRACE_PACKET; +static struct trace_key trace_packet = TRACE_KEY_INIT(PACKET); void packet_trace_identity(const char *prog) { @@ -15,7 +15,7 @@ static void packet_trace(const char *buf, unsigned int len, int write) int i; struct strbuf out; - if (!trace_want(trace_key)) + if (!trace_want(trace_packet)) return; /* +32 is just a guess for header + quoting */ @@ -27,7 +27,7 @@ static void packet_trace(const char *buf, unsigned int len, int write) if ((len = 4 starts_with(buf, PACK)) || (len = 5 starts_with(buf+1, PACK))) { strbuf_addstr(out, PACK ...); - unsetenv(trace_key); + trace_disable(trace_packet); } else { /* XXX we should really handle printable utf8 */ @@ -43,7 +43,7 @@ static void packet_trace(const char *buf, unsigned int len, int write) } strbuf_addch(out, '\n'); - trace_strbuf(trace_key, out); + trace_strbuf(trace_packet, out); strbuf_release(out); } diff --git a/shallow.c b/shallow.c index 0b267b6..de07709 100644 --- a/shallow.c +++ b/shallow.c @@ -325,7 +325,7 @@ void prune_shallow(int show_only) strbuf_release(sb); } -#define TRACE_KEY GIT_TRACE_SHALLOW +struct trace_key trace_shallow = TRACE_KEY_INIT(SHALLOW); /* * Step 1, split sender shallow commits into ours and theirs @@ -334,7 +334,7 @@ void prune_shallow(int show_only) void prepare_shallow_info(struct shallow_info *info, struct sha1_array *sa) { int i; - trace_printf_key(TRACE_KEY, shallow: prepare_shallow_info\n); + trace_printf_key(trace_shallow, shallow: prepare_shallow_info\n); memset(info, 0, sizeof(*info)); info-shallow = sa; if (!sa) @@ -365,7 +365,7 @@ void remove_nonexistent_theirs_shallow(struct shallow_info *info) { unsigned char (*sha1)[20] = info-shallow-sha1; int i, dst; - trace_printf_key(TRACE_KEY, shallow: remove_nonexistent_theirs_shallow\n); + trace_printf_key(trace_shallow, shallow: remove_nonexistent_theirs_shallow\n); for (i = dst = 0; i info-nr_theirs; i++) { if (i != dst) info-theirs[dst] = info-theirs[i]; @@ -516,7 +516,7 @@ void assign_shallow_commits_to_refs(struct shallow_info *info, int *shallow, nr_shallow = 0; struct paint_info pi; - trace_printf_key(TRACE_KEY,
Re: [PATCH 3/3] cache-tree: Write index with updated cache-tree after commit
On Tue, 2014-07-01 at 15:45 -0700, Junio C Hamano wrote: I didn't notice it when I gave the previous review comment but while reviewing this round, we already do the cache-tree population for commit -a in this function, which suggests me that it is the right place to do these changes. Modulo minor niggles, I like this iteration much better than the previous one. Thanks for working on this. Thanks for all your comments. I'll send out a v4 when I can, but I probably won't have more time to work on this until next week. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 05/16] Documentation/git.txt: improve documentation of 'GIT_TRACE*' variables
Separate GIT_TRACE description into what it prints and how to configure where trace output is printed to. Change other GIT_TRACE_* descriptions to refer to GIT_TRACE. Add descriptions for GIT_TRACE_SETUP and GIT_TRACE_SHALLOW. Signed-off-by: Karsten Blees bl...@dcon.de --- Documentation/git.txt | 50 ++ 1 file changed, 34 insertions(+), 16 deletions(-) diff --git a/Documentation/git.txt b/Documentation/git.txt index 3bd68b0..75633e6 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -904,18 +904,25 @@ for further details. based on whether stdout appears to be redirected to a file or not. 'GIT_TRACE':: - If this variable is set to 1, 2 or true (comparison - is case insensitive), Git will print `trace:` messages on - stderr telling about alias expansion, built-in command - execution and external command execution. - If this variable is set to an integer value greater than 1 - and lower than 10 (strictly) then Git will interpret this - value as an open file descriptor and will try to write the - trace messages into this file descriptor. - Alternatively, if this variable is set to an absolute path - (starting with a '/' character), Git will interpret this - as a file path and will try to write the trace messages - into it. + Enables general trace messages, e.g. alias expansion, built-in + command execution and external command execution. ++ +If this variable is set to 1, 2 or true (comparison +is case insensitive), trace messages will be printed to +stderr. ++ +If the variable is set to an integer value greater than 2 +and lower than 10 (strictly) then Git will interpret this +value as an open file descriptor and will try to write the +trace messages into this file descriptor. ++ +Alternatively, if the variable is set to an absolute path +(starting with a '/' character), Git will interpret this +as a file path and will try to write the trace messages +into it. ++ +Unsetting the variable, or setting it to empty, 0 or +false (case insensitive) disables trace messages. 'GIT_TRACE_PACK_ACCESS':: If this variable is set to a path, a file will be created at @@ -925,10 +932,21 @@ for further details. pack-related performance problems. 'GIT_TRACE_PACKET':: - If this variable is set, it shows a trace of all packets - coming in or out of a given program. This can help with - debugging object negotiation or other protocol issues. Tracing - is turned off at a packet starting with PACK. + Enables trace messages for all packets coming in or out of a + given program. This can help with debugging object negotiation + or other protocol issues. Tracing is turned off at a packet + starting with PACK. + See 'GIT_TRACE' for available trace output options. + +'GIT_TRACE_SETUP':: + Enables trace messages printing the .git, working tree and current + working directory after Git has completed its setup phase. + See 'GIT_TRACE' for available trace output options. + +'GIT_TRACE_SHALLOW':: + Enables trace messages that can help debugging fetching / + cloning of shallow repositories. + See 'GIT_TRACE' for available trace output options. GIT_LITERAL_PATHSPECS:: Setting this variable to `1` will cause Git to treat all -- 2.0.0.406.ge74f8ff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 06/16] sha1_file: change GIT_TRACE_PACK_ACCESS logging to use trace API
This changes GIT_TRACE_PACK_ACCESS functionality as follows: * supports the same options as GIT_TRACE (e.g. printing to stderr) * no longer supports relative paths * appends to the trace file rather than overwriting Signed-off-by: Karsten Blees bl...@dcon.de --- Documentation/git.txt | 4 ++-- sha1_file.c | 30 -- 2 files changed, 6 insertions(+), 28 deletions(-) diff --git a/Documentation/git.txt b/Documentation/git.txt index 75633e6..9d073f6 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -925,11 +925,11 @@ Unsetting the variable, or setting it to empty, 0 or false (case insensitive) disables trace messages. 'GIT_TRACE_PACK_ACCESS':: - If this variable is set to a path, a file will be created at - the given path logging all accesses to any packs. For each + Enables trace messages for all accesses to any packs. For each access, the pack file name and an offset in the pack is recorded. This may be helpful for troubleshooting some pack-related performance problems. + See 'GIT_TRACE' for available trace output options. 'GIT_TRACE_PACKET':: Enables trace messages for all packets coming in or out of a diff --git a/sha1_file.c b/sha1_file.c index 34d527f..7a110b5 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -36,9 +36,6 @@ static inline uintmax_t sz_fmt(size_t s) { return s; } const unsigned char null_sha1[20]; -static const char *no_log_pack_access = no_log_pack_access; -static const char *log_pack_access; - /* * This is meant to hold a *small* number of objects that you would * want read_sha1_file() to be able to return, but yet you do not want @@ -2085,27 +2082,9 @@ static void *read_object(const unsigned char *sha1, enum object_type *type, static void write_pack_access_log(struct packed_git *p, off_t obj_offset) { - static FILE *log_file; - - if (!log_pack_access) - log_pack_access = getenv(GIT_TRACE_PACK_ACCESS); - if (!log_pack_access) - log_pack_access = no_log_pack_access; - if (log_pack_access == no_log_pack_access) - return; - - if (!log_file) { - log_file = fopen(log_pack_access, w); - if (!log_file) { - error(cannot open pack access log '%s' for writing: %s, - log_pack_access, strerror(errno)); - log_pack_access = no_log_pack_access; - return; - } - } - fprintf(log_file, %s %PRIuMAX\n, - p-pack_name, (uintmax_t)obj_offset); - fflush(log_file); + static struct trace_key pack_access = TRACE_KEY_INIT(PACK_ACCESS); + trace_printf_key(pack_access, %s %PRIuMAX\n, +p-pack_name, (uintmax_t)obj_offset); } int do_check_packed_object_crc; @@ -2130,8 +2109,7 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset, int delta_stack_nr = 0, delta_stack_alloc = UNPACK_ENTRY_STACK_PREALLOC; int base_from_cache = 0; - if (log_pack_access != no_log_pack_access) - write_pack_access_log(p, obj_offset); + write_pack_access_log(p, obj_offset); /* PHASE 1: drill down to the innermost base object */ for (;;) { -- 2.0.0.406.ge74f8ff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 07/16] trace: add infrastructure to augment trace output with additional info
To be able to add a common prefix or suffix to all trace output (e.g. a timestamp or file:line of the caller), factor out common setup and cleanup tasks of the trace* functions. When adding a common prefix, it makes sense that the output of each trace call starts on a new line. Add '\n' in case the caller forgot. Note that this explicitly limits trace output to line-by-line, it is no longer possible to trace-print just part of a line. Until now, this was just an implicit assumption (trace-printing part of a line worked, but messed up the trace file if multiple threads or processes were involved). Thread-safety / inter-process-safety is also the reason why we need to do the prefixing and suffixing in memory rather than issuing multiple write() calls. Write_or_whine_pipe() / xwrite() is atomic unless the size exceeds MAX_IO_SIZE (8MB, see wrapper.c). In case of trace_strbuf, this costs an additional string copy (which should be irrelevant for performance in light of actual file IO). While we're at it, rename trace_strbuf's 'buf' argument, which suggests that the function is modifying the buffer. Trace_strbuf() currently is the only trace API that can print arbitrary binary data (without barfing on '%' or stopping at '\0'), so 'data' seems more appropriate. Signed-off-by: Karsten Blees bl...@dcon.de --- trace.c | 47 +-- trace.h | 2 +- 2 files changed, 34 insertions(+), 15 deletions(-) diff --git a/trace.c b/trace.c index 8662b79..3d02bcc 100644 --- a/trace.c +++ b/trace.c @@ -85,17 +85,37 @@ void trace_disable(struct trace_key *key) static const char err_msg[] = Could not trace into fd given by GIT_TRACE environment variable; +static int prepare_trace_line(struct trace_key *key, struct strbuf *buf) +{ + if (!trace_want(key)) + return 0; + + set_try_to_free_routine(NULL); /* is never reset */ + + /* add line prefix here */ + + return 1; +} + +static void print_trace_line(struct trace_key *key, struct strbuf *buf) +{ + /* append newline if missing */ + if (buf-len buf-buf[buf-len - 1] != '\n') + strbuf_addch(buf, '\n'); + + write_or_whine_pipe(get_trace_fd(key), buf-buf, buf-len, err_msg); + strbuf_release(buf); +} + static void trace_vprintf(struct trace_key *key, const char *format, va_list ap) { struct strbuf buf = STRBUF_INIT; - if (!trace_want(key)) + if (!prepare_trace_line(key, buf)) return; - set_try_to_free_routine(NULL); /* is never reset */ strbuf_vaddf(buf, format, ap); - trace_strbuf(key, buf); - strbuf_release(buf); + print_trace_line(key, buf); } void trace_printf_key(struct trace_key *key, const char *format, ...) @@ -114,32 +134,31 @@ void trace_printf(const char *format, ...) va_end(ap); } -void trace_strbuf(struct trace_key *key, const struct strbuf *buf) +void trace_strbuf(struct trace_key *key, const struct strbuf *data) { - int fd = get_trace_fd(key); - if (!fd) + struct strbuf buf = STRBUF_INIT; + + if (!prepare_trace_line(key, buf)) return; - write_or_whine_pipe(fd, buf-buf, buf-len, err_msg); + strbuf_addbuf(buf, data); + print_trace_line(key, buf); } void trace_argv_printf(const char **argv, const char *format, ...) { struct strbuf buf = STRBUF_INIT; va_list ap; - int fd = get_trace_fd(NULL); - if (!fd) + + if (!prepare_trace_line(NULL, buf)) return; - set_try_to_free_routine(NULL); /* is never reset */ va_start(ap, format); strbuf_vaddf(buf, format, ap); va_end(ap); sq_quote_argv(buf, argv, 0); - strbuf_addch(buf, '\n'); - write_or_whine_pipe(fd, buf.buf, buf.len, err_msg); - strbuf_release(buf); + print_trace_line(NULL, buf); } static const char *quote_crnl(const char *path) diff --git a/trace.h b/trace.h index fbfd9f4..e69455f 100644 --- a/trace.h +++ b/trace.h @@ -22,6 +22,6 @@ extern int trace_want(struct trace_key *key); extern void trace_disable(struct trace_key *key); __attribute__((format (printf, 2, 3))) extern void trace_printf_key(struct trace_key *key, const char *format, ...); -extern void trace_strbuf(struct trace_key *key, const struct strbuf *buf); +extern void trace_strbuf(struct trace_key *key, const struct strbuf *data); #endif /* TRACE_H */ -- 2.0.0.406.ge74f8ff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 08/16] trace: disable additional trace output for unit tests
Some unit-tests use trace output to verify internal state, and unstable output such as timestamps and line numbers are not useful there. Disable additional trace output if GIT_TRACE_BARE is set. Signed-off-by: Karsten Blees bl...@dcon.de --- t/test-lib.sh | 4 trace.c | 6 ++ 2 files changed, 10 insertions(+) diff --git a/t/test-lib.sh b/t/test-lib.sh index 81394c8..e37da5a 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -109,6 +109,10 @@ export GIT_AUTHOR_EMAIL GIT_AUTHOR_NAME export GIT_COMMITTER_EMAIL GIT_COMMITTER_NAME export EDITOR +# Tests using GIT_TRACE typically don't want timestamp file:line output +GIT_TRACE_BARE=1 +export GIT_TRACE_BARE + if test -n ${TEST_GIT_INDEX_VERSION:+isset} then GIT_INDEX_VERSION=$TEST_GIT_INDEX_VERSION diff --git a/trace.c b/trace.c index 3d02bcc..a194b16 100644 --- a/trace.c +++ b/trace.c @@ -87,11 +87,17 @@ static const char err_msg[] = Could not trace into fd given by static int prepare_trace_line(struct trace_key *key, struct strbuf *buf) { + static struct trace_key trace_bare = TRACE_KEY_INIT(BARE); + if (!trace_want(key)) return 0; set_try_to_free_routine(NULL); /* is never reset */ + /* unit tests may want to disable additional trace output */ + if (trace_want(trace_bare)) + return 1; + /* add line prefix here */ return 1; -- 2.0.0.406.ge74f8ff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 09/16] trace: add current timestamp to all trace output
This is useful to tell apart trace output of separate test runs. It can also be used for basic, coarse-grained performance analysis. Note that the accuracy is tainted by writing to the trace file, and you have to calculate the deltas yourself (which is next to impossible if multiple threads or processes are involved). Signed-off-by: Karsten Blees bl...@dcon.de --- trace.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/trace.c b/trace.c index a194b16..18e5d93 100644 --- a/trace.c +++ b/trace.c @@ -88,6 +88,9 @@ static const char err_msg[] = Could not trace into fd given by static int prepare_trace_line(struct trace_key *key, struct strbuf *buf) { static struct trace_key trace_bare = TRACE_KEY_INIT(BARE); + struct timeval tv; + struct tm tm; + time_t secs; if (!trace_want(key)) return 0; @@ -98,7 +101,12 @@ static int prepare_trace_line(struct trace_key *key, struct strbuf *buf) if (trace_want(trace_bare)) return 1; - /* add line prefix here */ + /* print current timestamp */ + gettimeofday(tv, NULL); + secs = tv.tv_sec; + localtime_r(secs, tm); + strbuf_addf(buf, %02d:%02d:%02d.%06ld , tm.tm_hour, tm.tm_min, + tm.tm_sec, (long) tv.tv_usec); return 1; } -- 2.0.0.406.ge74f8ff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 11/16] trace: add 'file:line' to all trace output
This is useful to see where trace output came from. Add 'const char *file, int line' parameters to the printing functions and rename them to *_fl. Add trace_printf* and trace_strbuf macros resolving to the *_fl functions and let the preprocessor fill in __FILE__ and __LINE__. As the trace_printf* functions take a variable number of arguments, this requires variadic macros (i.e. '#define foo(...) foo_impl(__VA_ARGS__)'. Though part of C99, it is unclear whether older compilers support this. Thus keep the old functions and only enable variadic macros for GNUC and MSVC 2005+ (_MSC_VER 1400). This has the nice side effect that the old C-style declarations serve as documentation how the macros are to be used. Print 'file:line ' as prefix to each trace line. Align the remaining trace output at column 40 to accommodate 18 char file names + 4 digit line number (currently there are 30 *.c files of length 18 and just 11 of 19). Trace output from longer source files (e.g. builtin/receive-pack.c) will not be aligned. Signed-off-by: Karsten Blees bl...@dcon.de --- git-compat-util.h | 4 trace.c | 72 +-- trace.h | 54 + 3 files changed, 118 insertions(+), 12 deletions(-) diff --git a/git-compat-util.h b/git-compat-util.h index b6f03b3..4dd065d 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -704,6 +704,10 @@ void git_qsort(void *base, size_t nmemb, size_t size, #endif #endif +#if defined(__GNUC__) || (_MSC_VER = 1400) +#define HAVE_VARIADIC_MACROS 1 +#endif + /* * Preserves errno, prints a message, but gives no warning for ENOENT. * Always returns the return value of unlink(2). diff --git a/trace.c b/trace.c index e8ce619..f013958 100644 --- a/trace.c +++ b/trace.c @@ -85,7 +85,8 @@ void trace_disable(struct trace_key *key) static const char err_msg[] = Could not trace into fd given by GIT_TRACE environment variable; -static int prepare_trace_line(struct trace_key *key, struct strbuf *buf) +static int prepare_trace_line(const char *file, int line, + struct trace_key *key, struct strbuf *buf) { static struct trace_key trace_bare = TRACE_KEY_INIT(BARE); struct timeval tv; @@ -108,6 +109,14 @@ static int prepare_trace_line(struct trace_key *key, struct strbuf *buf) strbuf_addf(buf, %02d:%02d:%02d.%06ld , tm.tm_hour, tm.tm_min, tm.tm_sec, (long) tv.tv_usec); +#ifdef HAVE_VARIADIC_MACROS + /* print file:line */ + strbuf_addf(buf, %s:%d , file, line); + /* align trace output (column 40 catches most files names in git) */ + while (buf-len 40) + strbuf_addch(buf, ' '); +#endif + return 1; } @@ -121,49 +130,52 @@ static void print_trace_line(struct trace_key *key, struct strbuf *buf) strbuf_release(buf); } -static void trace_vprintf(struct trace_key *key, const char *format, va_list ap) +static void trace_vprintf_fl(const char *file, int line, struct trace_key *key, +const char *format, va_list ap) { struct strbuf buf = STRBUF_INIT; - if (!prepare_trace_line(key, buf)) + if (!prepare_trace_line(file, line, key, buf)) return; strbuf_vaddf(buf, format, ap); print_trace_line(key, buf); } -void trace_argv_printf(const char **argv, const char *format, ...) +static void trace_argv_vprintf_fl(const char *file, int line, + const char **argv, const char *format, + va_list ap) { struct strbuf buf = STRBUF_INIT; - va_list ap; - if (!prepare_trace_line(NULL, buf)) + if (!prepare_trace_line(file, line, NULL, buf)) return; - va_start(ap, format); strbuf_vaddf(buf, format, ap); - va_end(ap); sq_quote_argv(buf, argv, 0); print_trace_line(NULL, buf); } -void trace_strbuf(struct trace_key *key, const struct strbuf *data) +void trace_strbuf_fl(const char *file, int line, struct trace_key *key, +const struct strbuf *data) { struct strbuf buf = STRBUF_INIT; - if (!prepare_trace_line(key, buf)) + if (!prepare_trace_line(file, line, key, buf)) return; strbuf_addbuf(buf, data); print_trace_line(key, buf); } +#ifndef HAVE_VARIADIC_MACROS + void trace_printf(const char *format, ...) { va_list ap; va_start(ap, format); - trace_vprintf(NULL, format, ap); + trace_vprintf_fl(NULL, 0, NULL, format, ap); va_end(ap); } @@ -171,10 +183,46 @@ void trace_printf_key(struct trace_key *key, const char *format, ...) { va_list ap; va_start(ap, format); - trace_vprintf(key, format, ap); + trace_vprintf_fl(NULL, 0, key, format, ap); + va_end(ap); +} + +void trace_argv_printf(const char
[PATCH v7 10/16] trace: move code around, in preparation to file:line output
No functional changes, just move stuff around so that the next patch isn't that ugly... Signed-off-by: Karsten Blees bl...@dcon.de --- trace.c | 36 ++-- trace.h | 12 2 files changed, 26 insertions(+), 22 deletions(-) diff --git a/trace.c b/trace.c index 18e5d93..e8ce619 100644 --- a/trace.c +++ b/trace.c @@ -132,20 +132,20 @@ static void trace_vprintf(struct trace_key *key, const char *format, va_list ap) print_trace_line(key, buf); } -void trace_printf_key(struct trace_key *key, const char *format, ...) +void trace_argv_printf(const char **argv, const char *format, ...) { + struct strbuf buf = STRBUF_INIT; va_list ap; - va_start(ap, format); - trace_vprintf(key, format, ap); - va_end(ap); -} -void trace_printf(const char *format, ...) -{ - va_list ap; + if (!prepare_trace_line(NULL, buf)) + return; + va_start(ap, format); - trace_vprintf(NULL, format, ap); + strbuf_vaddf(buf, format, ap); va_end(ap); + + sq_quote_argv(buf, argv, 0); + print_trace_line(NULL, buf); } void trace_strbuf(struct trace_key *key, const struct strbuf *data) @@ -159,20 +159,20 @@ void trace_strbuf(struct trace_key *key, const struct strbuf *data) print_trace_line(key, buf); } -void trace_argv_printf(const char **argv, const char *format, ...) +void trace_printf(const char *format, ...) { - struct strbuf buf = STRBUF_INIT; va_list ap; - - if (!prepare_trace_line(NULL, buf)) - return; - va_start(ap, format); - strbuf_vaddf(buf, format, ap); + trace_vprintf(NULL, format, ap); va_end(ap); +} - sq_quote_argv(buf, argv, 0); - print_trace_line(NULL, buf); +void trace_printf_key(struct trace_key *key, const char *format, ...) +{ + va_list ap; + va_start(ap, format); + trace_vprintf(key, format, ap); + va_end(ap); } static const char *quote_crnl(const char *path) diff --git a/trace.h b/trace.h index e69455f..0c98da2 100644 --- a/trace.h +++ b/trace.h @@ -13,15 +13,19 @@ struct trace_key { #define TRACE_KEY_INIT(name) { GIT_TRACE_ #name } -__attribute__((format (printf, 1, 2))) -extern void trace_printf(const char *format, ...); -__attribute__((format (printf, 2, 3))) -extern void trace_argv_printf(const char **argv, const char *format, ...); extern void trace_repo_setup(const char *prefix); extern int trace_want(struct trace_key *key); extern void trace_disable(struct trace_key *key); + +__attribute__((format (printf, 1, 2))) +extern void trace_printf(const char *format, ...); + __attribute__((format (printf, 2, 3))) extern void trace_printf_key(struct trace_key *key, const char *format, ...); + +__attribute__((format (printf, 2, 3))) +extern void trace_argv_printf(const char **argv, const char *format, ...); + extern void trace_strbuf(struct trace_key *key, const struct strbuf *data); #endif /* TRACE_H */ -- 2.0.0.406.ge74f8ff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 12/16] trace: add high resolution timer function to debug performance issues
Add a getnanotime() function that returns nanoseconds since 01/01/1970 as unsigned 64-bit integer (i.e. overflows in july 2554). This is easier to work with than e.g. struct timeval or struct timespec. Basing the timer on the epoch allows using the results with other time-related APIs. To simplify adaption to different platforms, split the implementation into a common getnanotime() and a platform-specific highres_nanos() function. The common getnanotime() function handles errors, falling back to gettimeofday() if highres_nanos() isn't implemented or doesn't work. getnanotime() is also responsible for normalizing to the epoch. The offset to the system clock is calculated only once on initialization, i.e. manually setting the system clock has no impact on the timer (except if the fallback gettimeofday() is in use). Git processes are typically short lived, so we don't need to handle clock drift. The highres_nanos() function returns monotonically increasing nanoseconds relative to some arbitrary point in time (e.g. system boot), or 0 on failure. Providing platform-specific implementations should be relatively easy, e.g. adapting to clock_gettime() as defined by the POSIX realtime extensions is seven lines of code. This version includes highres_nanos() implementations for: * Linux: using clock_gettime(CLOCK_MONOTONIC) * Windows: using QueryPerformanceCounter() Todo: * enable clock_gettime() on more platforms * add Mac OSX version, e.g. using mach_absolute_time + mach_timebase_info Signed-off-by: Karsten Blees bl...@dcon.de Signed-off-by: Junio C Hamano gits...@pobox.com --- Makefile | 7 + config.mak.uname | 1 + trace.c | 82 trace.h | 1 + 4 files changed, 91 insertions(+) diff --git a/Makefile b/Makefile index 07ea105..80f4390 100644 --- a/Makefile +++ b/Makefile @@ -340,6 +340,8 @@ all:: # # Define GMTIME_UNRELIABLE_ERRORS if your gmtime() function does not # return NULL when it receives a bogus time_t. +# +# Define HAVE_CLOCK_GETTIME if your platform has clock_gettime in librt. GIT-VERSION-FILE: FORCE @$(SHELL_PATH) ./GIT-VERSION-GEN @@ -1497,6 +1499,11 @@ ifdef GMTIME_UNRELIABLE_ERRORS BASIC_CFLAGS += -DGMTIME_UNRELIABLE_ERRORS endif +ifdef HAVE_CLOCK_GETTIME + BASIC_CFLAGS += -DHAVE_CLOCK_GETTIME + EXTLIBS += -lrt +endif + ifeq ($(TCLTK_PATH),) NO_TCLTK = NoThanks endif diff --git a/config.mak.uname b/config.mak.uname index 1ae675b..dad2618 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -34,6 +34,7 @@ ifeq ($(uname_S),Linux) HAVE_PATHS_H = YesPlease LIBC_CONTAINS_LIBINTL = YesPlease HAVE_DEV_TTY = YesPlease + HAVE_CLOCK_GETTIME = YesPlease endif ifeq ($(uname_S),GNU/kFreeBSD) HAVE_ALLOCA_H = YesPlease diff --git a/trace.c b/trace.c index f013958..b9d7272 100644 --- a/trace.c +++ b/trace.c @@ -275,3 +275,85 @@ int trace_want(struct trace_key *key) { return !!get_trace_fd(key); } + +#ifdef HAVE_CLOCK_GETTIME + +static inline uint64_t highres_nanos(void) +{ + struct timespec ts; + if (clock_gettime(CLOCK_MONOTONIC, ts)) + return 0; + return (uint64_t) ts.tv_sec * 10 + ts.tv_nsec; +} + +#elif defined (GIT_WINDOWS_NATIVE) + +static inline uint64_t highres_nanos(void) +{ + static uint64_t high_ns, scaled_low_ns; + static int scale; + LARGE_INTEGER cnt; + + if (!scale) { + if (!QueryPerformanceFrequency(cnt)) + return 0; + + /* high_ns = number of ns per cnt.HighPart */ + high_ns = (10LL 32) / (uint64_t) cnt.QuadPart; + + /* +* Number of ns per cnt.LowPart is 10^9 / frequency (or +* high_ns 32). For maximum precision, we scale this factor +* so that it just fits within 32 bit (i.e. won't overflow if +* multiplied with cnt.LowPart). +*/ + scaled_low_ns = high_ns; + scale = 32; + while (scaled_low_ns = 0x1LL) { + scaled_low_ns = 1; + scale--; + } + } + + /* if QPF worked on initialization, we expect QPC to work as well */ + QueryPerformanceCounter(cnt); + + return (high_ns * cnt.HighPart) + + ((scaled_low_ns * cnt.LowPart) scale); +} + +#else +# define highres_nanos() 0 +#endif + +static inline uint64_t gettimeofday_nanos(void) +{ + struct timeval tv; + gettimeofday(tv, NULL); + return (uint64_t) tv.tv_sec * 10 + tv.tv_usec * 1000; +} + +/* + * Returns nanoseconds since the epoch (01/01/1970), for performance tracing + * (i.e. favoring high precision over wall clock time accuracy). + */ +inline uint64_t getnanotime(void) +{ + static uint64_t offset; + if (offset 1) { + /* initialization
[PATCH v7 13/16] trace: add trace_performance facility to debug performance issues
Add trace_performance and trace_performance_since macros that print a duration and an optional printf-formatted text to the file specified in environment variable GIT_TRACE_PERFORMANCE. These macros, in conjunction with getnanotime(), are intended to simplify performance measurements from within the application (i.e. profiling via manual instrumentation, rather than using an external profiling tool). Unless enabled via GIT_TRACE_PERFORMANCE, these macros have no noticeable impact on performance, so that test code for well known time killers may be shipped in release builds. Alternatively, a developer could provide an additional performance patch (not meant for master) that allows reviewers to reproduce performance tests more easily, e.g. on other platforms or using their own repositories. Usage examples: Simple use case (measure one code section): uint64_t start = getnanotime(); /* code section to measure */ trace_performance_since(start, foobar); Complex use case (measure repetitive code sections): uint64_t t = 0; for (;;) { /* ignore */ t -= getnanotime(); /* code section to measure */ t += getnanotime(); /* ignore */ } trace_performance(t, frotz); Signed-off-by: Karsten Blees bl...@dcon.de --- trace.c | 47 +++ trace.h | 18 ++ 2 files changed, 65 insertions(+) diff --git a/trace.c b/trace.c index b9d7272..af64dbb 100644 --- a/trace.c +++ b/trace.c @@ -169,6 +169,27 @@ void trace_strbuf_fl(const char *file, int line, struct trace_key *key, print_trace_line(key, buf); } +static struct trace_key trace_perf_key = TRACE_KEY_INIT(PERFORMANCE); + +static void trace_performance_vprintf_fl(const char *file, int line, +uint64_t nanos, const char *format, +va_list ap) +{ + struct strbuf buf = STRBUF_INIT; + + if (!prepare_trace_line(file, line, trace_perf_key, buf)) + return; + + strbuf_addf(buf, performance: %.9f s, (double) nanos / 10); + + if (format *format) { + strbuf_addstr(buf, : ); + strbuf_vaddf(buf, format, ap); + } + + print_trace_line(trace_perf_key, buf); +} + #ifndef HAVE_VARIADIC_MACROS void trace_printf(const char *format, ...) @@ -200,6 +221,23 @@ void trace_strbuf(const char *key, const struct strbuf *data) trace_strbuf_fl(NULL, 0, key, data); } +void trace_performance(uint64_t nanos, const char *format, ...) +{ + va_list ap; + va_start(ap, format); + trace_performance_vprintf_fl(NULL, 0, nanos, format, ap); + va_end(ap); +} + +void trace_performance_since(uint64_t start, const char *format, ...) +{ + va_list ap; + va_start(ap, format); + trace_performance_vprintf_fl(NULL, 0, getnanotime() - start, +format, ap); + va_end(ap); +} + #else void trace_printf_key_fl(const char *file, int line, struct trace_key *key, @@ -220,6 +258,15 @@ void trace_argv_printf_fl(const char *file, int line, const char **argv, va_end(ap); } +void trace_performance_fl(const char *file, int line, uint64_t nanos, + const char *format, ...) +{ + va_list ap; + va_start(ap, format); + trace_performance_vprintf_fl(file, line, nanos, format, ap); + va_end(ap); +} + #endif /* HAVE_VARIADIC_MACROS */ diff --git a/trace.h b/trace.h index 6ad29dd..92d0fa6 100644 --- a/trace.h +++ b/trace.h @@ -31,6 +31,14 @@ extern void trace_argv_printf(const char **argv, const char *format, ...); extern void trace_strbuf(struct trace_key *key, const struct strbuf *data); +/* Prints elapsed time (in nanoseconds) if GIT_TRACE_PERFORMANCE is enabled. */ +__attribute__((format (printf, 2, 3))) +extern void trace_performance(uint64_t nanos, const char *format, ...); + +/* Prints elapsed time since 'start' if GIT_TRACE_PERFORMANCE is enabled. */ +__attribute__((format (printf, 2, 3))) +extern void trace_performance_since(uint64_t start, const char *format, ...); + #else /* @@ -71,6 +79,13 @@ extern void trace_strbuf(struct trace_key *key, const struct strbuf *data); #define trace_strbuf(key, data) \ trace_strbuf_fl(TRACE_CONTEXT, __LINE__, key, data) +#define trace_performance(nanos, ...) \ + trace_performance_fl(TRACE_CONTEXT, __LINE__, nanos, __VA_ARGS__) + +#define trace_performance_since(start, ...) \ + trace_performance_fl(TRACE_CONTEXT, __LINE__, getnanotime() - (start), \ +__VA_ARGS__) + /* backend functions, use non-*fl macros instead */ __attribute__((format (printf, 4, 5))) extern void trace_printf_key_fl(const char *file, int line, struct trace_key *key, @@ -80,6 +95,9 @@ extern void trace_argv_printf_fl(const char *file, int line, const char **argv, const char *format, ...); extern void
[PATCH v7 14/16] git: add performance tracing for git's main() function to debug scripts
Use trace_performance to measure and print execution time and command line arguments of the entire main() function. In constrast to the shell's 'time' utility, which measures total time of the parent process, this logs all involved git commands recursively. This is particularly useful to debug performance issues of scripted commands (i.e. which git commands were called with which parameters, and how long did they execute). Due to git's deliberate use of exit(), the implementation uses an atexit routine rather than just adding trace_performance_since() at the end of main(). Usage example: GIT_TRACE_PERFORMANCE=~/git-trace.log git stash list Creates a log file like this: 23:57:38.638765 trace.c:405 performance: 0.000310107 s: git command: 'git' 'rev-parse' '--git-dir' 23:57:38.644387 trace.c:405 performance: 0.000261759 s: git command: 'git' 'rev-parse' '--show-toplevel' 23:57:38.646207 trace.c:405 performance: 0.000304468 s: git command: 'git' 'config' '--get-colorbool' 'color.interactive' 23:57:38.648491 trace.c:405 performance: 0.000241667 s: git command: 'git' 'config' '--get-color' 'color.interactive.help' 'red bold' 23:57:38.650465 trace.c:405 performance: 0.000243063 s: git command: 'git' 'config' '--get-color' '' 'reset' 23:57:38.654850 trace.c:405 performance: 0.025126313 s: git command: 'git' 'stash' 'list' Signed-off-by: Karsten Blees bl...@dcon.de --- Documentation/git.txt | 5 + git.c | 2 ++ trace.c | 22 ++ trace.h | 1 + 4 files changed, 30 insertions(+) diff --git a/Documentation/git.txt b/Documentation/git.txt index 9d073f6..fcb8afa 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -938,6 +938,11 @@ Unsetting the variable, or setting it to empty, 0 or starting with PACK. See 'GIT_TRACE' for available trace output options. +'GIT_TRACE_PERFORMANCE':: + Enables performance related trace messages, e.g. total execution + time of each Git command. + See 'GIT_TRACE' for available trace output options. + 'GIT_TRACE_SETUP':: Enables trace messages printing the .git, working tree and current working directory after Git has completed its setup phase. diff --git a/git.c b/git.c index 7780572..d4daeb8 100644 --- a/git.c +++ b/git.c @@ -568,6 +568,8 @@ int main(int argc, char **av) git_setup_gettext(); + trace_command_performance(argv); + /* * git- is the same as git , but we obviously: * diff --git a/trace.c b/trace.c index af64dbb..e583dc6 100644 --- a/trace.c +++ b/trace.c @@ -404,3 +404,25 @@ inline uint64_t getnanotime(void) return now; } } + +static uint64_t command_start_time; +static struct strbuf command_line = STRBUF_INIT; + +static void print_command_performance_atexit(void) +{ + trace_performance_since(command_start_time, git command:%s, + command_line.buf); +} + +void trace_command_performance(const char **argv) +{ + if (!trace_want(trace_perf_key)) + return; + + if (!command_start_time) + atexit(print_command_performance_atexit); + + strbuf_reset(command_line); + sq_quote_argv(command_line, argv, 0); + command_start_time = getnanotime(); +} diff --git a/trace.h b/trace.h index 92d0fa6..74d7396 100644 --- a/trace.h +++ b/trace.h @@ -17,6 +17,7 @@ extern void trace_repo_setup(const char *prefix); extern int trace_want(struct trace_key *key); extern void trace_disable(struct trace_key *key); extern uint64_t getnanotime(void); +extern void trace_command_performance(const char **argv); #ifndef HAVE_VARIADIC_MACROS -- 2.0.0.406.ge74f8ff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 15/16] wt-status: simplify performance measurement by using getnanotime()
Calculating duration from a single uint64_t is simpler than from a struct timeval. Change performance measurement for 'advice.statusuoption' from gettimeofday() to getnanotime(). Also initialize t_begin to prevent uninitialized variable warning. Signed-off-by: Karsten Blees bl...@dcon.de --- wt-status.c | 14 +++--- 1 file changed, 3 insertions(+), 11 deletions(-) diff --git a/wt-status.c b/wt-status.c index 318a191..dfdc018 100644 --- a/wt-status.c +++ b/wt-status.c @@ -574,14 +574,11 @@ static void wt_status_collect_untracked(struct wt_status *s) { int i; struct dir_struct dir; - struct timeval t_begin; + uint64_t t_begin = getnanotime(); if (!s-show_untracked_files) return; - if (advice_status_u_option) - gettimeofday(t_begin, NULL); - memset(dir, 0, sizeof(dir)); if (s-show_untracked_files != SHOW_ALL_UNTRACKED_FILES) dir.flags |= @@ -612,13 +609,8 @@ static void wt_status_collect_untracked(struct wt_status *s) free(dir.ignored); clear_directory(dir); - if (advice_status_u_option) { - struct timeval t_end; - gettimeofday(t_end, NULL); - s-untracked_in_ms = - (uint64_t)t_end.tv_sec * 1000 + t_end.tv_usec / 1000 - - ((uint64_t)t_begin.tv_sec * 1000 + t_begin.tv_usec / 1000); - } + if (advice_status_u_option) + s-untracked_in_ms = (getnanotime() - t_begin) / 100; } void wt_status_collect(struct wt_status *s) -- 2.0.0.406.ge74f8ff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 16/16] progress: simplify performance measurement by using getnanotime()
Calculating duration from a single uint64_t is simpler than from a struct timeval. Change throughput measurement from gettimeofday() to getnanotime(). Also calculate misec only if needed, and change integer division to integer multiplication + shift, which should be slightly faster. Signed-off-by: Karsten Blees bl...@dcon.de --- progress.c | 71 +++--- 1 file changed, 36 insertions(+), 35 deletions(-) diff --git a/progress.c b/progress.c index 261314e..412e6b1 100644 --- a/progress.c +++ b/progress.c @@ -12,13 +12,14 @@ #include gettext.h #include progress.h #include strbuf.h +#include trace.h #define TP_IDX_MAX 8 struct throughput { off_t curr_total; off_t prev_total; - struct timeval prev_tv; + uint64_t prev_ns; unsigned int avg_bytes; unsigned int avg_misecs; unsigned int last_bytes[TP_IDX_MAX]; @@ -127,65 +128,65 @@ static void throughput_string(struct strbuf *buf, off_t total, void display_throughput(struct progress *progress, off_t total) { struct throughput *tp; - struct timeval tv; - unsigned int misecs; + uint64_t now_ns; + unsigned int misecs, count, rate; + struct strbuf buf = STRBUF_INIT; if (!progress) return; tp = progress-throughput; - gettimeofday(tv, NULL); + now_ns = getnanotime(); if (!tp) { progress-throughput = tp = calloc(1, sizeof(*tp)); if (tp) { tp-prev_total = tp-curr_total = total; - tp-prev_tv = tv; + tp-prev_ns = now_ns; } return; } tp-curr_total = total; + /* only update throughput every 0.5 s */ + if (now_ns - tp-prev_ns = 5) + return; + /* -* We have x = bytes and y = microsecs. We want z = KiB/s: +* We have x = bytes and y = nanosecs. We want z = KiB/s: * -* z = (x / 1024) / (y / 100) -* z = x / y * 100 / 1024 -* z = x / (y * 1024 / 100) +* z = (x / 1024) / (y / 10) +* z = x / y * 10 / 1024 +* z = x / (y * 1024 / 10) * z = x / y' * * To simplify things we'll keep track of misecs, or 1024th of a sec * obtained with: * -* y' = y * 1024 / 100 -* y' = y / (100 / 1024) -* y' = y / 977 +* y' = y * 1024 / 10 +* y' = y * (2^10 / 2^42) * (2^42 / 10) +* y' = y / 2^32 * 4398 +* y' = (y * 4398) 32 */ - misecs = (tv.tv_sec - tp-prev_tv.tv_sec) * 1024; - misecs += (int)(tv.tv_usec - tp-prev_tv.tv_usec) / 977; + misecs = ((now_ns - tp-prev_ns) * 4398) 32; - if (misecs 512) { - struct strbuf buf = STRBUF_INIT; - unsigned int count, rate; + count = total - tp-prev_total; + tp-prev_total = total; + tp-prev_ns = now_ns; + tp-avg_bytes += count; + tp-avg_misecs += misecs; + rate = tp-avg_bytes / tp-avg_misecs; + tp-avg_bytes -= tp-last_bytes[tp-idx]; + tp-avg_misecs -= tp-last_misecs[tp-idx]; + tp-last_bytes[tp-idx] = count; + tp-last_misecs[tp-idx] = misecs; + tp-idx = (tp-idx + 1) % TP_IDX_MAX; - count = total - tp-prev_total; - tp-prev_total = total; - tp-prev_tv = tv; - tp-avg_bytes += count; - tp-avg_misecs += misecs; - rate = tp-avg_bytes / tp-avg_misecs; - tp-avg_bytes -= tp-last_bytes[tp-idx]; - tp-avg_misecs -= tp-last_misecs[tp-idx]; - tp-last_bytes[tp-idx] = count; - tp-last_misecs[tp-idx] = misecs; - tp-idx = (tp-idx + 1) % TP_IDX_MAX; - - throughput_string(buf, total, rate); - strncpy(tp-display, buf.buf, sizeof(tp-display)); - strbuf_release(buf); - if (progress-last_value != -1 progress_update) - display(progress, progress-last_value, NULL); - } + throughput_string(buf, total, rate); + strncpy(tp-display, buf.buf, sizeof(tp-display)); + strbuf_release(buf); + if (progress-last_value != -1 progress_update) + display(progress, progress-last_value, NULL); } int display_progress(struct progress *progress, unsigned n) -- 2.0.0.406.ge74f8ff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] git-merge-file: do not add LF at EOF while applying unrelated change
On Mon, Jun 30, 2014 at 04:55:10PM +0200, Johannes Schindelin wrote: I just wish the tests were a little easier to understand... What could be improved with them? Having said that, here is my ACK for the current revision of the patch series Thanks. By the way, for \r\n eol it did even worse, adding just \n. And I guess it still adds just \n for union merge. Should file merge consider the core.eol? I think it should, and for the conflict markers also, it looks ugly when whole file has \r\n but the conflict markers have \n. But then git-merge-file could not be used outside of repository, I guess. In general, I wish file merging (and diffing) were more tolerant of the line endings in input. Because in windows environment, when people have different core.autocrlf, it becomes quite frustrating to always get conflicts and changes. -- Max -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html