Re: Weird revision walk behaviour
On 31/05/2018 08:43, Jeff King wrote: If there are zero parents (neither relevant nor irrelevant), is it still TREESAME? I would say in theory yes. Not sure - I think roots are such a special case that TREESAME effectively doesn't matter. We always test for roots first. So what I was proposing would be to rewrite the parents to the empty set. That feels a bit radical - I believe we need to retain (some) parent information for modes that show it (eg the dangling unfilled circles in gitk). And making it a root I think could cause other problems with making it look like we have a disjoint history. I believe the next simplification step may be trying to follow down to the common root. What next here? It looks like we have a proposed solution. Do you want to try to work up a set of tests based on what you wrote earlier? I was hoping Gábor would carry on, as he's made a start... I was just planning to back-seat drive. I'd also love to hear from Junio as the expert in this area, but I think he's been a bit busy with maintainer stuff recently. So maybe I should just be patient. :) Likewise - I have been quite deep into this, but it was a quite short window of investigation a long time ago, and I've not looked at it since. Would like input from someone with more active knowledge. Kevin
Re: Weird revision walk behaviour
On 30/05/2018 00:04, Jeff King wrote: Do we even need to do the parent rewriting here? By definition those parents aren't interesting, and we're TREESAME to whatever is in treesame_parents. So conceptually it seems like we just need a flag "I found a treesame parent", but we only convert that into a TREESAME flag if there are no relevant parents. I think it's necessary to make the rules consistent. To mark the commit as TREESAME here when it's not TREESAME to all its parents would be inconsistent with the definition of the TREESAME flag used everywhere else: * Original definition: "A commit is TREESAME if it is treesame to any parent" * d0af66 definition: "A commit is TREESAME if it is treesame to all parents" * Current 4d8266 definition: "A commit is TREESAME if it is treesame to all relevant parents; if no relevant parents then if it is treesame to all (irrelevant) parents." The current problem is that the node is not marked TREESAME, but that's consistent with the definition. I think we do have to rewrite the commit so it is TREESAME as per the definition. Not flag it as TREESAME in violation of it. It's possible you *could* get away with just flagging, because we never recompute the TREESAME flag in simple history mode. But it would be a cheat, and it may have other side effects. It means this node would remain a special rare case for others to trip up on later. And I don't think it simplifies the scan. Remembering "pointer-to-first-treesame-parent" (not a list) for the rewrite is no more complex than remembering "bool-there-was-a-treesame-parent". (A bool is what earlier code did - it worked for the original TREESAME definition. My patch series dropped that bool without replacement - missing this all-irrelevant case). In the simple history mode, the assumption is we're "simplifying away merges up-front" here; we won't (and can't) rewrite parents later in a way that needs to recompute TREESAME. In the initial scan when all parents are relevant and we matched one, the commit became TREESAME as per the new definition immediately because of the rewrite. This applies the equivalent rewrite when no relevant parents, consistent with the general concept, and without changing the TREESAME definition. Kevin
Re: Weird revision walk behaviour
On 29/05/2018 01:06, SZEDER Gábor wrote: So, without investing nearly enough time to understand what is going on, I massaged the above diffs into this: Cool. + treesame_parents = xmalloc(sizeof(*treesame_parents)); There's no need to actually record a list here. This is just for the simple history. We are only interested into becoming a non-merge to 1 treesame parent, so I think we just need to record a pointer to the first one we see, just as this would exit immediately for the first relevant one. For the full-history case, we already have the full "which parents are treesame" recording mechanism just above, but it only kicks in for merge commits and only when settings require it. Adding a malloc here would be significant machinery overhead. FWIW, the test suite passes with the above patch applied. I doubt there's an existing case like this anywhere in the revision test suite :) . And this patch is focused enough that it *should* only be changing the behaviour of this very specific case. As such, it does feel a little like a kludge, but I think it's fine because it's aligning the simple-history analysis with the "analyse relevant parents if any, else analyse irrelevant" rule of the full-history. And here is the small PoC test case to illustrate the issue, which fails without but succeeds with the above patch. Eventually it should be part of 't6012-rev-list-simplify.sh', of course, but I haven't looked into that yet. It may be there's enough criss-crossy history to test here to merit breaking out into a second test series. +# B---M2 master +# / \ / +# A X +# \ / \ +# C---M1 b2 +# +# and modify 'file' in commits 'A' and 'B', so one of 'M1's parents +# ('B') is TREESAME wrt. 'file'. I guess we'll be wanting test cases for A..B2, B..B2 and C..B2, and some where the the base is "some other child of B or C". "B..B2" is no longer a pure set subtraction for simplification as B is UNINTERESTING (ie not in the set) but RELEVANT (because you named it as a bottom commit), so B..B2 actually still leaves M1 with 2 relevant parents. You'd want test cases covering B relevant+C irrelevant and B irrelevant+C relevant, which means subtracting them without naming them - so name a child of one. And then we need to think about whether we want it displayed in each of the other modes for each of those queries... Kevin
Re: Weird revision walk behaviour
On 24/05/2018 23:26, Kevin Bracey wrote: On Wed, May 23, 2018 at 07:10:58PM +0200, SZEDER Gábor wrote: $ git log --oneline master..ba95710a3b -- ci/ ea44c0a594 Merge branch 'bw/protocol-v2' into jt/partial-clone-proto-v2 In this case, we're hitting a merge commit which is not on master, but it has two parents which both are. Which, IIRC, means the merge commit is INTERESTING with two UNINTERESTING parents; and we are TREESAME to only one of them. The commit changing the logic of TREESAME you identified believes that those TREESAME changes for merges which were intended to improve fuller history modes shouldn't affect the simple history "because partially TREESAME merges are turned into normal commits". Clearly that didn't happen here. Haven't currently got a development environment set up here, but I've been looking at the code.Here's a proposal, untested, as a potential starting point if anyone wants to consider a proper patch. The simplify_history first-scan logic never actually turned merges into simple commits unless they were TREESAME to a relevant/interesting parent. Anything where the TREESAME parent was UNINTERESTING was retained as a merge, but had its TREESAME flag set, and that permitted later simplification. With the redefinition of the TREESAME flag, this merge commit is no longer TREESAME, and as the decoration logic to refine TREESAME isn't active for simplify_history, it doesn't get cleaned up (even if it would be in full history?) I think the answer may be to add an extra post-process step on the initial loop to handle this special case. Something like: case REV_TREE_SAME: if (!revs->simplify_history || !relevant_commit(p)) { /* Even if a merge with an uninteresting * side branch brought the entire change * we are interested in, we do not want * to lose the other branches of this * merge, so we just keep going. */ if (ts) ts->treesame[nth_parent] = 1; + /* But we note it for potential later simplification */ + if (!treesame_parent) + treesame_parent = p; continue; } ... After loop: + if (relevant_parents == 0 && revs->simplify_history && treesame_parent) { + treesame_parent->next = NULL;// Repeats code from loop - share somehow? + commit->parents = treesame_parent; + commit->object.flags |= TREESAME; + return; + } /* * TREESAME is straightforward for single-parent commits. For merge The other option would be to take off the " || !relevant_commit(p)" test, but I'm assuming that is still needed for other cases. Kevin
Re: Weird revision walk behaviour
On 23/05/2018 20:35, Jeff King wrote: On Wed, May 23, 2018 at 01:32:46PM -0400, Jeff King wrote: On Wed, May 23, 2018 at 07:10:58PM +0200, SZEDER Gábor wrote: $ git log --oneline master..ba95710a3b -- ci/ ea44c0a594 Merge branch 'bw/protocol-v2' into jt/partial-clone-proto-v2 I keep some older builds around, and it does not reproduce with v1.6.6.3 (that's my usual goto for "old"). Bisecting turns up d0af663e42 (revision.c: Make --full-history consider more merges, 2013-05-16). It looks like an unintended change (the commit message claims that the non-full-history case shouldn't be affected). There's more discussion in the thread at: https://public-inbox.org/git/1366658602-12254-1-git-send-email-ke...@bracey.fi/ I haven't absorbed it all yet, but I'm adding Junio to the cc. In this case, we're hitting a merge commit which is not on master, but it has two parents which both are. Which, IIRC, means the merge commit is INTERESTING with two UNINTERESTING parents; and we are TREESAME to only one of them. The commit changing the logic of TREESAME you identified believes that those TREESAME changes for merges which were intended to improve fuller history modes shouldn't affect the simple history "because partially TREESAME merges are turned into normal commits". Clearly that didn't happen here. I think we need to look at why that isn't happening, and if it can be made to happen. The problem is that this commit is effectively the base of the graph - it's got a double-connection to the UNINTERESTING set, and maybe that prevented the simple history "follow 1 TREESAME" logic from kicking in. Maybe it won't follow 1 TREESAME to UNINTERESTING. I know there were quite a few changes later in the series to try to reconcile the simple and full history, for the cases where the simple history takes a weird path because of its love of TREESAME parents, hiding evil merges. But I believe the simple history behaviour was supposed to remain as-is - take first TREESAME always. Kevin
Re: Weird revision walk behaviour
On 23/05/2018 20:35, Jeff King wrote: There's more discussion in the thread at: https://public-inbox.org/git/1366658602-12254-1-git-send-email-ke...@bracey.fi/ I haven't absorbed it all yet, but I'm adding Junio to the cc. Just to ack that I've seen the discussion, but I can't identify the code's reasoning at the moment. My recollection is that I accepted while coming up with the algorithm that it might err slightly on the side of false positives in the display - there were some merge cases I was unable to fully distinguish whether or not the merge had lost a change it shouldn't have done, and if I was uncertain I'd rather show it than not. The first commit was not originally intended to alter behaviour for anything other than --full-history, but later in the chain there was specific consideration into tracking the path to the specified "bottom" commit. It may be that's part of what's happening here. Kevin
Re: [PATCH 1/3] add QSORT
On 04/10/2016 23:31, René Scharfe wrote: So let's summarize; here's the effect of a raw qsort(3) call: array == NULL nmemb bug QSORT following NULL check - - --- - 0 0 no qsort is skipped 0 >0 no qsort is skipped 1 0 no qsort is skipped (bad!) ** 1 >0 yes qsort is skipped ** Right - row 3 may not be a bug from the point of view of your internals, but it means you violate the API of qsort.Therefore a fix is required. With the micro-optimization removed (nmemb > 0) the matrix gets simpler: array == NULL nmemb bug QSORT following NULL check - - --- - 0 0 no noop is executed 0 >0 no qsort is skipped 1 0 no noop is executed 1 >0 yes qsort is skipped ** And with your NULL check (array != NULL) we'd get: array == NULL nmemb bug QSORT following NULL check - - --- - 0 0 no qsort reuses check result 0 >0 no qsort reuses check result 1 0 no noop reuses check result 1 >0 yes noop reuses check result Did I get it right? AFAICS all variants (except raw qsort) are safe -- no useful NULL checks are removed, and buggy code should be noticed by segfaults in code accessing the sorted array. I think your tables are correct. But I disagree that you could ever call invoking the "" lines safe. Unless you have documentation on what limit GCC (and your other compilers) are prepared to put on the undefined behaviour of violating that "non-null" constraint. Up to now dereferencing a null pointer has been implicitly (or explicitly?) defined as simply generating SIGSEGV. And that has naturally extended into NULL passed to library implementations. But that's no longer true - it seems bets are somewhat off. But, as long as you are confident you never invoke that line without a program bug - ie an API precondition of your own QSORT is that NULL is legal iff nmemb is zero, then I guess it's fine. Behaviour is defined, as long as you don't violate your internal preconditions. Kevin
Re: [PATCH 1/3] add QSORT
On 04/10/2016 01:00, René Scharfe wrote: Am 03.10.2016 um 19:09 schrieb Kevin Bracey: As such, NULL checks can still be elided even with your change. If you effectively change your example to: if (nmemb > 1) qsort(array, nmemb, size, cmp); if (!array) printf("array is NULL\n"); array may only be checked for NULL if nmemb <= 1. You can see GCC doing that in the compiler explorer - it effectively turns that into "else if". We don't support array == NULL together with nmemb > 1, so a segfault is to be expected in such cases, and thus NULL checks can be removed safely. Possibly true in practice. But technically wrong by the C standard - behaviour is undefined if the qsort pointer is invalid. You can't formally expect the defined behaviour of a segfault when sending NULL into qsort. (Hell, maybe the qsort has its own NULL check and silently returns! cf printf - some printfs will segfault when passed NULL, some print "(null)"). I've worked on systems that don't fault reads to NULL, only writes, so those might not segfault there, if NULL appeared sorted... And obviously there's the language lawyer favourite possibility of the call causing nasal flying monkeys or whatever. So if it's not a program error for array to be NULL and nmemb to be zero in your code, and you want a diagnostic for array=NULL, nmemb non-zero, I think you should put that diagnostic into sane_qsort as an assert or something, not rely on qsort's undefined behaviour being a segfault. sane_qsort(blah) { if (nmemb >= 1) { assert(array); qsort(array, nmemb, ...); } } Can't invoke undefined behaviour from NULL without triggering the assert. (Could still have other invalid pointers, of course). Usually I am on the side of "no NULL checks", as I make the assumption that we will get a segfault as soon as NULL pointers are used, and those are generally easy to diagnose. But seeing a compiler invoking this sort of new trickery due to invoking undefined behaviour is making me more nervous about doing so... To make that check really work, you have to do: if (array) qsort(array, nmemb, size, cmp); else printf("array is NULL\n"); So maybe your "sane_qsort" should be checking array, not nmemb. It would be safe, but arguably too much so, because non-empty arrays with NULL wouldn't segfault anymore, and thus become harder to identify as the programming errors they are. Well, you get the print. Although I guess you're worrying about the second if being real code, not a debugging check. I must say, this is quite a courageous new optimisation from GCC. It strikes me as finding a language lawyer loophole that seems to have been intended for something else (mapping library functions directly onto CISCy CPU intrinsics), and using it to invent a whole new optimisation that seems more likely to trigger bugs than optimise any significant amount of code in a desirable way. Doubly weird as there's no (standard) language support for this. I don't know how you'd define "my_qsort" that triggered the same optimisations. I've seen similar library-knowledge-without-any-way-to-reproduce-in-user-code optimisations like "malloc returns a new pointer that doesn't alias with anything existing" (and no way to reproduce the optimisation with my_malloc_wrapper). But those seemed to have a clear performance benefit, without any obvious traps. Doubtful about this one. Kevin
Re: [PATCH 1/3] add QSORT
On 01/10/2016 19:19, René Scharfe wrote: It's hard to imagine an implementation of qsort(3) that can't handle zero elements. QSORT's safety feature is that it prevents the compiler from removing NULL checks for the array pointer. E.g. the last two lines in the following example could be optimized away: qsort(ptr, n, sizeof(*ptr), fn); if (!ptr) do_stuff(); You can see that on https://godbolt.org/g/JwS99b -- an awesome website for exploring compilation results for small snippets, by the way. This optimization is dangerous when combined with the convention of using a NULL pointer for empty arrays. Diagnosing an affected NULL check is probably quite hard -- it's right there in the code after all and not all compilers remove it. Hang on, hang on. This is either a compiler bug, or you're wrong on your assumption about the specification of qsort. Either way, the extra layer of indirection is not proper protection. The unwanted compiler optimisation you're inadvertently triggering could still be triggered through the inline. Now, looking at the C standard, this isn't actually clear to me. The standard says that if you call qsort with nmemb zero, the pointer still has to be "valid". Not totally clear to me if NULL is valid, by their definition in C99 7.1.4. Googling hasn't given me a concrete answer. The compiler seems to think that NULL wouldn't be valid, so because you've called qsort on it, you've invoked undefined behaviour if it's NULL, so it's free to elide the NULL check. Kevin
Re: [PATCH 1/3] add QSORT
On 01/10/2016 19:19, René Scharfe wrote: It's hard to imagine an implementation of qsort(3) that can't handle zero elements. QSORT's safety feature is that it prevents the compiler from removing NULL checks for the array pointer. E.g. the last two lines in the following example could be optimized away: qsort(ptr, n, sizeof(*ptr), fn); if (!ptr) do_stuff(); You can see that on https://godbolt.org/g/JwS99b -- an awesome website for exploring compilation results for small snippets, by the way. Ah, second attempt. Originally misread the original code, and didn't understand what it was doing. I get it now. A nasty trap I hadn't been aware of - I was under the impression NULL + zero length was generally legal, but the C standard does indeed not give you a specific out for NULL to library functions in that case. As such, NULL checks can still be elided even with your change. If you effectively change your example to: if (nmemb > 1) qsort(array, nmemb, size, cmp); if (!array) printf("array is NULL\n"); array may only be checked for NULL if nmemb <= 1. You can see GCC doing that in the compiler explorer - it effectively turns that into "else if". To make that check really work, you have to do: if (array) qsort(array, nmemb, size, cmp); else printf("array is NULL\n"); So maybe your "sane_qsort" should be checking array, not nmemb. Kevin
Re: Reset by checkout?
On 07/06/2014 17:52, Philip Oakley wrote: Just to say there has been a similar confusion about 'git reset' reported on the Git Users group for the case of reset with added (staged), but uncommitted changes being wiped out, which simlarly reports on the difficulty of explaining some of the conditions especially when some are wrong ;-) https://groups.google.com/forum/#!topic/git-users/27_FxIV_100 I'm coming around to the view that git reset mode should be (almost) demoted to plumbing, leaving only the reset file that reverses add file as everyday Porcelain. I think reset --keep and --merge were a step in the wrong direction, at least for the Porcelain - trying to make reset mode more useful, rather than less necessary. Normal users shouldn't be needing to touch these hard-to-explain-and-slightly-dangerous commands. The addition of --abort to merge and other commands was much more solid. They helped a lot, and I think we should follow that model by adding --undo to various commands. That would mop up all the common resets, in conjunction with Atsushi's proposed checkout -u alternative to -B, which I quite like. Main few: commit --undo = reset --soft HEAD^ merge --undo = reset --keep HEAD^ rebase --undo = reset --keep ORIG_HEAD [bug report: rebase -p doesn't set ORIG_HEAD reliably] pull --undo = merge/rebase --undo depending on rebase settings [could we go nuts and undo the fetch too?] Bonus: commit --amend --undo: reset --soft HEAD@{1} The undos can also have a bit of extra veneer that checks the log/reflog for whether it matches the proposed undo, and also checks the upstream to see if the thing being undone is already public. Given those, I honestly don't think I'd ever need to explain git reset mode to anyone again. Which would be nice... (Note I propose no --mixed equivalent for the commit undos, but it's easy enough to follow the commit --undo with a normal git reset. I'd rather re-document the normal git reset under commit --undo than add and document yet another option). Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Reset by checkout?
On 03/06/2014 00:54, Junio C Hamano wrote: Not that I can think of a better way to update these descriptions, and not that I am opposing to update these descriptions to make it easier for new people to learn, but I am not sure if these treat ORIG_HEAD and the changes since that commit as separate entities is a good approach to do so. Somewhat frustrated, not by your patch but by being unable to suggest a better way X-. I know. I started off myself knowing what I meant to say, and then got bogged down somewhat trying to be detailed enough for a full explanation. I think it's just inherently very hard for anyone to visualise what these do in the /general/ case. This is one of those commands where the structure of a man page gets in the way. We have to give a summary of what the mode options /do/, but that's not what people want to know. They want to know what they're /for/. (And, to some extent, reset, like checkout, is two separate commands. One being the path manipulator, the other being the HEAD manipulator. Just bogs us down further). I think these are the most important HEAD resets, covering 95%+ of uses: git reset --soft HEAD~n git reset HEAD~n git reset --keep HEAD~n git reset --keep ORIG_HEAD git reset --keep @{n} git reset --keep some other arbitary place (and possibly git reset --merge although I think this should be fully covered by git xxx --abort - maybe a couple of those missing like git stash pop/apply --abort?) Anything more than those, I think, are pretty far-fetched. I can't 100% grok --soft/--mixed onto a different branch, for example. (But at least we do define those cases in the A/B/C/D discussion section for the real geeks.) Maybe we just need to tighten up the EXAMPLES section? Give it easy-to-locate path/--soft/--mixed/--keep subheadings, covering all those common use cases (in clean trees...), including a before/after git status views. Then normal users could skip the top technical section waffling about indexes and go straight there instead. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Reset by checkout?
On 01/06/2014 07:26, Atsushi Nakagawa wrote: Kevin Bracey ke...@bracey.fi wrote: The original git reset --hard used to be a pretty top-level command. It was used for aborting merges in particular. But I think it now stands out as being one of the only really dangerous porcelain commands, and I can't think of any real workflow it's still useful for. My thoughts exactly. I think the 'reset --soft/--mixed/--hard' pattern is so ingrained, that many people just don't realize there's a safer alternative. (I've heard work mates on more than one occasion recommending 'reset --hard' as the go-to command for discarding commits.) I believe this is likely because many third party GUI tools just don't support 'reset --keep', and these tools present a Reset... dialog with the de facto Soft/Mixed/Hard options. (Even 'gitk' does this.) True on the GUI - hard really needs demotion. It would help if the documentation explained better straight off what the different reset modes are intended /for/ in a more practical way, rather than the technical jargon. There is the EXAMPLES section, but I think the problem is that it's not clearly laid out by mode, meaning people checking to see what git reset can do are inclined to go first to the --xxx mode list in DESCRIPTION, and stop there, baffled, probably not finding any example for that mode. Maybe the examples should have clearer --option subheadings? (And all the existing examples for --hard should really suggest --keep instead). But given that the DISCUSSION section now has the full internal details on what exactly each mode does in every state, and now that we have more than the simple soft/mixed/hard to deal with, I think the main DESCRIPTION could be simplified for end users. Most useful for visualisation, I feel, would just showing what git status will look like afterwards, primarily from the point of view of a backwards reset to HEAD~n. In particular, normal users don't think in terms of the absolute contents of the index, but rather in terms of diffs. Maybe something like this: All modes move the current branch pointer so that HEAD now points to the specified commit. ORIG_HEAD is set to the original HEAD location. The modes differ in what happens to the contents of ORIG_HEAD, that are no longer on the reset branch; and also what happens to your not-yet-committed changes. --soft Retains the contents of ORIG_HEAD in the index+work area, leaving the difference as changes to be committed. git reset --soft HEAD~1 would be the first step if you want to remove the last commit, but intend to recommit most or all of its changes. git status after reset --soft shows: To be committed: Changes in ORIG_HEAD relative to HEAD (+Any initial staged changes) Not staged: (Any initial unstaged changes) --mixed (default) Retains the contents of ORIG_HEAD in the work area, leaving the difference as unstaged changes. git reset HEAD~1 would be the first step if you want to remove the last commit, and think again from scratch about which of its changes should be committed. git status after reset --mixed shows: Not staged: Changes in ORIG_HEAD relative to HEAD (+Any initial staged changes) (+Any initial unstaged changes) --keep The contents of ORIG_HEAD are dropped, leaving the work area and index containing the new HEAD; your uncommitted changes to unaffected files are retained. If you have uncommitted changes to any files that differ in the proposed new HEAD, the operation is refused; you would need to git stash first. git reset --keep HEAD~1 can be used to totally remove the last commit. (This removal can itself be undone with another git reset --keep ORIG_HEAD, or git reset --keep branch@{n} - see git-reflog(1)). git reset --keep is a safe alternative to --hard, and is roughly equivalent to git checkout -B current-branch-name. git status after reset --keep shows: Not staged (Any initial staged changes) [should these be left staged, as per git checkout?] (+Any initial unstaged changes) --hard All other changes are dropped, and the work area and index are forcibly reset to the new HEAD. Note that this is dangerous if used carelessly: ALL uncommitted changes to ALL tracked files will be lost, even if you were only trying to drop an unrelated commit that didn't touch those files. Older documentation often recommends git reset --hard to undo commits; the newer --keep option is a much better alternative in almost all cases. git status after reset --hard shows: Work area clean (or untracked files present) --merge Performs the operation of git merge --abort, intended for use during a merge resolution - see git-merge(1) for more information. This form is not normally used directly. [Not really true? Still the best command to abort git checkout --merge/git stash pop|apply? Do those need --abort?] If people just forgot about '--hard
Re: Reset by checkout?
On 31/05/2014 08:46, Atsushi Nakagawa wrote: `git checkout -B current-branch-name tree-ish` This is such an useful notion that I can fathom why there isn't a better, first-tier, alternative.q I'm 100% in agreement. Reset current branch to X is an extremely common operation, and I use this all the time. But having to actually name the current branch is silly, and like you, I'm prone to swapping the parameters. I guess in theory using checkout allows fancier extra options like --merge and --patch, but I don't think I've ever used those with checkout, let alone this mode, where I really do just want a reset, with safety checks. The original git reset --hard used to be a pretty top-level command. It was used for aborting merges in particular. But I think it now stands out as being one of the only really dangerous porcelain commands, and I can't think of any real workflow it's still useful for. Maybe it could now be modified to warn and require -f to overwrite anything in the working tree? While digging into this, it seems git reset --keep is actually pretty close to git checkout -B current branch. It certainly won't lose your workspace file, but unlike checkout it /does /forget what you've staged, which could be annoying. Maybe that could be modified to keep the index too? (I like your alias.become - might try that). Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Unicode: update of combining code points
On 16/04/2014 22:58, Torsten Bögershausen wrote: Excellent, thanks for the pointers. Running the script below shows that 0X00AD SOFT HYPHEN should have zero length (and some others too). I wonder if that is really the case, and which one of the last 2 lines in the script is the right one. What does this mean for us: CfFormat a format control character Maybe dig back through the Git logs to check the original logic, but the comments suggest that Cf characters have been viewed as zero-width. That makes sense - they're usually markers indicating things like bidirectional text flow, so won't be taking space. (Although they may be causing even more extreme layout effects...) Soft-hyphen is noted as an explicit exception to the rule in the utf8.c comments. As of Unicode 4.0, it's supposed to be a character indicating a point where a hyphen could be placed if a line-wrap occurs, and if that wrap happens, then it can actually take up 1 space, otherwise not. So its width could be either 0 or 1, depending. Or, quite likely, the terminal doesn't treat it specially, and it always just looks like a hyphen... Thus we err on the safe side and give it width 1. See http://en.wikipedia.org/wiki/Soft_hyphen for background. The comments suggest adding -00AD +1160-11FF to the uniset command line for that tweak and for composing Hangul. (The +200B tweak isn't necessary any more - Zero-Width Space U+200B became Cf officially in Unicode 4.0.1: http://en.wikipedia.org/wiki/Zero-width_space http://www.unicode.org/review/resolved-pri.html#pri21 ) All of this is only really an approximation - a best-effort attempt to figure out the width of a string without any actual communication with the display device. So it'll never be perfect. The choice between double and single width in particular will often be unpredictable, unless you had deeper locale knowledge. Actually, while doing this, I've realised that this was originally Markus Kuhn's implementation, and that is acknowledged at the top of the file: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c Good, because he knows what he's doing. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] utf8.c: partially update to version 6.3
On 17/04/2014 14:47, Torsten Bögershausen wrote: ./uniset/uniset +cat=Me +cat=Mn +cat=Cf -00AD +1160-11FF +200B c 200B isn't a special case any more, as its database properties have been changed, so you can slightly simplify this command (both in the commit message and the comments). And while you're here I think it's probably worth updating the width=2 test now too, to make this a complete update to Unicode 6.3. I can see it needs a few extensions (eg A960-A97C, FE10-FE19, 1F200-1F251). Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Unicode: update of combining code points
On 16/04/2014 07:48, Torsten Bögershausen wrote: On 15.04.14 21:10, Peter Krefting wrote: Torsten Bögershausen: diff --git a/utf8.c b/utf8.c index a831d50..77c28d4 100644 --- a/utf8.c +++ b/utf8.c Is there a script that generates this code from the Unicode database files, or did you hand-update it? Some of the code points which have 0 length on the display are called combining, others are called vowels or accents. E.g. 5BF is not marked any of them, but if you look at the glyph, it should be combining (please correct me if that is wrong). Indeed it is combining (more specifically it has General Category Nonspacing_Mark = Mn). If I could have found a file which indicates for each code point, what it is, I could write a script. The most complete and machine-readable data are in these files: http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt The general categories can also be seen more legibly in: http://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedGeneralCategory.txt For docs, see: http://www.unicode.org/reports/tr44/ http://www.unicode.org/reports/tr11/ http://www.unicode.org/ucd/ The existing utf8.c comments describe the attributes being selected from the tables (general categories Cf,Mn,Me, East Asian Width W, F). And they suggest that the combining character table was originally auto-generated from UnicodeData.txt with a uniset tool. Presumably this? https://github.com/depp/uniset The fullwidth-checking code looks like it was done by hand, although apparently uniset can process EastAsianWidth.txt. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: breakage in revision traversal with pathspec
On 11/09/2013 01:23, Junio C Hamano wrote: Kevin Bracey ke...@bracey.fi writes: On 10/09/2013 20:19, Junio C Hamano wrote: This command $ git log v1.8.3.1..v1.8.4 -- git-cvsserver.perl reports that a merge 766f0f8ef7 (which did not touch the specified path at all) touches it. Bisecting points at d0af663e (revision.c: Make --full-history consider more merges, 2013-05-16). That merge appearing *with* --full-history would seem like correct behaviour to me. Or at least it's what I intended. ... But it shouldn't appear if the user does not ask for --full-history. Well, there is a functioning semi-work-around for now: avoid difficult non-linear questions like v1.8.3.1..v1.8.4. A question like v1.8.3..v1.8.4 is a lot easier to visualise, and it does already omit the merge. On reflection I'm not sure what we should for the simple history view of v1.8.3.1..v1.8.4. We're not rewriting parents, so we don't get a chance to reconsider the merge as being zero-parent, and we do have this little section of graph to traverse at the bottom: 1.8.3 oxxxx---x--- (x = included, o = excluded, *=!treesame) / /* o--x--x--x--x In effect, we do have a linear section of history to follow, and the file does change in the middle of that line. It may be quite hard to come up with a solid rule to hide the merge that doesn't go wrong somewhere else. The current rules for this are 1) if identical to any on-graph parent, follow that one, and rewrite the merge as a non-merge. We currently do not follow to an identical off-graph parent. This long-standing comment in try_to_simplify_commit applies: Even if a merge with an uninteresting side branch brought the entire change we are interested in, we do not want to lose the other branches of this merge, so we just keep going. For this query, the mainline link to 1.8.3 is the uninteresting side branch! If you do specify v1.8.3..v1.8.4, then v1.8.3 becomes on-graph thanks to other new rules, and this rule does kick in, hiding the merge. 2) If rule 1 doesn't activate, and it remains as a merge, hide it if treesame to all on-graph parents. Previously this rule was hide if treesame to any parent, and so that would have hidden the merge. Now, when I changed rule 2, I did not think this would affect the default log. See my commit message: Now redefine a commit's TREESAME flag to be true only if a commit is TREESAME to _all_ of its [later: on-graph] parent. This doesn't affect ... the default simplify_history behaviour (because partially TREESAME merges are turned into normal commits)... Whoops - partially TREESAME merges are not always turned into normal commits. Maybe the fix is to define TREESAME differently for simplify_history - to use the old definition of identical to any parent in that case. I'm not sure that's right though. I currently feel instinctively more disposed to dropping the older don't follow off-graph identical parents rule. Let the default history go straight to v1.8.3 even though it goes off the graph, stopping us traversing the topic branch. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: breakage in revision traversal with pathspec
On 11/09/2013 21:24, Jonathan Nieder wrote: Kevin Bracey wrote: On reflection I'm not sure what we should for the simple history view of v1.8.3.1..v1.8.4. We're not rewriting parents, so we don't get a chance to reconsider the merge as being zero-parent, and we do have this little section of graph to traverse at the bottom: 1.8.3 oxxxx---x--- (x = included, o = excluded, *=!treesame) / /* o--x--x--x--x [...] 1) if identical to any on-graph parent, follow that one, and rewrite the merge as a non-merge. We currently do not follow to an identical off-graph parent. This long-standing comment in try_to_simplify_commit applies: Even if a merge with an uninteresting side branch brought the entire change we are interested in, we do not want to lose the other branches of this merge, so we just keep going. [...] I currently feel instinctively more disposed to dropping the older don't follow off-graph identical parents rule. Let the default history go straight to v1.8.3 even though it goes off the graph, stopping us traversing the topic branch. Thanks for this analysis. Interesting. The rule (1) comes from v1.3.0-rc1~13^2~6: ... I think you're right that dropping the don't follow off-graph treesame parents rule would be a sensible change. The usual point of the follow the treesame parent rule is to avoid drawing undue attention to merges of ancient history where some of the parents are side-branches with an old version of the files being tracked and did not actually change those files. That rationale applies just as much for a merge on top of an UNINTERESTING rev as any other merge. I agree about the rationale still applying - why not follow off-graph, unless you're doing --ancestry-path? (Fortunately ancestry_path already disables simplify_history). That makes more sense if you try to ignore the misleading comment. In a typical v1..v3 range, the temporal limiting means that it's paths to the mainline that will tend to be marked UNINTERESTING, not to the topic branches... But I can imagine going off graph it may previously have tripped up other parts of the code. It could be that this Git 1.3.0 rule ended up covering over some of the older merge hiding logic flakiness. Maybe it's no longer necessary. I'll do some experiments. Now, one bit of news - I have just figured out why gitk is behaving differently. It transforms .. before it reaches git. To see the effect at the command line: git log v1.8.3..v.1.8.4 hides the merge, but git log ^v1.8.3 v1.8.4 shows it. Whoops. A new example of a dotty shorthand not being exactly equivalent. In the .. case the v1.8.3 tag gets peeled before being sent to add_rev_cmdline , and the mark bottom commits logic works. But in the ^ case, the v1.8.3 doesn't get peeled. Junio - any thoughts on the correct place to fix that? (And gitk actually does ^tag-sha, just to be odd, so that needs to be handled too). Should these things be peeled in revs-cmdline or not? We should be consistent. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] am: replace uses of --resolved with --continue
git am was previously modified to provide --continue for consistency with rebase, merge etc, and the documentation changed to showing --continue as the primary form. Complete the work by replacing remaining uses of --resolved by --continue, most notably in suggested command reminders. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/git-am.txt | 4 ++-- Documentation/user-manual.txt | 2 +- git-am.sh | 8 t/t7512-status-help.sh| 4 ++-- wt-status.c | 2 +- 5 files changed, 10 insertions(+), 10 deletions(-) diff --git a/Documentation/git-am.txt b/Documentation/git-am.txt index 5bbe7b6..54d8461 100644 --- a/Documentation/git-am.txt +++ b/Documentation/git-am.txt @@ -132,7 +132,7 @@ default. You can use `--no-utf8` to override this. --resolvemsg=msg:: When a patch failure occurs, msg will be printed to the screen before exiting. This overrides the - standard message informing you to use `--resolved` + standard message informing you to use `--continue` or `--skip` to handle the failure. This is solely for internal use between 'git rebase' and 'git am'. @@ -176,7 +176,7 @@ aborts in the middle. You can recover from this in one of two ways: . hand resolve the conflict in the working directory, and update the index file to bring it into a state that the patch should - have produced. Then run the command with the '--resolved' option. + have produced. Then run the command with the '--continue' option. The command refuses to process new mailboxes until the current operation is finished, so if you decide to start over from scratch, diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt index e831cc2..8218cf9 100644 --- a/Documentation/user-manual.txt +++ b/Documentation/user-manual.txt @@ -1835,7 +1835,7 @@ Once the index is updated with the results of the conflict resolution, instead of creating a new commit, just run - -$ git am --resolved +$ git am --continue - and Git will create the commit for you and continue applying the diff --git a/git-am.sh b/git-am.sh index 9f44509..7ea40fe 100755 --- a/git-am.sh +++ b/git-am.sh @@ -6,7 +6,7 @@ SUBDIRECTORY_OK=Yes OPTIONS_KEEPDASHDASH= OPTIONS_SPEC=\ git am [options] [(mbox|Maildir)...] -git am [options] (--resolved | --skip | --abort) +git am [options] (--continue | --skip | --abort) -- i,interactive run interactively b,binary* (historical option -- no-op) @@ -102,7 +102,7 @@ stop_here_user_resolve () { printf '%s\n' $resolvemsg stop_here $1 fi -eval_gettextln When you have resolved this problem, run \\$cmdline --resolved\. +eval_gettextln When you have resolved this problem, run \\$cmdline --continue\. If you prefer to skip this patch, run \\$cmdline --skip\ instead. To restore the original branch and stop patching, run \\$cmdline --abort\. @@ -523,7 +523,7 @@ Use \git am --abort\ to remove it.) esac fi - # Make sure we are not given --skip, --resolved, nor --abort + # Make sure we are not given --skip, --continue, nor --abort test $skip$resolved$abort = || die $(gettext Resolve operation not in progress, we are not resuming.) @@ -670,7 +670,7 @@ do # - patch is the patch body. # # When we are resuming, these files are either already prepared - # by the user, or the user can tell us to do so by --resolved flag. + # by the user, or the user can tell us to do so by --continue flag. case $resume in '') if test -f $dotest/rebasing diff --git a/t/t7512-status-help.sh b/t/t7512-status-help.sh index 4f09bec..bd8aab0 100755 --- a/t/t7512-status-help.sh +++ b/t/t7512-status-help.sh @@ -510,7 +510,7 @@ test_expect_success 'status in an am session: file already exists' ' cat expected -\EOF # On branch am_already_exists # You are in the middle of an am session. - # (fix conflicts and then run git am --resolved) + # (fix conflicts and then run git am --continue) # (use git am --skip to skip this patch) # (use git am --abort to restore the original branch) # @@ -532,7 +532,7 @@ test_expect_success 'status in an am session: file does not exist' ' cat expected -\EOF # On branch am_not_exists # You are in the middle of an am session. - # (fix conflicts and then run git am --resolved) + # (fix conflicts and then run git am --continue) # (use git am --skip to skip this patch) # (use git am --abort to restore the original branch) # diff --git a/wt-status.c b/wt-status.c index 438a40d..b191c65 100644 --- a/wt-status.c +++ b/wt-status.c @@ -826,7 +826,7 @@ static void show_am_in_progress(struct
[PATCH] Documentation: Move git diff blob blob
The section describing git diff blob blob had been placed in a position that disrupted the statement This is synonymous to the previous form. Reorder to place this form after all the commit-using forms, and the note applying to them. Also mention this form in the initial description paragraph. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/git-diff.txt | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt index a7b4620..78d6d50 100644 --- a/Documentation/git-diff.txt +++ b/Documentation/git-diff.txt @@ -18,8 +18,8 @@ SYNOPSIS DESCRIPTION --- Show changes between the working tree and the index or a tree, changes -between the index and a tree, changes between two trees, or changes -between two files on disk. +between the index and a tree, changes between two trees, changes between +two blob objects, or changes between two files on disk. 'git diff' [--options] [--] [path...]:: @@ -56,11 +56,6 @@ directories. This behavior can be forced by --no-index. This is to view the changes between two arbitrary commit. -'git diff' [options] blob blob:: - - This form is to view the differences between the raw - contents of two blob objects. - 'git diff' [--options] commit..commit [--] [path...]:: This is synonymous to the previous form. If commit on @@ -87,6 +82,11 @@ and the range notations (commit..commit and commit\...commit) do not mean a range as defined in the SPECIFYING RANGES section in linkgit:gitrevisions[7]. +'git diff' [options] blob blob:: + + This form is to view the differences between the raw + contents of two blob objects. + OPTIONS --- :git-diff: 1 -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] sha1_name: fix error message for @{N}, @{date}
On 21/05/2013 19:52, Junio C Hamano wrote: Ramkumar Ramachandra artag...@gmail.com writes: The empty string '' looks ugly and inconsistent with the output of branch@{N}. Replace it with the string 'current branch'. Wouldn't that be '*the* current branch'? More importantly, doesn't real_ref have the name of the branch? Suppose the user said git show @{1} instead of git show master@{1} while on 'master'. It could be argued that it may look nicer to say your current branch does not have enough update history instead of saying master does not... (i.e. different input to ask for the same thing, different output depending on the way the user asked). It also could be argued that they should produce the same diagnosis that is more informative. I am slightly leaning toward the latter. That would also avoid the complaint I was about to make that putting 'current branch' in scare quotes would be annoying. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] New kind of upstream branch: base branch
On 17/05/2013 22:51, Junio C Hamano wrote: Kevin Bracey ke...@bracey.fi writes: On 15/05/2013 23:34, Felipe Contreras wrote: I think I'm using 'upstream' for something it was not intended to, and I think the current 'upstream' behavior should be split into 'upstream' and 'base'. I found myself thinking the same thing. It's really convenient being able to set your topic branch's upstream to another local branch, so What is that another local branch? ... And if that is your workflow, setting push.default to current (and setting remote.pushdefault to your publishing repository) should be a sufficient interim solution, and you do not need to set branch.$name.push to each and every branch you intend to push out, I think. I agree that using push.default current covers some cases - I hadn't really considered it - tended to just stick with upstream. current nearly does the job, but I will sometimes be wanting different names. What I'll often be doing is creating a topic branch based on master or origin/master. (I would hardly ever be updating master or pushing to origin/master myself, so I probably should be just doing origin/master, but I tend to create a local master just to save typing on all those git rebase origin/master). During work, to give others visibility, and the possibility to tinker with the topic branch during development (as we don't have full inter-site sharing of work areas), I would push the topic branch up to the central origin server, often with a kbracey/ prefix, partially for namespacing, and partially to indicate it's currently private work and subject to rebasing. I guess I could create the topic branch as kbracey/topic locally, but I'd rather not have to. So I'd like git rebase (-i) to move my topic branch up (origin/)master. And I'd like git push (-f) to send it to origin/kbracey/topic. And by extension, I suppose git pull --rebase to update origin/master and rebase. (Although I'm not much of a puller - I tend to fetch then rebase manually). The final releasing procedure for the topic branch would be to hand that branch over to an integrator, who would then merge/rebase it into master. And it would be ideal if the initial base and push tracking information could be set up automatically on the first git checkout -b/git branch and git push. (For one, I've always found it odd that there's an asymmetry - if you check out a topic branch from the server to work on or use it, you get a local copy with upstream set by default. But if you create a topic branch yourself then push it, the upstream isn't set by default - you need the -u flag. This seems odd to me, and I've seen others confused by this). Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] New kind of upstream branch: base branch
On 15/05/2013 23:34, Felipe Contreras wrote: I think I'm using 'upstream' for something it was not intended to, and I think the current 'upstream' behavior should be split into 'upstream' and 'base'. I found myself thinking the same thing. It's really convenient being able to set your topic branch's upstream to another local branch, so git rebase works without parameters. But then I can't use upstream to point to a remote version of that topic branch. I want my topic branch to know both that it's based on master (or origin/master), and that it's upstream is origin/topic. So, yes, here's a vote in favour of the general concept. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 01/15] decorate.c: compact table when growing
When growing the table, take the opportunity to compact it by removing entries with NULL decoration. Users may have removed decorations by passing NULL to insert_decoration. An object's table entry can't actually be removed during normal operation, as it would break the linear hash collision search. But we can remove NULL decoration entries when rebuilding the table. Signed-off-by: Kevin Bracey ke...@bracey.fi --- decorate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/decorate.c b/decorate.c index 2f8a63e..7cb5d29 100644 --- a/decorate.c +++ b/decorate.c @@ -49,7 +49,7 @@ static void grow_decoration(struct decoration *n) const struct object *base = old_hash[i].base; void *decoration = old_hash[i].decoration; - if (!base) + if (!decoration) continue; insert_decoration(n, base, decoration); } -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 09/15] t6012: update test for tweaked full-history traversal
From: Junio C Hamano gits...@pobox.com Signed-off-by: Junio C Hamano gits...@pobox.com --- t/t6012-rev-list-simplify.sh | 29 +++-- 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh index dd6dc84..4e55872 100755 --- a/t/t6012-rev-list-simplify.sh +++ b/t/t6012-rev-list-simplify.sh @@ -14,21 +14,24 @@ unnote () { test_expect_success setup ' echo Hi there file - git add file - test_tick git commit -m Initial file + echo initial lost + git add file lost + test_tick git commit -m Initial file and lost note A git branch other-branch echo Hello file - git add file - test_tick git commit -m Modified file + echo second lost + git add file lost + test_tick git commit -m Modified file and lost note B git checkout other-branch echo Hello file - git add file + lost + git add file lost test_tick git commit -m Modified the file identically note C @@ -37,7 +40,9 @@ test_expect_success setup ' test_tick git commit -m Add another file note D - test_tick git merge -m merge master + test_tick + test_must_fail git merge -m merge master + lost git commit -a -m merge note E echo Yet another elif @@ -110,4 +115,16 @@ check_result 'I B A' -- file check_result 'I B A' --topo-order -- file check_result 'H' --first-parent -- another-file +check_result 'E C B A' --full-history E -- lost +test_expect_success 'full history simplification without parent' ' + printf %s\n E C B A expect + git log --pretty=$FMT --full-history E -- lost | + unnote actual + sed -e s/^.* \([^ ]*\) .*/\1/ check actual + test_cmp expect check || { + cat actual + false + } +' + test_done -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 11/15] simplify-merges: drop merge from irrelevant side branch
Reimplement commit 4b7f53da on top of the new simplify-merges infrastructure, tightening the condition to only consider root parents; the original version incorrectly dropped parents that were TREESAME to anything. Original log message follows. The merge simplification rule stated in 6546b59 (revision traversal: show full history with merge simplification, 2008-07-31) still treated merge commits too specially. Namely, in a history with this shape: ---o---o---M / x---x---x where three 'x' were on a history completely unrelated to the main history 'o' and do not touch any of the paths we are following, we still said that after simplifying all of the parents of M, 'x' (which is the leftmost 'x' that rightmost 'x simplifies down to) and 'o' (which would be the last commit on the main history that touches the paths we are following) are independent from each other, and both need to be kept. That is incorrect; when the side branch 'x' never touches the paths, it should be removed to allow M to simplify down to the last commit on the main history that touches the paths. Suggested-by: Junio C Hamano gits...@pobox.com Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 34 +- revision.c | 26 +- t/t6012-rev-list-simplify.sh | 2 +- 3 files changed, 47 insertions(+), 15 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index f41e865..b462f17 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -342,13 +342,13 @@ In the following, we will always refer to the same example history to illustrate the differences between simplification settings. We assume that you are filtering for a file `foo` in this commit graph: --- - .-A---M---N---O---P -/ / / / / - I B C D E -\ / / / / - `-' + .-A---M---N---O---P---Q +/ / / / / / + I B C D E Y +\ / / / / / + `-' X --- -The horizontal line of history A---P is taken to be the first parent of +The horizontal line of history A---Q is taken to be the first parent of each merge. The commits are: * `I` is the initial commit, in which `foo` exists with contents @@ -369,6 +369,10 @@ each merge. The commits are: * `E` changes `quux` to xyzzy, and its merge `P` combines the strings to quux xyzzy. `P` is TREESAME to `O`, but not to `E`. +* `X` is an indpendent root commit that added a new file `side`, and `Y` + modified it. `Y` is TREESAME to `X`. Its merge `Q` added `side` to `P`, and + `Q` is TREESAME to `P`, but not to `Y`. + 'rev-list' walks backwards through history, including or excluding commits based on whether '\--full-history' and/or parent rewriting (via '\--parents' or '\--children') are used. The following settings @@ -409,7 +413,7 @@ parent lines. the example, we get + --- - I A B N D O P + I A B N D O P Q --- + `M` was excluded because it is TREESAME to both parents. `E`, @@ -430,7 +434,7 @@ Along each parent, prune away commits that are not included themselves. This results in + --- - .-A---M---N---O---P + .-A---M---N---O---P---Q / / / / / I B / D / \ / / / / @@ -440,7 +444,7 @@ themselves. This results in Compare to '\--full-history' without rewriting above. Note that `E` was pruned away because it is TREESAME, but the parent list of P was rewritten to contain `E`'s parent `I`. The same happened for `C` and -`N`. +`N`, and `X`, `Y` and `Q`. In addition to the above settings, you can change whether TREESAME affects inclusion: @@ -470,9 +474,9 @@ history according to the following rules: * Set `C'` to `C`. + * Replace each parent `P` of `C'` with its simplification `P'`. In - the process, drop parents that are ancestors of other parents, and - remove duplicates, but take care to never drop all parents that - we are TREESAME to. + the process, drop parents that are ancestors of other parents or that are + root commits TREESAME to an empty tree, and remove duplicates, but take care + to never drop all parents that we are TREESAME to. + * If after this parent rewriting, `C'` is a root or merge commit (has zero or 1 parents), a boundary commit, or !TREESAME, it remains. @@ -490,7 +494,7 @@ The effect of this is best shown by way of comparing
[PATCH v4 08/15] revision.c: Make --full-history consider more merges
History simplification previously always treated merges as TREESAME if they were TREESAME to any parent. While this was consistent with the default behaviour, this could be extremely unhelpful when searching detailed history, and could not be overridden. For example, if a merge had ignored a change, as if by -s ours, then: git log -m -p --full-history -Schange file would successfully locate change's addition but would not locate the merge that resolved against it. Futher, simplify_merges could drop the actual parent that a commit was TREESAME to, leaving it as a normal commit marked TREESAME that isn't actually TREESAME to its remaining parent. Now redefine a commit's TREESAME flag to be true only if a commit is TREESAME to _all_ of its parents. This doesn't affect either the default simplify_history behaviour (because partially TREESAME merges are turned into normal commits), or full-history with parent rewriting (because all merges are output). But it does affect other modes. The clearest difference is that --full-history will show more merges - sufficient to ensure that -m -p --full-history log searches can really explain every change to the file, including those changes' ultimate fate in merges. Also modify simplify_merges to recalculate TREESAME after removing a parent. This is achieved by storing per-parent TREESAME flags on the initial scan, so the combined flag can be easily recomputed. This fixes some t6111 failures, but creates a couple of new ones - we are now showing some merges that don't need to be shown. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 6 +- revision.c | 241 - revision.h | 1 + t/t6019-rev-list-ancestry-path.sh | 2 +- t/t6111-rev-list-treesame.sh | 26 ++-- 5 files changed, 229 insertions(+), 47 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 55ddf33..d166384 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -409,10 +409,10 @@ parent lines. the example, we get + --- - I A B N D O + I A B N D O P --- + -`P` and `M` were excluded because they are TREESAME to a parent. `E`, +`M` was excluded because it is TREESAME to both parents. `E`, `C` and `B` were all walked, but only `B` was !TREESAME, so the others do not appear. + @@ -440,7 +440,7 @@ themselves. This results in Compare to '\--full-history' without rewriting above. Note that `E` was pruned away because it is TREESAME, but the parent list of P was rewritten to contain `E`'s parent `I`. The same happened for `C` and -`N`. Note also that `P` was included despite being TREESAME. +`N`. In addition to the above settings, you can change whether TREESAME affects inclusion: diff --git a/revision.c b/revision.c index 7f7a8ab..64b86ae 100644 --- a/revision.c +++ b/revision.c @@ -429,10 +429,100 @@ static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit) return retval = 0 (tree_difference == REV_TREE_SAME); } +struct treesame_state { + unsigned int nparents; + unsigned char treesame[FLEX_ARRAY]; +}; + +static struct treesame_state *initialise_treesame(struct rev_info *revs, struct commit *commit) +{ + unsigned n = commit_list_count(commit-parents); + struct treesame_state *st = xcalloc(1, sizeof(*st) + n); + st-nparents = n; + add_decoration(revs-treesame, commit-object, st); + return st; +} + +/* + * Must be called immediately after removing the nth_parent from a commit's + * parent list, if we are maintaining the per-parent treesame[] decoration. + * This does not recalculate the master TREESAME flag - update_treesame() + * should be called to update it after a sequence of treesame[] modifications + * that may have affected it. + */ +static int compact_treesame(struct rev_info *revs, struct commit *commit, unsigned nth_parent) +{ + struct treesame_state *st; + int old_same; + + if (!commit-parents) { + /* +* Have just removed the only parent from a non-merge. +* Different handling, as we lack decoration. +*/ + if (nth_parent != 0) + die(compact_treesame %u, nth_parent); + old_same = !!(commit-object.flags TREESAME); + if (rev_same_tree_as_empty(revs, commit)) + commit-object.flags |= TREESAME; + else + commit-object.flags = ~TREESAME; + return old_same; + } + + st = lookup_decoration(revs-treesame, commit-object); + if (!st || nth_parent = st-nparents) + die(compact_treesame %u
[PATCH v4 14/15] revision.c: don't show all merges for --parents
When using --parents or --children, get_commit_action() previously showed all merges, even if TREESAME to both parents. This was intended to tie together the topology of the rewritten parents, but it was excessive - in fact we only need to show merges that have two or more relevant parents. Merges at the boundary do not necessarily need to be shown. Signed-off-by: Kevin Bracey ke...@bracey.fi --- revision.c | 22 +++--- t/t6111-rev-list-treesame.sh | 4 ++-- 2 files changed, 17 insertions(+), 9 deletions(-) diff --git a/revision.c b/revision.c index 1c75070..edb7e1c 100644 --- a/revision.c +++ b/revision.c @@ -2760,10 +2760,7 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi if (revs-min_age != -1 (commit-date revs-min_age)) return commit_ignore; if (revs-min_parents || (revs-max_parents = 0)) { - int n = 0; - struct commit_list *p; - for (p = commit-parents; p; p = p-next) - n++; + int n = commit_list_count(commit-parents); if ((n revs-min_parents) || ((revs-max_parents = 0) (n revs-max_parents))) return commit_ignore; @@ -2773,12 +2770,23 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi if (revs-prune revs-dense) { /* Commit without changes? */ if (commit-object.flags TREESAME) { + int n; + struct commit_list *p; /* drop merges unless we want parenthood */ if (!want_ancestry(revs)) return commit_ignore; - /* non-merge - always ignore it */ - if (!commit-parents || !commit-parents-next) - return commit_ignore; + /* +* If we want ancestry, then need to keep any merges +* between relevant commits to tie together topology. +* For consistency with TREESAME and simplification +* use relevant here rather than just INTERESTING, +* to treat bottom commit(s) as part of the topology. +*/ + for (n = 0, p = commit-parents; p; p = p-next) + if (relevant_commit(p-item)) + if (++n = 2) + return commit_show; + return commit_ignore; } } return commit_show; diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh index e32b373..25cc8ad 100755 --- a/t/t6111-rev-list-treesame.sh +++ b/t/t6111-rev-list-treesame.sh @@ -139,7 +139,7 @@ check_result 'M L G' F..M --first-parent -- file # If we want history since E, then we're quite happy to ignore G that took E. check_result 'M L K J I H G' E..M --ancestry-path check_result 'M L J I H' E..M --ancestry-path -- file -check_outcome failure '(LH)M (K)L (EJ)K (I)J (E)I (E)H' E..M --ancestry-path --parents -- file # includes G +check_result '(LH)M (K)L (EJ)K (I)J (E)I (E)H' E..M --ancestry-path --parents -- file check_result '(LH)M (E)H (J)L (I)J (E)I' E..M --ancestry-path --simplify-merges -- file # Should still be able to ignore I-J branch in simple log, despite limiting @@ -168,7 +168,7 @@ check_result '(D)F (BA)D' B..F --full-history --parents -- file check_result '(B)F' B..F --simplify-merges -- file check_result 'F D' B..F --ancestry-path check_result 'F' B..F --ancestry-path -- file -check_outcome failure 'F' B..F --ancestry-path --parents -- file # includes D +check_result 'F' B..F --ancestry-path --parents -- file check_result 'F' B..F --ancestry-path --simplify-merges -- file check_result 'F D' B..F --first-parent check_result 'F' B..F --first-parent -- file -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 12/15] revision.c: add BOTTOM flag for commits
When performing edge-based operations on the revision graph, it can be useful to be able to identify the INTERESTING graph's connection(s) to the bottom commit(s) specified by the user. Conceptually when the user specifies A..B (== B ^A), they are asking for the history from A to B. The first connection from A onto the INTERESTING graph is part of that history, and should be considered. If we consider only INTERESTING nodes and their connections, then we're really only considering the history from A's immediate descendants to B. This patch does not change behaviour, but adds a new BOTTOM flag to indicate the bottom commits specified by the user, ready to be used by following patches. We immediately use the BOTTOM flag to return collect_bottom_commits() to its original approach of examining the pending commit list rather than the command line. This will ensure alignment of the definition of bottom with future patches. Signed-off-by: Kevin Bracey ke...@bracey.fi --- revision.c | 34 -- revision.h | 3 ++- 2 files changed, 18 insertions(+), 19 deletions(-) diff --git a/revision.c b/revision.c index 4f7446c..6607dab 100644 --- a/revision.c +++ b/revision.c @@ -909,16 +909,12 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li * to filter the result of A..B further to the ones that can actually * reach A. */ -static struct commit_list *collect_bottom_commits(struct rev_info *revs) +static struct commit_list *collect_bottom_commits(struct commit_list *list) { - struct commit_list *bottom = NULL; - int i; - for (i = 0; i revs-cmdline.nr; i++) { - struct rev_cmdline_entry *elem = revs-cmdline.rev[i]; - if ((elem-flags UNINTERESTING) - elem-item-type == OBJ_COMMIT) - commit_list_insert((struct commit *)elem-item, bottom); - } + struct commit_list *elem, *bottom = NULL; + for (elem = list; elem; elem = elem-next) + if (elem-item-object.flags BOTTOM) + commit_list_insert(elem-item, bottom); return bottom; } @@ -949,7 +945,7 @@ static int limit_list(struct rev_info *revs) struct commit_list *bottom = NULL; if (revs-ancestry_path) { - bottom = collect_bottom_commits(revs); + bottom = collect_bottom_commits(list); if (!bottom) die(--ancestry-path given but there are no bottom commits); } @@ -1121,7 +1117,7 @@ static int add_parents_only(struct rev_info *revs, const char *arg_, int flags) const char *arg = arg_; if (*arg == '^') { - flags ^= UNINTERESTING; + flags ^= UNINTERESTING | BOTTOM; arg++; } if (get_sha1_committish(arg, sha1)) @@ -1213,8 +1209,8 @@ static void prepare_show_merge(struct rev_info *revs) add_pending_object(revs, head-object, HEAD); add_pending_object(revs, other-object, MERGE_HEAD); bases = get_merge_bases(head, other, 1); - add_rev_cmdline_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING); - add_pending_commit_list(revs, bases, UNINTERESTING); + add_rev_cmdline_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING | BOTTOM); + add_pending_commit_list(revs, bases, UNINTERESTING | BOTTOM); free_commit_list(bases); head-object.flags |= SYMMETRIC_LEFT; @@ -1250,13 +1246,15 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi int cant_be_filename = revarg_opt REVARG_CANNOT_BE_FILENAME; unsigned get_sha1_flags = 0; + flags = flags UNINTERESTING ? flags | BOTTOM : flags ~BOTTOM; + dotdot = strstr(arg, ..); if (dotdot) { unsigned char from_sha1[20]; const char *next = dotdot + 2; const char *this = arg; int symmetric = *next == '.'; - unsigned int flags_exclude = flags ^ UNINTERESTING; + unsigned int flags_exclude = flags ^ (UNINTERESTING | BOTTOM); static const char head_by_default[] = HEAD; unsigned int a_flags; @@ -1332,13 +1330,13 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi dotdot = strstr(arg, ^!); if (dotdot !dotdot[2]) { *dotdot = 0; - if (!add_parents_only(revs, arg, flags ^ UNINTERESTING)) + if (!add_parents_only(revs, arg, flags ^ (UNINTERESTING | BOTTOM))) *dotdot = '^'; } local_flags = 0; if (*arg == '^') { - local_flags = UNINTERESTING; + local_flags = UNINTERESTING | BOTTOM; arg++; } @@ -1815,7 +1813,7 @@ static int handle_revision_pseudo_opt(const char *submodule, handle_refs(submodule, revs, *flags
[PATCH v4 15/15] revision.c: make default history consider bottom commits
Previously, the default history treated bottom commits the same as any other UNINTERESTING commit, which could force it down side branches. Consider the following history: *A--*B---D--*F * marks !TREESAME parent paths \ /* `-C-' When requesting B..F, B is UNINTERESTING but TREESAME to D. C is !UNINTERESTING. So default following would go from D into the irrelevant side branch C to A, rather than to B. Note also that if there had been an extra !UNINTERESTING commit B1 between B and D, it wouldn't have gone down C. Change the default following to test relevant_commit() instead of !UNINTERESTING, so it can proceed straight from D to B, thus finishing the traversal of that path. Signed-off-by: Kevin Bracey ke...@bracey.fi --- revision.c | 2 +- t/t6111-rev-list-treesame.sh | 12 ++-- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/revision.c b/revision.c index edb7e1c..914ac78 100644 --- a/revision.c +++ b/revision.c @@ -684,7 +684,7 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) sha1_to_hex(p-object.sha1)); switch (rev_compare_tree(revs, p, commit)) { case REV_TREE_SAME: - if (!revs-simplify_history || (p-object.flags UNINTERESTING)) { + if (!revs-simplify_history || !relevant_commit(p)) { /* Even if a merge with an uninteresting * side branch brought the entire change * we are interested in, we do not want diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh index 25cc8ad..88b84df 100755 --- a/t/t6111-rev-list-treesame.sh +++ b/t/t6111-rev-list-treesame.sh @@ -146,8 +146,8 @@ check_result '(LH)M (E)H (J)L (I)J (E)I' E..M --ancestry-path --simplify-merges # to G. check_result 'M L K J I H' G..M check_result 'M H L K J I' G..M --topo-order -check_outcome failure 'M L H' G..M -- file # includes J I -check_outcome failure '(LH)M (G)L (G)H' G..M --parents -- file # includes J I +check_result 'M L H' G..M -- file +check_result '(LH)M (G)L (G)H' G..M --parents -- file check_result 'M L J I H' G..M --full-history -- file check_result 'M L K J I H' G..M --full-history --parents -- file check_result 'M H L J I' G..M --simplify-merges -- file @@ -161,8 +161,8 @@ check_result 'M H L J I' G..M --ancestry-path --simplify-merges -- file # But --full-history shouldn't drop D on its own - without simplification, # we can't decide if the merge from INTERESTING commit C was sensible. check_result 'F D C' B..F -check_outcome failure 'F' B..F -- file # includes D -check_outcome failure '(B)F' B..F --parents -- file # includes D +check_result 'F' B..F -- file +check_result '(B)F' B..F --parents -- file check_result 'F D' B..F --full-history -- file check_result '(D)F (BA)D' B..F --full-history --parents -- file check_result '(B)F' B..F --simplify-merges -- file @@ -174,8 +174,8 @@ check_result 'F D' B..F --first-parent check_result 'F' B..F --first-parent -- file # E...F should be equivalent to E F ^B, and be able to drop D as above. -check_outcome failure 'F' E F ^B -- file # includes D -check_outcome failure 'F' E...F -- file # includes D +check_result 'F' E F ^B -- file # includes D +check_result 'F' E...F -- file # includes D # Any sort of full history of C..F should show D, as it's the connection to C, # and it differs from it. -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 03/15] t6111: new TREESAME test set
Some side branching and odd merging to illustrate various flaws in revision list scans, particularly when limiting the list. Many expected failures, which will be gone by the end of the history traversal refinements series. Signed-off-by: Kevin Bracey ke...@bracey.fi --- t/t6111-rev-list-treesame.sh | 184 +++ 1 file changed, 184 insertions(+) create mode 100755 t/t6111-rev-list-treesame.sh diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh new file mode 100755 index 000..b2bca77 --- /dev/null +++ b/t/t6111-rev-list-treesame.sh @@ -0,0 +1,184 @@ +#!/bin/sh +# +#,---E--. *H--. * marks !TREESAME parent paths +# /\ / \* +# *A--*B---D--*F-*G-K-*L-*M +# \ /* \ / +#`-C-' `-*I-*J +# +# A creates file, B and F change it. +# Odd merge G takes the old version from B. +# I changes it, but J reverts it, so K is TREESAME to both parents. +# H and L both change file, and M merges those changes. + +test_description='TREESAME and limiting' + +. ./test-lib.sh + +note () { + git tag $1 +} + +unnote () { + git name-rev --tags --stdin | sed -e s|$_x40 (tags/\([^)]*\)) |\1 |g +} + +test_expect_success setup ' + test_commit Initial file file Hi there A + git branch other-branch + + test_commit file=Hello file Hello B + git branch third-branch + + git checkout other-branch + test_commit Added other other Hello C + + git checkout master + test_merge D other-branch + + git checkout third-branch + test_commit Third file third Nothing E + + git checkout master + test_commit file=Blah file Blah F + + test_tick git merge --no-commit third-branch + git checkout third-branch file + git commit + note G + git branch fiddler-branch + + git checkout -b part2-branch + test_commit file=Part 2 file Part 2 H + + git checkout fiddler-branch + test_commit Bad commit file Silly I + + test_tick git revert I note J + + git checkout master + test_tick git merge --no-ff fiddler-branch + note K + + test_commit file=Part 1 file Part 1 L + + test_tick test_must_fail git merge part2-branch + test_commit M file Parts 1+2 +' + +FMT='tformat:%P%H | %s' + +# could we soup this up to optionally check parents? So (BA)C would check +# that C is shown and has parents B A. +check_outcome () { + outcome=$1 + shift + for c in $1 + do + echo $c + done expect + shift + param=$* + test_expect_$outcome log $param ' + git log --format=$FMT $param | + unnote actual + sed -e s/^.* \([^ ]*\) .*/\1/ check actual + test_cmp expect check || { + cat actual + false + } + ' +} + +check_result () { + check_outcome success $@ +} + +# Odd merge G drops a change in F. Important that G is listed in all +# except the most basic list. Achieving this means normal merge D will also be +# shown in normal full-history, as we can't distinguish unless we do a +# simplification pass. After simplification, D is dropped but G remains. +check_result 'M L K J I H G F E D C B A' +check_result 'M H L K J I G E F D C B A' --topo-order +check_result 'M L H B A' -- file +check_result 'M L H B A' --parents -- file +check_outcome failure 'M L J I H G F D B A' --full-history -- file # drops G +check_result 'M L K J I H G F D B A' --full-history --parents -- file +check_outcome failure 'M H L J I G F B A' --simplify-merges -- file # drops G +check_result 'M L K G F D B A' --first-parent +check_result 'M L G F B A' --first-parent -- file + +# Check that odd merge G remains shown when F is the bottom. +check_result 'M L K J I H G E' F..M +check_result 'M H L K J I G E' F..M --topo-order +check_result 'M L H' F..M -- file +check_result 'M L H' F..M --parents -- file # L+H's parents rewritten to B, so more useful than it may seem +check_outcome failure 'M L J I H G' F..M --full-history -- file # drops G +check_result 'M L K J I H G' F..M --full-history --parents -- file +check_outcome failure 'M H L J I G' F..M --simplify-merges -- file # drops G +check_result 'M L K J I H G' F..M --ancestry-path +check_outcome failure 'M L J I H G' F..M --ancestry-path -- file # drops G +check_result 'M L K J I H G' F..M --ancestry-path --parents -- file +check_result 'M H L J I G' F..M --ancestry-path --simplify-merges -- file +check_result 'M L K G' F..M --first-parent +check_result 'M L G' F..M --first-parent -- file + +# Note that G is pruned when E is the bottom, even if it's the same commit list +# If we want history since E, then we're quite happy to ignore G that took E. +check_result 'M L K J I H G' E..M --ancestry-path +check_result 'M L J I H
[PATCH v4 13/15] revision.c: discount side branches when computing TREESAME
Use the BOTTOM flag to define relevance for pruning. Relevant commits are those that are !UNINTERESTING or BOTTOM, and this allows us to identify irrelevant side branches (UNINTERESTING !BOTTOM). If a merge has relevant parents, and it is TREESAME to them, then do not let irrelevant parents cause the merge to be treated as !TREESAME. When considering simplification, don't always include all merges - merges with exactly one relevant parent can be simplified, if TREESAME according to the above rule. These two changes greatly increase simplification in limited, pruned revision lists. Signed-off-by: Kevin Bracey ke...@bracey.fi --- revision.c| 171 +- t/t6019-rev-list-ancestry-path.sh | 12 ++- t/t6111-rev-list-treesame.sh | 8 +- 3 files changed, 164 insertions(+), 27 deletions(-) diff --git a/revision.c b/revision.c index 6607dab..1c75070 100644 --- a/revision.c +++ b/revision.c @@ -333,6 +333,80 @@ static int everybody_uninteresting(struct commit_list *orig) } /* + * A definition of relevant commit that we can use to simplify limited graphs + * by eliminating side branches. + * + * A relevant commit is one that is !UNINTERESTING (ie we are including it + * in our list), or that is a specified BOTTOM commit. Then after computing + * a limited list, during processing we can generally ignore boundary merges + * coming from outside the graph, (ie from irrelevant parents), and treat + * those merges as if they were single-parent. TREESAME is defined to consider + * only relevant parents, if any. If we are TREESAME to our on-graph parents, + * we don't care if we were !TREESAME to non-graph parents. + * + * Treating bottom commits as relevant ensures that a limited graph's + * connection to the actual bottom commit is not viewed as a side branch, but + * treated as part of the graph. For example: + * + * Z...A---X---o---o---B + *. / + * W---Y + * + * When computing A..B, the A-X connection is at least as important as + * Y-X, despite A being flagged UNINTERESTING. + * + * And when computing --ancestry-path A..B, the A-X connection is more + * important than Y-X, despite both A and Y being flagged UNINTERESTING. + */ +static inline int relevant_commit(struct commit *commit) +{ + return (commit-object.flags (UNINTERESTING | BOTTOM)) != UNINTERESTING; +} + +/* + * Return a single relevant commit from a parent list. If we are a TREESAME + * commit, and this selects one of our parents, then we can safely simplify to + * that parent. + */ +static struct commit *one_relevant_parent(const struct rev_info *revs, + struct commit_list *orig) +{ + struct commit_list *list = orig; + struct commit *relevant = NULL; + + if (!orig) + return NULL; + + /* +* For 1-parent commits, or if first-parent-only, then return that +* first parent (even if not relevant by the above definition). +* TREESAME will have been set purely on that parent. +*/ + if (revs-first_parent_only || !orig-next) + return orig-item; + + /* +* For multi-parent commits, identify a sole relevant parent, if any. +* If we have only one relevant parent, then TREESAME will be set purely +* with regard to that parent, and we can simplify accordingly. +* +* If we have more than one relevant parent, or no relevant parents +* (and multiple irrelevant ones), then we can't select a parent here +* and return NULL. +*/ + while (list) { + struct commit *commit = list-item; + list = list-next; + if (relevant_commit(commit)) { + if (relevant) + return NULL; + relevant = commit; + } + } + return relevant; +} + +/* * The goal is to get REV_TREE_NEW as the result only if the * diff consists of all '+' (and no other changes), REV_TREE_OLD * if the whole diff is removal of old data, and otherwise @@ -502,27 +576,52 @@ static unsigned update_treesame(struct rev_info *revs, struct commit *commit) if (commit-parents commit-parents-next) { unsigned n; struct treesame_state *st; + struct commit_list *p; + unsigned relevant_parents; + unsigned relevant_change, irrelevant_change; st = lookup_decoration(revs-treesame, commit-object); if (!st) die(update_treesame %s, sha1_to_hex(commit-object.sha1)); - commit-object.flags |= TREESAME; - for (n = 0; n st-nparents; n++) { - if (!st-treesame[n]) { - commit-object.flags = ~TREESAME; - break
[PATCH v4 10/15] simplify-merges: never remove all TREESAME parents
When simplifying an odd merge, such as one that used -s ours, we may find ourselves TREESAME to apparently redundant parents. Prevent simplify_merges() from removing every TREESAME parent; if this would happen reinstate the first TREESAME parent - the one that the default log would have followed. This avoids producing a totally disjoint history from the default log when the default log is a better explanation of the end result, and aids visualisation of odd merges. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 3 +- revision.c | 69 ++ t/t6111-rev-list-treesame.sh | 4 +-- 3 files changed, 73 insertions(+), 3 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index d166384..f41e865 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -471,7 +471,8 @@ history according to the following rules: + * Replace each parent `P` of `C'` with its simplification `P'`. In the process, drop parents that are ancestors of other parents, and - remove duplicates. + remove duplicates, but take care to never drop all parents that + we are TREESAME to. + * If after this parent rewriting, `C'` is a root or merge commit (has zero or 1 parents), a boundary commit, or !TREESAME, it remains. diff --git a/revision.c b/revision.c index 64b86ae..62f399c 100644 --- a/revision.c +++ b/revision.c @@ -2136,6 +2136,73 @@ static int mark_redundant_parents(struct rev_info *revs, struct commit *commit) return marked; } +/* + * Awkward naming - this means one parent we are TREESAME to. + * cf mark_treesame_root_parents: root parents that are TREESAME (to an + * empty tree). Better name suggestions? + */ +static int leave_one_treesame_to_parent(struct rev_info *revs, struct commit *commit) +{ + struct treesame_state *ts = lookup_decoration(revs-treesame, commit-object); + struct commit *unmarked = NULL, *marked = NULL; + struct commit_list *p; + unsigned n; + + for (p = commit-parents, n = 0; p; p = p-next, n++) { + if (ts-treesame[n]) { + if (p-item-object.flags TMP_MARK) { + if (!marked) + marked = p-item; + } else { + if (!unmarked) { + unmarked = p-item; + break; + } + } + } + } + + /* +* If we are TREESAME to a marked-for-deletion parent, but not to any +* unmarked parents, unmark the first TREESAME parent. This is the +* parent that the default simplify_history==1 scan would have followed, +* and it doesn't make sense to omit that path when asking for a +* simplified full history. Retaining it improves the chances of +* understanding odd missed merges that took an old version of a file. +* +* Example: +* +* I*X A modified the file, but mainline merge X used +*\ /-s ours, so took the version from I. X is +* `-*A--' TREESAME to I and !TREESAME to A. +* +* Default log from X would produce I. Without this check, +* --full-history --simplify-merges would produce I-A-X, showing +* the merge commit X and that it changed A, but not making clear that +* it had just taken the I version. With this check, the topology above +* is retained. +* +* Note that it is possible that the simplification chooses a different +* TREESAME parent from the default, in which case this test doesn't +* activate, and we _do_ drop the default parent. Example: +* +* I--X A modified the file, but it was reverted in B, +*\/ meaning mainline merge X is TREESAME to both +**A-*B parents. +* +* Default log would produce I by following the first parent; +* --full-history --simplify-merges will produce I-A-B. But this is a +* reasonable result - it presents a logical full history leading from +* I to X, and X is not an important merge. +*/ + if (!unmarked marked) { + marked-object.flags = ~TMP_MARK; + return 1; + } + + return 0; +} + static int remove_marked_parents(struct rev_info *revs, struct commit *commit) { struct commit_list **pp, *p; @@ -2239,6 +2306,8 @@ static struct commit_list **simplify_one(struct rev_info *revs, struct commit *c if (1 cnt) { int marked = mark_redundant_parents(revs, commit); if (marked) + marked -= leave_one_treesame_to_parent(revs
[PATCH v4 06/15] rev-list-options.txt: correct TREESAME for P
In the example given, P is not TREESAME to E. This doesn't affect the current result, but it will matter when we change behaviour. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 3bdbf5e..50bbff7 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -367,8 +367,7 @@ each merge. The commits are: `N` and `D` to foobarbaz; i.e., it is not TREESAME to any parent. * `E` changes `quux` to xyzzy, and its merge `P` combines the - strings to quux xyzzy. Despite appearing interesting, `P` is - TREESAME to all parents. + strings to quux xyzzy. `P` is TREESAME to `O`, but not to `E`. 'rev-list' walks backwards through history, including or excluding commits based on whether '\--full-history' and/or parent rewriting -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 07/15] Documentation: avoid uninteresting
The documentation of --boundary uses the term uninteresting, which is not used or defined anywhere else in the documentation. This is unhelpful and confusing to anyone who hasn't seen the UNINTERESTING flag in the source code. Change to use excluded, as per revisions.txt. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 50bbff7..55ddf33 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -271,8 +271,8 @@ See also linkgit:git-reflog[1]. --boundary:: - Output uninteresting commits at the boundary, which are usually - not shown. + Output excluded boundary commits. Boundary commits are + prefixed with `-`. -- -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 00/15] History traversal refinements
No new functionality or bug fixes since v3, just tidying: * Tests now use Junio's parent-checking functionality * BOTTOM flags now set in a neater fashion (I think), separating it out from the cmdline stuff. * Creation and use of BOTTOM flag now split into 4 separate commits - last version was too much for one commit, I feel. * Finally decided that relevant is the word I was looking for. Obvious, really. * On the subject of words, remove the only technical use of uninteresting that I found in the Documentation - I know that it always confused me until I read the source. This sequence is based on my 2-commit ancestry-path ... series, but no longer depends on it due to the new way the BOTTOM flag is initialised. But they both touch the t6019 test, so applying this on top will avoid conflicts. Junio C Hamano (2): t6111: allow checking the parents as well t6012: update test for tweaked full-history traversal Kevin Bracey (13): decorate.c: compact table when growing t6019: test file dropped in -s ours merge t6111: new TREESAME test set t6111: add parents to tests rev-list-options.txt: correct TREESAME for P Documentation: avoid uninteresting revision.c: Make --full-history consider more merges simplify-merges: never remove all TREESAME parents simplify-merges: drop merge from irrelevant side branch revision.c: add BOTTOM flag for commits revision.c: discount side branches when computing TREESAME revision.c: don't show all merges for --parents revision.c: make default history consider bottom commits Documentation/rev-list-options.txt | 42 +-- decorate.c | 2 +- revision.c | 539 - revision.h | 4 +- t/t6012-rev-list-simplify.sh | 31 ++- t/t6019-rev-list-ancestry-path.sh | 27 ++ t/t6111-rev-list-treesame.sh | 196 ++ 7 files changed, 750 insertions(+), 91 deletions(-) create mode 100755 t/t6111-rev-list-treesame.sh -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 04/15] t6111: allow checking the parents as well
From: Junio C Hamano gits...@pobox.com Signed-off-by: Junio C Hamano gits...@pobox.com --- t/t6111-rev-list-treesame.sh | 30 +- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh index b2bca77..1e4a550 100755 --- a/t/t6111-rev-list-treesame.sh +++ b/t/t6111-rev-list-treesame.sh @@ -20,7 +20,7 @@ note () { } unnote () { - git name-rev --tags --stdin | sed -e s|$_x40 (tags/\([^)]*\)) |\1 |g + git name-rev --tags --stdin | sed -e s|$_x40 (tags/\([^)]*\))\([ ]\)|\1\2|g } test_expect_success setup ' @@ -66,23 +66,34 @@ test_expect_success setup ' test_commit M file Parts 1+2 ' -FMT='tformat:%P%H | %s' - # could we soup this up to optionally check parents? So (BA)C would check # that C is shown and has parents B A. check_outcome () { outcome=$1 shift - for c in $1 - do - echo $c - done expect - shift + + case $1 in + *(*) + FMT=%P %H | %s + munge_actual= + s/^\([^ ]*\)\([^ ]*\) .*/(\1)\2/ + s/ //g + s/()// + + ;; + *) + FMT=%H | %s + munge_actual=s/^\([^ ]*\) .*/\1/ + ;; + esac + printf %s\n $1 expect + shift + param=$* test_expect_$outcome log $param ' git log --format=$FMT $param | unnote actual - sed -e s/^.* \([^ ]*\) .*/\1/ check actual + sed -e $munge_actual actual check test_cmp expect check || { cat actual false @@ -99,6 +110,7 @@ check_result () { # shown in normal full-history, as we can't distinguish unless we do a # simplification pass. After simplification, D is dropped but G remains. check_result 'M L K J I H G F E D C B A' +check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FE)G (D)F (B)E (BC)D (A)C (A)B A' check_result 'M H L K J I G E F D C B A' --topo-order check_result 'M L H B A' -- file check_result 'M L H B A' --parents -- file -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 05/15] t6111: add parents to tests
Signed-off-by: Kevin Bracey ke...@bracey.fi --- t/t6111-rev-list-treesame.sh | 38 +++--- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh index 1e4a550..4d74d3c 100755 --- a/t/t6111-rev-list-treesame.sh +++ b/t/t6111-rev-list-treesame.sh @@ -66,8 +66,6 @@ test_expect_success setup ' test_commit M file Parts 1+2 ' -# could we soup this up to optionally check parents? So (BA)C would check -# that C is shown and has parents B A. check_outcome () { outcome=$1 shift @@ -109,14 +107,16 @@ check_result () { # except the most basic list. Achieving this means normal merge D will also be # shown in normal full-history, as we can't distinguish unless we do a # simplification pass. After simplification, D is dropped but G remains. +# Also, merge simplification of G should not drop the parent B that the default +# simple history follows. check_result 'M L K J I H G F E D C B A' check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FE)G (D)F (B)E (BC)D (A)C (A)B A' check_result 'M H L K J I G E F D C B A' --topo-order check_result 'M L H B A' -- file -check_result 'M L H B A' --parents -- file +check_result '(LH)M (B)L (B)H (A)B A' --parents -- file check_outcome failure 'M L J I H G F D B A' --full-history -- file # drops G -check_result 'M L K J I H G F D B A' --full-history --parents -- file -check_outcome failure 'M H L J I G F B A' --simplify-merges -- file # drops G +check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FB)G (D)F (BA)D (A)B A' --full-history --parents -- file +check_outcome failure '(LH)M (G)H (J)L (I)J (G)I (FB)G (B)F (A)B A' --simplify-merges -- file # drops G check_result 'M L K G F D B A' --first-parent check_result 'M L G F B A' --first-parent -- file @@ -124,14 +124,14 @@ check_result 'M L G F B A' --first-parent -- file check_result 'M L K J I H G E' F..M check_result 'M H L K J I G E' F..M --topo-order check_result 'M L H' F..M -- file -check_result 'M L H' F..M --parents -- file # L+H's parents rewritten to B, so more useful than it may seem +check_result '(LH)M (B)L (B)H' --parents F..M -- file check_outcome failure 'M L J I H G' F..M --full-history -- file # drops G -check_result 'M L K J I H G' F..M --full-history --parents -- file -check_outcome failure 'M H L J I G' F..M --simplify-merges -- file # drops G +check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FB)G' F..M --full-history --parents -- file +check_outcome failure '(LH)M (G)H (J)L (I)J (G)I (FB)G' F..M --simplify-merges -- file # drops G check_result 'M L K J I H G' F..M --ancestry-path check_outcome failure 'M L J I H G' F..M --ancestry-path -- file # drops G -check_result 'M L K J I H G' F..M --ancestry-path --parents -- file -check_result 'M H L J I G' F..M --ancestry-path --simplify-merges -- file +check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FE)G' F..M --ancestry-path --parents -- file +check_result '(LH)M (G)H (J)L (I)J (G)I (FE)G' F..M --ancestry-path --simplify-merges -- file check_result 'M L K G' F..M --first-parent check_result 'M L G' F..M --first-parent -- file @@ -139,15 +139,15 @@ check_result 'M L G' F..M --first-parent -- file # If we want history since E, then we're quite happy to ignore G that took E. check_result 'M L K J I H G' E..M --ancestry-path check_result 'M L J I H' E..M --ancestry-path -- file -check_outcome failure 'M L K J I H' E..M --ancestry-path --parents -- file # includes G -check_outcome failure 'M H L J I' E..M --ancestry-path --simplify-merges -- file # includes G +check_outcome failure '(LH)M (K)L (EJ)K (I)J (E)I (E)H' E..M --ancestry-path --parents -- file # includes G +check_outcome failure '(LH)M (E)H (J)L (I)J (E)I' E..M --ancestry-path --simplify-merges -- file # includes G # Should still be able to ignore I-J branch in simple log, despite limiting # to G. check_result 'M L K J I H' G..M check_result 'M H L K J I' G..M --topo-order check_outcome failure 'M L H' G..M -- file # includes J I -check_outcome failure 'M L H' G..M --parents -- file # includes J I +check_outcome failure '(LH)M (G)L (G)H' G..M --parents -- file # includes J I check_result 'M L J I H' G..M --full-history -- file check_result 'M L K J I H' G..M --full-history --parents -- file check_result 'M H L J I' G..M --simplify-merges -- file @@ -162,10 +162,10 @@ check_result 'M H L J I' G..M --ancestry-path --simplify-merges -- file # we can't decide if the merge from INTERESTING commit C was sensible. check_result 'F D C' B..F check_result 'F' B..F -- file -check_outcome failure 'F' B..F --parents -- file # includes D +check_outcome failure '(B)F' B..F --parents -- file # includes D check_outcome failure 'F D' B..F --full-history -- file # drops D prematurely -check_result 'F D' B..F --full-history --parents -- file -check_result 'F' B..F --simplify-merges -- file +check_result '(D)F (BA)D' B..F --full-history --parents -- file +check_result '(B
[PATCH v4 02/15] t6019: test file dropped in -s ours merge
In preparation for upcoming TREESAME work, check the result for G.t, which is dropped in -s ours merge L. The default rev-list is empty, as expected - it follows the first parent path where it never existed. Unfortunately, --ancestry-path is also empty. Merges H J and L are all TREESAME to 1 parent, so are treated as TREESAME and not shown. This is clearly undesirable in the case of merge L, which dropped our G.t by taking the non-ancestry-path version. Document this as a known failure, and expect H J L, the 3 merges along the path that had to chose G.t versions. Signed-off-by: Kevin Bracey ke...@bracey.fi --- t/t6019-rev-list-ancestry-path.sh | 19 +++ 1 file changed, 19 insertions(+) diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh index dd5b0e5..c3bc2e7 100755 --- a/t/t6019-rev-list-ancestry-path.sh +++ b/t/t6019-rev-list-ancestry-path.sh @@ -16,6 +16,9 @@ test_description='--ancestry-path' # # F...I == F G H I # --ancestry-path F...I == F H I +# +# G..M -- G.t == [nothing - was dropped in -s ours merge L] +# --ancestry-path G..M -- G.t == H J L . ./test-lib.sh @@ -89,6 +92,22 @@ test_expect_success 'rev-list --ancestry-path F...I' ' test_cmp expect actual ' +# G.t is dropped in an -s ours merge +test_expect_success 'rev-list G..M -- G.t' ' + expect + git rev-list --format=%s G..M -- G.t | + sed -e /^commit /d actual + test_cmp expect actual +' + +test_expect_failure 'rev-list --ancestry-path G..M -- G.t' ' + for c in H J L; do echo $c; done expect + git rev-list --ancestry-path --format=%s G..M -- G.t | + sed -e /^commit /d | + sort actual + test_cmp expect actual +' + # b---bc # / \ / # a X -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/2] merge-base: add --merge-child option
On 13/05/2013 01:22, Junio C Hamano wrote: Kevin Bracey ke...@bracey.fi writes: git log --ancestry-path --left-right E...F --not $(git merge-base --all E F) which looks like we're having to repeat ourselves because it's not paying attention... You are half wrong; --left-right is about do we show the //= marker in the output?, so it is true that it does not make sense without ..., but the reverse is not true: A...B does not and should not imply --left-right. The repetition I meant is that by the definition of ancestry-path, the above would seem to be equivalent to git log --ancestry-path --left-right E F --not $(git merge-base --all E F) $(git merge-base --all E F) Anyway, revised separated-out version of the patch follows. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] Make --ancestry-path A...B work
This patch is a revised form of the one in my history traversal refinements series. Pulled out to allow it to proceed faster, given that John Keeping has found a use for the command. I suggest this is placed onto pu ahead of the existing series, dropping the equivalent final commit there. And then hopefully this can proceed to next faster. (Dropping that commit will drop the only --ancestry-path A...B test in t6111, meaning no immediate dependencies. But the next version of that series will be sent with t6111 testing and expecting a pass due to this fix being in.) Kevin Bracey (2): t6019: demonstrate --ancestry-path A...B breakage revision.c: treat A...B merge bases as if manually specified revision.c| 17 + revision.h| 1 + t/t6019-rev-list-ancestry-path.sh | 21 - 3 files changed, 38 insertions(+), 1 deletion(-) -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] revision.c: treat A...B merge bases as if manually specified
The documentation assures users that A...B is defined as A B --not $(git merge-base --all A B). This wasn't in fact quite true, because the calculated merge bases were not sent to add_rev_cmdline(). The main effect of this was that although git rev-list --ancestry-path A B --not $(git merge-base --all A B) worked, the simpler form git rev-list --ancestry-path A...B failed with a no bottom commits error. Other potential users of bottom commits could also be affected by this problem, if they examine revs-cmdline_info; I came across the issue in my proposed history traversal refinements series. So ensure that the calculated merge bases are sent to add_rev_cmdline(), flagged with new 'whence' enum value REV_CMD_MERGE_BASE. Signed-off-by: Kevin Bracey ke...@bracey.fi --- revision.c| 17 + revision.h| 1 + t/t6019-rev-list-ancestry-path.sh | 2 +- 3 files changed, 19 insertions(+), 1 deletion(-) diff --git a/revision.c b/revision.c index a67b615..7f7a8ab 100644 --- a/revision.c +++ b/revision.c @@ -915,6 +915,19 @@ static void add_rev_cmdline(struct rev_info *revs, info-nr++; } +static void add_rev_cmdline_list(struct rev_info *revs, +struct commit_list *commit_list, +int whence, +unsigned flags) +{ + while (commit_list) { + struct object *object = commit_list-item-object; + add_rev_cmdline(revs, object, sha1_to_hex(object-sha1), + whence, flags); + commit_list = commit_list-next; + } +} + struct all_refs_cb { int all_flags; int warned_bad_reflog; @@ -1092,6 +1105,7 @@ static void prepare_show_merge(struct rev_info *revs) add_pending_object(revs, head-object, HEAD); add_pending_object(revs, other-object, MERGE_HEAD); bases = get_merge_bases(head, other, 1); + add_rev_cmdline_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING); add_pending_commit_list(revs, bases, UNINTERESTING); free_commit_list(bases); head-object.flags |= SYMMETRIC_LEFT; @@ -1179,6 +1193,9 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi if (symmetric) { exclude = get_merge_bases(a, b, 1); + add_rev_cmdline_list(revs, exclude, +REV_CMD_MERGE_BASE, +flags_exclude); add_pending_commit_list(revs, exclude, flags_exclude); free_commit_list(exclude); diff --git a/revision.h b/revision.h index 01bd2b7..878a555 100644 --- a/revision.h +++ b/revision.h @@ -35,6 +35,7 @@ struct rev_cmdline_info { REV_CMD_PARENTS_ONLY, REV_CMD_LEFT, REV_CMD_RIGHT, + REV_CMD_MERGE_BASE, REV_CMD_REV } whence; unsigned flags; diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh index 5287f6a..dd5b0e5 100755 --- a/t/t6019-rev-list-ancestry-path.sh +++ b/t/t6019-rev-list-ancestry-path.sh @@ -81,7 +81,7 @@ test_expect_success 'rev-list F...I' ' test_cmp expect actual ' -test_expect_failure 'rev-list --ancestry-path F...I' ' +test_expect_success 'rev-list --ancestry-path F...I' ' for c in F H I; do echo $c; done expect git rev-list --ancestry-path --format=%s F...I | sed -e /^commit /d | -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] t6019: demonstrate --ancestry-path A...B breakage
Signed-off-by: Kevin Bracey ke...@bracey.fi --- t/t6019-rev-list-ancestry-path.sh | 21 - 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh index 39b4cb0..5287f6a 100755 --- a/t/t6019-rev-list-ancestry-path.sh +++ b/t/t6019-rev-list-ancestry-path.sh @@ -13,6 +13,9 @@ test_description='--ancestry-path' # # D..M -- M.t == M # --ancestry-path D..M -- M.t == M +# +# F...I == F G H I +# --ancestry-path F...I == F H I . ./test-lib.sh @@ -63,13 +66,29 @@ test_expect_success 'rev-list D..M -- M.t' ' test_cmp expect actual ' -test_expect_success 'rev-list --ancestry-patch D..M -- M.t' ' +test_expect_success 'rev-list --ancestry-path D..M -- M.t' ' echo M expect git rev-list --ancestry-path --format=%s D..M -- M.t | sed -e /^commit /d actual test_cmp expect actual ' +test_expect_success 'rev-list F...I' ' + for c in F G H I; do echo $c; done expect + git rev-list --format=%s F...I | + sed -e /^commit /d | + sort actual + test_cmp expect actual +' + +test_expect_failure 'rev-list --ancestry-path F...I' ' + for c in F H I; do echo $c; done expect + git rev-list --ancestry-path --format=%s F...I | + sed -e /^commit /d | + sort actual + test_cmp expect actual +' + # b---bc # / \ / # a X -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/2] merge-base: add --merge-child option
On 11/05/2013 15:23, John Keeping wrote: This is helpful when examining branches with disjoint roots, for example because one is periodically merged into a subtree of the other. $ git log --left-right F...E --not $(git merge-base --merge-child F E) F E Wouldn't --left-right --ancestry-path F...E do the job? I can't immediately see how that differs. Unfortunately, that syntax doesn't currently work - I just stumbled across this problem, and my history traversal refinements series in pu fixes it, somewhat incidentally to the main subject in there. However, without that patch, the alternative way of expressing it: --left-right --ancestry path F E --not $(git merge-base --all F E) should still work. Unless --left-right is magic for ...? If it is, then my patch is more significant than I thought. My series will also be better at optimising away D too, through a combination of my and Junio's efforts. Try it out. On this subject, is there any way to exclude a path from a log query? Is there a not operator for paths? Might be another way of doing this - disjoint histories probably have disjoint paths... Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/2] merge-base: add --merge-child option
On 12/05/2013 19:33, John Keeping wrote: On Sun, May 12, 2013 at 05:28:24PM +0100, John Keeping wrote: You're right (and I was wrong in my reply to Junio's parallel message) ancestry path does seem to be what I want: $ git rev-list --ancestry-path --left-right --count \ master...git-gui/master 20565 However, this doesn't seem to make a difference to the time taken when I add in --cherry-mark (which is why I was partially correct in the parallel thread - it doesn't have the effect on cherry-mark that I want it to): This seems to be caused by the code in revision.c::limit_list() which does the cherry detection then limits to left/right and only then applies the ancestry path. I haven't looked further than that, but is there any reason not to apply the ancestry path restriction before looking for patch-identical commits? That certainly sounds like it's doing it the wrong way round. At first sight, it seems obviously suboptimal. No I didn't. I forgot to update my $PATH when I built on master - those results are from pu. master says: fatal: --ancestry-path given but there are no bottom commits Well, then it looks like we have a user for that particular syntax. Seemed a bit esoteric to me :) Although I realised after sending my mail you could also use git log --ancestry-path --left-right E...F --not $(git merge-base --all E F) which looks like we're having to repeat ourselves because it's not paying attention... I hit it because one of my optimisations relies on knowing the bottom commits, and I made absolutely sure I was using the exactly same definition of bottom as --ancestry-path. And then I found that my optimisations didn't work properly with ... I suggest we pull my patch out from the more complex optimisation series so it can proceed to next faster. It shouldn't have to wait for all my new fancy stuff - it's fixing something that appears to be a clear bug. Although Junio did have a comment about the implementation - I'll revisit it tomorrow and send a revised version separately, if everyone thinks that's sensible. On this subject, is there any way to exclude a path from a log query? Is there a not operator for paths? Might be another way of doing this - disjoint histories probably have disjoint paths... That relates to another idea I had about optimizing the detection of patch-identical commits. If the smaller side of a symmetric difference is quite small (as it is likely to be if it's a topic branch), would it be a good idea to calculate the set of paths touched by commits on that side and then skip calculating the patch ID for any commits that touch paths outside that set. The tree comparison is a lot cheaper than doing a diff on all of the files. Sounds cute. Go for it :) Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 0/2] merge-base: add --merge-child option
On 12/05/2013 19:58, John Keeping wrote: With the patch below, the --ancestry-path version drops to under 2 seconds. I'm not sure if this is a good idea though. It helps me say I know nothing that isn't on the ancestry path can be patch-identical, so don't bother checking if it is but it regresses users who want the full cherry-pick check while only limiting the output. Hmm. Should an excluded commit be a valid comparator? Is it sensible/correct to show a left commit as = to a right commit that has been excluded by the revision specifiers? Doesn't sound right to me. I'm not convinced that there's a valid use-case that you're regressing here. If --ancestry-path is being misused (the user's assertion that non-ancestry doesn't matter is wrong) the error of not noting culled patch-identical commits is nothing compared to the fact we're already totally omitting the non-identical ones. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 5/9] revision.c: Make --full-history consider more merges
On 06/05/2013 23:45, Junio C Hamano wrote: Kevin Bracey ke...@bracey.fi writes: +struct treesame_state { + unsigned int nparents; + unsigned char treesame[FLEX_ARRAY]; +}; I have been wondering if we want to do one-bit (not one-byte) per parent but no biggie ;-) I did start down that path, because I felt bad about bloat. But then I realised how much I would be complicating and slowing the code down to only save a few bytes each time we walk a merge with at least 5 parents, and I came to my senses. :) Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 10/9] revision.c: treat A...B merge bases as if manually specified
The documentation assures users that A...B is defined as 'r1 r2 --not $(git merge-base --all r1 r2)'. This isn't in fact quite true, because the calculated merge bases are not sent to add_rev_cmdline(). Previously, the effect was pretty minor - all that I can think of is that git rev-list --ancestry-path A B ^AB_base works, but git rev-list --ancestry-path A...B fails with a no bottom commits error. But now that all history walking cares about bottom commits, this failure to note the merge bases as bottoms can lead to worse results for A...B compared to A B ^AB_base. So ensure that the calculated merge bases are sent to add_rev_cmdline(), flagged as 'REV_CMD_MERGE_BASE'. Signed-off-by: Kevin Bracey ke...@bracey.fi --- revision.c | 9 +++-- revision.h | 2 ++ t/t6111-rev-list-treesame.sh | 4 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/revision.c b/revision.c index 99a3432..aad16a6 100644 --- a/revision.c +++ b/revision.c @@ -1305,12 +1305,16 @@ void init_revisions(struct rev_info *revs, const char *prefix) static void add_pending_commit_list(struct rev_info *revs, struct commit_list *commit_list, +int whence, unsigned int flags) { while (commit_list) { struct object *object = commit_list-item-object; + const char *sha1 = sha1_to_hex(object-sha1); object-flags |= flags; - add_pending_object(revs, object, sha1_to_hex(object-sha1)); + if (whence != REV_CMD_NONE) + add_rev_cmdline(revs, object, sha1, whence, flags); + add_pending_object(revs, object, sha1); commit_list = commit_list-next; } } @@ -1332,7 +1336,7 @@ static void prepare_show_merge(struct rev_info *revs) add_pending_object(revs, head-object, HEAD); add_pending_object(revs, other-object, MERGE_HEAD); bases = get_merge_bases(head, other, 1); - add_pending_commit_list(revs, bases, UNINTERESTING); + add_pending_commit_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING); free_commit_list(bases); head-object.flags |= SYMMETRIC_LEFT; @@ -1420,6 +1424,7 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi if (symmetric) { exclude = get_merge_bases(a, b, 1); add_pending_commit_list(revs, exclude, + REV_CMD_MERGE_BASE, flags_exclude); free_commit_list(exclude); a_flags = flags | SYMMETRIC_LEFT; diff --git a/revision.h b/revision.h index 765630a..b2ac5a3 100644 --- a/revision.h +++ b/revision.h @@ -32,10 +32,12 @@ struct rev_cmdline_info { struct object *item; const char *name; enum { + REV_CMD_NONE, REV_CMD_REF, REV_CMD_PARENTS_ONLY, REV_CMD_LEFT, REV_CMD_RIGHT, + REV_CMD_MERGE_BASE, REV_CMD_REV } whence; unsigned flags; diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh index 689d357..ded0b1a 100755 --- a/t/t6111-rev-list-treesame.sh +++ b/t/t6111-rev-list-treesame.sh @@ -161,6 +161,10 @@ check_result 'F' B..F --ancestry-path --simplify-merges -- file check_result 'F D' B..F --first-parent check_result 'F' B..F --first-parent -- file +# E...F should be equivalent to E F ^B, and be able to drop D as above. +check_result 'F' E F ^B -- file +check_result 'F' E...F -- file + # Any sort of full history of C..F should show D, as it's the connection to C, # and it differs from it. check_result 'F D B' C..F -- 1.8.3.rc0.28.g4b02ef5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 3/9] t6111: new TREESAME test set
Some side branching and odd merging to illustrate various flaws in revision list scans, particularly when limiting the list. Many expected failures, which will be gone by the end of the history traversal refinements series. Signed-off-by: Kevin Bracey ke...@bracey.fi --- t/t6111-rev-list-treesame.sh | 180 +++ 1 file changed, 180 insertions(+) create mode 100755 t/t6111-rev-list-treesame.sh diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh new file mode 100755 index 000..602c02d --- /dev/null +++ b/t/t6111-rev-list-treesame.sh @@ -0,0 +1,180 @@ +#!/bin/sh +# +#,---E--. *H--. * marks !TREESAME parent paths +# /\ / \* +# *A--*B---D--*F-*G-K-*L-*M +# \ /* \ / +#`-C-' `-*I-*J +# +# A creates file, B and F change it. +# Odd merge G takes the old version from B. +# I changes it, but J reverts it. +# H and L both change it, and M merges those changes. + +test_description='TREESAME and limiting' + +. ./test-lib.sh + +note () { + git tag $1 +} + +unnote () { + git name-rev --tags --stdin | sed -e s|$_x40 (tags/\([^)]*\)) |\1 |g +} + +test_expect_success setup ' + test_commit Initial file file Hi there A + git branch other-branch + + test_commit file=Hello file Hello B + git branch third-branch + + git checkout other-branch + test_commit Added other other Hello C + + git checkout master + test_merge D other-branch + + git checkout third-branch + test_commit Third file third Nothing E + + git checkout master + test_commit file=Blah file Blah F + + test_tick git merge --no-commit third-branch + git checkout third-branch file + git commit + note G + git branch fiddler-branch + + git checkout -b part2-branch + test_commit file=Part 2 file Part 2 H + + git checkout fiddler-branch + test_commit Bad commit file Silly I + + test_tick git revert I note J + + git checkout master + test_tick git merge --no-ff fiddler-branch + note K + + test_commit file=Part 1 file Part 1 L + + test_tick test_must_fail git merge part2-branch + test_commit M file Parts 1+2 +' + +FMT='tformat:%P%H | %s' + +# could we soup this up to optionally check parents? So (BA)C would check +# that C is shown and has parents B A. +check_outcome () { + outcome=$1 + shift + for c in $1 + do + echo $c + done expect + shift + param=$* + test_expect_$outcome log $param ' + git log --format=$FMT $param | + unnote actual + sed -e s/^.* \([^ ]*\) .*/\1/ check actual + test_cmp expect check || { + cat actual + false + } + ' +} + +check_result () { + check_outcome success $@ +} + +# Odd merge G drops a change in F. Important that G is listed in all +# except the most basic list. Achieving this means normal merge D will also be +# shown in normal full-history, as we can't distinguish unless we do a +# simplification pass. After simplification, D is dropped but G remains. +check_result 'M L K J I H G F E D C B A' +check_result 'M H L K J I G E F D C B A' --topo-order +check_result 'M L H B A' -- file +check_result 'M L H B A' --parents -- file +check_outcome failure 'M L J I H G F D B A' --full-history -- file # drops G +check_result 'M L K J I H G F D B A' --full-history --parents -- file +check_outcome failure 'M H L J I G F B A' --simplify-merges -- file # drops G +check_result 'M L K G F D B A' --first-parent +check_result 'M L G F B A' --first-parent -- file + +# Check that odd merge G remains shown when F is the bottom. +check_result 'M L K J I H G E' F..M +check_result 'M H L K J I G E' F..M --topo-order +check_result 'M L H' F..M -- file +check_result 'M L H' F..M --parents -- file # L+H's parents rewritten to B, so more useful than it may seem +check_outcome failure 'M L J I H G' F..M --full-history -- file # drops G +check_result 'M L K J I H G' F..M --full-history --parents -- file +check_outcome failure 'M H L J I G' F..M --simplify-merges -- file # drops G +check_result 'M L K J I H G' F..M --ancestry-path +check_outcome failure 'M L J I H G' F..M --ancestry-path -- file # drops G +check_result 'M L K J I H G' F..M --ancestry-path --parents -- file +check_result 'M H L J I G' F..M --ancestry-path --simplify-merges -- file +check_result 'M L K G' F..M --first-parent +check_result 'M L G' F..M --first-parent -- file + +# Note that G is pruned when E is the bottom, even if it's the same commit list +# If we want history since E, then we're quite happy to ignore G that took E. +check_result 'M L K J I H G' E..M --ancestry-path +check_result 'M L J I H' E..M --ancestry-path -- file
[PATCH v3 6/9] t6012: update test for tweaked full-history traversal
From: Junio C Hamano gits...@pobox.com Signed-off-by: Junio C Hamano gits...@pobox.com --- t/t6012-rev-list-simplify.sh | 29 +++-- 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh index dd6dc84..4e55872 100755 --- a/t/t6012-rev-list-simplify.sh +++ b/t/t6012-rev-list-simplify.sh @@ -14,21 +14,24 @@ unnote () { test_expect_success setup ' echo Hi there file - git add file - test_tick git commit -m Initial file + echo initial lost + git add file lost + test_tick git commit -m Initial file and lost note A git branch other-branch echo Hello file - git add file - test_tick git commit -m Modified file + echo second lost + git add file lost + test_tick git commit -m Modified file and lost note B git checkout other-branch echo Hello file - git add file + lost + git add file lost test_tick git commit -m Modified the file identically note C @@ -37,7 +40,9 @@ test_expect_success setup ' test_tick git commit -m Add another file note D - test_tick git merge -m merge master + test_tick + test_must_fail git merge -m merge master + lost git commit -a -m merge note E echo Yet another elif @@ -110,4 +115,16 @@ check_result 'I B A' -- file check_result 'I B A' --topo-order -- file check_result 'H' --first-parent -- another-file +check_result 'E C B A' --full-history E -- lost +test_expect_success 'full history simplification without parent' ' + printf %s\n E C B A expect + git log --pretty=$FMT --full-history E -- lost | + unnote actual + sed -e s/^.* \([^ ]*\) .*/\1/ check actual + test_cmp expect check || { + cat actual + false + } +' + test_done -- 1.8.3.rc0.28.g682c2d9 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 7/9] simplify-merges: never remove all TREESAME parents
When simplifying an odd merge, such as one that used -s ours, we may find ourselves TREESAME to apparently redundant parents. Prevent simplify_merges() from removing every TREESAME parent; if this would happen reinstate the first TREESAME parent - the one that the default log would have followed. This avoids producing a totally disjoint history from the default log when the default log is a better explanation of the end result, and aids visualisation of odd merges. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 3 +- revision.c | 69 ++ t/t6111-rev-list-treesame.sh | 2 +- 3 files changed, 72 insertions(+), 2 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 1dad341..e23bdb0 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -471,7 +471,8 @@ history according to the following rules: + * Replace each parent `P` of `C'` with its simplification `P'`. In the process, drop parents that are ancestors of other parents, and - remove duplicates. + remove duplicates, but take care to never drop all parents that + we are TREESAME to. + * If after this parent rewriting, `C'` is a root or merge commit (has zero or 1 parents), a boundary commit, or !TREESAME, it remains. diff --git a/revision.c b/revision.c index c88ded8..7535757 100644 --- a/revision.c +++ b/revision.c @@ -2119,6 +2119,73 @@ static int mark_redundant_parents(struct rev_info *revs, struct commit *commit) return marked; } +/* + * Awkward naming - this means one parent we are TREESAME to. + * cf mark_treesame_root_parents: root parents that are TREESAME (to an + * empty tree). Better name suggestions? + */ +static int leave_one_treesame_to_parent(struct rev_info *revs, struct commit *commit) +{ + struct treesame_state *ts = lookup_decoration(revs-treesame, commit-object); + struct commit *unmarked = NULL, *marked = NULL; + struct commit_list *p; + unsigned n; + + for (p = commit-parents, n = 0; p; p = p-next, n++) { + if (ts-treesame[n]) { + if (p-item-object.flags TMP_MARK) { + if (!marked) + marked = p-item; + } else { + if (!unmarked) { + unmarked = p-item; + break; + } + } + } + } + + /* +* If we are TREESAME to a marked-for-deletion parent, but not to any +* unmarked parents, unmark the first TREESAME parent. This is the +* parent that the default simplify_history==1 scan would have followed, +* and it doesn't make sense to omit that path when asking for a +* simplified full history. Retaining it improves the chances of +* understanding odd missed merges that took an old version of a file. +* +* Example: +* +* I-X A modified the file, but mainline merge X used +*\ /-s ours, so took the version from I. X is +* `--A--' TREESAME to I and !TREESAME to A. +* +* Default log from X would produce I. Without this check, +* --full-history --simplify-merges would produce I-A-X, showing +* the merge commit X and that it changed A, but not making clear that +* it had just taken the I version. With this check, the topology above +* is retained. +* +* Note that it is possible that the simplification chooses a different +* TREESAME parent from the default, in which case this test doesn't +* activate, and we _do_ drop the default parent. Example: +* +* I--X A modified the file, but it was reverted in B, +*\/ meaning mainline merge X is TREESAME to both +* A--B parents. +* +* Default log would produce I by following the first parent; +* --full-history --simplify-merges will produce I-A-B. But this is a +* reasonable result - it presents a logical full history leading from +* I to X, and X is not an important merge. +*/ + if (!unmarked marked) { + marked-object.flags = ~TMP_MARK; + return 1; + } + + return 0; +} + static int remove_marked_parents(struct rev_info *revs, struct commit *commit) { struct commit_list **pp, *p; @@ -,6 +2289,8 @@ static struct commit_list **simplify_one(struct rev_info *revs, struct commit *c if (1 cnt) { int marked = mark_redundant_parents(revs, commit); if (marked) + marked -= leave_one_treesame_to_parent(revs
[PATCH v3 0/9] History traversal refinements
New version - nothing much changed since v2.2, except the new test set to illustrate and prove the changes. Not sure about the t6111 numbering - there wasn't space where I really wanted to put it. And maybe it should be appended to one of the existing tests. You will note that I'm floundering for the name for the commits I care about in part 9. Currently at priority, but that's horrible, not least because it isn't an adjective. I think the word I really want is interesting, but that's already taken... Relevant, important? Junio C Hamano (1): t6012: update test for tweaked full-history traversal Kevin Bracey (8): decorate.c: compact table when growing t6019: test file dropped in -s ours merge t6111: new TREESAME test set rev-list-options.txt: correct TREESAME for P revision.c: Make --full-history consider more merges simplify-merges: never remove all TREESAME parents simplify-merges: drop merge from irrelevant side branch revision.c: discount side branches when computing TREESAME Documentation/rev-list-options.txt | 38 +-- decorate.c | 2 +- revision.c | 511 + revision.h | 4 +- t/t6012-rev-list-simplify.sh | 31 ++- t/t6019-rev-list-ancestry-path.sh | 29 ++- t/t6111-rev-list-treesame.sh | 180 + 7 files changed, 723 insertions(+), 72 deletions(-) create mode 100755 t/t6111-rev-list-treesame.sh -- 1.8.3.rc0.28.g682c2d9 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 2/9] t6019: test file dropped in -s ours merge
In preparation for upcoming TREESAME work, check the result for G.t, which is dropped in -s ours merge L. The default rev-list is empty, as expected - it follows the first parent path where it never existed. Unfortunately, --ancestry-path is also empty. Merges H J and L are all TREESAME to 1 parent, so are treated as TREESAME and not shown. This is clearly undesirable in the case of merge L, which dropped our G.t by taking the non-ancestry-path version. Document this as a known failure, and expect H J L, the 3 merges along the path that had to chose G.t versions. Signed-off-by: Kevin Bracey ke...@bracey.fi --- t/t6019-rev-list-ancestry-path.sh | 21 - 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh index 39b4cb0..86ef908 100755 --- a/t/t6019-rev-list-ancestry-path.sh +++ b/t/t6019-rev-list-ancestry-path.sh @@ -13,6 +13,9 @@ test_description='--ancestry-path' # # D..M -- M.t == M # --ancestry-path D..M -- M.t == M +# +# G..M -- G.t == [nothing - was dropped in -s ours merge L] +# --ancestry-path G..M -- G.t == H J L . ./test-lib.sh @@ -63,13 +66,29 @@ test_expect_success 'rev-list D..M -- M.t' ' test_cmp expect actual ' -test_expect_success 'rev-list --ancestry-patch D..M -- M.t' ' +test_expect_success 'rev-list --ancestry-path D..M -- M.t' ' echo M expect git rev-list --ancestry-path --format=%s D..M -- M.t | sed -e /^commit /d actual test_cmp expect actual ' +# G.t is dropped in an -s ours merge +test_expect_success 'rev-list G..M -- G.t' ' + expect + git rev-list --format=%s G..M -- G.t | + sed -e /^commit /d actual + test_cmp expect actual +' + +test_expect_failure 'rev-list --ancestry-path G..M -- G.t' ' + for c in H J L; do echo $c; done expect + git rev-list --ancestry-path --format=%s G..M -- G.t | + sed -e /^commit /d | + sort actual + test_cmp expect actual +' + # b---bc # / \ / # a X -- 1.8.3.rc0.28.g682c2d9 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/9] decorate.c: compact table when growing
When growing the table, take the opportunity to compact it by removing entries with NULL decoration. Users may have removed decorations by passing NULL to insert_decoration. An object's table entry can't actually be removed during normal operation, as it would break the linear hash collision search. But we can remove NULL decoration entries when rebuilding the table. Signed-off-by: Kevin Bracey ke...@bracey.fi --- decorate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/decorate.c b/decorate.c index 2f8a63e..7cb5d29 100644 --- a/decorate.c +++ b/decorate.c @@ -49,7 +49,7 @@ static void grow_decoration(struct decoration *n) const struct object *base = old_hash[i].base; void *decoration = old_hash[i].decoration; - if (!base) + if (!decoration) continue; insert_decoration(n, base, decoration); } -- 1.8.3.rc0.28.g682c2d9 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 8/9] simplify-merges: drop merge from irrelevant side branch
Reimplement commit 4b7f53da on top of the new simplify-merges infrastructure, tightening the condition to only consider root parents; the original version incorrectly dropped parents that were TREESAME to anything. Original log message follows. The merge simplification rule stated in 6546b59 (revision traversal: show full history with merge simplification, 2008-07-31) still treated merge commits too specially. Namely, in a history with this shape: ---o---o---M / x---x---x where three 'x' were on a history completely unrelated to the main history 'o' and do not touch any of the paths we are following, we still said that after simplifying all of the parents of M, 'x' (which is the leftmost 'x' that rightmost 'x simplifies down to) and 'o' (which would be the last commit on the main history that touches the paths we are following) are independent from each other, and both need to be kept. That is incorrect; when the side branch 'x' never touches the paths, it should be removed to allow M to simplify down to the last commit on the main history that touches the paths. Suggested-by: Junio C Hamano gits...@pobox.com Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 34 +- revision.c | 26 +- t/t6012-rev-list-simplify.sh | 2 +- 3 files changed, 47 insertions(+), 15 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index e23bdb0..b7fbc80 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -342,13 +342,13 @@ In the following, we will always refer to the same example history to illustrate the differences between simplification settings. We assume that you are filtering for a file `foo` in this commit graph: --- - .-A---M---N---O---P -/ / / / / - I B C D E -\ / / / / - `-' + .-A---M---N---O---P---Q +/ / / / / / + I B C D E Y +\ / / / / / + `-' X --- -The horizontal line of history A---P is taken to be the first parent of +The horizontal line of history A---Q is taken to be the first parent of each merge. The commits are: * `I` is the initial commit, in which `foo` exists with contents @@ -369,6 +369,10 @@ each merge. The commits are: * `E` changes `quux` to xyzzy, and its merge `P` combines the strings to quux xyzzy. `P` is TREESAME to `O`, but not to `E`. +* `X` is an indpendent root commit that added a new file `side`, and `Y` + modified it. `Y` is TREESAME to `X`. Its merge `Q` added `side` to `P`, and + `Q` is TREESAME to `P`, but not to `Y`. + 'rev-list' walks backwards through history, including or excluding commits based on whether '\--full-history' and/or parent rewriting (via '\--parents' or '\--children') are used. The following settings @@ -409,7 +413,7 @@ parent lines. the example, we get + --- - I A B N D O P + I A B N D O P Q --- + `M` was excluded because it is TREESAME to both parents. `E`, @@ -430,7 +434,7 @@ Along each parent, prune away commits that are not included themselves. This results in + --- - .-A---M---N---O---P + .-A---M---N---O---P---Q / / / / / I B / D / \ / / / / @@ -440,7 +444,7 @@ themselves. This results in Compare to '\--full-history' without rewriting above. Note that `E` was pruned away because it is TREESAME, but the parent list of P was rewritten to contain `E`'s parent `I`. The same happened for `C` and -`N`. +`N`, and `X`, `Y` and `Q`. In addition to the above settings, you can change whether TREESAME affects inclusion: @@ -470,9 +474,9 @@ history according to the following rules: * Set `C'` to `C`. + * Replace each parent `P` of `C'` with its simplification `P'`. In - the process, drop parents that are ancestors of other parents, and - remove duplicates, but take care to never drop all parents that - we are TREESAME to. + the process, drop parents that are ancestors of other parents or that are + root commits TREESAME to an empty tree, and remove duplicates, but take care + to never drop all parents that we are TREESAME to. + * If after this parent rewriting, `C'` is a root or merge commit (has zero or 1 parents), a boundary commit, or !TREESAME, it remains. @@ -490,7 +494,7 @@ The effect of this is best shown by way of comparing
[PATCH v3 9/9] revision.c: discount side branches when computing TREESAME
Add a BOTTOM flag to commit objects, and use it to define priority for pruning. Priority commits are those that are !UNINTERESTING or BOTTOM, and this allows us to identify irrelevant side branches (UNINTERESTING !BOTTOM). If a merge has priority parents, and it is TREESAME to them, then do not let non-priority parents cause the merge to be treated as !TREESAME. When considering simplification, don't always include all merges - merges with exactly 1 priority parent can be simplified, if TREESAME according to the above rule. These two changes greatly increase simplification in limited revision lists - irrelevant merges from unlisted commits can be omitted. Signed-off-by: Kevin Bracey ke...@bracey.fi --- revision.c| 201 -- revision.h| 3 +- t/t6019-rev-list-ancestry-path.sh | 12 ++- t/t6111-rev-list-treesame.sh | 22 ++--- 4 files changed, 195 insertions(+), 43 deletions(-) diff --git a/revision.c b/revision.c index 20c7058..99a3432 100644 --- a/revision.c +++ b/revision.c @@ -333,6 +333,76 @@ static int everybody_uninteresting(struct commit_list *orig) } /* + * A definition of priority commit that we can use to simplify limited graphs + * by eliminating side branches. + * + * A priority commit is one that is !UNINTERESTING (ie we are including it + * in our list), or that is a specified BOTTOM commit. Then after computing + * a limited list, during processing we can generally ignore boundary merges + * coming from outside the graph, (ie from non-priority parents), and treat + * those merges as if they were single-parent. TREESAME is defined to consider + * only priority parents, if any. If we are TREESAME to our on-graph parents, + * we don't care if we were !TREESAME to non-graph parents. + * + * Including bottom commits in priority ensures that a limited graph's + * connection to the actual bottom commit is not viewed as an uninteresting + * side branch, but treated as part of the graph. For example: + * + * Z...A---X---o---o---B + *. / + * W---Y + * + * When computing A..B, the A-X connection is at least as important as + * Y-X, despite A being flagged UNINTERESTING. + * + * And when computing --ancestry-path A..B, the A-X connection is more + * important than Y-X, despite both A and Y being flagged UNINTERESTING. + */ +static inline int priority_commit(struct commit *commit) +{ + return (commit-object.flags (UNINTERESTING | BOTTOM)) != UNINTERESTING; +} + +/* + * Return a single priority commit from a parent list. If we are a TREESAME + * commit, and this selects one of our parents, then we can safely simplify to + * that parent. + * + * For 1-parent commits, or if first-parent-only, then this returns the only + * parent. TREESAME will have been set purely on that parent. + * + * For multi-parent commits, identify a sole priority parent, if any. + * If we have only one priority parent, then TREESAME will be set purely + * with regard to that parent, and we can choose simplification appropriately. + * + * If we have more than one priority parent, or no priority parents + * (and multiple non-priority ones), then we can't select a parent here and + * return NULL. + */ +static struct commit *priority_select_parent(struct rev_info *revs, struct commit_list *orig) +{ + struct commit_list *list = orig; + struct commit *priority = NULL; + + if (!orig) + return NULL; + + if (revs-first_parent_only || !orig-next) + return orig-item; + + while (list) { + struct commit *commit = list-item; + list = list-next; + if (priority_commit(commit)) { + if (priority) + return NULL; + priority = commit; + } + } + return priority; +} + +/* * The goal is to get REV_TREE_NEW as the result only if the * diff consists of all '+' (and no other changes), REV_TREE_OLD * if the whole diff is removal of old data, and otherwise @@ -502,27 +572,52 @@ static unsigned update_treesame(struct rev_info *revs, struct commit *commit) if (commit-parents commit-parents-next) { unsigned n; struct treesame_state *st; + struct commit_list *p; + unsigned priority_parents; + unsigned priority_change, nonpriority_change; st = lookup_decoration(revs-treesame, commit-object); if (!st) die(update_treesame %s, sha1_to_hex(commit-object.sha1)); - commit-object.flags |= TREESAME; - for (n = 0; n st-nparents; n++) { - if (!st-treesame[n]) { - commit-object.flags = ~TREESAME; - break; - } + priority_parents = 0
[PATCH v3 5/9] revision.c: Make --full-history consider more merges
History simplification previously always treated merges as TREESAME if they were TREESAME to any parent. While this was consistent with the default behaviour, this could be extremely unhelpful when searching detailed history, and could not be overridden. For example, if a merge had ignored a change, as if by -s ours, then: git log -m -p --full-history -Schange file would successfully locate change's addition but would not locate the merge that resolved against it. Futher, simplify_merges could drop the actual parent that a commit was TREESAME to, leaving it as a normal commit marked TREESAME that isn't actually TREESAME to its remaining parent. Now redefine a commit's TREESAME flag to be true only if a commit is TREESAME to _all_ of its parents. This doesn't affect either the default simplify_history behaviour (because partially TREESAME merges are turned into normal commits), or full-history with parent rewriting (because all merges are output). But it does affect other modes. The clearest difference is that --full-history will show more merges - sufficient to ensure that -m -p --full-history log searches can really explain every change to the file, including those changes' ultimate fate in merges. Also modify simplify_merges to recalculate TREESAME after removing a parent. This is achieved by storing per-parent TREESAME flags on the initial scan, so the combined flag can be easily recomputed. This fixes some t6111 failures, but creates a couple of new ones - we are now showing some merges that don't need to be shown. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 6 +- revision.c | 241 - revision.h | 1 + t/t6019-rev-list-ancestry-path.sh | 2 +- t/t6111-rev-list-treesame.sh | 20 +-- 5 files changed, 226 insertions(+), 44 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 50bbff7..1dad341 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -409,10 +409,10 @@ parent lines. the example, we get + --- - I A B N D O + I A B N D O P --- + -`P` and `M` were excluded because they are TREESAME to a parent. `E`, +`M` was excluded because it is TREESAME to both parents. `E`, `C` and `B` were all walked, but only `B` was !TREESAME, so the others do not appear. + @@ -440,7 +440,7 @@ themselves. This results in Compare to '\--full-history' without rewriting above. Note that `E` was pruned away because it is TREESAME, but the parent list of P was rewritten to contain `E`'s parent `I`. The same happened for `C` and -`N`. Note also that `P` was included despite being TREESAME. +`N`. In addition to the above settings, you can change whether TREESAME affects inclusion: diff --git a/revision.c b/revision.c index a67b615..c88ded8 100644 --- a/revision.c +++ b/revision.c @@ -429,10 +429,100 @@ static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit) return retval = 0 (tree_difference == REV_TREE_SAME); } +struct treesame_state { + unsigned int nparents; + unsigned char treesame[FLEX_ARRAY]; +}; + +static struct treesame_state *initialise_treesame(struct rev_info *revs, struct commit *commit) +{ + unsigned n = commit_list_count(commit-parents); + struct treesame_state *st = xcalloc(1, sizeof(*st) + n); + st-nparents = n; + add_decoration(revs-treesame, commit-object, st); + return st; +} + +/* + * Must be called immediately after removing the nth_parent from a commit's + * parent list, if we are maintaining the per-parent treesame[] decoration. + * This does not recalculate the master TREESAME flag - update_treesame() + * should be called to update it after a sequence of treesame[] modifications + * that may have affected it. + */ +static int compact_treesame(struct rev_info *revs, struct commit *commit, unsigned nth_parent) +{ + struct treesame_state *st; + int old_same; + + if (!commit-parents) { + /* +* Have just removed the only parent from a non-merge. +* Different handling, as we lack decoration. +*/ + if (nth_parent != 0) + die(compact_treesame %u, nth_parent); + old_same = !!(commit-object.flags TREESAME); + if (rev_same_tree_as_empty(revs, commit)) + commit-object.flags |= TREESAME; + else + commit-object.flags = ~TREESAME; + return old_same; + } + + st = lookup_decoration(revs-treesame, commit-object); + if (!st || nth_parent = st-nparents) + die(compact_treesame %u
Re: [PATCH] revision.c: Fix a sparse warning
On 04/05/2013 20:25, Ramsay Jones wrote: Sparse issues an 'sole_interesting' not declared. Should it be static? warning. In order to suppress the warning, since this symbol does not need more than file visibility, we simply add the static modifier to its declaration. Thanks! I'll include that fix. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 8/8] revision.c: discount UNINTERESTING parents
On 02/05/2013 23:05, Junio C Hamano wrote: Z...A===X---o---o---B \\/ W---Y OK, I think I understand it, and we are in agreement. For the purpose of the above paragraph, a side branch is what is not on the --ancestry-path, so all of the below examples are about the behaviour when --ancestry-path is used. Ah, in fact, no. In some previous mails I was concentrating on ancestry-path, but those 3 examples were really of ordinary A..B, with W+Y in the INTERESTING set. I think the side-branch logic remains sound and desirable even in the absence of --ancestry-path, so I don't think this is an ancestry-path change. (And from a basic naive usability point of view I'm much more interested in improving the more obvious modes than the rather more obscure --ancestry-path.) Ancestry path forces side branches to be ignored - it's the simple case for ignoring (and understanding) side branches. But if we let other modes know where the bottom is, they too can benefit from reliable side branch logic - we can find out if anything happened on side branches, but we can also ignore them completely if they turn out to be totally irrelevant. When not using ancestry-path, the side branches this patch works on are thosewhich go off and don't come back - they stub off at some UNINTERESTING commit other than the bottom(s). If no other limiting is set, they must have hit an ancestor of our BOTTOM commit(s); simplify-merges could have potentially pruned away if unlimited. And this patch restores that pruning ability - simplify-merges can rewrite them back to just 1 UNINTERESTING merge parent at the boundary (looking like an ancestry-path boundary), then this patch can chuck the boundary merge. Hey presto, irrelevant branch now invisible. And the patch also provides benefits to all other modes. I'll post v3 of the sequence tomorrow - it includes a new test which illustrates the changes - it's a 60-or-so-item test set, with about 15 failures in a variety of modes that get fixed by this sequence. I think that should make an excellent discussion topic. We'll see whether folks agree with my view about what should and shouldn't be shown... Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2.1 8/8] revision.c: discount side branches when computing TREESAME
Add a BOTTOM flag to commit objects, and use it to define priority for pruning. Priority commits are those that are !UNINTERESTING or BOTTOM, and this allows us to identify irrelevant side branches (UNINTERESTING !BOTTOM). If a merge has priority parents, and it is TREESAME to them, then do not let non-priority parents cause the merge to be treated as !TREESAME. When considering simplification, don't always include all merges - merges with exactly 1 priority parent can be simplified, if TREESAME according to the above rule. These two changes greatly increase simplification in limited revision lists - irrelevant merges from unlisted commits can be omitted. Signed-off-by: Kevin Bracey ke...@bracey.fi --- revision.c| 210 -- revision.h| 3 +- t/t6019-rev-list-ancestry-path.sh | 12 ++- 3 files changed, 188 insertions(+), 37 deletions(-) diff --git a/revision.c b/revision.c index ddaed33..2a7f08d 100644 --- a/revision.c +++ b/revision.c @@ -333,6 +333,76 @@ static int everybody_uninteresting(struct commit_list *orig) } /* + * A definition of priority commit that we can use to simplify limited graphs + * by eliminating side branches. + * + * A priority commit is one that is !UNINTERESTING (ie we are including it + * in our list), or that is a specified BOTTOM commit. Then after computing + * a limited list, during processing we can generally ignore boundary merges + * coming from outside the graph, (ie from non-priority parents), and treat + * those merges as if they were single-parent. TREESAME is defined to consider + * only priority parents, if any. If we are TREESAME to our on-graph parents, + * we don't care if we were !TREESAME to non-graph parents. + * + * Including bottom commits in priority ensures that a limited graph's + * connection to the actual bottom commit is not viewed as an uninteresting + * side branch, but treated as part of the graph. For example: + * + * Z...A---X---o---o---B + *. / + * W---Y + * + * When computing A..B, the A-X connection is at least as important as + * Y-X, despite A being flagged UNINTERESTING. + * + * And when computing --ancestry-path A..B, the A-X connection is more + * important than Y-X, despite both A and Y being flagged UNINTERESTING. + */ +static inline int priority_commit(struct commit *commit) +{ + return (commit-object.flags (UNINTERESTING | BOTTOM)) != UNINTERESTING; +} + +/* + * Return a single priority commit from a parent list. If we are a TREESAME + * commit, and this selects one of our parents, then we can safely simplify to + * that parent. + * + * For 1-parent commits, or if first-parent-only, then this returns the only + * parent. TREESAME will have been set purely on that parent. + * + * For multi-parent commits, identify a sole priority parent, if any. + * If we have only one priority parent, then TREESAME will be set purely + * with regard to that parent, and we can choose simplification appropriately. + * + * If we have more than one priority parent, or no priority parents + * (and multiple non-priority ones), then we can't select a parent here and + * return NULL. + */ +struct commit *priority_select_parent(struct rev_info *revs, struct commit_list *orig) +{ + struct commit_list *list = orig; + struct commit *priority = NULL; + + if (!orig) + return NULL; + + if (revs-first_parent_only || !orig-next) + return orig-item; + + while (list) { + struct commit *commit = list-item; + list = list-next; + if (priority_commit(commit)) { + if (priority) + return NULL; + priority = commit; + } + } + return priority; +} + +/* * The goal is to get REV_TREE_NEW as the result only if the * diff consists of all '+' (and no other changes), REV_TREE_OLD * if the whole diff is removal of old data, and otherwise @@ -502,17 +572,28 @@ static unsigned update_treesame(struct rev_info *revs, struct commit *commit) if (commit-parents commit-parents-next) { unsigned n; struct treesame_state *st; + struct commit_list *p; + unsigned priority_parents; + unsigned priority_change, nonpriority_change; st = lookup_decoration(revs-treesame, commit-object); if (!st) die(update_treesame %s, sha1_to_hex(commit-object.sha1)); - commit-object.flags |= TREESAME; - for (n = 0; n st-nparents; n++) { - if (!st-treesame[n]) { - commit-object.flags = ~TREESAME; - break; - } + priority_parents = 0; + priority_change = nonpriority_change = 0
[PATCH v2.2 8/8] revision.c: discount side branches when computing TREESAME
Add a BOTTOM flag to commit objects, and use it to define priority for pruning. Priority commits are those that are !UNINTERESTING or BOTTOM, and this allows us to identify irrelevant side branches (UNINTERESTING !BOTTOM). If a merge has priority parents, and it is TREESAME to them, then do not let non-priority parents cause the merge to be treated as !TREESAME. When considering simplification, don't always include all merges - merges with exactly 1 priority parent can be simplified, if TREESAME according to the above rule. These two changes greatly increase simplification in limited revision lists - irrelevant merges from unlisted commits can be omitted. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Something went wrong with v2.1; it got based on an old version of the series. This one should apply to v2 correctly. revision.c| 216 +- revision.h| 3 +- t/t6019-rev-list-ancestry-path.sh | 12 ++- 3 files changed, 199 insertions(+), 32 deletions(-) diff --git a/revision.c b/revision.c index 20c7058..e5d5091 100644 --- a/revision.c +++ b/revision.c @@ -333,6 +333,76 @@ static int everybody_uninteresting(struct commit_list *orig) } /* + * A definition of priority commit that we can use to simplify limited graphs + * by eliminating side branches. + * + * A priority commit is one that is !UNINTERESTING (ie we are including it + * in our list), or that is a specified BOTTOM commit. Then after computing + * a limited list, during processing we can generally ignore boundary merges + * coming from outside the graph, (ie from non-priority parents), and treat + * those merges as if they were single-parent. TREESAME is defined to consider + * only priority parents, if any. If we are TREESAME to our on-graph parents, + * we don't care if we were !TREESAME to non-graph parents. + * + * Including bottom commits in priority ensures that a limited graph's + * connection to the actual bottom commit is not viewed as an uninteresting + * side branch, but treated as part of the graph. For example: + * + * Z...A---X---o---o---B + *. / + * W---Y + * + * When computing A..B, the A-X connection is at least as important as + * Y-X, despite A being flagged UNINTERESTING. + * + * And when computing --ancestry-path A..B, the A-X connection is more + * important than Y-X, despite both A and Y being flagged UNINTERESTING. + */ +static inline int priority_commit(struct commit *commit) +{ + return (commit-object.flags (UNINTERESTING | BOTTOM)) != UNINTERESTING; +} + +/* + * Return a single priority commit from a parent list. If we are a TREESAME + * commit, and this selects one of our parents, then we can safely simplify to + * that parent. + * + * For 1-parent commits, or if first-parent-only, then this returns the only + * parent. TREESAME will have been set purely on that parent. + * + * For multi-parent commits, identify a sole priority parent, if any. + * If we have only one priority parent, then TREESAME will be set purely + * with regard to that parent, and we can choose simplification appropriately. + * + * If we have more than one priority parent, or no priority parents + * (and multiple non-priority ones), then we can't select a parent here and + * return NULL. + */ +struct commit *priority_select_parent(struct rev_info *revs, struct commit_list *orig) +{ + struct commit_list *list = orig; + struct commit *priority = NULL; + + if (!orig) + return NULL; + + if (revs-first_parent_only || !orig-next) + return orig-item; + + while (list) { + struct commit *commit = list-item; + list = list-next; + if (priority_commit(commit)) { + if (priority) + return NULL; + priority = commit; + } + } + return priority; +} + +/* * The goal is to get REV_TREE_NEW as the result only if the * diff consists of all '+' (and no other changes), REV_TREE_OLD * if the whole diff is removal of old data, and otherwise @@ -502,27 +572,54 @@ static unsigned update_treesame(struct rev_info *revs, struct commit *commit) if (commit-parents commit-parents-next) { unsigned n; struct treesame_state *st; + struct commit_list *p; + unsigned priority_parents; + unsigned priority_change, nonpriority_change; st = lookup_decoration(revs-treesame, commit-object); if (!st) die(update_treesame %s, sha1_to_hex(commit-object.sha1)); - commit-object.flags |= TREESAME; - for (n = 0; n st-nparents; n++) { - if (!st-treesame[n]) { - commit-object.flags = ~TREESAME; - break
Re: [PATCH v2 8/8] revision.c: discount UNINTERESTING parents
On 01/05/2013 00:18, Junio C Hamano wrote: These rules paying more attention to UNINTERESTING do add a tricky wrinkle to behaviour. Because limited revision lists are conventionally expressed as A..B (ie B ^A), the bottom commit is UNINTERESTING. OK. Thus its connection to the INTERESTING graph is not privileged over side branches, I take that its connection refers to the === link below, the nodes connected with --- form the INTERESTING graph, and Z...A===X---o---o---B \\/ W---Y side branches refer to the development that built W and Y and merged at X. And you are saying that A===X is not privileged over Y---X, with some definition of privileged I am not sure about. Okay, that's a good graph. The basic problem is that all the rules above try to identify a merge from an irrelevant branch and eliminate it. But how can we define what a side branch is? I think the rules I state are conceptually sound - a side branch is an extra parent coming in from outside our limited history graph. But the problem is at the bottom. In the event that someone specifies A..B with the above history, we get the resultant graph W-Y-X-o-o-B. A is not on that graph. So with the rules as they stand, A being UNINTERESTING makes it get treated as a side branch of X, which isn't good. A needs to be INTERESTING for the purpose of side-branch logic. So when someone says A..B and generates W-Y-X-o-o-B, we want to know that X's parent path (Z)-W-Y-X is the (possibly irrelevant) side branch, not (A)-X. Example undesirable behaviour, with A treated as a side branch: 1) If only commit A changed our file, and merge X took old version Y for some reason, under these rules --full-history A..B would show nothing. X doesn't consider A because it's UNINTERESTING. If there had been an intervening (even irrelevant) commit A1 between A1 and X, X would have been shown. 2) If only commit A changed our file, and merge X took A, with this rules--simplify-merges A..B would show X, with two rewritten UNINTERESTING parents A and Z. That's not what we want - Z is an irrelevant side-branch in this case. If there had been an intervening (unsimplifiable) commit A1 between A and X, X would have had INTERESTING parent A1, and WY would have been successfully discarded. 3) Even before my new rules, there is one existing place trying to spot side branches, and it can fail here for the same reasons. See simplify_history's even if a merge with an uninteresting side branch brought the entire change we are interested in test. It would do the wrong thing at X, if W had made a change and Y reverted it. git log A..B would show W and Y, which is not what we want. As the scan hits X, it follows parent Y rather than just go for first parent A, because it thinks A is an uninteresting side branch. Again, if there had been an intervening commit A1 before X, X would have followed A1 and W and Y would not have been shown. All 3 cases work if A is treated as INTERESTING for side-branch rules. We shouldn't have needed to put in an extra A1 commit to make them work. # D---E---F # / \ \ +#B---C-G0-G--H---I---J # / \ # A---K---L--M # Conceptually, the ancestry-path shouldn't get affected by any pathspec. The range --ancestry-path G0..M should be equivalent to the range G0..M ^F ^E ^K, and with the rule to ignore non-sameness with uninteresting side branches, I would have expected that H and J would be equally irrelevant, because E and F would be outside the graph we would want to look at sameness. Those two pathspecs produce the same graph of commits, and yes, they've always produced the same thing up until now. But we're trying to do something new(ish) here. We're trying to define side branch, to allow us to make more useful pruning comparisons. And we can't reliably define side branch unless we can reliably define where we're coming from. Looking at this case, given that graph of commits G-H-I-J-L-M (produced by any pathspec/flags), that is easy because it's linear. The bottom of the INTERESTING graph has a single parent, and we can follow it straight from bottom to top. We can deduce that non-merge G is bottom, and thus call the connections to E and F sides. (But that could have been wrong if the graph had been made some other way, and the user wasn't asking for history since G0). But if presented with H-I-J-L-M we get stuck. The lowest commit in our graph has 2 parents. How do we distinguish between E and G? Which is the side, and which is the bottom? We can only define side branch here if we know where our bottom is. The version of this patch as posted can't distinguish whether E or G is more important, so merge H always gets shown. And that makes me unhappy. And note that normal merge-simplification will often prune away boring non-merge commits, leaving the user-specified
[PATCH v2 5/8] t6012: update test for tweaked full-history traversal
From: Junio C Hamano gits...@pobox.com Signed-off-by: Junio C Hamano gits...@pobox.com --- t/t6012-rev-list-simplify.sh | 29 +++-- 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh index dd6dc84..4e55872 100755 --- a/t/t6012-rev-list-simplify.sh +++ b/t/t6012-rev-list-simplify.sh @@ -14,21 +14,24 @@ unnote () { test_expect_success setup ' echo Hi there file - git add file - test_tick git commit -m Initial file + echo initial lost + git add file lost + test_tick git commit -m Initial file and lost note A git branch other-branch echo Hello file - git add file - test_tick git commit -m Modified file + echo second lost + git add file lost + test_tick git commit -m Modified file and lost note B git checkout other-branch echo Hello file - git add file + lost + git add file lost test_tick git commit -m Modified the file identically note C @@ -37,7 +40,9 @@ test_expect_success setup ' test_tick git commit -m Add another file note D - test_tick git merge -m merge master + test_tick + test_must_fail git merge -m merge master + lost git commit -a -m merge note E echo Yet another elif @@ -110,4 +115,16 @@ check_result 'I B A' -- file check_result 'I B A' --topo-order -- file check_result 'H' --first-parent -- another-file +check_result 'E C B A' --full-history E -- lost +test_expect_success 'full history simplification without parent' ' + printf %s\n E C B A expect + git log --pretty=$FMT --full-history E -- lost | + unnote actual + sed -e s/^.* \([^ ]*\) .*/\1/ check actual + test_cmp expect check || { + cat actual + false + } +' + test_done -- 1.8.2.1.632.gd2b1879 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 6/8] simplify-merges: never remove all TREESAME parents
When simplifying an odd merge, such as one that used -s ours, we may find ourselves TREESAME to apparently redundant parents. Prevent simplify_merges() from removing every TREESAME parent; if this would happen reinstate the first TREESAME parent - the one that the default log would have followed. This avoids producing a totally disjoint history from the default log when the default log is a better explanation of the end result, and aids visualisation of odd merges. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 3 +- revision.c | 69 ++ 2 files changed, 71 insertions(+), 1 deletion(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 1dad341..e23bdb0 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -471,7 +471,8 @@ history according to the following rules: + * Replace each parent `P` of `C'` with its simplification `P'`. In the process, drop parents that are ancestors of other parents, and - remove duplicates. + remove duplicates, but take care to never drop all parents that + we are TREESAME to. + * If after this parent rewriting, `C'` is a root or merge commit (has zero or 1 parents), a boundary commit, or !TREESAME, it remains. diff --git a/revision.c b/revision.c index c88ded8..7535757 100644 --- a/revision.c +++ b/revision.c @@ -2119,6 +2119,73 @@ static int mark_redundant_parents(struct rev_info *revs, struct commit *commit) return marked; } +/* + * Awkward naming - this means one parent we are TREESAME to. + * cf mark_treesame_root_parents: root parents that are TREESAME (to an + * empty tree). Better name suggestions? + */ +static int leave_one_treesame_to_parent(struct rev_info *revs, struct commit *commit) +{ + struct treesame_state *ts = lookup_decoration(revs-treesame, commit-object); + struct commit *unmarked = NULL, *marked = NULL; + struct commit_list *p; + unsigned n; + + for (p = commit-parents, n = 0; p; p = p-next, n++) { + if (ts-treesame[n]) { + if (p-item-object.flags TMP_MARK) { + if (!marked) + marked = p-item; + } else { + if (!unmarked) { + unmarked = p-item; + break; + } + } + } + } + + /* +* If we are TREESAME to a marked-for-deletion parent, but not to any +* unmarked parents, unmark the first TREESAME parent. This is the +* parent that the default simplify_history==1 scan would have followed, +* and it doesn't make sense to omit that path when asking for a +* simplified full history. Retaining it improves the chances of +* understanding odd missed merges that took an old version of a file. +* +* Example: +* +* I-X A modified the file, but mainline merge X used +*\ /-s ours, so took the version from I. X is +* `--A--' TREESAME to I and !TREESAME to A. +* +* Default log from X would produce I. Without this check, +* --full-history --simplify-merges would produce I-A-X, showing +* the merge commit X and that it changed A, but not making clear that +* it had just taken the I version. With this check, the topology above +* is retained. +* +* Note that it is possible that the simplification chooses a different +* TREESAME parent from the default, in which case this test doesn't +* activate, and we _do_ drop the default parent. Example: +* +* I--X A modified the file, but it was reverted in B, +*\/ meaning mainline merge X is TREESAME to both +* A--B parents. +* +* Default log would produce I by following the first parent; +* --full-history --simplify-merges will produce I-A-B. But this is a +* reasonable result - it presents a logical full history leading from +* I to X, and X is not an important merge. +*/ + if (!unmarked marked) { + marked-object.flags = ~TMP_MARK; + return 1; + } + + return 0; +} + static int remove_marked_parents(struct rev_info *revs, struct commit *commit) { struct commit_list **pp, *p; @@ -,6 +2289,8 @@ static struct commit_list **simplify_one(struct rev_info *revs, struct commit *c if (1 cnt) { int marked = mark_redundant_parents(revs, commit); if (marked) + marked -= leave_one_treesame_to_parent(revs, commit); + if (marked
[PATCH v2 1/8] decorate.c: compact table when growing
When growing the table, take the opportunity to compact it by removing entries with NULL decoration. Users may have removed decorations by passing NULL to insert_decoration. An object's table entry can't actually be removed during normal operation, as it would break the linear hash collision search. But we can remove NULL decoration entries when rebuilding the table. Signed-off-by: Kevin Bracey ke...@bracey.fi --- decorate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/decorate.c b/decorate.c index 2f8a63e..7cb5d29 100644 --- a/decorate.c +++ b/decorate.c @@ -49,7 +49,7 @@ static void grow_decoration(struct decoration *n) const struct object *base = old_hash[i].base; void *decoration = old_hash[i].decoration; - if (!base) + if (!decoration) continue; insert_decoration(n, base, decoration); } -- 1.8.2.1.632.gd2b1879 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/8] rev-list-options.txt: correct TREESAME for P
In the example given, P is not TREESAME to E. This doesn't affect the current result, but it will matter when we change behaviour. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 3bdbf5e..50bbff7 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -367,8 +367,7 @@ each merge. The commits are: `N` and `D` to foobarbaz; i.e., it is not TREESAME to any parent. * `E` changes `quux` to xyzzy, and its merge `P` combines the - strings to quux xyzzy. Despite appearing interesting, `P` is - TREESAME to all parents. + strings to quux xyzzy. `P` is TREESAME to `O`, but not to `E`. 'rev-list' walks backwards through history, including or excluding commits based on whether '\--full-history' and/or parent rewriting -- 1.8.2.1.632.gd2b1879 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/8] History traversal refinements
Okay, here's what I'll call v2 of this series. In the 3 parts from before (4,6 7), I've addressed the comments from Junio and David, corrected some errors, reconstructed the main commit message, and made some adjustments in preparation for part 8. New part 1 is just me making amends for writing NULL into decoration and leaving cruft behind in part 4. New part 2 expands the ancestry-path test - which is useful because it's full of -s ours merges. New part 3 has a little look at the TREESAME documentation bug - maybe we should add Junio's little asterisk decorations. Part 5 is Junio's test, in the correct place in the sequence. (Not sure if it's valid to send that with git send-email - I'll find out). And finally part 8 is a first attempt at the new UNINTERESTING/TREESAME interaction logic. I'm pretty happy with the results it produces, but it's an even more deep and scary change than the earlier parts. And we obviously need some more new tests - the effects of these changes are almost non-existent on the pre-existing set. I'd like to beg for any volunteers here - I'm not that proficient at shell scripting, and on top of that something like this could really do with an independent set of eyes checking that the claimed benefits actually match the results. (And that the claims are understandable.) Junio C Hamano (1): t6012: update test for tweaked full-history traversal Kevin Bracey (7): decorate.c: compact table when growing t6019: test file dropped in -s ours merge rev-list-options.txt: correct TREESAME for P revision.c: Make --full-history consider more merges simplify-merges: never remove all TREESAME parents simplify-merges: drop merge from irrelevant side branch revision.c: discount UNINTERESTING parents Documentation/rev-list-options.txt | 38 ++-- decorate.c | 2 +- revision.c | 453 + revision.h | 1 + t/t6012-rev-list-simplify.sh | 31 ++- t/t6019-rev-list-ancestry-path.sh | 37 ++- 6 files changed, 494 insertions(+), 68 deletions(-) -- 1.8.2.1.632.gd2b1879 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 7/8] simplify-merges: drop merge from irrelevant side branch
Reimplement commit 4b7f53da on top of the new simplify-merges infrastructure, tightening the condition to only consider root parents; the original version incorrectly dropped parents that were TREESAME to anything. Original log message follows. The merge simplification rule stated in 6546b59 (revision traversal: show full history with merge simplification, 2008-07-31) still treated merge commits too specially. Namely, in a history with this shape: ---o---o---M / x---x---x where three 'x' were on a history completely unrelated to the main history 'o' and do not touch any of the paths we are following, we still said that after simplifying all of the parents of M, 'x' (which is the leftmost 'x' that rightmost 'x simplifies down to) and 'o' (which would be the last commit on the main history that touches the paths we are following) are independent from each other, and both need to be kept. That is incorrect; when the side branch 'x' never touches the paths, it should be removed to allow M to simplify down to the last commit on the main history that touches the paths. Suggested-by: Junio C Hamano gits...@pobox.com Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 34 +- revision.c | 26 +- t/t6012-rev-list-simplify.sh | 2 +- 3 files changed, 47 insertions(+), 15 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index e23bdb0..b7fbc80 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -342,13 +342,13 @@ In the following, we will always refer to the same example history to illustrate the differences between simplification settings. We assume that you are filtering for a file `foo` in this commit graph: --- - .-A---M---N---O---P -/ / / / / - I B C D E -\ / / / / - `-' + .-A---M---N---O---P---Q +/ / / / / / + I B C D E Y +\ / / / / / + `-' X --- -The horizontal line of history A---P is taken to be the first parent of +The horizontal line of history A---Q is taken to be the first parent of each merge. The commits are: * `I` is the initial commit, in which `foo` exists with contents @@ -369,6 +369,10 @@ each merge. The commits are: * `E` changes `quux` to xyzzy, and its merge `P` combines the strings to quux xyzzy. `P` is TREESAME to `O`, but not to `E`. +* `X` is an indpendent root commit that added a new file `side`, and `Y` + modified it. `Y` is TREESAME to `X`. Its merge `Q` added `side` to `P`, and + `Q` is TREESAME to `P`, but not to `Y`. + 'rev-list' walks backwards through history, including or excluding commits based on whether '\--full-history' and/or parent rewriting (via '\--parents' or '\--children') are used. The following settings @@ -409,7 +413,7 @@ parent lines. the example, we get + --- - I A B N D O P + I A B N D O P Q --- + `M` was excluded because it is TREESAME to both parents. `E`, @@ -430,7 +434,7 @@ Along each parent, prune away commits that are not included themselves. This results in + --- - .-A---M---N---O---P + .-A---M---N---O---P---Q / / / / / I B / D / \ / / / / @@ -440,7 +444,7 @@ themselves. This results in Compare to '\--full-history' without rewriting above. Note that `E` was pruned away because it is TREESAME, but the parent list of P was rewritten to contain `E`'s parent `I`. The same happened for `C` and -`N`. +`N`, and `X`, `Y` and `Q`. In addition to the above settings, you can change whether TREESAME affects inclusion: @@ -470,9 +474,9 @@ history according to the following rules: * Set `C'` to `C`. + * Replace each parent `P` of `C'` with its simplification `P'`. In - the process, drop parents that are ancestors of other parents, and - remove duplicates, but take care to never drop all parents that - we are TREESAME to. + the process, drop parents that are ancestors of other parents or that are + root commits TREESAME to an empty tree, and remove duplicates, but take care + to never drop all parents that we are TREESAME to. + * If after this parent rewriting, `C'` is a root or merge commit (has zero or 1 parents), a boundary commit, or !TREESAME, it remains. @@ -490,7 +494,7 @@ The effect of this is best shown by way of comparing
[PATCH v2 4/8] revision.c: Make --full-history consider more merges
History simplification previously always treated merges as TREESAME if they were TREESAME to any parent. While this was consistent with the default behaviour, this could be extremely unhelpful when searching detailed history, and could not be overridden. For example, if a merge had ignored a change, as if by -s ours, then: git log -m -p --full-history -Schange file would successfully locate change's addition but would not locate the merge that resolved against it. Futher, simplify_merges could drop the actual parent that a commit was TREESAME to, leaving it as a normal commit marked TREESAME that isn't actually TREESAME to its remaining parent. Now redefine a commit's TREESAME flag to be true only if a commit is TREESAME to _all_ of its parents. This doesn't affect either the default simplify_history behaviour (because partially TREESAME merges are turned into normal commits), or full-history with parent rewriting (because all merges are output). But it does affect other modes. The clearest difference is that --full-history will show more merges - sufficient to ensure that -m -p --full-history log searches can really explain every change to the file, including those changes' ultimate fate in merges. Also modify simplify_merges to recalculate TREESAME after removing a parent. This is achieved by storing per-parent TREESAME flags on the initial scan, so the combined flag can be easily recomputed. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 6 +- revision.c | 241 - revision.h | 1 + t/t6019-rev-list-ancestry-path.sh | 2 +- 4 files changed, 216 insertions(+), 34 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 50bbff7..1dad341 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -409,10 +409,10 @@ parent lines. the example, we get + --- - I A B N D O + I A B N D O P --- + -`P` and `M` were excluded because they are TREESAME to a parent. `E`, +`M` was excluded because it is TREESAME to both parents. `E`, `C` and `B` were all walked, but only `B` was !TREESAME, so the others do not appear. + @@ -440,7 +440,7 @@ themselves. This results in Compare to '\--full-history' without rewriting above. Note that `E` was pruned away because it is TREESAME, but the parent list of P was rewritten to contain `E`'s parent `I`. The same happened for `C` and -`N`. Note also that `P` was included despite being TREESAME. +`N`. In addition to the above settings, you can change whether TREESAME affects inclusion: diff --git a/revision.c b/revision.c index a67b615..c88ded8 100644 --- a/revision.c +++ b/revision.c @@ -429,10 +429,100 @@ static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit) return retval = 0 (tree_difference == REV_TREE_SAME); } +struct treesame_state { + unsigned int nparents; + unsigned char treesame[FLEX_ARRAY]; +}; + +static struct treesame_state *initialise_treesame(struct rev_info *revs, struct commit *commit) +{ + unsigned n = commit_list_count(commit-parents); + struct treesame_state *st = xcalloc(1, sizeof(*st) + n); + st-nparents = n; + add_decoration(revs-treesame, commit-object, st); + return st; +} + +/* + * Must be called immediately after removing the nth_parent from a commit's + * parent list, if we are maintaining the per-parent treesame[] decoration. + * This does not recalculate the master TREESAME flag - update_treesame() + * should be called to update it after a sequence of treesame[] modifications + * that may have affected it. + */ +static int compact_treesame(struct rev_info *revs, struct commit *commit, unsigned nth_parent) +{ + struct treesame_state *st; + int old_same; + + if (!commit-parents) { + /* +* Have just removed the only parent from a non-merge. +* Different handling, as we lack decoration. +*/ + if (nth_parent != 0) + die(compact_treesame %u, nth_parent); + old_same = !!(commit-object.flags TREESAME); + if (rev_same_tree_as_empty(revs, commit)) + commit-object.flags |= TREESAME; + else + commit-object.flags = ~TREESAME; + return old_same; + } + + st = lookup_decoration(revs-treesame, commit-object); + if (!st || nth_parent = st-nparents) + die(compact_treesame %u, nth_parent); + + old_same = st-treesame[nth_parent]; + memmove(st-treesame + nth_parent, + st-treesame + nth_parent + 1, + st-nparents
[PATCH v2 8/8] revision.c: discount UNINTERESTING parents
The simplification and rewriting logic previously paid little heed to whether parents were UNINTERESTING, leading to situations where limited histories could unnecessarily include a lot of irrelevant merges along the boundary. Tighten up the rules to properly account for limited lists: 1) If a merge has INTERESTING parents, and it is TREESAME to them, then do not let UNINTERESTING parents cause the merge to be treated as !TREESAME. If we match our walked parents, we don't care if we don't match unwalked parents. 2) Do not let UNINTERESTING parents prevent commits from being simplified or omitted: merges with exactly 1 INTERESTING parent that they are TREESAME to can be treated as a non-merge commit. 3) When rewriting parents, we don't need to show all merges - only merges with 2 or more INTERESTING parents are required to hold topology together. These changes greatly increase simplification in pruned, limited revision lists - irrelevant merges from unlisted or partially listed side branches can be omitted. These rules paying more attention to UNINTERESTING do add a tricky wrinkle to behaviour. Because limited revision lists are conventionally expressed as A..B (ie B ^A), the bottom commit is UNINTERESTING. Thus its connection to the INTERESTING graph is not privileged over side branches, and this can lead to its first descendant merge being shown for no particularly good reason. See t6019's --ancestry-path G..M -- G.t for an example of this effect. Merges H and J are semantically identical and equally irrelevant, from the point of view of tracking the history of G.t, but H is shown and J isn't. Bottom commit G is marked UNINTERESTING, and thus isn't privileged over E, so H is shown because it differs from E. Whereas higher up the graph, I is INTERESTING and thus privileged over F, so we don't care that J differs from F. So should we treat bottom commits as interesting for the rules above? Signed-off-by: Kevin Bracey ke...@bracey.fi --- revision.c| 141 -- t/t6019-rev-list-ancestry-path.sh | 20 -- 2 files changed, 134 insertions(+), 27 deletions(-) diff --git a/revision.c b/revision.c index 20c7058..aa814bc 100644 --- a/revision.c +++ b/revision.c @@ -333,6 +333,45 @@ static int everybody_uninteresting(struct commit_list *orig) } /* + * Compute whether we have only one interesting parent. If we are TREESAME, + * and this returns a parent, then we can safely simplify to that parent. + * + * For 1-parent commits, or if first-parent-only, then this returns the only + * parent (whether UNINTERESTING or not, that is our interesting parent). + * TREESAME will have been set purely on that parent. + * + * For multi-parent commits, identify a sole INTERESTING parent, if any. + * If we have only 1 INTERESTING parent, then TREESAME will be set purely + * with regard to that parent, and we can choose simplification appropriately. + * + * If we have more than one INTERESTING parent, or no INTERESTING parents + * (and multiple UNINTERESTING ones), then we can't choose a parent to follow, + * and we should not be simplified. + */ +struct commit *sole_interesting(struct rev_info *revs, struct commit_list *orig) +{ + struct commit_list *list = orig; + struct commit *interesting = NULL; + + if (!orig) + return NULL; + + if (revs-first_parent_only || !orig-next) + return orig-item; + + while (list) { + struct commit *commit = list-item; + list = list-next; + if (!(commit-object.flags UNINTERESTING)) { + if (interesting) + return NULL; + interesting = commit; + } + } + return interesting; +} + +/* * The goal is to get REV_TREE_NEW as the result only if the * diff consists of all '+' (and no other changes), REV_TREE_OLD * if the whole diff is removal of old data, and otherwise @@ -502,27 +541,53 @@ static unsigned update_treesame(struct rev_info *revs, struct commit *commit) if (commit-parents commit-parents-next) { unsigned n; struct treesame_state *st; + struct commit_list *p; + unsigned interesting_parents; + unsigned uninteresting_change, interesting_change; st = lookup_decoration(revs-treesame, commit-object); if (!st) die(update_treesame %s, sha1_to_hex(commit-object.sha1)); - commit-object.flags |= TREESAME; - for (n = 0; n st-nparents; n++) { - if (!st-treesame[n]) { - commit-object.flags = ~TREESAME; - break; + interesting_parents = 0; + uninteresting_change = interesting_change = 0; + for (p = commit-parents, n = 0; p; n++, p = p-next
[PATCH v2 2/8] t6019: test file dropped in -s ours merge
In preparation for upcoming TREESAME work, check the result for G.t, which is dropped in -s ours merge L. The default rev-list is empty, as expected - it follows the first parent path where it never existed. Unfortunately, --ancestry-path is also empty. Merges H J and L are all TREESAME to 1 parent, so are treated as TREESAME and not shown. This is clearly undesirable in the case of merge L, which dropped our G.t by taking the non-ancestry-path version. Document this as a known failure, and expect H J L, the 3 merges along the path that had to choose between G.t versions. Signed-off-by: Kevin Bracey ke...@bracey.fi --- t/t6019-rev-list-ancestry-path.sh | 21 - 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh index 39b4cb0..86ef908 100755 --- a/t/t6019-rev-list-ancestry-path.sh +++ b/t/t6019-rev-list-ancestry-path.sh @@ -13,6 +13,9 @@ test_description='--ancestry-path' # # D..M -- M.t == M # --ancestry-path D..M -- M.t == M +# +# G..M -- G.t == [nothing - was dropped in -s ours merge L] +# --ancestry-path G..M -- G.t == H J L . ./test-lib.sh @@ -63,13 +66,29 @@ test_expect_success 'rev-list D..M -- M.t' ' test_cmp expect actual ' -test_expect_success 'rev-list --ancestry-patch D..M -- M.t' ' +test_expect_success 'rev-list --ancestry-path D..M -- M.t' ' echo M expect git rev-list --ancestry-path --format=%s D..M -- M.t | sed -e /^commit /d actual test_cmp expect actual ' +# G.t is dropped in an -s ours merge +test_expect_success 'rev-list G..M -- G.t' ' + expect + git rev-list --format=%s G..M -- G.t | + sed -e /^commit /d actual + test_cmp expect actual +' + +test_expect_failure 'rev-list --ancestry-path G..M -- G.t' ' + for c in H J L; do echo $c; done expect + git rev-list --ancestry-path --format=%s G..M -- G.t | + sed -e /^commit /d | + sort actual + test_cmp expect actual +' + # b---bc # / \ / # a X -- 1.8.2.1.632.gd2b1879 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/3] revision.c: tighten up TREESAME handling of merges
On 28/04/2013 21:38, Junio C Hamano wrote: @@ -773,6 +861,9 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li * NEEDSWORK: decide if we want to remove parents that are * not marked with TMP_MARK from commit-parents for commits * in the resulting list. We may not want to do that, though. +* +* Maybe it should be considered if we are TREESAME to such +* parents - now possible with stored per-parent flags. */ Hmm, that is certainly a thought. My comment's wrong though. Reconsidering, what I think needs removing is actually off-ancestry parents that we are !TREESAME to, when we are TREESAME on the ancestry path. I thought I read you meant exactly that, i.e. !TREESAME, but now I re-read what is quoted, you did say we are TREESAME ;-). I think I agree with you that we do not want any side branch that is not on the ancestry path we are interested in to affect the sameness assigned to the merge commit. I did a trial implementation of this in limit_to_ancestry(), and the result was lovely, but in the end I decided it's not actually the right place to do it. The logic is more general than that; this isn't just an ancestry-path issue, and I think hiding parents isn't the right way to go about it anyway. To slightly generalise your own wording: I think the rule is we do not want any side branch that is UNINTERESTING to affect the sameness assigned to the merge commit. I think that rule applies to all dense, pruned modes. Having experimented with some of the annoyingly complex merge paths that originally prompted this series, it seems this rule makes a huge difference, and it's useful whether asking --simplify-merges A..B file or --ancestry-path A..B file. At present, either query will show lots of really boring merge commits of topic branches at the boundary, with 1 INTERESTING parent that they're TREESAME too, and 1 UNINTERESTING parent that they may or may not be TREESAME to, depending on how old the base of that topic branch was. Most such commits are of no relevance to our history whatsoever. In the case of --simplify-merges, the fact that they're UNINTERESTING actually _prevented_ their simplification - if it had been allowed to follow the UNINTERESTING path back further, it would have reached an ancestor, and been found redundant. So limiting the rev-list actually increases the number of merges shown. We can lose all those boring commits with these two changes: 1) Previously TREESAME was defined as this commit matches at least 1 parent. My first patch changes it to this commit matches all parents. It should be refined further to this commit matches all INTERESTING parents, if it has any, else all (UNINTERESTING) parents. (Can we word that better?) Note that this fancy rule collapses to the same straightforward TREESAME check as ever for 0- or 1-parent commits. 2) simplify_merges currently will not simplify commits unless they have exactly 1 parent. That's not what we want. We only need to preserve commits that don't have exactly 1 INTERESTING parent. Those 2 rules produce the desirable result: if we have a merge commit with exactly 1 INTERESTING parent it is TREESAME to, it is always simplified away - any other UNINTERESTING parents it may have did not affect our code, so we don't care about whether we were TREESAME to them or not, and as we don't want to see any of the UNINTERESTING parents themselves, the merge is not worth showing. This makes a massive difference on some of my searches, reducing the total commits shown by a factor of 5 to 10, greatly improving the signal-to-noise ratio. I'll put together a trial patch at the end of the next iteration of the series that implements this logic. I need to think a bit more - I think get_commit_action needs a similar INTERESTING check for merges too, to get the same sort of effect without relying on simplify_merges. Parent rewriting shouldn't necessitate keeping all merges - only merges with 2+ INTERESTING parents. * * .-A---M---N---O---P /*/ /* /* /* I B C D E \ /* / /* / `-' I've added '*' next to each arc between a commit-pair whose contents at 'foo' are different to the illustration, following the set-up the manual describes. E is the same as I for 'foo' and P would resolve 'foo' to be the same as O. I think that sort of thing could be a useful patch to the docs. Given this error, and this change, I think this example may want a slight rethink. Do we want a proper messing with other paths but TREESAME merge example? Say if E's parent was O, P would not be TREESAME and not included in --full-history. I am not sure if I follow your last sentence. Do you mean this topology, where E's sole parent is O, i.e. E / \ N---O---P /*
Re: [RFC/PATCH 2/3] simplify-merges: never remove all TREESAME parents
On 28/04/2013 02:02, Junio C Hamano wrote: Kevin Bracey ke...@bracey.fi writes: In the event of an odd merge, we may find ourselves TREESAME to apparently redundant parents. Prevent simplify_merges() from removing every TREESAME parent - in the event of such a merge it's useful to see where we came actually from came. came from :) @@ -2106,8 +2106,32 @@ static int remove_marked_parents(struct rev_info *revs, struct commit *commit) { struct treesame_state *ts = lookup_decoration(revs-treesame, commit-object); struct commit_list **pp, *p; + struct commit *su = NULL, *sm = NULL; What do su and sm stand for? same, unmarked and same, marked were what was in my head. Could you explain here a bit more the reason why we do not want to remove them and why -s ours is so significant that it deserves to be singled out? And why randomly picking one that is redundant (because it is an ancestor of some other parent) is an improvement? I feel it's consistent with the default non-full-history behaviour. The parent that we choose not to remove is the same one that the default log with simplify_history==1 would have followed: the first parent we are TREESAME to. Or at least that's the intent. So this parent would normally be singled out, and it's not an arbitrary (or random) choice. It feels wrong to me that --full-history --simplify-merges could produce a disjoint history from the default. I think it should include the default simple history. If this code triggered, it should produce something quite distinctive in a graphical view, which will help find odd/broken merges. The remove-redundant logic first marks commits that are ancestors of other commits in the original set, without taking treesame[] into account at all. If the final objective of the code is to keep paths that consists of non-treesame[] commits, perhaps the logic needs to be changed to reject non-treesame[] commits that are ancestors of other non-treesame[] commits, or something? Maybe - the main simplify_merges machinery could certainly make use of the treesame[] information, and one of the motivations for introducing it was to open up the possibility of richer analysis. I'll spend a little time to think about whether we want a more fundamental change to the original scan. But this patch as it stands was an easy change to make with clearly limited scope and relatively little risk - I specifically wanted just to include our default simple parent. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/3] revision.c: tighten up TREESAME handling of merges
On 28/04/2013 01:36, Junio C Hamano wrote: Kevin Bracey ke...@bracey.fi writes: Historically TREESAME was set on a commit if it was TREESAME to _any_ of its parents. This is not optimal, as such a merge could still be worth showing, particularly if it is an odd -s ours merge that (possibly accidentally) dropped a change. ... and with options like --full-history or --simplify-merges are used to get more complete history, I think. git log path without these options is a tool to get one version of simplified history that explains the end result, and by definition, the side branch merged by -s ours did _not_ contribute anything to the end result. Yeah, I'm not happy with this commit message - I knocked it up separately from my first pass, which I didn't have to hand. Next version will combine it with the original, which better distinguished the default mode, and specifically addressed the --full-history -S search problem. That's key - that I really want such searches to be able to track the entire life of a change on a side branch, not potentially showing just its birth as now, but also always including any ultimate merge death. (I think that we may be able to refine --ancestry-path to give an even tighter pinpoint, but --full-history should definitely include the information, as per its name). Do we want to discard the decoration data when the commit becomes a non-merge? Would seem reasonable, and would also help make concrete why we update TREESAME immediately, and not in update_treesame(), but I didn't spot a mechanism to discard decoration. I'll recheck. + commit-object.flags |= TREESAME; + for (n = 0; n st-nparents; n++) { + if (!st-treesame[n]) { + commit-object.flags = ~TREESAME; + break; + } + } Can a commit that earlier was marked as TREESAME become not TREESAME? Wouldn't simplification only increase sameness, never decrease? That's true - I paid attention to that earlier when it really mattered due to the cost of recalculating it with try_to_simplify_commit(). Not sure that it matters so much any more, and I don't see how we can use that information to change this scan for !treesame loop. I could insert an if (!commit-object.flags TREESAME) test to skip the entire update. I'd be inclined to do that as the caller of update_treesame(). I think update_treesame() itself should be general-purpose without assumptions about what changes have been made, so it's a pure treesame[]-TREESAME calculation, without TREESAME as an input. (Aside - just occurred to me we could swap the loop for strlen(st-treesame) == st-nparents, if we kept a zero terminator in the array. Maybe a bit too smart-ass?) + for (pp = commit-parents; +(parent = *pp) != NULL; +pp = parent-next, nth_parent++) { I see the reason to change from while to for is because you wanted to count, and I think it makes sense; but it is more readable to initialise the counter here, too, if that is the case. I.e. for (pp = commit-parents, nth_parent = 0; !(parent = *pp); pp = parent-next, nth_parent++) { Agree on nth_parent, but !(parent = *pp) isn't (parent = *pp) != NULL, mind. Did you mean !!? In which case I still prefer it my way. + if (!tree_changed) + ts-treesame[0] = 1; Have we made any two tree comparison at this point to set this one? Ahh, this is tricky. You do this in the _second_ iteration of the loop, so tree_changed here is from inspecting the first parent, not the one we are looking at (i.e. *p). Yes, this is the we've reached our second iteration, so from now on we're dealing a merge if {} block. I'll clarify this in the comment at the top, and note that we're populating the newly-allocated treesame[] from our first iteration. @@ -773,6 +861,9 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li * NEEDSWORK: decide if we want to remove parents that are * not marked with TMP_MARK from commit-parents for commits * in the resulting list. We may not want to do that, though. +* +* Maybe it should be considered if we are TREESAME to such +* parents - now possible with stored per-parent flags. */ Hmm, that is certainly a thought. My comment's wrong though. Reconsidering, what I think needs removing is actually off-ancestry parents that we are !TREESAME to, when we are TREESAME on the ancestry path. I've realised while testing this that there's been one thing that's confused me repeatedly, and I think this comment was an example of it. The example in the rev-list-options manual is wrong. .-A---M---N---O---P / / / / / I B C D E
Re: [RFC/PATCH] Make --full-history consider more merges
On 25/04/2013 21:19, Junio C Hamano wrote: How many decorations are we talking about here? One for each merge commit in the entire history? Do we have a cue that can tell us when we are really done with a commit that lets us discard the associated data as we go on digging, keeping the size of our working set somewhat bounded, perhaps proportional to the number of commits in our rev_info-commits queue? As it stands, it's one decoration per interesting merge commit that's processed by try_to_simplify_commit(), if simplify_merges is set. (Only simplify_merges needs the information at present - conceivably limit_to_ancestry could use it too). I don't know how to go about discarding the data, or when it could be done - I'm not clear enough on the wider sequencing machinery in revision.c yet. I have a first draft of a set of patches, which will follow this message. 1/3 addresses the initial get simplify_merges right, passing your test. That patch is written to make it easy to extend with the next two, which add a never eliminate all our TREESAME parents rule, and reinstate your revertedeliminate irrelevant side branches. The error I spotted in that was that you weren't checking that you were eliminating root commits - with that fixed it seems sound to me. I need to create a new test for patch 2, but this version passes the full test suite, including fixing the known failure in t6012, and it also gets the examples in the manual right. (I've extended the examples to include the irrelevant side branch - not sure this is worthwhile, as it's such an unusual case, and I think that section is confusing enough for newbies...) Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC/PATCH 1/3] revision.c: tighten up TREESAME handling of merges
Historically TREESAME was set on a commit if it was TREESAME to _any_ of its parents. This is not optimal, as such a merge could still be worth showing, particularly if it is an odd -s ours merge that (possibly accidentally) dropped a change. And the problem is further compounded by the fact that simplify_merges could drop the actual parent that a commit was TREESAME to, leaving it as a normal commit marked TREESAME that isn't actually TREESAME to its remaining parent. This commit redefines TREESAME to be true only if a commit is TREESAME to _all_ of its parents. This doesn't affect either the default simplify_history behaviour (because partially TREESAME merges are turned into normal commits), or full-history with parent rewriting (because all merges are output). But it does affect other modes. It also modifies simplify_merges to recalculate TREESAME after removing a parent. This is achieved by storing per-parent TREESAME flags on the initial scan, so the combined flag can be easily recomputed. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 2 +- revision.c | 220 - revision.h | 1 + 3 files changed, 192 insertions(+), 31 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 3bdbf5e..380db48 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -413,7 +413,7 @@ parent lines. I A B N D O --- + -`P` and `M` were excluded because they are TREESAME to a parent. `E`, +`P` and `M` were excluded because they are TREESAME to both parents. `E`, `C` and `B` were all walked, but only `B` was !TREESAME, so the others do not appear. + diff --git a/revision.c b/revision.c index a67b615..176eb7b 100644 --- a/revision.c +++ b/revision.c @@ -429,10 +429,85 @@ static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit) return retval = 0 (tree_difference == REV_TREE_SAME); } +struct treesame_state { + unsigned int nparents; + unsigned char treesame[FLEX_ARRAY]; +}; + +static struct treesame_state *initialise_treesame(struct rev_info *revs, struct commit *commit) +{ + unsigned n = commit_list_count(commit-parents); + struct treesame_state *st = xcalloc(1, sizeof(*st) + n); + st-nparents = n; + add_decoration(revs-treesame, commit-object, st); + return st; +} + +static int compact_treesame(struct rev_info *revs, struct commit *commit, unsigned parent) +{ + struct treesame_state *st = lookup_decoration(revs-treesame, commit-object); + int old_same; + + if (!st || parent = st-nparents) + die(compact_treesame %u, parent); + + /* Can be useful to indicate sameness of removed parent */ + old_same = st-treesame[parent]; + + memmove(st-treesame + parent, + st-treesame + parent + 1, + st-nparents - parent - 1); + + /* If we've just become a non-merge commit, update TREESAME +* immediately, in case we trigger an early-exit optimisation. +* (Note that there may be a mismatch between commit-parents and the +* treesame_state at this stage, as we may be in mid-rewrite). +* If still a merge, defer update until update_treesame(). +*/ + switch (--st-nparents) { + case 0: + if (rev_same_tree_as_empty(revs, commit)) + commit-object.flags |= TREESAME; + else + commit-object.flags = ~TREESAME; + break; + case 1: + if (st-treesame[0] revs-dense) + commit-object.flags |= TREESAME; + else + commit-object.flags = ~TREESAME; + break; + } + + return old_same; +} + +static unsigned update_treesame(struct rev_info *revs, struct commit *commit) +{ + /* If 0 or 1 parents, TREESAME should be valid already */ + if (commit-parents commit-parents-next) { + int n = 0; + struct treesame_state *st; + + st = lookup_decoration(revs-treesame, commit-object); + if (!st) die(update_treesame); + commit-object.flags |= TREESAME; + for (n = 0; n st-nparents; n++) { + if (!st-treesame[n]) { + commit-object.flags = ~TREESAME; + break; + } + } + } + + return commit-object.flags TREESAME; +} + static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) { struct commit_list **pp, *parent; - int tree_changed = 0, tree_same = 0, nth_parent = 0; + struct treesame_state *ts = NULL; + int tree_changed = 0
[RFC/PATCH 2/3] simplify-merges: never remove all TREESAME parents
In the event of an odd merge, we may find ourselves TREESAME to apparently redundant parents. Prevent simplify_merges() from removing every TREESAME parent - in the event of such a merge it's useful to see where we came actually from came. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 3 ++- revision.c | 24 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 380db48..0832004 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -472,7 +472,8 @@ history according to the following rules: + * Replace each parent `P` of `C'` with its simplification `P'`. In the process, drop parents that are ancestors of other parents, and - remove duplicates. + remove duplicates, but take care to never drop all parents that + we are TREESAME to. + * If after this parent rewriting, `C'` is a root or merge commit (has zero or 1 parents), a boundary commit, or !TREESAME, it remains. diff --git a/revision.c b/revision.c index 176eb7b..4e27c9a 100644 --- a/revision.c +++ b/revision.c @@ -2106,8 +2106,32 @@ static int remove_marked_parents(struct rev_info *revs, struct commit *commit) { struct treesame_state *ts = lookup_decoration(revs-treesame, commit-object); struct commit_list **pp, *p; + struct commit *su = NULL, *sm = NULL; int n, removed = 0; + /* Prescan - look for both marked and unmarked TREESAME parents */ + for (p = commit-parents, n = 0; p; p = p-next, n++) { + if (ts-treesame[n]) { + if (p-item-object.flags TMP_MARK) { + if (!sm) sm = p-item; + } + else { + if (!su) { + su = p-item; + break; + } + } + } + } + + /* If we are TREESAME to a marked-for-deletion parent, but not to any +* unmarked parents, unmark the first TREESAME parent. We don't want +* to remove our real parent in the event of an -s ours type +* merge. +*/ + if (!su sm) + sm-object.flags = ~TMP_MARK; + pp = commit-parents; n = 0; while ((p = *pp) != NULL) { -- 1.8.2.1.632.gd2b1879 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC/PATCH 3/3] simplify-merges: drop merge from irrelevant side branch
Reimplement commit 4b7f53da on top of the new simplify-merges infrastructure, tightening the condition to only consider root parents; the original version incorrectly dropped parents that were TREESAME to anything. Original log message follows. The merge simplification rule stated in 6546b59 (revision traversal: show full history with merge simplification, 2008-07-31) still treated merge commits too specially. Namely, in a history with this shape: ---o---o---M / x---x---x where three 'x' were on a history completely unrelated to the main history 'o' and do not touch any of the paths we are following, we still said that after simplifying all of the parents of M, 'x' (which is the leftmost 'x' that rightmost 'x simplifies down to) and 'o' (which would be the last commit on the main history that touches the paths we are following) are independent from each other, and both need to be kept. That is incorrect; when the side branch 'x' never touches the paths, it should be removed to allow M to simplify down to the last commit on the main history that touches the paths. Suggested-by: Junio C Hamano gits...@pobox.com Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/rev-list-options.txt | 35 ++- revision.c | 26 +- t/t6012-rev-list-simplify.sh | 2 +- 3 files changed, 48 insertions(+), 15 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 0832004..db45401 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -342,13 +342,13 @@ In the following, we will always refer to the same example history to illustrate the differences between simplification settings. We assume that you are filtering for a file `foo` in this commit graph: --- - .-A---M---N---O---P -/ / / / / - I B C D E -\ / / / / - `-' + .-A---M---N---O---P---Q +/ / / / / / + I B C D E Y +\ / / / / / + `-' X --- -The horizontal line of history A---P is taken to be the first parent of +The horizontal line of history A---Q is taken to be the first parent of each merge. The commits are: * `I` is the initial commit, in which `foo` exists with contents @@ -370,6 +370,10 @@ each merge. The commits are: strings to quux xyzzy. Despite appearing interesting, `P` is TREESAME to all parents. +* `X` is an indpendent root commit that added a new file `side`, and `Y` + modified it. `Y` is TREESAME to `X`. Its merge `Q` added `side` to `P`, and + `Q` is TREESAME to all parents. + 'rev-list' walks backwards through history, including or excluding commits based on whether '\--full-history' and/or parent rewriting (via '\--parents' or '\--children') are used. The following settings @@ -413,7 +417,7 @@ parent lines. I A B N D O --- + -`P` and `M` were excluded because they are TREESAME to both parents. `E`, +`Q`, `P` and `M` were excluded because they are TREESAME to both parents. `E`, `C` and `B` were all walked, but only `B` was !TREESAME, so the others do not appear. + @@ -431,7 +435,7 @@ Along each parent, prune away commits that are not included themselves. This results in + --- - .-A---M---N---O---P + .-A---M---N---O---P---Q / / / / / I B / D / \ / / / / @@ -441,7 +445,8 @@ themselves. This results in Compare to '\--full-history' without rewriting above. Note that `E` was pruned away because it is TREESAME, but the parent list of P was rewritten to contain `E`'s parent `I`. The same happened for `C` and -`N`. Note also that `P` was included despite being TREESAME. +`N`, and `X`, `Y` and `Q`. Note also that `P` and `Q` were included despite +being TREESAME. In addition to the above settings, you can change whether TREESAME affects inclusion: @@ -471,9 +476,9 @@ history according to the following rules: * Set `C'` to `C`. + * Replace each parent `P` of `C'` with its simplification `P'`. In - the process, drop parents that are ancestors of other parents, and - remove duplicates, but take care to never drop all parents that - we are TREESAME to. + the process, drop parents that are ancestors of other parents or that are + root commits TREESAME to an empty tree, and remove duplicates, but take care + to never drop all parents that we are TREESAME to. + * If after this parent rewriting, `C'` is a root or merge commit (has zero or 1 parents), a boundary commit, or !TREESAME
Re: [RFC/PATCH] Make --full-history consider more merges
On 25/04/2013 04:59, Junio C Hamano wrote: Junio C Hamano gits...@pobox.com writes: So, given all that, revised patch below: I tried to squeeze the minimum test I sent $gmane/220919 to the test suite. I think the do not use --parents option for this test switch needs to be cleaned up a bit more, but it fails without your patch and does pass with your patch. I somehow was hoping that your fix to TREESAME semantics would also correct the known breakage documented in that test, but it seems that I was too greedy ;-) Thanks for the test addition. Maybe we will be able to satisfy your greed in this series. There could be more worth doing here, and I think getting TREESAME precise is key. I think I do want to take the step of storing treesame per parent. And once we do that, as well as avoiding the expensive re-diff, we have much richer information readily available as a simplification input (and output). I'm working on a patch that does this - filling in an initial treesame[] array as a decoration in try_to_simplify_commit() is easy, and maintaining the array through later parent rewrites isn't as onerous as I feared - there are only a few places that rewrite parents after the initial scan. With a couple of helper functions to do things like delete nth, I think it'll be quite tidy. I believe that simplify_merges itself needs at least one addition, and could use the treesame[] array to do it: if after doing reduce_heads, a commit is now different to all remaining parents, but there was a TREESAME parent eliminated, that parent should be reinstated. That would clearly highlight missed merges, showing both that older TREESAME parent and the newer !TREESAME parent that would have been taken in a normal merge. And maybe there's more simplify_merges could do, if it had this full TREESAME information available. (But even after you do all this stuff to get the right commits out, we then hit a niggle of mine that gitk forces --cc diffs - even if it shows shows the offending merge commit, you can't get it to do a diff...) Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] Make --full-history consider more merges
On 25/04/2013 19:51, Junio C Hamano wrote: Kevin Bracey ke...@bracey.fi writes: Thanks for the test addition. Maybe we will be able to satisfy your greed in this series. There could be more worth doing here, and I think getting TREESAME precise is key. It is perfectly fine to do things one step at a time. Let's get the --full-history change into a good shape first and worry about the more complex case after we are done. So do you see the rerun of try_to_simplify_commit() as acceptable? I'm really not happy with it - it was fine for an initial proof-of-concept, but it's an obvious waste of CPU cycles to recompute something we already figured out, and I'm uncomfortable with the fact that the function potentially does more than just compute TREESAME; by inspection it seems safe given the known context inside simplify_merges(), but it feels like something waiting to trip us up. The latter could be dealt with by breaking try_to_simplify_commit() up, but that feels like a diversion. I'd rather just get on and make this first patch store and use the per-parent-treesame flags if feasible. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] Make --full-history consider more merges
On 22/04/2013 22:49, Junio C Hamano wrote: So in addition to have some change and there is no same parent case, under _some_ condition we avoid marking a merge not worth showing (i.e. TREESAME) if there is any change. And the condition is !simplify_history and !simplify_merges, which would cover --full-history, but I am not sure if requiring !simplify_merges is correct. You're right, it isn't correct. Logically, the test should be for simplify_history. The original patch was overly cautious about limiting it to --full-history. And it avoided breaking a test - specifically t6016.7. simplify_merges() should just be post- processing full history with parents as you say, not changing its behaviour. I've dug into this more, and am now fairly confident about my analysis. I believe the problem arises because TREESAME's definition was conflated with the default simplification behaviour, unnecessarily. Merges were defined as TREESAME if they matched ANY parent. That's fine if simplify_history is set, but then in that case a merge which is TREESAME to any parent is turned into a normal commit based on that parent anyway, and that simplified commit is inherently TREESAME, regardless of how we define TREESAME for merges. If simplify_history isn't set, and we don't want ancestry, then that TREESAME definition is problematic. It avoids showing merges that don't have anything actually new, but does lead to this issue that git log -Sxxx --full-history can't locate a dropped change. If simplify_history is set, and we do want ancestry, then it doesn't matter about the TREESAME definition because it shows all merges, regardless of the TREESAME flag. Thus adding --parents to the above command means it can find it, but only because it drags _every_ merge into consideration. Should that be necessary? Futher, if we add simplify_merges, then TREESAME becomes a problem. After merge simplification, we can be left with a single-parent commit that doesn't match its parent but is skipped due to its TREESAME flag being set: it was TREESAME to a parent that got dropped. I think the correction is that TREESAME for a commit needs to be redefined to be this commit matches all its (remaining) parents, rather than this commit matches at least 1 of its (remaining) parents. And when we eliminate parents, we have to check for the flag changing, but we should have been doing that anyway. Now, what effect does post-processing have on TREESAME? Parent rewriting due to elimination of duplicates, and making dense has no effect, as there we're always replacing a commit with something it was TREESAME to, so that doesn't affect TREESAMEness of its children. But reduce_heads() in simplify_merges() can have an effect. In the example shown in the comment, X may or may not be TREESAME to its remaining single parent. And test 6016.7 covers one instance of that - reduce_heads() eliminates branches B and C, because they didn't touch bar.txt, and the rewritten parents were then ancestors of the A path. But that that drop ancestor paths logic totally ignores TREESAMEness, and we don't know whether the merge kept the bar.txt from A, or eliminated it by an -s ours take of B. In the 6016.7 case, the merge was originally marked TREESAME, as it matched A. When B and C were eliminated, it remained TREESAME, and this wasn't recalculated. And that happened to be correct in this case, because the merge was normal. If on the other hand, the merge had taken the old version B with -s ours, or if there had been some other semantic change arising in the merge, then simplify_merges would still have eliminated B and C, and left a normal commit differing from parent A, but still marked TREESAME from having matched B. So that would have wrongly hidden the merge. Recalculation is already necessary to spot the odd case. If we redefine TREESAME, then recalculation becomes necessary to handle the normal case - the merge will not initially be TREESAME, as it differs from B and C. But once B and C are eliminated, we should then determine that the commit is TREESAME from matching its remaining parent A. Simplification should only ever increase the sameness, so we can avoid recalculation if TREESAME is already set. Without this recalculation, 6016.7 fails. So I believe that if reduce_heads() does eliminate any parents, and TREESAME wasn't set, then we have to recalculate TREESAME, by running over the remaining parents. A call to try_to_simplify_commit() does the job, but it feels hacky, and obviously it's slow. It would be nice if we could remember per-parent TREESAME flags, and shuffle them when we change parents, but that could be fiddly. Also if limit_to_ancestry() did eliminate parents from outside the ancestry, as per the NEEDSWORK comment, then that would also want to do the TREESAME recalculation. (And it could conceivably be very useful, helping pick up only the merges that went against your base commit). So, given all that, revised patch below: ---
[PATCH] cherry-pick/revert: make usage say 'commit-ish...'
The usage string for cherry-pick and revert has never been updated to reflect their ability to handle multiple commits. Other documentation is already correct. Signed-off-by: Kevin Bracey ke...@bracey.fi --- builtin/revert.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/builtin/revert.c b/builtin/revert.c index c5e36b9..0401fdb 100644 --- a/builtin/revert.c +++ b/builtin/revert.c @@ -19,13 +19,13 @@ */ static const char * const revert_usage[] = { - N_(git revert [options] commit-ish), + N_(git revert [options] commit-ish...), N_(git revert subcommand), NULL }; static const char * const cherry_pick_usage[] = { - N_(git cherry-pick [options] commit-ish), + N_(git cherry-pick [options] commit-ish...), N_(git cherry-pick subcommand), NULL }; -- 1.8.2.255.g39c5835 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC/PATCH] Make --full-history consider more merges
History simplification previously always treated merges as TREESAME if they were TREESAME to any parent. While the desired default behaviour, this could be extremely unhelpful when searching detailed history, and could not be overridden. For example, if a merge had ignored a change, as if by -s ours, then: git log -m -p --full-history -Schange file would successfully locate change's addition but would not locate the merge that resolved against it. This patch changes the simplification so that when --full-history is specified, a merge is treated as TREESAME only if it is TREESAME to _all_ parents. This means the command above locates a merge that dropped change. Signed-off-by: Kevin Bracey ke...@bracey.fi --- This would address my problem case - it passes existing tests, and covers my (all-too-common) problem. But it would also need documentation changes and a new test. revision.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/revision.c b/revision.c index eb98128..96fe3f5 100644 --- a/revision.c +++ b/revision.c @@ -516,8 +516,14 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) } die(bad tree compare for commit %s, sha1_to_hex(commit-object.sha1)); } - if (tree_changed !tree_same) - return; + + if (tree_changed) { + if (!tree_same) + return; + + if (!revs-simplify_history !revs-simplify_merges) + return; + } commit-object.flags |= TREESAME; } -- 1.8.2.255.g39c5835 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Locating merge that dropped a change
On 09/04/2013 21:00, Kevin Bracey wrote: So, how to automatically find a merge that ignored a known change? I think I've found the problem. It only doesn't work _if you specify the file_. Specifically, if I was missing an addition, my first attempt to find it would be git log -p -m -Saddition file If the addition was lost in a merge, that doesn't even show the addition, which is surprising, but intentional. The addition isn't part of the HEAD version of file, so no point going down that path of the merge. Fine. However, I expected this to work: git log --full-history -p -m -Saddition file But it doesn't. It finds the addition, but _not_ the loss in the merge commit. But this does work: git log -p -m -Saddition That really feels like a bug to me. By specifying a file, I've made it miss the change, and I can see no way to get the change without making it a full-tree operation. Looking at try_to_simplify_commit() it appears the merge that removed the addition is excluded because it's TREESAME to its _other_ parent. But with --full-history, we should only be eliminating a merge if its TREESAME to all parents, surely? Otherwise we have this case that we show a commit but not its reversion. And the code doing this looks broken, or at least illogical - the parent loop sets tree_same for a same parent in --full-history, rather than immediately setting the TREESAME flag, so maybe previous authors were thinking about this. But setting tree_same guarantees that TREESAME is always set on exit anyway, so it's pointless (unless I'm missing something). It does appear this is documented behaviour in the manual: full-history without parent rewriting .. P and M were excluded because they are TREESAME to _a_ parent. I would say that they should have been excluded due to being TREESAME to _all_ parents. I really don't want a merge where one version of my file was chosen over another excluded from its so-called full history. Now, obviously in a sane environment, most merges won't be that interesting, as they'll just be propagating wanted patches from the real commits already in the log. But I'd like some way to find merges that drop code in a specified file, and surely --full-history is it? Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Locating merge that dropped a change
This morning, I was struggling (not for the first time) to produce a Git command that would identify a merge commit that dropped a change. I could see where it was added, but couldn't automate finding out why it wasn't any longer in HEAD. All the permutations of --full-history, -m, -S, -G on git log I could think of did not get me anywhere. As long as I had --full-history, they could find the original commit that had added the change, but not the merge commit that had dropped it by taking the other parent. So, how to automatically find a merge that ignored a known change? And then for visualisation purposes, how do you persuade gitk's diff display to actually show that that merge commit removed the change from one of its parents? Again, -m didn't seem to work. Help appreciated! Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] checkout: abbreviate hash in suggest_reattach
After printing the list of left-behind commits (with abbreviated hashes), use an abbreviated hash in the suggested 'git branch' command; there's no point in outputting a full 40-character hex string in some friendly advice. Signed-off-by: Kevin Bracey ke...@bracey.fi --- builtin/checkout.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index a9c1b5a..e168bfb 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -700,7 +700,7 @@ static void suggest_reattach(struct commit *commit, struct rev_info *revs) If you want to keep them by creating a new branch, this may be a good time\nto do so with:\n\n git branch new_branch_name %s\n\n), - sha1_to_hex(commit-object.sha1)); + find_unique_abbrev(commit-object.sha1, DEFAULT_ABBREV)); } /* -- 1.8.2.445.gfcda34b -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--simplify-merges breakage
Commit 4b7f53d (simplify-merges: drop merge from irrelevant side branch) appears to have broken simplify-merges. [Credit to git bisect run - and now I see the report problems immediately message on the recent merge of that change.] I was trying to get my head around history simplification, by creating a repo with the example shown in the manual.The result of gitk --simplify-merges foo was far from pretty. With the example in the manual, you should get this: .-A---M---N---O / / / I B D \ / / `-' After 4b7f53 you instead get: .-A---M N---O / / / / I B / D \ / / / `-' Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 0/3] git-merge-one-file error reporting
Style clean up, as requested, followed by the fix to the both sides added handling for git-merge-one-file. This is based on v4 of my p4merge series, as they touch the same area. Kevin Bracey (3): git-merge-one-file: style cleanup git-merge-one-file: send ERROR: messages to stderr git-merge-one-file: revise merge error reporting git-merge-one-file.sh | 50 +- 1 file changed, 25 insertions(+), 25 deletions(-) -- 1.8.2.rc3.21.g744ac65 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 1/2] mergetools/p4merge: swap LOCAL and REMOTE
Reverse LOCAL and REMOTE when invoking P4Merge as a mergetool, so that the incoming branch is now in the left-hand, blue triangle pane, and the current branch is in the right-hand, green circle pane. This change makes use of P4Merge consistent with its built-in help, its reference documentation, and Perforce itself. But most importantly, it makes merge results clearer. P4Merge is not totally symmetrical between left and right; despite changing a few text labels from theirs/ours to left/right when invoked manually, it still retains its original Perforce theirs/ours viewpoint. Most obviously, in the result pane P4Merge shows changes that are common to both branches in green. This is on the basis of the current branch being green, as it is when invoked from Perforce; it means that lines in the result are blue if and only if they are being changed by the merge, making the resulting diff clearer. Note that P4Merge now shows ours on the right for both diff and merge, unlike other diff/mergetools, which always have REMOTE on the right. But observe that REMOTE is the working tree (ie ours) for a diff, while it's another branch (ie theirs) for a merge. Ours and theirs are reversed for a rebase - see git help rebase. However, this does produce the desired show the results of this commit effect in P4Merge - changes that remain in the rebased commit (in your branch, but not in the new base) appear in blue; changes that do not appear in the rebased commit (from the new base, or common to both) are in green. If Perforce had rebase, they'd probably not swap ours/theirs, but make P4Merge show common changes in blue, picking out our changes in green. We can't do that, so this is next best. Signed-off-by: Kevin Bracey ke...@bracey.fi Reviewed-by: David Aguilar dav...@gmail.com --- No change to part 1/2 from previous version, apart from Reviewed-by. Part 2/2 is modified. mergetools/p4merge | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mergetools/p4merge b/mergetools/p4merge index 8a36916..46b3a5a 100644 --- a/mergetools/p4merge +++ b/mergetools/p4merge @@ -22,7 +22,7 @@ diff_cmd () { merge_cmd () { touch $BACKUP $base_present || $BASE - $merge_tool_path $BASE $LOCAL $REMOTE $MERGED + $merge_tool_path $BASE $REMOTE $LOCAL $MERGED check_unchanged } -- 1.8.2.rc3.21.g744ac65 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 3/3] git-merge-one-file: revise merge error reporting
Commit 718135e improved the merge error reporting for the resolve strategy's merge conflict and permission conflict cases, but led to a malformed ERROR: in myfile.c message in the case of a file added differently. This commit reverts that change, and uses an alternative approach without this flaw. Signed-off-by: Kevin Bracey ke...@bracey.fi --- git-merge-one-file.sh | 22 +++--- 1 file changed, 7 insertions(+), 15 deletions(-) diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh index 39b7799..e231d20 100755 --- a/git-merge-one-file.sh +++ b/git-merge-one-file.sh @@ -107,10 +107,12 @@ case ${1:-.}${2:-.}${3:-.} in ;; esac + ret=0 src2=$(git-unpack-file $3) case $1 in '') - echo Added $4 in both, but differently. + echo ERROR: Added $4 in both, but differently. 2 + ret=1 orig=$(git-unpack-file $2) create_virtual_base $orig $src2 ;; @@ -124,11 +126,10 @@ case ${1:-.}${2:-.}${3:-.} in # would confuse merge greatly. src1=$(git-unpack-file $2) git merge-file $src1 $orig $src2 - ret=$? - msg= - if test $ret != 0 + if test $? != 0 then - msg='content conflict' + echo ERROR: Content conflict in $4 2 + ret=1 fi # Create the working tree file, using our tree version from the @@ -138,21 +139,12 @@ case ${1:-.}${2:-.}${3:-.} in if test $6 != $7 then - if test -n $msg - then - msg=$msg, - fi - msg=${msg}permissions conflict: $5-$6,$7 - ret=1 - fi - if test -z $1 - then + echo ERROR: Permissions conflict: $5-$6,$7 2 ret=1 fi if test $ret != 0 then - echo ERROR: $msg in $4 2 exit 1 fi exec git update-index -- $4 -- 1.8.2.rc3.21.g744ac65 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 2/2] mergetools/p4merge: create a base if none available
Originally, with no base, Git gave P4Merge $LOCAL as a dummy base: p4merge $LOCAL $LOCAL $REMOTE $MERGED Commit 0a0ec7bd changed this to: p4merge empty file $LOCAL $REMOTE $MERGED to avoid the problem of being unable to save in some circumstances with similar inputs. Unfortunately this approach produces much worse results on differing inputs. P4Merge really regards the blank file as the base, and once you have just a couple of differences between the two branches you end up with one a massive full-file conflict. The 3-way diff is not readable, and you have to invoke difftool MERGE_HEAD HEAD manually to get a useful view. The original approach appears to have invoked special 2-way merge behaviour in P4Merge that occurs only if the base filename is or equal to the left input. You get a good visual comparison, and it does not auto-resolve differences. (Normally if one branch matched the base, it would autoresolve to the other branch). But there appears to be no way of getting this 2-way behaviour and being able to reliably save. Having base==left appears to be triggering other assumptions. There are tricks the user can use to force the save icon on, but it's not intuitive. So we now follow a suggestion given in the original patch's discussion: generate a virtual base, consisting of the lines common to the two branches. This is the same as the technique used in resolve and octopus merges, so we relocate that code to a shared function. Note that if there are no differences at the same location, this technique can lead to automatic resolution without conflict, combining everything from the 2 files. As with the other merges using this technique, we assume the user will inspect the result before saving. Signed-off-by: Kevin Bracey ke...@bracey.fi Reviewed-by: David Aguilar dav...@gmail.com --- Minor change from v3: that version moved initialisation of src1 higher up, detaching it from its associated comment. This move was only required by earlier versions, so v4 leaves src1 in its original position. Added Reviewed-by footer. Documentation/git-sh-setup.txt | 6 ++ git-merge-one-file.sh | 18 +- git-sh-setup.sh| 12 mergetools/p4merge | 6 +- 4 files changed, 28 insertions(+), 14 deletions(-) diff --git a/Documentation/git-sh-setup.txt b/Documentation/git-sh-setup.txt index 6a9f66d..5d709d0 100644 --- a/Documentation/git-sh-setup.txt +++ b/Documentation/git-sh-setup.txt @@ -82,6 +82,12 @@ get_author_ident_from_commit:: outputs code for use with eval to set the GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL and GIT_AUTHOR_DATE variables for a given commit. +create_virtual_base:: + modifies the first file so only lines in common with the + second file remain. If there is insufficient common material, + then the first file is left empty. The result is suitable + as a virtual base input for a 3-way merge. + GIT --- Part of the linkgit:git[1] suite diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh index 3373c04..255c07a 100755 --- a/git-merge-one-file.sh +++ b/git-merge-one-file.sh @@ -104,30 +104,22 @@ case ${1:-.}${2:-.}${3:-.} in ;; esac - src2=`git-unpack-file $3` + src2=$(git-unpack-file $3) case $1 in '') echo Added $4 in both, but differently. - # This extracts OUR file in $orig, and uses git apply to - # remove lines that are unique to ours. - orig=`git-unpack-file $2` - sz0=`wc -c $orig` - @@DIFF@@ -u -La/$orig -Lb/$orig $orig $src2 | git apply --no-add - sz1=`wc -c $orig` - - # If we do not have enough common material, it is not - # worth trying two-file merge using common subsections. - expr $sz0 \ $sz1 \* 2 /dev/null || : $orig + orig=$(git-unpack-file $2) + create_virtual_base $orig $src2 ;; *) echo Auto-merging $4 - orig=`git-unpack-file $1` + orig=$(git-unpack-file $1) ;; esac # Be careful for funny filename such as -L in $4, which # would confuse merge greatly. - src1=`git-unpack-file $2` + src1=$(git-unpack-file $2) git merge-file $src1 $orig $src2 ret=$? msg= diff --git a/git-sh-setup.sh b/git-sh-setup.sh index 9cfbe7f..2f78359 100644 --- a/git-sh-setup.sh +++ b/git-sh-setup.sh @@ -249,6 +249,18 @@ clear_local_git_env() { unset $(git rev-parse --local-env-vars) } +# Generate a virtual base file for a two-file merge. Uses git apply to +# remove lines from $1 that are not in $2, leaving only common lines. +create_virtual_base() { + sz0=$(wc -c $1) + @@DIFF@@ -u -La/$1 -Lb/$1 $1 $2 | git apply --no-add + sz1=$(wc -c $1) + + # If we do not have enough common material
Re: [PATCH v3 3/3] git-merge-one-file: revise merge error reporting
On 13/03/2013 19:57, Junio C Hamano wrote: Kevin Bracey ke...@bracey.fi writes: - echo Added $4 in both, but differently. + echo ERROR: Added $4 in both, but differently. + ret=1 The problem you identified may be worth fixing, but I do not think this change is correct. This message is at the same severity level as the message on the other arm of this case that says Auto-merging $4. In that other case arm, we are attempting a true three-way merge, and in this case arm, we are attempting a similar three-way merge using your virtual base. Neither has found any error in this case arm yet. The messages are both informational, not an error. I do not think you would want to set ret=1 until you see content conflict. I disagree here. At the minute, it does set ret to 1 (but further down the code - bringing it up here next to the ERROR print clarifies that), and will report the merge as failed, conflict in the 3-way merge or not. Which I think is correct. We have to stop for user inspection here. We do have a fake base; we can't trust the 3-way merge with it. The virtual 3-way merge will take ABCDE and ABDE and produce ABCDE without conflict. That's flat wrong if the real base they failed to tell Git about was ABCDE. Despite being useful, I'm still slightly uncomfortable that it can produce something without any conflict markers. The user really needs to look at properly. (And one interesting related glitch, or at least thing that puzzled me when it happened. This is from memory, so may be slightly mistaken, but what seemed to happen was that if you have rerere enabled, then mergetool tends to say nothing to merge, because it relies on rerere remaining, which relies on conflict markers. I think you could still force a mergetool up by specifying the specific file though.) Maybe the virtual base itself should be different. Maybe it should put a ??? marker in place of every unique line. So you get: Left ABCEFGH Right XABCDEFJH - Merge result |XABC|DEFG|JH VBase ?ABC?EF??H That actually feels like it may be the correct answer here. And it's effectively what P4Merge does in its 2-way mode I failed to invoke. (At least for the result view). Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 3/3] git-merge-one-file: revise merge error reporting
On 14/03/2013 16:56, Junio C Hamano wrote: Kevin Bracey ke...@bracey.fi writes: Maybe the virtual base itself should be different. Maybe it should put a ??? marker in place of every unique line. So you get: Left ABCEFGH Right XABCDEFJH - Merge result |XABC|DEFG|JH VBase ?ABC?EF??H That actually feels like it may be the correct answer here. Interesting, though the approach has downsides with the diff3 conflict style, no? Well, yes, but I would assume that we would forcibly select normal diff here somehow, if we aren't already. We should be - turning ABCDEFGH vs ABCD into ABCDEFGH|EFGH= is silly. This topic has a lot in common with the zdiff3 discussion going on. The concern there is about large chunks of similar code appearing on two sides, and not being in the base, leading to useless diff3. This is just the special case of the base being totally empty. The thought on zdiff3 philosophy was that common additions should be treated as resolved, and not appear inside conflict markers. That's exactly what we'd be doing. So, same conflict as above, but this time embedded in a larger file, using zdiff3 logic: LeftaabaacaaABCEFGHeee Baseeee - zdiff3 aaadab|a=faacaaABC|DEFG|JHeee Right aaadaafaABCDEFJHeee Note that I've chosen to suppress the = marker if the lines surrounding the conflict are not in the base. I think that helps highlight the fact that we're in a diff2 section. EFG|=JH reads like an assertion that the base has EFH. Whereas EFG|JH avoids that. So, anyway, commonality with zdiff3 would be good. Even if we can't share code, we should at least share the general style of result. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 3/3] git-merge-one-file: revise merge error reporting
On 14/03/2013 19:31, Kevin Bracey wrote: On 14/03/2013 16:56, Junio C Hamano wrote: Well, yes, but I would assume that we would forcibly select normal diff here somehow, if we aren't already. We should be - turning ABCDEFGH vs ABCD into ABCDEFGH|EFGH= is silly. Doh. But anyway, we don't want to waste space with |= markers, and make the same surrounding code is in the base suggestion. So we should be selecting diff. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 2/3] mergetools/p4merge: create a base if none available
Originally, with no base, Git gave P4Merge $LOCAL as a dummy base: p4merge $LOCAL $LOCAL $REMOTE $MERGED Commit 0a0ec7bd changed this to: p4merge empty file $LOCAL $REMOTE $MERGED to avoid the problem of being unable to save in some circumstances with similar inputs. Unfortunately this approach produces much worse results on differing inputs. P4Merge really regards the blank file as the base, and once you have just a couple of differences between the two branches you end up with one a massive full-file conflict. The 3-way diff is not readable, and you have to invoke difftool MERGE_HEAD HEAD manually to get a useful view. The original approach appears to have invoked special 2-way merge behaviour in P4Merge that occurs only if the base filename is or equal to the left input. You get a good visual comparison, and it does not auto-resolve differences. (Normally if one branch matched the base, it would autoresolve to the other branch). But there appears to be no way of getting this 2-way behaviour and being able to reliably save. Having base==left appears to be triggering other assumptions. There are tricks the user can use to force the save icon on, but it's not intuitive. So we now follow a suggestion given in the original patch's discussion: generate a virtual base, consisting of the lines common to the two branches. This is the same as the technique used in resolve and octopus merges, so we relocate that code to a shared function. Note that if there are no differences at the same location, this technique can lead to automatic resolution without conflict, combining everything from the 2 files. As with the other merges using this technique, we assume the user will inspect the result before saving. Signed-off-by: Kevin Bracey ke...@bracey.fi --- Documentation/git-sh-setup.txt | 6 ++ git-merge-one-file.sh | 18 +- git-sh-setup.sh| 12 mergetools/p4merge | 6 +- 4 files changed, 28 insertions(+), 14 deletions(-) diff --git a/Documentation/git-sh-setup.txt b/Documentation/git-sh-setup.txt index 6a9f66d..5d709d0 100644 --- a/Documentation/git-sh-setup.txt +++ b/Documentation/git-sh-setup.txt @@ -82,6 +82,12 @@ get_author_ident_from_commit:: outputs code for use with eval to set the GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL and GIT_AUTHOR_DATE variables for a given commit. +create_virtual_base:: + modifies the first file so only lines in common with the + second file remain. If there is insufficient common material, + then the first file is left empty. The result is suitable + as a virtual base input for a 3-way merge. + GIT --- Part of the linkgit:git[1] suite diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh index f612cb8..0f164e5 100755 --- a/git-merge-one-file.sh +++ b/git-merge-one-file.sh @@ -104,30 +104,22 @@ case ${1:-.}${2:-.}${3:-.} in ;; esac - src2=`git-unpack-file $3` + src1=$(git-unpack-file $2) + src2=$(git-unpack-file $3) case $1 in '') echo Added $4 in both, but differently. - # This extracts OUR file in $orig, and uses git apply to - # remove lines that are unique to ours. - orig=`git-unpack-file $2` - sz0=`wc -c $orig` - @@DIFF@@ -u -La/$orig -Lb/$orig $orig $src2 | git apply --no-add - sz1=`wc -c $orig` - - # If we do not have enough common material, it is not - # worth trying two-file merge using common subsections. - expr $sz0 \ $sz1 \* 2 /dev/null || : $orig + orig=$(git-unpack-file $2) + create_virtual_base $orig $src2 ;; *) echo Auto-merging $4 - orig=`git-unpack-file $1` + orig=$(git-unpack-file $1) ;; esac # Be careful for funny filename such as -L in $4, which # would confuse merge greatly. - src1=`git-unpack-file $2` git merge-file $src1 $orig $src2 ret=$? msg= diff --git a/git-sh-setup.sh b/git-sh-setup.sh index 795edd2..349a5d4 100644 --- a/git-sh-setup.sh +++ b/git-sh-setup.sh @@ -249,6 +249,18 @@ clear_local_git_env() { unset $(git rev-parse --local-env-vars) } +# Generate a virtual base file for a two-file merge. Uses git apply to +# remove lines from $1 that are not in $2, leaving only common lines. +create_virtual_base() { + sz0=$(wc -c $1) + @@DIFF@@ -u -La/$1 -Lb/$1 $1 $2 | git apply --no-add + sz1=$(wc -c $1) + + # If we do not have enough common material, it is not + # worth trying two-file merge using common subsections. + expr $sz0 \ $sz1 \* 2 /dev/null || : $1 +} + # Platform specific tweaks to work around some commands case $(uname -s) in diff --git a/mergetools/p4merge b/mergetools/p4merge index 46b3a5a..5a608ab
[PATCH v3 3/3] git-merge-one-file: revise merge error reporting
Commit 718135e improved the merge error reporting for the resolve strategy's merge conflict and permission conflict cases, but led to a malformed ERROR: in myfile.c message in the case of a file added differently. This commit reverts that change, and uses an alternative approach without this flaw. Signed-off-by: Kevin Bracey ke...@bracey.fi --- git-merge-one-file.sh | 20 +++- 1 file changed, 7 insertions(+), 13 deletions(-) diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh index 0f164e5..78b07a8 100755 --- a/git-merge-one-file.sh +++ b/git-merge-one-file.sh @@ -104,11 +104,13 @@ case ${1:-.}${2:-.}${3:-.} in ;; esac + ret=0 src1=$(git-unpack-file $2) src2=$(git-unpack-file $3) case $1 in '') - echo Added $4 in both, but differently. + echo ERROR: Added $4 in both, but differently. + ret=1 orig=$(git-unpack-file $2) create_virtual_base $orig $src2 ;; @@ -121,10 +123,9 @@ case ${1:-.}${2:-.}${3:-.} in # Be careful for funny filename such as -L in $4, which # would confuse merge greatly. git merge-file $src1 $orig $src2 - ret=$? - msg= - if [ $ret -ne 0 ]; then - msg='content conflict' + if [ $? -ne 0 ]; then + echo ERROR: Content conflict in $4 + ret=1 fi # Create the working tree file, using our tree version from the @@ -133,18 +134,11 @@ case ${1:-.}${2:-.}${3:-.} in rm -f -- $orig $src1 $src2 if [ $6 != $7 ]; then - if [ -n $msg ]; then - msg=$msg, - fi - msg=${msg}permissions conflict: $5-$6,$7 - ret=1 - fi - if [ $1 = '' ]; then + echo ERROR: Permissions conflict: $5-$6,$7 ret=1 fi if [ $ret -ne 0 ]; then - echo ERROR: $msg in $4 exit 1 fi exec git update-index -- $4 -- 1.8.2.rc3.7.g1100d09.dirty -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/3] mergetools/p4merge: swap LOCAL and REMOTE
Reverse LOCAL and REMOTE when invoking P4Merge as a mergetool, so that the incoming branch is now in the left-hand, blue triangle pane, and the current branch is in the right-hand, green circle pane. This change makes use of P4Merge consistent with its built-in help, its reference documentation, and Perforce itself. But most importantly, it makes merge results clearer. P4Merge is not totally symmetrical between left and right; despite changing a few text labels from theirs/ours to left/right when invoked manually, it still retains its original Perforce theirs/ours viewpoint. Most obviously, in the result pane P4Merge shows changes that are common to both branches in green. This is on the basis of the current branch being green, as it is when invoked from Perforce; it means that lines in the result are blue if and only if they are being changed by the merge, making the resulting diff clearer. Note that P4Merge now shows ours on the right for both diff and merge, unlike other diff/mergetools, which always have REMOTE on the right. But observe that REMOTE is the working tree (ie ours) for a diff, while it's another branch (ie theirs) for a merge. Ours and theirs are reversed for a rebase - see git help rebase. However, this does produce the desired show the results of this commit effect in P4Merge - changes that remain in the rebased commit (in your branch, but not in the new base) appear in blue; changes that do not appear in the rebased commit (from the new base, or common to both) are in green. If Perforce had rebase, they'd probably not swap ours/theirs, but make P4Merge show common changes in blue, picking out our changes in green. We can't do that, so this is next best. Signed-off-by: Kevin Bracey ke...@bracey.fi --- mergetools/p4merge | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mergetools/p4merge b/mergetools/p4merge index 8a36916..46b3a5a 100644 --- a/mergetools/p4merge +++ b/mergetools/p4merge @@ -22,7 +22,7 @@ diff_cmd () { merge_cmd () { touch $BACKUP $base_present || $BASE - $merge_tool_path $BASE $LOCAL $REMOTE $MERGED + $merge_tool_path $BASE $REMOTE $LOCAL $MERGED check_unchanged } -- 1.8.2.rc3.7.g1100d09.dirty -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] git.c: make usage match manual page
Re-ordered option list in command-line usage to match the manual page. Also makes it less than 80-characters wide. Signed-off-by: Kevin Bracey ke...@bracey.fi --- git.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/git.c b/git.c index d33f9b3..2a98624 100644 --- a/git.c +++ b/git.c @@ -6,10 +6,10 @@ #include run-command.h const char git_usage_string[] = - git [--version] [--exec-path[=path]] [--html-path] [--man-path] [--info-path]\n + git [--version] [--help] [-c name=value]\n + [--exec-path[=path]] [--html-path] [--man-path] [--info-path]\n [-p|--paginate|--no-pager] [--no-replace-objects] [--bare]\n [--git-dir=path] [--work-tree=path] [--namespace=name]\n - [-c name=value] [--help]\n command [args]; const char git_more_info_string[] = -- 1.8.2.rc3.7.g1100d09.dirty -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] git.c: make usage match manual page
On 11/03/2013 21:58, Junio C Hamano wrote: Kevin Bracey ke...@bracey.fi writes: Re-ordered option list in command-line usage to match the manual page. Also makes it less than 80-characters wide. Thanks (s/Re-ordered/reorder/ and s/makes/make/, though). Got it. But I'm going to reword it, to follow the history of the manual change. Is git.c the only one whose -h output does not match the manual synopsis? Generally, -h just puts options in the synopsis, and then prints a line per option, so most commands don't really match the manual show all options on one line style anyway. git.c is atypical. (Something else to look at for the whole git help thing? Should git -h print a option list in that style?) But, yes, I've found a few others that are show almost the same thing as the manual but with subtle pointless differences. git remote, for example. That's a larger project, I feel; the 80-column thing is key here. Kevin -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html