Re: Weird revision walk behaviour

2018-05-31 Thread Kevin Bracey

On 31/05/2018 08:43, Jeff King wrote:


If there are zero parents (neither relevant nor irrelevant), is it still
TREESAME? I would say in theory yes.


Not sure - I think roots are such a special case that TREESAME 
effectively doesn't matter. We always test for roots first.

  So what I was proposing would be to
rewrite the parents to the empty set.
That feels a bit radical - I believe we need to retain (some) parent 
information for modes that show it (eg the dangling unfilled circles in 
gitk). And making it a root I think could cause other problems with 
making it look like we have a disjoint history. I believe the next 
simplification step may be trying to follow down to the common root.

What next here? It looks like we have a proposed solution. Do you want
to try to work up a set of tests based on what you wrote earlier?
I was hoping Gábor would carry on, as he's made a start... I was just 
planning to back-seat drive.

I'd also love to hear from Junio as the expert in this area, but I think
he's been a bit busy with maintainer stuff recently. So maybe I should
just be patient. :)

Likewise - I have been quite deep into this, but it was a quite short 
window of investigation a long time ago, and I've not looked at it 
since. Would like input from someone with more active knowledge.


Kevin



Re: Weird revision walk behaviour

2018-05-30 Thread Kevin Bracey

On 30/05/2018 00:04, Jeff King wrote:


Do we even need to do the parent rewriting here? By definition those
parents aren't interesting, and we're TREESAME to whatever is in
treesame_parents. So conceptually it seems like we just need a flag "I
found a treesame parent", but we only convert that into a TREESAME flag
if there are no relevant parents.


I think it's necessary to make the rules consistent. To mark the commit 
as TREESAME here when it's not TREESAME to all its parents would be 
inconsistent with the definition of the TREESAME flag used everywhere else:


* Original definition: "A commit is TREESAME if it is treesame to any 
parent"

* d0af66 definition: "A commit is TREESAME if it is treesame to all parents"
* Current 4d8266 definition: "A commit is TREESAME if it is treesame to 
all relevant parents; if no relevant parents then if it is treesame to 
all (irrelevant) parents."


The current problem is that the node is not marked TREESAME, but that's 
consistent with the definition. I think we do have to rewrite the commit 
so it is TREESAME as per the definition. Not flag it as TREESAME in 
violation of it.


It's possible you *could* get away with just flagging, because we never 
recompute the TREESAME flag in simple history mode. But it would be a 
cheat, and it may have other side effects. It means this node would 
remain a special rare case for others to trip up on later.  And I don't 
think it simplifies the scan. Remembering 
"pointer-to-first-treesame-parent" (not a list) for the rewrite is no 
more complex than remembering "bool-there-was-a-treesame-parent".  (A 
bool is what earlier code did - it worked for the original TREESAME 
definition. My patch series dropped that bool without replacement - 
missing this all-irrelevant case).


In the simple history mode, the assumption is we're "simplifying away 
merges up-front" here; we won't (and can't) rewrite parents later in a 
way that needs to recompute TREESAME. In the initial scan when all 
parents are relevant and we matched one, the commit became TREESAME as 
per the new definition immediately because of the rewrite.  This applies 
the equivalent rewrite when no relevant parents, consistent with the 
general concept, and without changing the TREESAME definition.


Kevin




Re: Weird revision walk behaviour

2018-05-29 Thread Kevin Bracey

On 29/05/2018 01:06, SZEDER Gábor wrote:


So, without investing nearly enough time to understand what is going
on, I massaged the above diffs into this:

Cool.


+   treesame_parents = 
xmalloc(sizeof(*treesame_parents));
There's no need to actually record a list here. This is just for the 
simple history. We are only interested into becoming a non-merge to 1 
treesame parent, so I think we just need to record a pointer to the 
first one we see, just as this would exit immediately for the first 
relevant one. For the full-history case, we already have the full "which 
parents are treesame" recording mechanism just above, but it only kicks 
in for merge commits and only when settings require it. Adding a malloc 
here would be significant machinery overhead.

FWIW, the test suite passes with the above patch applied.
I doubt there's an existing case like this anywhere in the revision test 
suite :) . And this patch is focused enough that it *should* only be 
changing the behaviour of this very specific case. As such, it does feel 
a little like a kludge, but I think it's fine because it's aligning the 
simple-history analysis with the "analyse relevant parents if any, else 
analyse irrelevant" rule of the full-history.


And here is the small PoC test case to illustrate the issue, which
fails without but succeeds with the above patch.  Eventually it should
be part of 't6012-rev-list-simplify.sh', of course, but I haven't
looked into that yet.
It may be there's enough criss-crossy history to test here to merit 
breaking out into a second test series.


+#   B---M2   master
+#  / \ /
+# A   X
+#  \ / \
+#   C---M1   b2
+#
+# and modify 'file' in commits 'A' and 'B', so one of 'M1's parents
+# ('B') is TREESAME wrt. 'file'.

I guess we'll be wanting test cases for A..B2, B..B2 and C..B2, and some 
where the the base is "some other child of B or C".  "B..B2" is no 
longer a pure set subtraction for simplification as B is UNINTERESTING 
(ie not in the set) but RELEVANT (because you named it as a bottom 
commit), so B..B2 actually still leaves M1 with 2 relevant parents. 
You'd want test cases covering B relevant+C irrelevant and B 
irrelevant+C relevant, which means subtracting them without naming them 
- so name a child of one.


And then we need to think about whether we want it displayed in each of 
the other modes for each of those queries...


Kevin



Re: Weird revision walk behaviour

2018-05-27 Thread Kevin Bracey

On 24/05/2018 23:26, Kevin Bracey wrote:



On Wed, May 23, 2018 at 07:10:58PM +0200, SZEDER Gábor wrote:


   $ git log --oneline master..ba95710a3b -- ci/
   ea44c0a594 Merge branch 'bw/protocol-v2' into 
jt/partial-clone-proto-v2


In this case, we're hitting a merge commit which is not on master, but 
it has two parents which both are. Which, IIRC, means the merge commit 
is INTERESTING with two UNINTERESTING parents; and we are TREESAME to 
only one of them.


The commit changing the logic of TREESAME you identified believes that 
those TREESAME changes for merges which were intended to improve 
fuller history modes shouldn't affect the simple history "because 
partially TREESAME merges are turned into normal commits". Clearly 
that didn't happen here.


Haven't currently got a development environment set up here, but I've 
been looking at the code.Here's a proposal, untested, as a potential 
starting point if anyone wants to consider a proper patch.


The simplify_history first-scan logic never actually turned merges into 
simple commits unless they were TREESAME to a relevant/interesting 
parent.  Anything where the TREESAME parent was UNINTERESTING was 
retained as a merge, but had its TREESAME flag set, and that permitted 
later simplification.


With the redefinition of the TREESAME flag, this merge commit is no 
longer TREESAME, and as the decoration logic to refine TREESAME isn't 
active for simplify_history, it doesn't get cleaned up (even if it would 
be in full history?)


I think the answer may be to add an extra post-process step on the 
initial loop to handle this special case. Something like:


        case REV_TREE_SAME:
            if (!revs->simplify_history || !relevant_commit(p)) {
                /* Even if a merge with an uninteresting
                 * side branch brought the entire change
                 * we are interested in, we do not want
                 * to lose the other branches of this
                 * merge, so we just keep going.
                 */
                if (ts)
                    ts->treesame[nth_parent] = 1;
+   /* But we note it for potential later simplification */
+       if (!treesame_parent)
+    treesame_parent = p;
    continue;
 }

...

After loop:

+ if (relevant_parents == 0 && revs->simplify_history && 
treesame_parent) {
+   treesame_parent->next = NULL;// Repeats code from loop - 
share somehow?

+   commit->parents = treesame_parent;
+   commit->object.flags |= TREESAME;
+   return;
+    }

 /*
  * TREESAME is straightforward for single-parent commits. For merge

The other option would be to take off the " || !relevant_commit(p)" 
test, but I'm assuming that is still needed for other cases.


Kevin




Re: Weird revision walk behaviour

2018-05-24 Thread Kevin Bracey

On 23/05/2018 20:35, Jeff King wrote:

On Wed, May 23, 2018 at 01:32:46PM -0400, Jeff King wrote:


On Wed, May 23, 2018 at 07:10:58PM +0200, SZEDER Gábor wrote:


   $ git log --oneline master..ba95710a3b -- ci/
   ea44c0a594 Merge branch 'bw/protocol-v2' into jt/partial-clone-proto-v2


I keep some older builds around, and it does not reproduce with v1.6.6.3
(that's my usual goto for "old"). Bisecting turns up d0af663e42
(revision.c: Make --full-history consider more merges, 2013-05-16).  It
looks like an unintended change (the commit message claims that the
non-full-history case shouldn't be affected).

There's more discussion in the thread at:

   
https://public-inbox.org/git/1366658602-12254-1-git-send-email-ke...@bracey.fi/

I haven't absorbed it all yet, but I'm adding Junio to the cc.



In this case, we're hitting a merge commit which is not on master, but 
it has two parents which both are. Which, IIRC, means the merge commit 
is INTERESTING with two UNINTERESTING parents; and we are TREESAME to 
only one of them.


The commit changing the logic of TREESAME you identified believes that 
those TREESAME changes for merges which were intended to improve fuller 
history modes shouldn't affect the simple history "because partially 
TREESAME merges are turned into normal commits". Clearly that didn't 
happen here.


I think we need to look at why that isn't happening, and if it can be 
made to happen. The problem is that this commit is effectively the base 
of the graph - it's got a double-connection to the UNINTERESTING set, 
and maybe that prevented the simple history "follow 1 TREESAME" logic 
from kicking in. Maybe it won't follow 1 TREESAME to UNINTERESTING.


I know there were quite a few changes later in the series to try to 
reconcile the simple and full history, for the cases where the simple 
history takes a weird path because of its love of TREESAME parents, 
hiding evil merges. But I believe the simple history behaviour was 
supposed to remain as-is - take first TREESAME always.


Kevin




Re: Weird revision walk behaviour

2018-05-24 Thread Kevin Bracey

On 23/05/2018 20:35, Jeff King wrote:

There's more discussion in the thread at:

   
https://public-inbox.org/git/1366658602-12254-1-git-send-email-ke...@bracey.fi/

I haven't absorbed it all yet, but I'm adding Junio to the cc.


Just to ack that I've seen the discussion, but I can't identify the 
code's reasoning at the moment. My recollection is that I accepted while 
coming up with the algorithm that it might err slightly on the side of 
false positives in the display - there were some merge cases I was 
unable to fully distinguish whether or not the merge had lost a change 
it shouldn't have done, and if I was uncertain I'd rather show it than not.


The first commit was not originally intended to alter behaviour for 
anything other than --full-history, but later in the chain there was 
specific consideration into tracking the path to the specified "bottom" 
commit. It may be that's part of what's happening here.


Kevin






Re: [PATCH 1/3] add QSORT

2016-10-05 Thread Kevin Bracey

On 04/10/2016 23:31, René Scharfe wrote:



So let's summarize; here's the effect of a raw qsort(3) call:

array == NULL  nmemb  bug  QSORT  following NULL check
-  -  ---  -  
0  0  no   qsort  is skipped
0 >0  no   qsort  is skipped
1  0  no   qsort  is skipped (bad!) **
1 >0  yes  qsort  is skipped **

Right - row 3 may not be a bug from the point of view of your internals, 
but it means you violate the API of qsort.Therefore a fix is required.



With the micro-optimization removed (nmemb > 0) the matrix gets simpler:

array == NULL  nmemb  bug  QSORT  following NULL check
-  -  ---  -  
0  0  no   noop   is executed
0 >0  no   qsort  is skipped
1  0  no   noop   is executed
1 >0  yes  qsort  is skipped **

And with your NULL check (array != NULL) we'd get:

array == NULL  nmemb  bug  QSORT  following NULL check
-  -  ---  -  
0  0  no   qsort  reuses check result
0 >0  no   qsort  reuses check result
1  0  no   noop   reuses check result
1 >0  yes  noop   reuses check result

Did I get it right?  AFAICS all variants (except raw qsort) are safe 
-- no useful NULL checks are removed, and buggy code should be noticed 
by segfaults in code accessing the sorted array.

I think your tables are correct.

But I disagree that you could ever call invoking the "" lines safe. 
Unless you have documentation on what limit GCC (and your other 
compilers) are prepared to put on the undefined behaviour of violating 
that "non-null" constraint.


Up to now dereferencing a null pointer has been implicitly (or 
explicitly?) defined as simply generating SIGSEGV. And that has 
naturally extended into NULL passed to library implementations. But 
that's no longer true - it seems bets are somewhat off.


But, as long as you are confident you never invoke that line without a 
program bug - ie an API precondition of your own QSORT is that NULL is 
legal iff nmemb is zero, then I guess it's fine. Behaviour is defined, 
as long as you don't violate your internal preconditions.


Kevin




Re: [PATCH 1/3] add QSORT

2016-10-04 Thread Kevin Bracey

On 04/10/2016 01:00, René Scharfe wrote:

Am 03.10.2016 um 19:09 schrieb Kevin Bracey:

As such, NULL checks can still be elided even with your change. If you
effectively change your example to:

if (nmemb > 1)
qsort(array, nmemb, size, cmp);
if (!array)
printf("array is NULL\n");

array may only be checked for NULL if nmemb <= 1. You can see GCC doing
that in the compiler explorer - it effectively turns that into "else
if".


We don't support array == NULL together with nmemb > 1, so a segfault 
is to be expected in such cases, and thus NULL checks can be removed 
safely.



Possibly true in practice.

But technically wrong by the C standard - behaviour is undefined if the 
qsort pointer is invalid. You can't formally expect the defined 
behaviour of a segfault when sending NULL into qsort. (Hell, maybe the 
qsort has its own NULL check and silently returns! cf printf - some 
printfs will segfault when passed NULL, some print "(null)"). I've 
worked on systems that don't fault reads to NULL, only writes, so those 
might not segfault there, if NULL appeared sorted...


And obviously there's the language lawyer favourite possibility of the 
call causing nasal flying monkeys or whatever.


So if it's not a program error for array to be NULL and nmemb to be zero 
in your code, and you want a diagnostic for array=NULL, nmemb non-zero, 
I think you should put that diagnostic into sane_qsort as an assert or 
something, not rely on qsort's undefined behaviour being a segfault.


sane_qsort(blah)
{
 if (nmemb >= 1) {
 assert(array);
 qsort(array, nmemb, ...);
 }
}

Can't invoke undefined behaviour from NULL without triggering the 
assert. (Could still have other invalid pointers, of course).


Usually I am on the side of "no NULL checks", as I make the assumption 
that we will get a segfault as soon as NULL pointers are used, and those 
are generally easy to diagnose. But seeing a compiler invoking this sort 
of new trickery due to invoking undefined behaviour is making me more 
nervous about doing so...



To make that check really work, you have to do:

if (array)
qsort(array, nmemb, size, cmp);
else
printf("array is NULL\n");

So maybe your "sane_qsort" should be checking array, not nmemb.


It would be safe, but arguably too much so, because non-empty arrays 
with NULL wouldn't segfault anymore, and thus become harder to 
identify as the programming errors they are.
Well, you get the print. Although I guess you're worrying about the 
second if being real code, not a debugging check.


I must say, this is quite a courageous new optimisation from GCC. It 
strikes me as finding a language lawyer loophole that seems to have been 
intended for something else (mapping library functions directly onto 
CISCy CPU intrinsics), and using it to invent a whole new optimisation 
that seems more likely to trigger bugs than optimise any significant 
amount of code in a desirable way.


Doubly weird as there's no (standard) language support for this. I don't 
know how you'd define "my_qsort" that triggered the same optimisations.


I've seen similar 
library-knowledge-without-any-way-to-reproduce-in-user-code 
optimisations like "malloc returns a new pointer that doesn't alias with 
anything existing" (and no way to reproduce the optimisation with 
my_malloc_wrapper). But those seemed to have a clear performance 
benefit, without any obvious traps. Doubtful about this one.


Kevin



Re: [PATCH 1/3] add QSORT

2016-10-03 Thread Kevin Bracey

On 01/10/2016 19:19, René Scharfe wrote:

It's hard to imagine an implementation of qsort(3) that can't handle
zero elements.  QSORT's safety feature is that it prevents the compiler
from removing NULL checks for the array pointer.  E.g. the last two
lines in the following example could be optimized away:

qsort(ptr, n, sizeof(*ptr), fn);
if (!ptr)
do_stuff();

You can see that on https://godbolt.org/g/JwS99b -- an awesome website
for exploring compilation results for small snippets, by the way.

This optimization is dangerous when combined with the convention of
using a NULL pointer for empty arrays.  Diagnosing an affected NULL
check is probably quite hard -- it's right there in the code after all
and not all compilers remove it.


Hang on, hang on. This is either a compiler bug, or you're wrong on your 
assumption about the specification of qsort.


Either way, the extra layer of indirection is not proper protection. The 
unwanted compiler optimisation you're inadvertently triggering could 
still be triggered through the inline.


Now, looking at the C standard, this isn't actually clear to me. The 
standard says that if you call qsort with nmemb zero, the pointer still 
has to be "valid". Not totally clear to me if NULL is valid, by their 
definition in C99 7.1.4. Googling hasn't given me a concrete answer.


The compiler seems to think that NULL wouldn't be valid, so because 
you've called qsort on it, you've invoked undefined behaviour if it's 
NULL, so it's free to elide the NULL check.


Kevin



Re: [PATCH 1/3] add QSORT

2016-10-03 Thread Kevin Bracey

On 01/10/2016 19:19, René Scharfe wrote:


It's hard to imagine an implementation of qsort(3) that can't handle
zero elements.  QSORT's safety feature is that it prevents the compiler
from removing NULL checks for the array pointer.  E.g. the last two
lines in the following example could be optimized away:

qsort(ptr, n, sizeof(*ptr), fn);
if (!ptr)
do_stuff();

You can see that on https://godbolt.org/g/JwS99b -- an awesome website
for exploring compilation results for small snippets, by the way.

Ah, second attempt. Originally misread the original code, and didn't 
understand what it was doing.


I get it now.

A nasty trap I hadn't been aware of - I was under the impression NULL + 
zero length was generally legal, but the C standard does indeed not give 
you a specific out for NULL to library functions in that case.


As such, NULL checks can still be elided even with your change. If you 
effectively change your example to:


if (nmemb > 1)
qsort(array, nmemb, size, cmp);
if (!array)
printf("array is NULL\n");

array may only be checked for NULL if nmemb <= 1. You can see GCC doing 
that in the compiler explorer - it effectively turns that into "else 
if".  To make that check really work, you have to do:


if (array)
qsort(array, nmemb, size, cmp);
else
printf("array is NULL\n");

So maybe your "sane_qsort" should be checking array, not nmemb.

Kevin



Re: Reset by checkout?

2014-06-09 Thread Kevin Bracey

On 07/06/2014 17:52, Philip Oakley wrote:



Just to say there has been a similar confusion about 'git reset' 
reported on the Git Users group for the case of reset with added 
(staged), but uncommitted changes being wiped out, which simlarly 
reports on the difficulty of explaining some of the conditions 
especially when some are wrong ;-)


https://groups.google.com/forum/#!topic/git-users/27_FxIV_100


I'm coming around to the view that git reset mode should be (almost) 
demoted to plumbing, leaving only the reset file that reverses add 
file as everyday Porcelain.


I think reset --keep and --merge were a step in the wrong direction, 
at least for the Porcelain - trying to make reset mode more useful, 
rather than less necessary. Normal users shouldn't be needing to touch 
these hard-to-explain-and-slightly-dangerous commands.


The addition of --abort to merge and other commands was much more 
solid. They helped a lot, and I think we should follow that model by 
adding --undo to various commands. That would mop up all the common 
resets, in conjunction with Atsushi's proposed checkout -u 
alternative to -B, which I quite like.


Main few:

commit --undo = reset --soft HEAD^
merge --undo  = reset --keep HEAD^
rebase --undo = reset --keep ORIG_HEAD   [bug report: rebase -p doesn't 
set ORIG_HEAD reliably]
pull --undo = merge/rebase --undo depending on rebase settings [could we 
go nuts and undo the fetch too?]


Bonus:

commit --amend --undo: reset --soft HEAD@{1}

The undos can also have a bit of extra veneer that checks the log/reflog 
for whether it matches the proposed undo, and also checks the upstream 
to see if the thing being undone is already public.


Given those, I honestly don't think I'd ever need to explain git reset 
mode to anyone again. Which would be nice...


(Note I propose no --mixed equivalent for the commit undos, but it's 
easy enough to follow the commit --undo with a normal git reset. I'd 
rather re-document the normal git reset under commit --undo than add 
and document yet another option).


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Reset by checkout?

2014-06-03 Thread Kevin Bracey

On 03/06/2014 00:54, Junio C Hamano wrote:


Not that I can think of a better way to update these descriptions,
and not that I am opposing to update these descriptions to make it
easier for new people to learn, but I am not sure if these treat
ORIG_HEAD and the changes since that commit as separate entities
is a good approach to do so.

Somewhat frustrated, not by your patch but by being unable to
suggest a better way X-.




I know. I started off myself knowing what I meant to say, and then got 
bogged down somewhat trying to be detailed enough for a full 
explanation. I think it's just inherently very hard for anyone to 
visualise what these do in the /general/ case.


This is one of those commands where the structure of a man page gets in 
the way. We have to give a summary of what the mode options /do/, but 
that's not what people want to know. They want to know what they're /for/.


(And, to some extent, reset, like checkout, is two separate commands. 
One being the path manipulator, the other being the HEAD manipulator. 
Just bogs us down further).


I think these are the most important HEAD resets, covering 95%+ of uses:

   git reset --soft HEAD~n
   git reset HEAD~n
   git reset --keep HEAD~n
   git reset --keep ORIG_HEAD
   git reset --keep @{n}
   git reset --keep some other arbitary place

(and possibly

   git reset --merge

although I think this should be fully covered by git xxx --abort - 
maybe a couple of those missing like git stash pop/apply --abort?)


Anything more than those, I think, are pretty far-fetched. I can't 100% 
grok --soft/--mixed onto a different branch, for example. (But at 
least we do define those cases in the A/B/C/D discussion section for 
the real geeks.)


Maybe we just need to tighten up the EXAMPLES section? Give it 
easy-to-locate path/--soft/--mixed/--keep subheadings, covering all 
those common use cases (in clean trees...), including a before/after git 
status views. Then normal users could skip the top technical section 
waffling about indexes and go straight there instead.


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Reset by checkout?

2014-06-01 Thread Kevin Bracey

On 01/06/2014 07:26, Atsushi Nakagawa wrote:

Kevin Bracey ke...@bracey.fi wrote:

The original git reset --hard used to be a pretty top-level command.
It was used for aborting merges in particular. But I think it now
stands out as being one of the only really dangerous porcelain
commands, and I can't think of any real workflow it's still useful
for.

My thoughts exactly.  I think the 'reset --soft/--mixed/--hard' pattern
is so ingrained, that many people just don't realize there's a safer
alternative.  (I've heard work mates on more than one occasion
recommending 'reset --hard' as the go-to command for discarding commits.)

I believe this is likely because many third party GUI tools just don't
support 'reset --keep', and these tools present a Reset... dialog with
the de facto Soft/Mixed/Hard options.  (Even 'gitk' does this.)

True on the GUI - hard really needs demotion.

It would help if the documentation explained better straight off what 
the different reset modes are intended /for/ in a more practical way, 
rather than the technical jargon.


There is the EXAMPLES section, but I think the problem is that it's 
not clearly laid out by mode, meaning people checking to see what git 
reset can do are inclined to go first to the --xxx mode list in 
DESCRIPTION, and stop there, baffled, probably not finding any example 
for that mode. Maybe the examples should have clearer --option 
subheadings? (And all the existing examples for --hard should really 
suggest --keep instead).


But given that the DISCUSSION section now has the full internal 
details on what exactly each mode does in every state, and now that we 
have more than the simple soft/mixed/hard to deal with, I think the 
main DESCRIPTION could be simplified for end users.


Most useful for visualisation, I feel, would just showing what git 
status will look like afterwards, primarily from the point of view of a 
backwards reset to HEAD~n. In particular, normal users don't think in 
terms of the absolute contents of the index, but rather in terms of diffs.


Maybe something like this:

All modes move the current branch pointer so that HEAD now points to 
the specified commit. ORIG_HEAD is set to the original HEAD location. 
The modes differ in what happens to the contents of ORIG_HEAD, that are 
no longer on the reset branch; and also what happens to your 
not-yet-committed changes.


--soft
 Retains the contents of ORIG_HEAD in the index+work area, leaving 
the difference as changes to be committed. git reset --soft HEAD~1 
would be the first step if you want to remove the last commit, but 
intend to recommit most or all of its changes.


git status after reset --soft shows:

  To be committed:
   Changes in ORIG_HEAD relative to HEAD
   (+Any initial staged changes)

  Not staged:
   (Any initial unstaged changes)

--mixed (default)
Retains the contents of ORIG_HEAD in the work area, leaving the 
difference as unstaged changes. git reset HEAD~1 would be the first 
step if you want to remove the last commit, and think again from scratch 
about which of its changes should be committed.


git status after reset --mixed shows:

   Not staged:
   Changes in ORIG_HEAD relative to HEAD
   (+Any initial staged changes)
   (+Any initial unstaged changes)

--keep
   The contents of ORIG_HEAD are dropped, leaving the work area and 
index containing the new HEAD; your uncommitted changes to unaffected 
files are retained. If you have uncommitted changes to any files that 
differ in the proposed new HEAD, the operation is refused; you would 
need to git stash first. git reset --keep HEAD~1 can be used to 
totally remove the last commit. (This removal can itself be undone with 
another git reset --keep ORIG_HEAD, or git reset --keep 
branch@{n} - see git-reflog(1)). git reset --keep is a safe 
alternative to --hard, and is roughly equivalent to git checkout -B 
current-branch-name.


git status after reset --keep shows:

   Not staged
   (Any initial staged changes)  [should these be left staged, as 
per git checkout?]

   (+Any initial unstaged changes)

--hard
   All other changes are dropped, and the work area and index are 
forcibly reset to the new HEAD. Note that this is dangerous if used 
carelessly: ALL uncommitted changes to ALL tracked files will be lost, 
even if you were only trying to drop an unrelated commit that didn't 
touch those files. Older documentation often recommends git reset 
--hard to undo commits; the newer --keep option is a much better 
alternative in almost all cases.


git status after reset --hard shows:

   Work area clean (or untracked files present)

--merge
   Performs the operation of git merge --abort, intended for use 
during a merge resolution - see git-merge(1) for more information. This 
form is not normally used directly. [Not really true? Still the best 
command to abort git checkout --merge/git stash pop|apply? Do those 
need --abort?]




If people just forgot about '--hard

Re: Reset by checkout?

2014-05-31 Thread Kevin Bracey

On 31/05/2014 08:46, Atsushi Nakagawa wrote:

   `git checkout -B current-branch-name tree-ish`

This is such an useful notion that I can fathom why there isn't a better,
first-tier, alternative.q
I'm 100% in agreement. Reset current branch to X is an extremely 
common operation, and I use this all the time. But having to actually 
name the current branch is silly, and like you, I'm prone to swapping 
the parameters.


I guess in theory using checkout allows fancier extra options like 
--merge and --patch, but I don't think I've ever used those with 
checkout, let alone this mode, where I really do just want a reset, 
with safety checks.


The original git reset --hard used to be a pretty top-level command. 
It was used for aborting merges in particular. But I think it now stands 
out as being one of the only really dangerous porcelain commands, and I 
can't think of any real workflow it's still useful for. Maybe it could 
now be modified to warn and require -f to overwrite anything in the 
working tree?


While digging into this, it seems git reset --keep is actually pretty 
close to git checkout -B current branch. It certainly won't lose 
your workspace file, but unlike checkout it /does /forget what you've 
staged, which could be annoying. Maybe that could be modified to keep 
the index too?


(I like your alias.become - might try that).

Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Unicode: update of combining code points

2014-04-17 Thread Kevin Bracey

On 16/04/2014 22:58, Torsten Bögershausen wrote:

Excellent, thanks for the pointers.
Running the script below shows that
0X00AD SOFT HYPHEN should have zero length (and some others too).
I wonder if that is really the case, and which one of the last 2 lines
in the script is the right one.

What does this mean for us:
CfFormat  a format control character

Maybe dig back through the Git logs to check the original logic, but the 
comments suggest that Cf characters have been viewed as zero-width. 
That makes sense - they're usually markers indicating things like 
bidirectional text flow, so won't be taking space. (Although they may be 
causing even more extreme layout effects...)


Soft-hyphen is noted as an explicit exception to the rule in the utf8.c 
comments. As of Unicode 4.0, it's supposed to be a character indicating 
a point where a hyphen could be placed if a line-wrap occurs, and if 
that wrap happens, then it can actually take up 1 space, otherwise not. 
So its width could be either 0 or 1, depending. Or, quite likely, the 
terminal doesn't treat it specially, and it always just looks like a 
hyphen... Thus we err on the safe side and give it width 1.


See http://en.wikipedia.org/wiki/Soft_hyphen for background.

The comments suggest adding -00AD +1160-11FF to the uniset command 
line for that tweak and for composing Hangul. (The +200B tweak isn't 
necessary any more - Zero-Width Space U+200B became Cf officially in 
Unicode 4.0.1:


http://en.wikipedia.org/wiki/Zero-width_space
http://www.unicode.org/review/resolved-pri.html#pri21
)

All of this is only really an approximation - a best-effort attempt to 
figure out the width of a string without any actual communication with 
the display device. So it'll never be perfect. The choice between double 
and single width in particular will often be unpredictable, unless you 
had deeper locale knowledge.


Actually, while doing this, I've realised that this was originally 
Markus Kuhn's implementation, and that is acknowledged at the top of the 
file:


http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c

Good, because he knows what he's doing.

Kevin




--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] utf8.c: partially update to version 6.3

2014-04-17 Thread Kevin Bracey

On 17/04/2014 14:47, Torsten Bögershausen wrote:

./uniset/uniset +cat=Me +cat=Mn +cat=Cf -00AD +1160-11FF +200B c


200B isn't a special case any more, as its database properties have been 
changed, so you can slightly simplify this command (both in the commit 
message and the comments).


And while you're here I think it's probably worth updating the width=2 
test now too, to make this a complete update to Unicode 6.3. I can see 
it needs a few extensions (eg A960-A97C, FE10-FE19, 1F200-1F251).


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Unicode: update of combining code points

2014-04-16 Thread Kevin Bracey

On 16/04/2014 07:48, Torsten Bögershausen wrote:

On 15.04.14 21:10, Peter Krefting wrote:

Torsten Bögershausen:


diff --git a/utf8.c b/utf8.c
index a831d50..77c28d4 100644
--- a/utf8.c
+++ b/utf8.c

Is there a script that generates this code from the Unicode database files, or 
did you hand-update it?


Some of the code points which have 0 length on the display are called
combining, others are called vowels or accents.
E.g. 5BF is not marked any of them, but if you look at the glyph, it should
be combining (please correct me if that is wrong).


Indeed it is combining (more specifically it has General Category 
Nonspacing_Mark = Mn).




If I could have found a file which indicates for each code point, what it
is, I could write a script.



The most complete and machine-readable data are in these files:

http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt
http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt

The general categories can also be seen more legibly in:

http://www.unicode.org/Public/UCD/latest/ucd/extracted/DerivedGeneralCategory.txt

For docs, see:

http://www.unicode.org/reports/tr44/
http://www.unicode.org/reports/tr11/
http://www.unicode.org/ucd/

The existing utf8.c comments describe the attributes being selected from 
the tables (general categories Cf,Mn,Me, East Asian Width W, 
F). And they suggest that the combining character table was originally 
auto-generated from UnicodeData.txt with a uniset tool. Presumably this?


https://github.com/depp/uniset

The fullwidth-checking code looks like it was done by hand, although 
apparently uniset can process EastAsianWidth.txt.


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: breakage in revision traversal with pathspec

2013-09-11 Thread Kevin Bracey

On 11/09/2013 01:23, Junio C Hamano wrote:

Kevin Bracey ke...@bracey.fi writes:


On 10/09/2013 20:19, Junio C Hamano wrote:

This command

  $ git log v1.8.3.1..v1.8.4 -- git-cvsserver.perl

reports that a merge 766f0f8ef7 (which did not touch the specified
path at all) touches it.

Bisecting points at d0af663e (revision.c: Make --full-history
consider more merges, 2013-05-16).


That merge appearing *with* --full-history would seem like correct
behaviour to me. Or at least it's what I intended.

... But it shouldn't
appear if the user does not ask for --full-history.


Well, there is a functioning semi-work-around for now: avoid difficult 
non-linear questions like v1.8.3.1..v1.8.4. A question like 
v1.8.3..v1.8.4 is a lot easier to visualise, and it does already omit 
the merge.


On reflection I'm not sure what we should for the simple history view 
of v1.8.3.1..v1.8.4. We're not rewriting parents, so we don't get a 
chance to reconsider the merge as being zero-parent, and we do have this 
little section of graph to traverse at the bottom:


  1.8.3
oxxxx---x--- (x = included, o = 
excluded, *=!treesame)

/
   /*
  o--x--x--x--x

In effect, we do have a linear section of history to follow, and the 
file does change in the middle of that line. It may be quite hard to 
come up with a solid rule to hide the merge that doesn't go wrong 
somewhere else.


The current rules for this are

1) if identical to any on-graph parent, follow that one, and rewrite the 
merge as a non-merge. We currently do not follow to an identical 
off-graph parent. This long-standing comment in try_to_simplify_commit 
applies: Even if a merge with an uninteresting side branch brought the 
entire change we are interested in, we do not want to lose the other 
branches of this merge, so we just keep going. For this query, the 
mainline link to 1.8.3 is the uninteresting side branch! If you do 
specify v1.8.3..v1.8.4, then v1.8.3 becomes on-graph thanks to other 
new rules, and this rule does kick in, hiding the merge.


2) If rule 1 doesn't activate, and it remains as a merge, hide it if 
treesame to all on-graph parents. Previously this rule was hide if 
treesame to any parent, and so that would have hidden the merge.


Now, when I changed rule 2, I did not think this would affect the 
default log. See my commit message:


Now redefine a commit's TREESAME flag to be true only if a commit is
TREESAME to _all_ of its [later: on-graph] parent. This doesn't 
affect ... the default
simplify_history behaviour (because partially TREESAME merges are 
turned

into normal commits)...

Whoops - partially TREESAME merges are not always turned into normal 
commits.


Maybe the fix is to define TREESAME differently for simplify_history - 
to use the old definition of identical to any parent in that case. I'm 
not sure that's right though.


I currently feel instinctively more disposed to dropping the older 
don't follow off-graph identical parents rule. Let the default history 
go straight to v1.8.3 even though it goes off the graph, stopping us 
traversing the topic branch.


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: breakage in revision traversal with pathspec

2013-09-11 Thread Kevin Bracey

On 11/09/2013 21:24, Jonathan Nieder wrote:

Kevin Bracey wrote:


On reflection I'm not sure what we should for the simple history
view of v1.8.3.1..v1.8.4. We're not rewriting parents, so we don't
get a chance to reconsider the merge as being zero-parent, and we do
have this little section of graph to traverse at the bottom:

   1.8.3
 oxxxx---x--- (x = included, o = excluded, 
*=!treesame)
 /
/*
   o--x--x--x--x

[...]

1) if identical to any on-graph parent, follow that one, and rewrite
the merge as a non-merge. We currently do not follow to an identical
off-graph parent. This long-standing comment in try_to_simplify_commit
applies: Even if a merge with an uninteresting side branch brought
the entire change we are interested in, we do not want to lose the
other branches of this merge, so we just keep going.


[...]

I currently feel instinctively more disposed to dropping the older
don't follow off-graph identical parents rule. Let the default
history go straight to v1.8.3 even though it goes off the graph,
stopping us traversing the topic branch.

Thanks for this analysis.  Interesting.

The rule (1) comes from v1.3.0-rc1~13^2~6: ...

I think you're right that dropping the don't follow off-graph
treesame parents rule would be a sensible change.  The usual point of
the follow the treesame parent rule is to avoid drawing undue
attention to merges of ancient history where some of the parents are
side-branches with an old version of the files being tracked and did
not actually change those files.  That rationale applies just as much
for a merge on top of an UNINTERESTING rev as any other merge.
I agree about the rationale still applying - why not follow off-graph, 
unless you're doing --ancestry-path? (Fortunately ancestry_path already 
disables simplify_history). That makes more sense if you try to ignore 
the misleading comment. In a typical v1..v3 range, the temporal 
limiting means that it's paths to the mainline that will tend to be 
marked UNINTERESTING, not to the topic branches...


But I can imagine going off graph it may previously have tripped up 
other parts of the code. It could be that this Git 1.3.0 rule ended up 
covering over some of the older merge hiding logic flakiness. Maybe it's 
no longer necessary. I'll do some experiments.


Now, one bit of news - I have just figured out why gitk is behaving 
differently. It transforms .. before it reaches git.


To see the effect at the command line: git log v1.8.3..v.1.8.4 hides 
the merge, but git log ^v1.8.3 v1.8.4 shows it. Whoops. A new example 
of a dotty shorthand not being exactly equivalent.


In the .. case the v1.8.3 tag gets peeled before being sent to 
add_rev_cmdline , and the mark bottom commits logic works. But in the 
^ case, the v1.8.3 doesn't get peeled. Junio - any thoughts on the 
correct place to fix that? (And gitk actually does ^tag-sha, just to 
be odd, so that needs to be handled too). Should these things be peeled 
in revs-cmdline or not? We should be consistent.


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] am: replace uses of --resolved with --continue

2013-06-26 Thread Kevin Bracey
git am was previously modified to provide --continue for consistency
with rebase, merge etc, and the documentation changed to showing
--continue as the primary form.

Complete the work by replacing remaining uses of --resolved by
--continue, most notably in suggested command reminders.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/git-am.txt  | 4 ++--
 Documentation/user-manual.txt | 2 +-
 git-am.sh | 8 
 t/t7512-status-help.sh| 4 ++--
 wt-status.c   | 2 +-
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/Documentation/git-am.txt b/Documentation/git-am.txt
index 5bbe7b6..54d8461 100644
--- a/Documentation/git-am.txt
+++ b/Documentation/git-am.txt
@@ -132,7 +132,7 @@ default.   You can use `--no-utf8` to override this.
 --resolvemsg=msg::
When a patch failure occurs, msg will be printed
to the screen before exiting.  This overrides the
-   standard message informing you to use `--resolved`
+   standard message informing you to use `--continue`
or `--skip` to handle the failure.  This is solely
for internal use between 'git rebase' and 'git am'.
 
@@ -176,7 +176,7 @@ aborts in the middle.  You can recover from this in one of 
two ways:
 
 . hand resolve the conflict in the working directory, and update
   the index file to bring it into a state that the patch should
-  have produced.  Then run the command with the '--resolved' option.
+  have produced.  Then run the command with the '--continue' option.
 
 The command refuses to process new mailboxes until the current
 operation is finished, so if you decide to start over from scratch,
diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt
index e831cc2..8218cf9 100644
--- a/Documentation/user-manual.txt
+++ b/Documentation/user-manual.txt
@@ -1835,7 +1835,7 @@ Once the index is updated with the results of the conflict
 resolution, instead of creating a new commit, just run
 
 -
-$ git am --resolved
+$ git am --continue
 -
 
 and Git will create the commit for you and continue applying the
diff --git a/git-am.sh b/git-am.sh
index 9f44509..7ea40fe 100755
--- a/git-am.sh
+++ b/git-am.sh
@@ -6,7 +6,7 @@ SUBDIRECTORY_OK=Yes
 OPTIONS_KEEPDASHDASH=
 OPTIONS_SPEC=\
 git am [options] [(mbox|Maildir)...]
-git am [options] (--resolved | --skip | --abort)
+git am [options] (--continue | --skip | --abort)
 --
 i,interactive   run interactively
 b,binary*   (historical option -- no-op)
@@ -102,7 +102,7 @@ stop_here_user_resolve () {
printf '%s\n' $resolvemsg
stop_here $1
 fi
-eval_gettextln When you have resolved this problem, run \\$cmdline 
--resolved\.
+eval_gettextln When you have resolved this problem, run \\$cmdline 
--continue\.
 If you prefer to skip this patch, run \\$cmdline --skip\ instead.
 To restore the original branch and stop patching, run \\$cmdline --abort\.
 
@@ -523,7 +523,7 @@ Use \git am --abort\ to remove it.)
esac
fi
 
-   # Make sure we are not given --skip, --resolved, nor --abort
+   # Make sure we are not given --skip, --continue, nor --abort
test $skip$resolved$abort =  ||
die $(gettext Resolve operation not in progress, we are not 
resuming.)
 
@@ -670,7 +670,7 @@ do
#  - patch is the patch body.
#
# When we are resuming, these files are either already prepared
-   # by the user, or the user can tell us to do so by --resolved flag.
+   # by the user, or the user can tell us to do so by --continue flag.
case $resume in
'')
if test -f $dotest/rebasing
diff --git a/t/t7512-status-help.sh b/t/t7512-status-help.sh
index 4f09bec..bd8aab0 100755
--- a/t/t7512-status-help.sh
+++ b/t/t7512-status-help.sh
@@ -510,7 +510,7 @@ test_expect_success 'status in an am session: file already 
exists' '
cat expected -\EOF 
# On branch am_already_exists
# You are in the middle of an am session.
-   #   (fix conflicts and then run git am --resolved)
+   #   (fix conflicts and then run git am --continue)
#   (use git am --skip to skip this patch)
#   (use git am --abort to restore the original branch)
#
@@ -532,7 +532,7 @@ test_expect_success 'status in an am session: file does not 
exist' '
cat expected -\EOF 
# On branch am_not_exists
# You are in the middle of an am session.
-   #   (fix conflicts and then run git am --resolved)
+   #   (fix conflicts and then run git am --continue)
#   (use git am --skip to skip this patch)
#   (use git am --abort to restore the original branch)
#
diff --git a/wt-status.c b/wt-status.c
index 438a40d..b191c65 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -826,7 +826,7 @@ static void show_am_in_progress(struct

[PATCH] Documentation: Move git diff blob blob

2013-06-18 Thread Kevin Bracey
The section describing git diff blob blob had been placed in a
position that disrupted the statement This is synonymous to the
previous form.

Reorder to place this form after all the commit-using forms, and the
note applying to them. Also mention this form in the initial description
paragraph.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/git-diff.txt | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index a7b4620..78d6d50 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -18,8 +18,8 @@ SYNOPSIS
 DESCRIPTION
 ---
 Show changes between the working tree and the index or a tree, changes
-between the index and a tree, changes between two trees, or changes
-between two files on disk.
+between the index and a tree, changes between two trees, changes between
+two blob objects, or changes between two files on disk.
 
 'git diff' [--options] [--] [path...]::
 
@@ -56,11 +56,6 @@ directories. This behavior can be forced by --no-index.
This is to view the changes between two arbitrary
commit.
 
-'git diff' [options] blob blob::
-
-   This form is to view the differences between the raw
-   contents of two blob objects.
-
 'git diff' [--options] commit..commit [--] [path...]::
 
This is synonymous to the previous form.  If commit on
@@ -87,6 +82,11 @@ and the range notations (commit..commit and
 commit\...commit) do not mean a range as defined in the
 SPECIFYING RANGES section in linkgit:gitrevisions[7].
 
+'git diff' [options] blob blob::
+
+   This form is to view the differences between the raw
+   contents of two blob objects.
+
 OPTIONS
 ---
 :git-diff: 1
-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] sha1_name: fix error message for @{N}, @{date}

2013-05-21 Thread Kevin Bracey

On 21/05/2013 19:52, Junio C Hamano wrote:

Ramkumar Ramachandra artag...@gmail.com writes:


The empty string '' looks ugly and inconsistent with the output of
branch@{N}.  Replace it with the string 'current branch'.

Wouldn't that be '*the* current branch'?

More importantly, doesn't real_ref have the name of the branch?

Suppose the user said git show @{1} instead of git show
master@{1} while on 'master'.

It could be argued that it may look nicer to say your current
branch does not have enough update history instead of saying
master does not... (i.e. different input to ask for the same
thing, different output depending on the way the user asked).  It
also could be argued that they should produce the same diagnosis
that is more informative.

I am slightly leaning toward the latter.
That would also avoid the complaint I was about to make that putting 
'current branch' in scare quotes would be annoying.


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] New kind of upstream branch: base branch

2013-05-18 Thread Kevin Bracey

On 17/05/2013 22:51, Junio C Hamano wrote:

Kevin Bracey ke...@bracey.fi writes:


On 15/05/2013 23:34, Felipe Contreras wrote:

   I think I'm using 'upstream' for something it was not intended to,
and
I think the current 'upstream' behavior should be split into
'upstream' and 'base'.


I found myself thinking the same thing. It's really convenient being
able to set your topic branch's upstream to another local branch, so

What is that another local branch? ...  And if that is your workflow, setting
push.default to current (and setting remote.pushdefault to your
publishing repository) should be a sufficient interim solution, and
you do not need to set branch.$name.push to each and every branch
you intend to push out, I think.


I agree that using push.default current covers some cases - I hadn't 
really considered it - tended to just stick with upstream. current 
nearly does the job, but I will sometimes be wanting different names.


What I'll often be doing is creating a topic branch based on master or 
origin/master. (I would hardly ever be updating master or pushing to 
origin/master myself, so I probably should be just doing origin/master, 
but I tend to create a local master just to save typing on all those 
git rebase origin/master).


During work, to give others visibility, and the possibility to tinker 
with the topic branch during development (as we don't have full 
inter-site sharing of work areas), I would push the topic branch up to 
the central origin server, often with a kbracey/ prefix, partially 
for namespacing, and partially to indicate it's currently private work 
and subject to rebasing.  I guess I could create the topic branch as 
kbracey/topic locally, but I'd rather not have to.


So I'd like git rebase (-i) to move my topic branch up 
(origin/)master. And I'd like git push (-f) to send it to 
origin/kbracey/topic. And by extension, I suppose git pull --rebase 
to update origin/master and rebase. (Although I'm not much of a puller - 
I tend to fetch then rebase manually).


The final releasing procedure for the topic branch would be to hand that 
branch over to an integrator, who would then merge/rebase it into master.


And it would be ideal if the initial base and push tracking information 
could be set up automatically on the first git checkout -b/git 
branch and git push. (For one, I've always found it odd that there's 
an asymmetry - if you check out a topic branch from the server to work 
on or use it, you get a local copy with upstream set by default. But if 
you create a topic branch yourself then push it, the upstream isn't set 
by default - you need the -u flag. This seems odd to me, and I've seen 
others confused by this).


Kevin


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] New kind of upstream branch: base branch

2013-05-17 Thread Kevin Bracey

On 15/05/2013 23:34, Felipe Contreras wrote:
  
I think I'm using 'upstream' for something it was not intended to, and

I think the current 'upstream' behavior should be split into
'upstream' and 'base'.

I found myself thinking the same thing. It's really convenient being 
able to set your topic branch's upstream to another local branch, so git 
rebase works without parameters. But then I can't use upstream to point 
to a remote version of that topic branch. I want my topic branch to know 
both that it's based on master (or origin/master), and that it's 
upstream is origin/topic.


So, yes, here's a vote in favour of the general concept.

Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 01/15] decorate.c: compact table when growing

2013-05-16 Thread Kevin Bracey
When growing the table, take the opportunity to compact it by removing
entries with NULL decoration.

Users may have removed decorations by passing NULL to
insert_decoration. An object's table entry can't actually be removed
during normal operation, as it would break the linear hash collision
search. But we can remove NULL decoration entries when rebuilding the
table.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 decorate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/decorate.c b/decorate.c
index 2f8a63e..7cb5d29 100644
--- a/decorate.c
+++ b/decorate.c
@@ -49,7 +49,7 @@ static void grow_decoration(struct decoration *n)
const struct object *base = old_hash[i].base;
void *decoration = old_hash[i].decoration;
 
-   if (!base)
+   if (!decoration)
continue;
insert_decoration(n, base, decoration);
}
-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 09/15] t6012: update test for tweaked full-history traversal

2013-05-16 Thread Kevin Bracey
From: Junio C Hamano gits...@pobox.com

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 t/t6012-rev-list-simplify.sh | 29 +++--
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh
index dd6dc84..4e55872 100755
--- a/t/t6012-rev-list-simplify.sh
+++ b/t/t6012-rev-list-simplify.sh
@@ -14,21 +14,24 @@ unnote () {
 
 test_expect_success setup '
echo Hi there file 
-   git add file 
-   test_tick  git commit -m Initial file 
+   echo initial lost 
+   git add file lost 
+   test_tick  git commit -m Initial file and lost 
note A 
 
git branch other-branch 
 
echo Hello file 
-   git add file 
-   test_tick  git commit -m Modified file 
+   echo second lost 
+   git add file lost 
+   test_tick  git commit -m Modified file and lost 
note B 
 
git checkout other-branch 
 
echo Hello file 
-   git add file 
+   lost 
+   git add file lost 
test_tick  git commit -m Modified the file identically 
note C 
 
@@ -37,7 +40,9 @@ test_expect_success setup '
test_tick  git commit -m Add another file 
note D 
 
-   test_tick  git merge -m merge master 
+   test_tick 
+   test_must_fail git merge -m merge master 
+   lost  git commit -a -m merge 
note E 
 
echo Yet another elif 
@@ -110,4 +115,16 @@ check_result 'I B A' -- file
 check_result 'I B A' --topo-order -- file
 check_result 'H' --first-parent -- another-file
 
+check_result 'E C B A' --full-history E -- lost
+test_expect_success 'full history simplification without parent' '
+   printf %s\n E C B A expect 
+   git log --pretty=$FMT --full-history E -- lost |
+   unnote actual 
+   sed -e s/^.*   \([^ ]*\) .*/\1/ check actual 
+   test_cmp expect check || {
+   cat actual
+   false
+   }
+'
+
 test_done
-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 11/15] simplify-merges: drop merge from irrelevant side branch

2013-05-16 Thread Kevin Bracey
Reimplement commit 4b7f53da on top of the new simplify-merges
infrastructure, tightening the condition to only consider root parents;
the original version incorrectly dropped parents that were TREESAME to
anything.

Original log message follows.

The merge simplification rule stated in 6546b59 (revision traversal:
show full history with merge simplification, 2008-07-31) still
treated merge commits too specially.  Namely, in a history with this
shape:

---o---o---M
  /
 x---x---x

where three 'x' were on a history completely unrelated to the main
history 'o' and do not touch any of the paths we are following, we
still said that after simplifying all of the parents of M, 'x'
(which is the leftmost 'x' that rightmost 'x simplifies down to) and
'o' (which would be the last commit on the main history that touches
the paths we are following) are independent from each other, and
both need to be kept.

That is incorrect; when the side branch 'x' never touches the paths,
it should be removed to allow M to simplify down to the last commit
on the main history that touches the paths.

Suggested-by: Junio C Hamano gits...@pobox.com
Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt | 34 +-
 revision.c | 26 +-
 t/t6012-rev-list-simplify.sh   |  2 +-
 3 files changed, 47 insertions(+), 15 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index f41e865..b462f17 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -342,13 +342,13 @@ In the following, we will always refer to the same 
example history to
 illustrate the differences between simplification settings.  We assume
 that you are filtering for a file `foo` in this commit graph:
 ---
- .-A---M---N---O---P
-/ /   /   /   /
-   I B   C   D   E
-\   /   /   /   /
- `-'
+ .-A---M---N---O---P---Q
+/ /   /   /   /   /
+   I B   C   D   E   Y
+\   /   /   /   /   /
+ `-'   X
 ---
-The horizontal line of history A---P is taken to be the first parent of
+The horizontal line of history A---Q is taken to be the first parent of
 each merge.  The commits are:
 
 * `I` is the initial commit, in which `foo` exists with contents
@@ -369,6 +369,10 @@ each merge.  The commits are:
 * `E` changes `quux` to xyzzy, and its merge `P` combines the
   strings to quux xyzzy.  `P` is TREESAME to `O`, but not to `E`.
 
+* `X` is an indpendent root commit that added a new file `side`, and `Y`
+  modified it. `Y` is TREESAME to `X`. Its merge `Q` added `side` to `P`, and
+  `Q` is TREESAME to `P`, but not to `Y`.
+
 'rev-list' walks backwards through history, including or excluding
 commits based on whether '\--full-history' and/or parent rewriting
 (via '\--parents' or '\--children') are used.  The following settings
@@ -409,7 +413,7 @@ parent lines.
the example, we get
 +
 ---
-   I  A  B  N  D  O  P
+   I  A  B  N  D  O  P  Q
 ---
 +
 `M` was excluded because it is TREESAME to both parents.  `E`,
@@ -430,7 +434,7 @@ Along each parent, prune away commits that are not included
 themselves.  This results in
 +
 ---
- .-A---M---N---O---P
+ .-A---M---N---O---P---Q
 / /   /   /   /
I B   /   D   /
 \   /   /   /   /
@@ -440,7 +444,7 @@ themselves.  This results in
 Compare to '\--full-history' without rewriting above.  Note that `E`
 was pruned away because it is TREESAME, but the parent list of P was
 rewritten to contain `E`'s parent `I`.  The same happened for `C` and
-`N`.
+`N`, and `X`, `Y` and `Q`.
 
 In addition to the above settings, you can change whether TREESAME
 affects inclusion:
@@ -470,9 +474,9 @@ history according to the following rules:
 * Set `C'` to `C`.
 +
 * Replace each parent `P` of `C'` with its simplification `P'`.  In
-  the process, drop parents that are ancestors of other parents, and
-  remove duplicates, but take care to never drop all parents that
-  we are TREESAME to.
+  the process, drop parents that are ancestors of other parents or that are
+  root commits TREESAME to an empty tree, and remove duplicates, but take care
+  to never drop all parents that we are TREESAME to.
 +
 * If after this parent rewriting, `C'` is a root or merge commit (has
   zero or 1 parents), a boundary commit, or !TREESAME, it remains.
@@ -490,7 +494,7 @@ The effect of this is best shown by way of comparing

[PATCH v4 08/15] revision.c: Make --full-history consider more merges

2013-05-16 Thread Kevin Bracey
History simplification previously always treated merges as TREESAME
if they were TREESAME to any parent.

While this was consistent with the default behaviour, this could be
extremely unhelpful when searching detailed history, and could not be
overridden. For example, if a merge had ignored a change, as if by -s
ours, then:

  git log -m -p --full-history -Schange file

would successfully locate change's addition but would not locate the
merge that resolved against it.

Futher, simplify_merges could drop the actual parent that a commit
was TREESAME to, leaving it as a normal commit marked TREESAME that
isn't actually TREESAME to its remaining parent.

Now redefine a commit's TREESAME flag to be true only if a commit is
TREESAME to _all_ of its parents. This doesn't affect either the default
simplify_history behaviour (because partially TREESAME merges are turned
into normal commits), or full-history with parent rewriting (because all
merges are output). But it does affect other modes. The clearest
difference is that --full-history will show more merges - sufficient to
ensure that -m -p --full-history log searches can really explain every
change to the file, including those changes' ultimate fate in merges.

Also modify simplify_merges to recalculate TREESAME after removing
a parent. This is achieved by storing per-parent TREESAME flags on the
initial scan, so the combined flag can be easily recomputed.

This fixes some t6111 failures, but creates a couple of new ones -
we are now showing some merges that don't need to be shown.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt |   6 +-
 revision.c | 241 -
 revision.h |   1 +
 t/t6019-rev-list-ancestry-path.sh  |   2 +-
 t/t6111-rev-list-treesame.sh   |  26 ++--
 5 files changed, 229 insertions(+), 47 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 55ddf33..d166384 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -409,10 +409,10 @@ parent lines.
the example, we get
 +
 ---
-   I  A  B  N  D  O
+   I  A  B  N  D  O  P
 ---
 +
-`P` and `M` were excluded because they are TREESAME to a parent.  `E`,
+`M` was excluded because it is TREESAME to both parents.  `E`,
 `C` and `B` were all walked, but only `B` was !TREESAME, so the others
 do not appear.
 +
@@ -440,7 +440,7 @@ themselves.  This results in
 Compare to '\--full-history' without rewriting above.  Note that `E`
 was pruned away because it is TREESAME, but the parent list of P was
 rewritten to contain `E`'s parent `I`.  The same happened for `C` and
-`N`.  Note also that `P` was included despite being TREESAME.
+`N`.
 
 In addition to the above settings, you can change whether TREESAME
 affects inclusion:
diff --git a/revision.c b/revision.c
index 7f7a8ab..64b86ae 100644
--- a/revision.c
+++ b/revision.c
@@ -429,10 +429,100 @@ static int rev_same_tree_as_empty(struct rev_info *revs, 
struct commit *commit)
return retval = 0  (tree_difference == REV_TREE_SAME);
 }
 
+struct treesame_state {
+   unsigned int nparents;
+   unsigned char treesame[FLEX_ARRAY];
+};
+
+static struct treesame_state *initialise_treesame(struct rev_info *revs, 
struct commit *commit)
+{
+   unsigned n = commit_list_count(commit-parents);
+   struct treesame_state *st = xcalloc(1, sizeof(*st) + n);
+   st-nparents = n;
+   add_decoration(revs-treesame, commit-object, st);
+   return st;
+}
+
+/*
+ * Must be called immediately after removing the nth_parent from a commit's
+ * parent list, if we are maintaining the per-parent treesame[] decoration.
+ * This does not recalculate the master TREESAME flag - update_treesame()
+ * should be called to update it after a sequence of treesame[] modifications
+ * that may have affected it.
+ */
+static int compact_treesame(struct rev_info *revs, struct commit *commit, 
unsigned nth_parent)
+{
+   struct treesame_state *st;
+   int old_same;
+
+   if (!commit-parents) {
+   /*
+* Have just removed the only parent from a non-merge.
+* Different handling, as we lack decoration.
+*/
+   if (nth_parent != 0)
+   die(compact_treesame %u, nth_parent);
+   old_same = !!(commit-object.flags  TREESAME);
+   if (rev_same_tree_as_empty(revs, commit))
+   commit-object.flags |= TREESAME;
+   else
+   commit-object.flags = ~TREESAME;
+   return old_same;
+   }
+
+   st = lookup_decoration(revs-treesame, commit-object);
+   if (!st || nth_parent = st-nparents)
+   die(compact_treesame %u

[PATCH v4 14/15] revision.c: don't show all merges for --parents

2013-05-16 Thread Kevin Bracey
When using --parents or --children, get_commit_action() previously showed
all merges, even if TREESAME to both parents.

This was intended to tie together the topology of the rewritten parents,
but it was excessive - in fact we only need to show merges that have two
or more relevant parents. Merges at the boundary do not necessarily need
to be shown.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 revision.c   | 22 +++---
 t/t6111-rev-list-treesame.sh |  4 ++--
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/revision.c b/revision.c
index 1c75070..edb7e1c 100644
--- a/revision.c
+++ b/revision.c
@@ -2760,10 +2760,7 @@ enum commit_action get_commit_action(struct rev_info 
*revs, struct commit *commi
if (revs-min_age != -1  (commit-date  revs-min_age))
return commit_ignore;
if (revs-min_parents || (revs-max_parents = 0)) {
-   int n = 0;
-   struct commit_list *p;
-   for (p = commit-parents; p; p = p-next)
-   n++;
+   int n = commit_list_count(commit-parents);
if ((n  revs-min_parents) ||
((revs-max_parents = 0)  (n  revs-max_parents)))
return commit_ignore;
@@ -2773,12 +2770,23 @@ enum commit_action get_commit_action(struct rev_info 
*revs, struct commit *commi
if (revs-prune  revs-dense) {
/* Commit without changes? */
if (commit-object.flags  TREESAME) {
+   int n;
+   struct commit_list *p;
/* drop merges unless we want parenthood */
if (!want_ancestry(revs))
return commit_ignore;
-   /* non-merge - always ignore it */
-   if (!commit-parents || !commit-parents-next)
-   return commit_ignore;
+   /*
+* If we want ancestry, then need to keep any merges
+* between relevant commits to tie together topology.
+* For consistency with TREESAME and simplification
+* use relevant here rather than just INTERESTING,
+* to treat bottom commit(s) as part of the topology.
+*/
+   for (n = 0, p = commit-parents; p; p = p-next)
+   if (relevant_commit(p-item))
+   if (++n = 2)
+   return commit_show;
+   return commit_ignore;
}
}
return commit_show;
diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh
index e32b373..25cc8ad 100755
--- a/t/t6111-rev-list-treesame.sh
+++ b/t/t6111-rev-list-treesame.sh
@@ -139,7 +139,7 @@ check_result 'M L G' F..M --first-parent -- file
 # If we want history since E, then we're quite happy to ignore G that took E.
 check_result 'M L K J I H G' E..M --ancestry-path
 check_result 'M L J I H' E..M --ancestry-path -- file
-check_outcome failure '(LH)M (K)L (EJ)K (I)J (E)I (E)H' E..M --ancestry-path 
--parents -- file # includes G
+check_result '(LH)M (K)L (EJ)K (I)J (E)I (E)H' E..M --ancestry-path --parents 
-- file
 check_result '(LH)M (E)H (J)L (I)J (E)I' E..M --ancestry-path 
--simplify-merges -- file
 
 # Should still be able to ignore I-J branch in simple log, despite limiting
@@ -168,7 +168,7 @@ check_result '(D)F (BA)D' B..F --full-history --parents -- 
file
 check_result '(B)F' B..F --simplify-merges -- file
 check_result 'F D' B..F --ancestry-path
 check_result 'F' B..F --ancestry-path -- file
-check_outcome failure 'F' B..F --ancestry-path --parents -- file # includes D
+check_result 'F' B..F --ancestry-path --parents -- file
 check_result 'F' B..F --ancestry-path --simplify-merges -- file
 check_result 'F D' B..F --first-parent
 check_result 'F' B..F --first-parent -- file
-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 12/15] revision.c: add BOTTOM flag for commits

2013-05-16 Thread Kevin Bracey
When performing edge-based operations on the revision graph, it can be
useful to be able to identify the INTERESTING graph's connection(s) to
the bottom commit(s) specified by the user.

Conceptually when the user specifies A..B (== B ^A), they are asking
for the history from A to B. The first connection from A onto the
INTERESTING graph is part of that history, and should be considered. If
we consider only INTERESTING nodes and their connections, then we're
really only considering the history from A's immediate descendants to B.

This patch does not change behaviour, but adds a new BOTTOM flag to
indicate the bottom commits specified by the user, ready to be used by
following patches.

We immediately use the BOTTOM flag to return collect_bottom_commits() to
its original approach of examining the pending commit list rather than
the command line. This will ensure alignment of the definition of
bottom with future patches.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 revision.c | 34 --
 revision.h |  3 ++-
 2 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/revision.c b/revision.c
index 4f7446c..6607dab 100644
--- a/revision.c
+++ b/revision.c
@@ -909,16 +909,12 @@ static void limit_to_ancestry(struct commit_list *bottom, 
struct commit_list *li
  * to filter the result of A..B further to the ones that can actually
  * reach A.
  */
-static struct commit_list *collect_bottom_commits(struct rev_info *revs)
+static struct commit_list *collect_bottom_commits(struct commit_list *list)
 {
-   struct commit_list *bottom = NULL;
-   int i;
-   for (i = 0; i  revs-cmdline.nr; i++) {
-   struct rev_cmdline_entry *elem = revs-cmdline.rev[i];
-   if ((elem-flags  UNINTERESTING) 
-   elem-item-type == OBJ_COMMIT)
-   commit_list_insert((struct commit *)elem-item, 
bottom);
-   }
+   struct commit_list *elem, *bottom = NULL;
+   for (elem = list; elem; elem = elem-next)
+   if (elem-item-object.flags  BOTTOM)
+   commit_list_insert(elem-item, bottom);
return bottom;
 }
 
@@ -949,7 +945,7 @@ static int limit_list(struct rev_info *revs)
struct commit_list *bottom = NULL;
 
if (revs-ancestry_path) {
-   bottom = collect_bottom_commits(revs);
+   bottom = collect_bottom_commits(list);
if (!bottom)
die(--ancestry-path given but there are no bottom 
commits);
}
@@ -1121,7 +1117,7 @@ static int add_parents_only(struct rev_info *revs, const 
char *arg_, int flags)
const char *arg = arg_;
 
if (*arg == '^') {
-   flags ^= UNINTERESTING;
+   flags ^= UNINTERESTING | BOTTOM;
arg++;
}
if (get_sha1_committish(arg, sha1))
@@ -1213,8 +1209,8 @@ static void prepare_show_merge(struct rev_info *revs)
add_pending_object(revs, head-object, HEAD);
add_pending_object(revs, other-object, MERGE_HEAD);
bases = get_merge_bases(head, other, 1);
-   add_rev_cmdline_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING);
-   add_pending_commit_list(revs, bases, UNINTERESTING);
+   add_rev_cmdline_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING | 
BOTTOM);
+   add_pending_commit_list(revs, bases, UNINTERESTING | BOTTOM);
free_commit_list(bases);
head-object.flags |= SYMMETRIC_LEFT;
 
@@ -1250,13 +1246,15 @@ int handle_revision_arg(const char *arg_, struct 
rev_info *revs, int flags, unsi
int cant_be_filename = revarg_opt  REVARG_CANNOT_BE_FILENAME;
unsigned get_sha1_flags = 0;
 
+   flags = flags  UNINTERESTING ? flags | BOTTOM : flags  ~BOTTOM;
+
dotdot = strstr(arg, ..);
if (dotdot) {
unsigned char from_sha1[20];
const char *next = dotdot + 2;
const char *this = arg;
int symmetric = *next == '.';
-   unsigned int flags_exclude = flags ^ UNINTERESTING;
+   unsigned int flags_exclude = flags ^ (UNINTERESTING | BOTTOM);
static const char head_by_default[] = HEAD;
unsigned int a_flags;
 
@@ -1332,13 +1330,13 @@ int handle_revision_arg(const char *arg_, struct 
rev_info *revs, int flags, unsi
dotdot = strstr(arg, ^!);
if (dotdot  !dotdot[2]) {
*dotdot = 0;
-   if (!add_parents_only(revs, arg, flags ^ UNINTERESTING))
+   if (!add_parents_only(revs, arg, flags ^ (UNINTERESTING | 
BOTTOM)))
*dotdot = '^';
}
 
local_flags = 0;
if (*arg == '^') {
-   local_flags = UNINTERESTING;
+   local_flags = UNINTERESTING | BOTTOM;
arg++;
}
 
@@ -1815,7 +1813,7 @@ static int handle_revision_pseudo_opt(const char 
*submodule,
handle_refs(submodule, revs, *flags

[PATCH v4 15/15] revision.c: make default history consider bottom commits

2013-05-16 Thread Kevin Bracey
Previously, the default history treated bottom commits the same as any
other UNINTERESTING commit, which could force it down side branches.

Consider the following history:

   *A--*B---D--*F * marks !TREESAME parent paths
 \ /*
  `-C-'

When requesting B..F, B is UNINTERESTING but TREESAME to D. C is
!UNINTERESTING.

So default following would go from D into the irrelevant side branch C
to A, rather than to B.  Note also that if there had been an extra
!UNINTERESTING commit B1 between B and D, it wouldn't have gone down C.

Change the default following to test relevant_commit() instead of
!UNINTERESTING, so it can proceed straight from D to B, thus finishing
the traversal of that path.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 revision.c   |  2 +-
 t/t6111-rev-list-treesame.sh | 12 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/revision.c b/revision.c
index edb7e1c..914ac78 100644
--- a/revision.c
+++ b/revision.c
@@ -684,7 +684,7 @@ static void try_to_simplify_commit(struct rev_info *revs, 
struct commit *commit)
sha1_to_hex(p-object.sha1));
switch (rev_compare_tree(revs, p, commit)) {
case REV_TREE_SAME:
-   if (!revs-simplify_history || (p-object.flags  
UNINTERESTING)) {
+   if (!revs-simplify_history || !relevant_commit(p)) {
/* Even if a merge with an uninteresting
 * side branch brought the entire change
 * we are interested in, we do not want
diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh
index 25cc8ad..88b84df 100755
--- a/t/t6111-rev-list-treesame.sh
+++ b/t/t6111-rev-list-treesame.sh
@@ -146,8 +146,8 @@ check_result '(LH)M (E)H (J)L (I)J (E)I' E..M 
--ancestry-path --simplify-merges
 # to G.
 check_result 'M L K J I H' G..M
 check_result 'M H L K J I' G..M --topo-order
-check_outcome failure 'M L H' G..M -- file # includes J I
-check_outcome failure '(LH)M (G)L (G)H' G..M --parents -- file # includes J I
+check_result 'M L H' G..M -- file
+check_result '(LH)M (G)L (G)H' G..M --parents -- file
 check_result 'M L J I H' G..M --full-history -- file
 check_result 'M L K J I H' G..M --full-history --parents -- file
 check_result 'M H L J I' G..M --simplify-merges -- file
@@ -161,8 +161,8 @@ check_result 'M H L J I' G..M --ancestry-path 
--simplify-merges -- file
 # But --full-history shouldn't drop D on its own - without simplification,
 # we can't decide if the merge from INTERESTING commit C was sensible.
 check_result 'F D C' B..F
-check_outcome failure 'F' B..F -- file # includes D
-check_outcome failure '(B)F' B..F --parents -- file # includes D
+check_result 'F' B..F -- file
+check_result '(B)F' B..F --parents -- file
 check_result 'F D' B..F --full-history -- file
 check_result '(D)F (BA)D' B..F --full-history --parents -- file
 check_result '(B)F' B..F --simplify-merges -- file
@@ -174,8 +174,8 @@ check_result 'F D' B..F --first-parent
 check_result 'F' B..F --first-parent -- file
 
 # E...F should be equivalent to E F ^B, and be able to drop D as above.
-check_outcome failure 'F' E F ^B -- file # includes D
-check_outcome failure 'F' E...F -- file # includes D
+check_result 'F' E F ^B -- file # includes D
+check_result 'F' E...F -- file # includes D
 
 # Any sort of full history of C..F should show D, as it's the connection to C,
 # and it differs from it.
-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 03/15] t6111: new TREESAME test set

2013-05-16 Thread Kevin Bracey
Some side branching and odd merging to illustrate various flaws in
revision list scans, particularly when limiting the list.

Many expected failures, which will be gone by the end of the history
traversal refinements series.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 t/t6111-rev-list-treesame.sh | 184 +++
 1 file changed, 184 insertions(+)
 create mode 100755 t/t6111-rev-list-treesame.sh

diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh
new file mode 100755
index 000..b2bca77
--- /dev/null
+++ b/t/t6111-rev-list-treesame.sh
@@ -0,0 +1,184 @@
+#!/bin/sh
+#
+#,---E--.   *H--. * marks !TREESAME parent paths
+#   /\ / \*
+# *A--*B---D--*F-*G-K-*L-*M
+#   \ /*   \   /
+#`-C-'  `-*I-*J
+#
+# A creates file, B and F change it.
+# Odd merge G takes the old version from B.
+# I changes it, but J reverts it, so K is TREESAME to both parents.
+# H and L both change file, and M merges those changes.
+
+test_description='TREESAME and limiting'
+
+. ./test-lib.sh
+
+note () {
+   git tag $1
+}
+
+unnote () {
+   git name-rev --tags --stdin | sed -e s|$_x40 (tags/\([^)]*\)) |\1 |g
+}
+
+test_expect_success setup '
+   test_commit Initial file file Hi there A 
+   git branch other-branch 
+
+   test_commit file=Hello file Hello B 
+   git branch third-branch 
+
+   git checkout other-branch 
+   test_commit Added other other Hello C 
+
+   git checkout master 
+   test_merge D other-branch 
+
+   git checkout third-branch 
+   test_commit Third file third Nothing E 
+
+   git checkout master 
+   test_commit file=Blah file Blah F 
+
+   test_tick  git merge --no-commit third-branch 
+   git checkout third-branch file 
+   git commit 
+   note G 
+   git branch fiddler-branch 
+
+   git checkout -b part2-branch 
+   test_commit file=Part 2 file Part 2 H 
+
+   git checkout fiddler-branch 
+   test_commit Bad commit file Silly I 
+
+   test_tick  git revert I  note J 
+
+   git checkout master 
+   test_tick  git merge --no-ff fiddler-branch 
+   note K
+
+   test_commit file=Part 1 file Part 1 L 
+
+   test_tick  test_must_fail git merge part2-branch 
+   test_commit M file Parts 1+2
+'
+
+FMT='tformat:%P%H | %s'
+
+# could we soup this up to optionally check parents? So (BA)C would check
+# that C is shown and has parents B A.
+check_outcome () {
+   outcome=$1
+   shift
+   for c in $1
+   do
+   echo $c
+   done expect 
+   shift 
+   param=$* 
+   test_expect_$outcome log $param '
+   git log --format=$FMT $param |
+   unnote actual 
+   sed -e s/^.*   \([^ ]*\) .*/\1/ check actual 
+   test_cmp expect check || {
+   cat actual
+   false
+   }
+   '
+}
+
+check_result () {
+   check_outcome success $@
+}
+
+# Odd merge G drops a change in F. Important that G is listed in all
+# except the most basic list. Achieving this means normal merge D will also be
+# shown in normal full-history, as we can't distinguish unless we do a
+# simplification pass. After simplification, D is dropped but G remains.
+check_result 'M L K J I H G F E D C B A'
+check_result 'M H L K J I G E F D C B A' --topo-order
+check_result 'M L H B A' -- file
+check_result 'M L H B A' --parents -- file
+check_outcome failure 'M L J I H G F D B A' --full-history -- file # drops G
+check_result 'M L K J I H G F D B A' --full-history --parents -- file
+check_outcome failure 'M H L J I G F B A' --simplify-merges -- file # drops G
+check_result 'M L K G F D B A' --first-parent
+check_result 'M L G F B A' --first-parent -- file
+
+# Check that odd merge G remains shown when F is the bottom.
+check_result 'M L K J I H G E' F..M
+check_result 'M H L K J I G E' F..M --topo-order
+check_result 'M L H' F..M -- file
+check_result 'M L H' F..M --parents -- file # L+H's parents rewritten to B, so 
more useful than it may seem
+check_outcome failure 'M L J I H G' F..M --full-history -- file # drops G
+check_result 'M L K J I H G' F..M --full-history --parents -- file
+check_outcome failure 'M H L J I G' F..M --simplify-merges -- file # drops G
+check_result 'M L K J I H G' F..M --ancestry-path
+check_outcome failure 'M L J I H G' F..M --ancestry-path -- file # drops G
+check_result 'M L K J I H G' F..M --ancestry-path --parents -- file
+check_result 'M H L J I G' F..M --ancestry-path --simplify-merges -- file
+check_result 'M L K G' F..M --first-parent
+check_result 'M L G' F..M --first-parent -- file
+
+# Note that G is pruned when E is the bottom, even if it's the same commit list
+# If we want history since E, then we're quite happy to ignore G that took E.
+check_result 'M L K J I H G' E..M --ancestry-path
+check_result 'M L J I H

[PATCH v4 13/15] revision.c: discount side branches when computing TREESAME

2013-05-16 Thread Kevin Bracey
Use the BOTTOM flag to define relevance for pruning. Relevant commits
are those that are !UNINTERESTING or BOTTOM, and this allows us to
identify irrelevant side branches (UNINTERESTING  !BOTTOM).

If a merge has relevant parents, and it is TREESAME to them, then do not
let irrelevant parents cause the merge to be treated as !TREESAME.

When considering simplification, don't always include all merges -
merges with exactly one relevant parent can be simplified, if TREESAME
according to the above rule.

These two changes greatly increase simplification in limited, pruned
revision lists.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 revision.c| 171 +-
 t/t6019-rev-list-ancestry-path.sh |  12 ++-
 t/t6111-rev-list-treesame.sh  |   8 +-
 3 files changed, 164 insertions(+), 27 deletions(-)

diff --git a/revision.c b/revision.c
index 6607dab..1c75070 100644
--- a/revision.c
+++ b/revision.c
@@ -333,6 +333,80 @@ static int everybody_uninteresting(struct commit_list 
*orig)
 }
 
 /*
+ * A definition of relevant commit that we can use to simplify limited graphs
+ * by eliminating side branches.
+ *
+ * A relevant commit is one that is !UNINTERESTING (ie we are including it
+ * in our list), or that is a specified BOTTOM commit. Then after computing
+ * a limited list, during processing we can generally ignore boundary merges
+ * coming from outside the graph, (ie from irrelevant parents), and treat
+ * those merges as if they were single-parent. TREESAME is defined to consider
+ * only relevant parents, if any. If we are TREESAME to our on-graph parents,
+ * we don't care if we were !TREESAME to non-graph parents.
+ *
+ * Treating bottom commits as relevant ensures that a limited graph's
+ * connection to the actual bottom commit is not viewed as a side branch, but
+ * treated as part of the graph. For example:
+ *
+ *   Z...A---X---o---o---B
+ *. /
+ * W---Y
+ *
+ * When computing A..B, the A-X connection is at least as important as
+ * Y-X, despite A being flagged UNINTERESTING.
+ *
+ * And when computing --ancestry-path A..B, the A-X connection is more
+ * important than Y-X, despite both A and Y being flagged UNINTERESTING.
+ */
+static inline int relevant_commit(struct commit *commit)
+{
+   return (commit-object.flags  (UNINTERESTING | BOTTOM)) != 
UNINTERESTING;
+}
+
+/*
+ * Return a single relevant commit from a parent list. If we are a TREESAME
+ * commit, and this selects one of our parents, then we can safely simplify to
+ * that parent.
+ */
+static struct commit *one_relevant_parent(const struct rev_info *revs,
+ struct commit_list *orig)
+{
+   struct commit_list *list = orig;
+   struct commit *relevant = NULL;
+
+   if (!orig)
+   return NULL;
+
+   /*
+* For 1-parent commits, or if first-parent-only, then return that
+* first parent (even if not relevant by the above definition).
+* TREESAME will have been set purely on that parent.
+*/
+   if (revs-first_parent_only || !orig-next)
+   return orig-item;
+
+   /*
+* For multi-parent commits, identify a sole relevant parent, if any.
+* If we have only one relevant parent, then TREESAME will be set purely
+* with regard to that parent, and we can simplify accordingly.
+*
+* If we have more than one relevant parent, or no relevant parents
+* (and multiple irrelevant ones), then we can't select a parent here
+* and return NULL.
+*/
+   while (list) {
+   struct commit *commit = list-item;
+   list = list-next;
+   if (relevant_commit(commit)) {
+   if (relevant)
+   return NULL;
+   relevant = commit;
+   }
+   }
+   return relevant;
+}
+
+/*
  * The goal is to get REV_TREE_NEW as the result only if the
  * diff consists of all '+' (and no other changes), REV_TREE_OLD
  * if the whole diff is removal of old data, and otherwise
@@ -502,27 +576,52 @@ static unsigned update_treesame(struct rev_info *revs, 
struct commit *commit)
if (commit-parents  commit-parents-next) {
unsigned n;
struct treesame_state *st;
+   struct commit_list *p;
+   unsigned relevant_parents;
+   unsigned relevant_change, irrelevant_change;
 
st = lookup_decoration(revs-treesame, commit-object);
if (!st)
die(update_treesame %s, 
sha1_to_hex(commit-object.sha1));
-   commit-object.flags |= TREESAME;
-   for (n = 0; n  st-nparents; n++) {
-   if (!st-treesame[n]) {
-   commit-object.flags = ~TREESAME;
-   break

[PATCH v4 10/15] simplify-merges: never remove all TREESAME parents

2013-05-16 Thread Kevin Bracey
When simplifying an odd merge, such as one that used -s ours, we may
find ourselves TREESAME to apparently redundant parents. Prevent
simplify_merges() from removing every TREESAME parent; if this would
happen reinstate the first TREESAME parent - the one that the default
log would have followed.

This avoids producing a totally disjoint history from the default log
when the default log is a better explanation of the end result, and aids
visualisation of odd merges.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt |  3 +-
 revision.c | 69 ++
 t/t6111-rev-list-treesame.sh   |  4 +--
 3 files changed, 73 insertions(+), 3 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index d166384..f41e865 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -471,7 +471,8 @@ history according to the following rules:
 +
 * Replace each parent `P` of `C'` with its simplification `P'`.  In
   the process, drop parents that are ancestors of other parents, and
-  remove duplicates.
+  remove duplicates, but take care to never drop all parents that
+  we are TREESAME to.
 +
 * If after this parent rewriting, `C'` is a root or merge commit (has
   zero or 1 parents), a boundary commit, or !TREESAME, it remains.
diff --git a/revision.c b/revision.c
index 64b86ae..62f399c 100644
--- a/revision.c
+++ b/revision.c
@@ -2136,6 +2136,73 @@ static int mark_redundant_parents(struct rev_info *revs, 
struct commit *commit)
return marked;
 }
 
+/*
+ * Awkward naming - this means one parent we are TREESAME to.
+ * cf mark_treesame_root_parents: root parents that are TREESAME (to an
+ * empty tree). Better name suggestions?
+ */
+static int leave_one_treesame_to_parent(struct rev_info *revs, struct commit 
*commit)
+{
+   struct treesame_state *ts = lookup_decoration(revs-treesame, 
commit-object);
+   struct commit *unmarked = NULL, *marked = NULL;
+   struct commit_list *p;
+   unsigned n;
+
+   for (p = commit-parents, n = 0; p; p = p-next, n++) {
+   if (ts-treesame[n]) {
+   if (p-item-object.flags  TMP_MARK) {
+   if (!marked)
+   marked = p-item;
+   } else {
+   if (!unmarked) {
+   unmarked = p-item;
+   break;
+   }
+   }
+   }
+   }
+
+   /*
+* If we are TREESAME to a marked-for-deletion parent, but not to any
+* unmarked parents, unmark the first TREESAME parent. This is the
+* parent that the default simplify_history==1 scan would have followed,
+* and it doesn't make sense to omit that path when asking for a
+* simplified full history. Retaining it improves the chances of
+* understanding odd missed merges that took an old version of a file.
+*
+* Example:
+*
+*   I*X   A modified the file, but mainline merge X used
+*\   /-s ours, so took the version from I. X is
+* `-*A--' TREESAME to I and !TREESAME to A.
+*
+* Default log from X would produce I. Without this check,
+* --full-history --simplify-merges would produce I-A-X, showing
+* the merge commit X and that it changed A, but not making clear that
+* it had just taken the I version. With this check, the topology above
+* is retained.
+*
+* Note that it is possible that the simplification chooses a different
+* TREESAME parent from the default, in which case this test doesn't
+* activate, and we _do_ drop the default parent. Example:
+*
+*   I--X A modified the file, but it was reverted in B,
+*\/  meaning mainline merge X is TREESAME to both
+**A-*B   parents.
+*
+* Default log would produce I by following the first parent;
+* --full-history --simplify-merges will produce I-A-B. But this is a
+* reasonable result - it presents a logical full history leading from
+* I to X, and X is not an important merge.
+*/
+   if (!unmarked  marked) {
+   marked-object.flags = ~TMP_MARK;
+   return 1;
+   }
+
+   return 0;
+}
+
 static int remove_marked_parents(struct rev_info *revs, struct commit *commit)
 {
struct commit_list **pp, *p;
@@ -2239,6 +2306,8 @@ static struct commit_list **simplify_one(struct rev_info 
*revs, struct commit *c
if (1  cnt) {
int marked = mark_redundant_parents(revs, commit);
if (marked)
+   marked -= leave_one_treesame_to_parent(revs

[PATCH v4 06/15] rev-list-options.txt: correct TREESAME for P

2013-05-16 Thread Kevin Bracey
In the example given, P is not TREESAME to E. This doesn't affect the
current result, but it will matter when we change behaviour.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 3bdbf5e..50bbff7 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -367,8 +367,7 @@ each merge.  The commits are:
   `N` and `D` to foobarbaz; i.e., it is not TREESAME to any parent.
 
 * `E` changes `quux` to xyzzy, and its merge `P` combines the
-  strings to quux xyzzy.  Despite appearing interesting, `P` is
-  TREESAME to all parents.
+  strings to quux xyzzy.  `P` is TREESAME to `O`, but not to `E`.
 
 'rev-list' walks backwards through history, including or excluding
 commits based on whether '\--full-history' and/or parent rewriting
-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 07/15] Documentation: avoid uninteresting

2013-05-16 Thread Kevin Bracey
The documentation of --boundary uses the term uninteresting, which is
not used or defined anywhere else in the documentation. This is
unhelpful and confusing to anyone who hasn't seen the UNINTERESTING
flag in the source code.

Change to use excluded, as per revisions.txt.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 50bbff7..55ddf33 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -271,8 +271,8 @@ See also linkgit:git-reflog[1].
 
 --boundary::
 
-   Output uninteresting commits at the boundary, which are usually
-   not shown.
+   Output excluded boundary commits. Boundary commits are
+   prefixed with `-`.
 
 --
 
-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 00/15] History traversal refinements

2013-05-16 Thread Kevin Bracey
No new functionality or bug fixes since v3, just tidying:

* Tests now use Junio's parent-checking functionality
* BOTTOM flags now set in a neater fashion (I think),
  separating it out from the cmdline stuff.
* Creation and use of BOTTOM flag now split into 4 separate
  commits - last version was too much for one commit, I feel.
* Finally decided that relevant is the word I was looking
  for. Obvious, really.
* On the subject of words, remove the only technical use
  of uninteresting that I found in the Documentation - I know
  that it always confused me until I read the source.

This sequence is based on my 2-commit ancestry-path ... series,
but no longer depends on it due to the new way the BOTTOM flag
is initialised. But they both touch the t6019 test, so
applying this on top will avoid conflicts.

Junio C Hamano (2):
  t6111: allow checking the parents as well
  t6012: update test for tweaked full-history traversal

Kevin Bracey (13):
  decorate.c: compact table when growing
  t6019: test file dropped in -s ours merge
  t6111: new TREESAME test set
  t6111: add parents to tests
  rev-list-options.txt: correct TREESAME for P
  Documentation: avoid uninteresting
  revision.c: Make --full-history consider more merges
  simplify-merges: never remove all TREESAME parents
  simplify-merges: drop merge from irrelevant side branch
  revision.c: add BOTTOM flag for commits
  revision.c: discount side branches when computing TREESAME
  revision.c: don't show all merges for --parents
  revision.c: make default history consider bottom commits

 Documentation/rev-list-options.txt |  42 +--
 decorate.c |   2 +-
 revision.c | 539 -
 revision.h |   4 +-
 t/t6012-rev-list-simplify.sh   |  31 ++-
 t/t6019-rev-list-ancestry-path.sh  |  27 ++
 t/t6111-rev-list-treesame.sh   | 196 ++
 7 files changed, 750 insertions(+), 91 deletions(-)
 create mode 100755 t/t6111-rev-list-treesame.sh

-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 04/15] t6111: allow checking the parents as well

2013-05-16 Thread Kevin Bracey
From: Junio C Hamano gits...@pobox.com

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 t/t6111-rev-list-treesame.sh | 30 +-
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh
index b2bca77..1e4a550 100755
--- a/t/t6111-rev-list-treesame.sh
+++ b/t/t6111-rev-list-treesame.sh
@@ -20,7 +20,7 @@ note () {
 }
 
 unnote () {
-   git name-rev --tags --stdin | sed -e s|$_x40 (tags/\([^)]*\)) |\1 |g
+   git name-rev --tags --stdin | sed -e s|$_x40 (tags/\([^)]*\))\([   
]\)|\1\2|g
 }
 
 test_expect_success setup '
@@ -66,23 +66,34 @@ test_expect_success setup '
test_commit M file Parts 1+2
 '
 
-FMT='tformat:%P%H | %s'
-
 # could we soup this up to optionally check parents? So (BA)C would check
 # that C is shown and has parents B A.
 check_outcome () {
outcome=$1
shift
-   for c in $1
-   do
-   echo $c
-   done expect 
-   shift 
+
+   case $1 in
+   *(*)
+   FMT=%P %H | %s
+   munge_actual=
+   s/^\([^ ]*\)\([^ ]*\) .*/(\1)\2/
+   s/ //g
+   s/()//
+   
+   ;;
+   *)
+   FMT=%H | %s
+   munge_actual=s/^\([^ ]*\) .*/\1/
+   ;;
+   esac 
+   printf %s\n $1 expect 
+   shift
+
param=$* 
test_expect_$outcome log $param '
git log --format=$FMT $param |
unnote actual 
-   sed -e s/^.*   \([^ ]*\) .*/\1/ check actual 
+   sed -e $munge_actual actual check 
test_cmp expect check || {
cat actual
false
@@ -99,6 +110,7 @@ check_result () {
 # shown in normal full-history, as we can't distinguish unless we do a
 # simplification pass. After simplification, D is dropped but G remains.
 check_result 'M L K J I H G F E D C B A'
+check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FE)G (D)F (B)E (BC)D (A)C (A)B 
A'
 check_result 'M H L K J I G E F D C B A' --topo-order
 check_result 'M L H B A' -- file
 check_result 'M L H B A' --parents -- file
-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 05/15] t6111: add parents to tests

2013-05-16 Thread Kevin Bracey
Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 t/t6111-rev-list-treesame.sh | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh
index 1e4a550..4d74d3c 100755
--- a/t/t6111-rev-list-treesame.sh
+++ b/t/t6111-rev-list-treesame.sh
@@ -66,8 +66,6 @@ test_expect_success setup '
test_commit M file Parts 1+2
 '
 
-# could we soup this up to optionally check parents? So (BA)C would check
-# that C is shown and has parents B A.
 check_outcome () {
outcome=$1
shift
@@ -109,14 +107,16 @@ check_result () {
 # except the most basic list. Achieving this means normal merge D will also be
 # shown in normal full-history, as we can't distinguish unless we do a
 # simplification pass. After simplification, D is dropped but G remains.
+# Also, merge simplification of G should not drop the parent B that the default
+# simple history follows.
 check_result 'M L K J I H G F E D C B A'
 check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FE)G (D)F (B)E (BC)D (A)C (A)B 
A'
 check_result 'M H L K J I G E F D C B A' --topo-order
 check_result 'M L H B A' -- file
-check_result 'M L H B A' --parents -- file
+check_result '(LH)M (B)L (B)H (A)B A' --parents -- file
 check_outcome failure 'M L J I H G F D B A' --full-history -- file # drops G
-check_result 'M L K J I H G F D B A' --full-history --parents -- file
-check_outcome failure 'M H L J I G F B A' --simplify-merges -- file # drops G
+check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FB)G (D)F (BA)D (A)B A' 
--full-history --parents -- file
+check_outcome failure '(LH)M (G)H (J)L (I)J (G)I (FB)G (B)F (A)B A' 
--simplify-merges -- file # drops G
 check_result 'M L K G F D B A' --first-parent
 check_result 'M L G F B A' --first-parent -- file
 
@@ -124,14 +124,14 @@ check_result 'M L G F B A' --first-parent -- file
 check_result 'M L K J I H G E' F..M
 check_result 'M H L K J I G E' F..M --topo-order
 check_result 'M L H' F..M -- file
-check_result 'M L H' F..M --parents -- file # L+H's parents rewritten to B, so 
more useful than it may seem
+check_result '(LH)M (B)L (B)H' --parents F..M -- file
 check_outcome failure 'M L J I H G' F..M --full-history -- file # drops G
-check_result 'M L K J I H G' F..M --full-history --parents -- file
-check_outcome failure 'M H L J I G' F..M --simplify-merges -- file # drops G
+check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FB)G' F..M --full-history 
--parents -- file
+check_outcome failure '(LH)M (G)H (J)L (I)J (G)I (FB)G' F..M --simplify-merges 
-- file # drops G
 check_result 'M L K J I H G' F..M --ancestry-path
 check_outcome failure 'M L J I H G' F..M --ancestry-path -- file # drops G
-check_result 'M L K J I H G' F..M --ancestry-path --parents -- file
-check_result 'M H L J I G' F..M --ancestry-path --simplify-merges -- file
+check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FE)G' F..M --ancestry-path 
--parents -- file
+check_result '(LH)M (G)H (J)L (I)J (G)I (FE)G' F..M --ancestry-path 
--simplify-merges -- file
 check_result 'M L K G' F..M --first-parent
 check_result 'M L G' F..M --first-parent -- file
 
@@ -139,15 +139,15 @@ check_result 'M L G' F..M --first-parent -- file
 # If we want history since E, then we're quite happy to ignore G that took E.
 check_result 'M L K J I H G' E..M --ancestry-path
 check_result 'M L J I H' E..M --ancestry-path -- file
-check_outcome failure 'M L K J I H' E..M --ancestry-path --parents -- file # 
includes G
-check_outcome failure 'M H L J I' E..M --ancestry-path --simplify-merges -- 
file # includes G
+check_outcome failure '(LH)M (K)L (EJ)K (I)J (E)I (E)H' E..M --ancestry-path 
--parents -- file # includes G
+check_outcome failure '(LH)M (E)H (J)L (I)J (E)I' E..M --ancestry-path 
--simplify-merges -- file # includes G
 
 # Should still be able to ignore I-J branch in simple log, despite limiting
 # to G.
 check_result 'M L K J I H' G..M
 check_result 'M H L K J I' G..M --topo-order
 check_outcome failure 'M L H' G..M -- file # includes J I
-check_outcome failure 'M L H' G..M --parents -- file # includes J I
+check_outcome failure '(LH)M (G)L (G)H' G..M --parents -- file # includes J I
 check_result 'M L J I H' G..M --full-history -- file
 check_result 'M L K J I H' G..M --full-history --parents -- file
 check_result 'M H L J I' G..M --simplify-merges -- file
@@ -162,10 +162,10 @@ check_result 'M H L J I' G..M --ancestry-path 
--simplify-merges -- file
 # we can't decide if the merge from INTERESTING commit C was sensible.
 check_result 'F D C' B..F
 check_result 'F' B..F -- file
-check_outcome failure 'F' B..F --parents -- file # includes D
+check_outcome failure '(B)F' B..F --parents -- file # includes D
 check_outcome failure 'F D' B..F --full-history -- file # drops D prematurely
-check_result 'F D' B..F --full-history --parents -- file
-check_result 'F' B..F --simplify-merges -- file
+check_result '(D)F (BA)D' B..F --full-history --parents -- file
+check_result '(B

[PATCH v4 02/15] t6019: test file dropped in -s ours merge

2013-05-16 Thread Kevin Bracey
In preparation for upcoming TREESAME work, check the result for G.t,
which is dropped in -s ours merge L. The default rev-list is empty, as
expected - it follows the first parent path where it never existed.

Unfortunately, --ancestry-path is also empty. Merges H J and L are all
TREESAME to 1 parent, so are treated as TREESAME and not shown. This is
clearly undesirable in the case of merge L, which dropped our G.t by
taking the non-ancestry-path version. Document this as a known failure,
and expect H J L, the 3 merges along the path that had to chose G.t
versions.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 t/t6019-rev-list-ancestry-path.sh | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/t/t6019-rev-list-ancestry-path.sh 
b/t/t6019-rev-list-ancestry-path.sh
index dd5b0e5..c3bc2e7 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -16,6 +16,9 @@ test_description='--ancestry-path'
 #
 #  F...I == F G H I
 #  --ancestry-path F...I == F H I
+#
+#  G..M -- G.t == [nothing - was dropped in -s ours merge L]
+#  --ancestry-path G..M -- G.t == H J L
 
 . ./test-lib.sh
 
@@ -89,6 +92,22 @@ test_expect_success 'rev-list --ancestry-path F...I' '
test_cmp expect actual
 '
 
+# G.t is dropped in an -s ours merge
+test_expect_success 'rev-list G..M -- G.t' '
+   expect 
+   git rev-list --format=%s G..M -- G.t |
+   sed -e /^commit /d actual 
+   test_cmp expect actual
+'
+
+test_expect_failure 'rev-list --ancestry-path G..M -- G.t' '
+   for c in H J L; do echo $c; done expect 
+   git rev-list --ancestry-path --format=%s G..M -- G.t |
+   sed -e /^commit /d |
+   sort actual 
+   test_cmp expect actual
+'
+
 #   b---bc
 #  / \ /
 # a   X
-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 0/2] merge-base: add --merge-child option

2013-05-13 Thread Kevin Bracey

On 13/05/2013 01:22, Junio C Hamano wrote:

Kevin Bracey ke...@bracey.fi writes:


git log --ancestry-path --left-right E...F --not $(git merge-base
--all E F)

which looks like we're having to repeat ourselves because it's not
paying attention...

You are half wrong; --left-right is about do we show the //=
marker in the output?, so it is true that it does not make sense
without ..., but the reverse is not true: A...B does not and
should not imply --left-right.

The repetition I meant is that by the definition of ancestry-path, the 
above would seem to be equivalent to


  git log --ancestry-path --left-right E F --not $(git merge-base --all E F) 
$(git merge-base --all E F)

Anyway, revised separated-out version of the patch follows.

Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] Make --ancestry-path A...B work

2013-05-13 Thread Kevin Bracey
This patch is a revised form of the one in my history traversal refinements
series. Pulled out to allow it to proceed faster, given that John Keeping has
found a use for the command.

I suggest this is placed onto pu ahead of the existing series, dropping the
equivalent final commit there. And then hopefully this can proceed to next
faster.

(Dropping that commit will drop the only --ancestry-path A...B test in t6111,
meaning no immediate dependencies. But the next version of that series will be
sent with t6111 testing and expecting a pass due to this fix being in.)

Kevin Bracey (2):
  t6019: demonstrate --ancestry-path A...B breakage
  revision.c: treat A...B merge bases as if manually specified

 revision.c| 17 +
 revision.h|  1 +
 t/t6019-rev-list-ancestry-path.sh | 21 -
 3 files changed, 38 insertions(+), 1 deletion(-)

-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] revision.c: treat A...B merge bases as if manually specified

2013-05-13 Thread Kevin Bracey
The documentation assures users that A...B is defined as A B --not
$(git merge-base --all A B). This wasn't in fact quite true, because
the calculated merge bases were not sent to add_rev_cmdline().

The main effect of this was that although

  git rev-list --ancestry-path A B --not $(git merge-base --all A B)

worked, the simpler form

  git rev-list --ancestry-path A...B

failed with a no bottom commits error.

Other potential users of bottom commits could also be affected by this
problem, if they examine revs-cmdline_info; I came across the issue in
my proposed history traversal refinements series.

So ensure that the calculated merge bases are sent to add_rev_cmdline(),
flagged with new 'whence' enum value REV_CMD_MERGE_BASE.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 revision.c| 17 +
 revision.h|  1 +
 t/t6019-rev-list-ancestry-path.sh |  2 +-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index a67b615..7f7a8ab 100644
--- a/revision.c
+++ b/revision.c
@@ -915,6 +915,19 @@ static void add_rev_cmdline(struct rev_info *revs,
info-nr++;
 }
 
+static void add_rev_cmdline_list(struct rev_info *revs,
+struct commit_list *commit_list,
+int whence,
+unsigned flags)
+{
+   while (commit_list) {
+   struct object *object = commit_list-item-object;
+   add_rev_cmdline(revs, object, sha1_to_hex(object-sha1),
+   whence, flags);
+   commit_list = commit_list-next;
+   }
+}
+
 struct all_refs_cb {
int all_flags;
int warned_bad_reflog;
@@ -1092,6 +1105,7 @@ static void prepare_show_merge(struct rev_info *revs)
add_pending_object(revs, head-object, HEAD);
add_pending_object(revs, other-object, MERGE_HEAD);
bases = get_merge_bases(head, other, 1);
+   add_rev_cmdline_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING);
add_pending_commit_list(revs, bases, UNINTERESTING);
free_commit_list(bases);
head-object.flags |= SYMMETRIC_LEFT;
@@ -1179,6 +1193,9 @@ int handle_revision_arg(const char *arg_, struct rev_info 
*revs, int flags, unsi
 
if (symmetric) {
exclude = get_merge_bases(a, b, 1);
+   add_rev_cmdline_list(revs, exclude,
+REV_CMD_MERGE_BASE,
+flags_exclude);
add_pending_commit_list(revs, exclude,
flags_exclude);
free_commit_list(exclude);
diff --git a/revision.h b/revision.h
index 01bd2b7..878a555 100644
--- a/revision.h
+++ b/revision.h
@@ -35,6 +35,7 @@ struct rev_cmdline_info {
REV_CMD_PARENTS_ONLY,
REV_CMD_LEFT,
REV_CMD_RIGHT,
+   REV_CMD_MERGE_BASE,
REV_CMD_REV
} whence;
unsigned flags;
diff --git a/t/t6019-rev-list-ancestry-path.sh 
b/t/t6019-rev-list-ancestry-path.sh
index 5287f6a..dd5b0e5 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -81,7 +81,7 @@ test_expect_success 'rev-list F...I' '
test_cmp expect actual
 '
 
-test_expect_failure 'rev-list --ancestry-path F...I' '
+test_expect_success 'rev-list --ancestry-path F...I' '
for c in F H I; do echo $c; done expect 
git rev-list --ancestry-path --format=%s F...I |
sed -e /^commit /d |
-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] t6019: demonstrate --ancestry-path A...B breakage

2013-05-13 Thread Kevin Bracey
Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 t/t6019-rev-list-ancestry-path.sh | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/t/t6019-rev-list-ancestry-path.sh 
b/t/t6019-rev-list-ancestry-path.sh
index 39b4cb0..5287f6a 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -13,6 +13,9 @@ test_description='--ancestry-path'
 #
 #  D..M -- M.t == M
 #  --ancestry-path D..M -- M.t == M
+#
+#  F...I == F G H I
+#  --ancestry-path F...I == F H I
 
 . ./test-lib.sh
 
@@ -63,13 +66,29 @@ test_expect_success 'rev-list D..M -- M.t' '
test_cmp expect actual
 '
 
-test_expect_success 'rev-list --ancestry-patch D..M -- M.t' '
+test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
echo M expect 
git rev-list --ancestry-path --format=%s D..M -- M.t |
sed -e /^commit /d actual 
test_cmp expect actual
 '
 
+test_expect_success 'rev-list F...I' '
+   for c in F G H I; do echo $c; done expect 
+   git rev-list --format=%s F...I |
+   sed -e /^commit /d |
+   sort actual 
+   test_cmp expect actual
+'
+
+test_expect_failure 'rev-list --ancestry-path F...I' '
+   for c in F H I; do echo $c; done expect 
+   git rev-list --ancestry-path --format=%s F...I |
+   sed -e /^commit /d |
+   sort actual 
+   test_cmp expect actual
+'
+
 #   b---bc
 #  / \ /
 # a   X
-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 0/2] merge-base: add --merge-child option

2013-05-12 Thread Kevin Bracey

On 11/05/2013 15:23, John Keeping wrote:

This is helpful when examining branches with disjoint roots, for example
because one is periodically merged into a subtree of the other.



 $ git log --left-right F...E --not $(git merge-base --merge-child F E)
  F
  E



Wouldn't --left-right --ancestry-path F...E do the job? I can't 
immediately see how that differs.


Unfortunately, that syntax doesn't currently work - I just stumbled 
across this problem, and my history traversal refinements series in pu 
fixes it, somewhat incidentally to the main subject in there.


However, without that patch, the alternative way of expressing it:

--left-right --ancestry path F E --not $(git merge-base --all F E)

should still work. Unless --left-right is magic for ...? If it is, 
then my patch is more significant than I thought.


My series will also be better at optimising away D too, through a 
combination of my and Junio's efforts. Try it out.


On this subject, is there any way to exclude a path from a log query? Is 
there a not operator for paths? Might be another way of doing this - 
disjoint histories probably have disjoint paths...


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 0/2] merge-base: add --merge-child option

2013-05-12 Thread Kevin Bracey

On 12/05/2013 19:33, John Keeping wrote:

On Sun, May 12, 2013 at 05:28:24PM +0100, John Keeping wrote:

You're right (and I was wrong in my reply to Junio's parallel message)
ancestry path does seem to be what I want:

 $ git rev-list --ancestry-path --left-right --count \
 master...git-gui/master
 20565

However, this doesn't seem to make a difference to the time taken when I
add in --cherry-mark (which is why I was partially correct in the
parallel thread - it doesn't have the effect on cherry-mark that I want
it to):

This seems to be caused by the code in revision.c::limit_list() which
does the cherry detection then limits to left/right and only then
applies the ancestry path.  I haven't looked further than that, but is
there any reason not to apply the ancestry path restriction before
looking for patch-identical commits?


That certainly sounds like it's doing it the wrong way round. At first 
sight, it seems obviously suboptimal.



No I didn't.  I forgot to update my $PATH when I built on master - those
results are from pu.  master says:

 fatal: --ancestry-path given but there are no bottom commits


Well, then it looks like we have a user for that particular syntax. 
Seemed a bit esoteric to me :)  Although I realised after sending my 
mail you could also use


   git log --ancestry-path --left-right E...F --not $(git merge-base 
--all E F)


which looks like we're having to repeat ourselves because it's not 
paying attention...


I hit it because one of my optimisations relies on knowing the bottom 
commits, and I made absolutely sure I was using the exactly same 
definition of bottom as --ancestry-path. And then I found that my 
optimisations didn't work properly with ...


I suggest we pull my patch out from the more complex optimisation series 
so it can proceed to next faster. It shouldn't have to wait for all my 
new fancy stuff - it's fixing something that appears to be a clear bug.


Although Junio did have a comment about the implementation - I'll 
revisit it tomorrow and send a revised version separately, if everyone 
thinks that's sensible.





On this subject, is there any way to exclude a path from a log query? Is
there a not operator for paths? Might be another way of doing this -
disjoint histories probably have disjoint paths...

That relates to another idea I had about optimizing the detection of
patch-identical commits.  If the smaller side of a symmetric difference
is quite small (as it is likely to be if it's a topic branch), would it
be a good idea to calculate the set of paths touched by commits on that
side and then skip calculating the patch ID for any commits that touch
paths outside that set.  The tree comparison is a lot cheaper than doing
a diff on all of the files.


Sounds cute. Go for it :)

Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 0/2] merge-base: add --merge-child option

2013-05-12 Thread Kevin Bracey

On 12/05/2013 19:58, John Keeping wrote:


With the patch below, the --ancestry-path version drops to under 2
seconds.

I'm not sure if this is a good idea though.  It helps me say I know
nothing that isn't on the ancestry path can be patch-identical, so don't
bother checking if it is but it regresses users who want the full
cherry-pick check while only limiting the output.


Hmm. Should an excluded commit be a valid comparator? Is it 
sensible/correct to show a left commit as = to a right commit that has 
been excluded by the revision specifiers? Doesn't sound right to me.


I'm not convinced that there's a valid use-case that you're regressing 
here. If --ancestry-path is being misused (the user's assertion that 
non-ancestry doesn't matter is wrong) the error of not noting culled 
patch-identical commits is nothing compared to the fact we're already 
totally omitting the non-identical ones.


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 5/9] revision.c: Make --full-history consider more merges

2013-05-07 Thread Kevin Bracey

On 06/05/2013 23:45, Junio C Hamano wrote:

Kevin Bracey ke...@bracey.fi writes:


+struct treesame_state {
+   unsigned int nparents;
+   unsigned char treesame[FLEX_ARRAY];
+};

I have been wondering if we want to do one-bit (not one-byte) per
parent but no biggie ;-)


I did start down that path, because I felt bad about bloat.

But then I realised how much I would be complicating and slowing the 
code down to only save a few bytes each time we walk a merge with at 
least 5 parents, and I came to my senses. :)


Kevin


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 10/9] revision.c: treat A...B merge bases as if manually specified

2013-05-06 Thread Kevin Bracey
The documentation assures users that A...B is defined as 'r1 r2 --not
$(git merge-base --all r1 r2)'. This isn't in fact quite true, because
the calculated merge bases are not sent to add_rev_cmdline().

Previously, the effect was pretty minor - all that I can think of is
that git rev-list --ancestry-path A B ^AB_base works, but git
rev-list --ancestry-path A...B fails with a no bottom commits error.

But now that all history walking cares about bottom commits, this
failure to note the merge bases as bottoms can lead to worse results for
A...B compared to A B ^AB_base.

So ensure that the calculated merge bases are sent to add_rev_cmdline(),
flagged as 'REV_CMD_MERGE_BASE'.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 revision.c   | 9 +++--
 revision.h   | 2 ++
 t/t6111-rev-list-treesame.sh | 4 
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/revision.c b/revision.c
index 99a3432..aad16a6 100644
--- a/revision.c
+++ b/revision.c
@@ -1305,12 +1305,16 @@ void init_revisions(struct rev_info *revs, const char 
*prefix)
 
 static void add_pending_commit_list(struct rev_info *revs,
 struct commit_list *commit_list,
+int whence,
 unsigned int flags)
 {
while (commit_list) {
struct object *object = commit_list-item-object;
+   const char *sha1 = sha1_to_hex(object-sha1);
object-flags |= flags;
-   add_pending_object(revs, object, sha1_to_hex(object-sha1));
+   if (whence != REV_CMD_NONE)
+   add_rev_cmdline(revs, object, sha1, whence, flags);
+   add_pending_object(revs, object, sha1);
commit_list = commit_list-next;
}
 }
@@ -1332,7 +1336,7 @@ static void prepare_show_merge(struct rev_info *revs)
add_pending_object(revs, head-object, HEAD);
add_pending_object(revs, other-object, MERGE_HEAD);
bases = get_merge_bases(head, other, 1);
-   add_pending_commit_list(revs, bases, UNINTERESTING);
+   add_pending_commit_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING);
free_commit_list(bases);
head-object.flags |= SYMMETRIC_LEFT;
 
@@ -1420,6 +1424,7 @@ int handle_revision_arg(const char *arg_, struct rev_info 
*revs, int flags, unsi
if (symmetric) {
exclude = get_merge_bases(a, b, 1);
add_pending_commit_list(revs, exclude,
+   REV_CMD_MERGE_BASE,
flags_exclude);
free_commit_list(exclude);
a_flags = flags | SYMMETRIC_LEFT;
diff --git a/revision.h b/revision.h
index 765630a..b2ac5a3 100644
--- a/revision.h
+++ b/revision.h
@@ -32,10 +32,12 @@ struct rev_cmdline_info {
struct object *item;
const char *name;
enum {
+   REV_CMD_NONE,
REV_CMD_REF,
REV_CMD_PARENTS_ONLY,
REV_CMD_LEFT,
REV_CMD_RIGHT,
+   REV_CMD_MERGE_BASE,
REV_CMD_REV
} whence;
unsigned flags;
diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh
index 689d357..ded0b1a 100755
--- a/t/t6111-rev-list-treesame.sh
+++ b/t/t6111-rev-list-treesame.sh
@@ -161,6 +161,10 @@ check_result 'F' B..F --ancestry-path --simplify-merges -- 
file
 check_result 'F D' B..F --first-parent
 check_result 'F' B..F --first-parent -- file
 
+# E...F should be equivalent to E F ^B, and be able to drop D as above.
+check_result 'F' E F ^B -- file
+check_result 'F' E...F -- file
+
 # Any sort of full history of C..F should show D, as it's the connection to C,
 # and it differs from it.
 check_result 'F D B' C..F
-- 
1.8.3.rc0.28.g4b02ef5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 3/9] t6111: new TREESAME test set

2013-05-05 Thread Kevin Bracey
Some side branching and odd merging to illustrate various flaws in
revision list scans, particularly when limiting the list.

Many expected failures, which will be gone by the end of the history
traversal refinements series.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 t/t6111-rev-list-treesame.sh | 180 +++
 1 file changed, 180 insertions(+)
 create mode 100755 t/t6111-rev-list-treesame.sh

diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh
new file mode 100755
index 000..602c02d
--- /dev/null
+++ b/t/t6111-rev-list-treesame.sh
@@ -0,0 +1,180 @@
+#!/bin/sh
+#
+#,---E--.   *H--. * marks !TREESAME parent paths
+#   /\ / \*
+# *A--*B---D--*F-*G-K-*L-*M
+#   \ /*   \   /
+#`-C-'  `-*I-*J
+#
+# A creates file, B and F change it.
+# Odd merge G takes the old version from B.
+# I changes it, but J reverts it.
+# H and L both change it, and M merges those changes.
+
+test_description='TREESAME and limiting'
+
+. ./test-lib.sh
+
+note () {
+   git tag $1
+}
+
+unnote () {
+   git name-rev --tags --stdin | sed -e s|$_x40 (tags/\([^)]*\)) |\1 |g
+}
+
+test_expect_success setup '
+   test_commit Initial file file Hi there A 
+   git branch other-branch 
+
+   test_commit file=Hello file Hello B 
+   git branch third-branch 
+
+   git checkout other-branch 
+   test_commit Added other other Hello C 
+
+   git checkout master 
+   test_merge D other-branch 
+
+   git checkout third-branch 
+   test_commit Third file third Nothing E 
+
+   git checkout master 
+   test_commit file=Blah file Blah F 
+
+   test_tick  git merge --no-commit third-branch 
+   git checkout third-branch file 
+   git commit 
+   note G 
+   git branch fiddler-branch 
+
+   git checkout -b part2-branch 
+   test_commit file=Part 2 file Part 2 H 
+
+   git checkout fiddler-branch 
+   test_commit Bad commit file Silly I 
+
+   test_tick  git revert I  note J 
+
+   git checkout master 
+   test_tick  git merge --no-ff fiddler-branch 
+   note K
+
+   test_commit file=Part 1 file Part 1 L 
+
+   test_tick  test_must_fail git merge part2-branch 
+   test_commit M file Parts 1+2
+'
+
+FMT='tformat:%P%H | %s'
+
+# could we soup this up to optionally check parents? So (BA)C would check
+# that C is shown and has parents B A.
+check_outcome () {
+   outcome=$1
+   shift
+   for c in $1
+   do
+   echo $c
+   done expect 
+   shift 
+   param=$* 
+   test_expect_$outcome log $param '
+   git log --format=$FMT $param |
+   unnote actual 
+   sed -e s/^.*   \([^ ]*\) .*/\1/ check actual 
+   test_cmp expect check || {
+   cat actual
+   false
+   }
+   '
+}
+
+check_result () {
+   check_outcome success $@
+}
+
+# Odd merge G drops a change in F. Important that G is listed in all
+# except the most basic list. Achieving this means normal merge D will also be
+# shown in normal full-history, as we can't distinguish unless we do a
+# simplification pass. After simplification, D is dropped but G remains.
+check_result 'M L K J I H G F E D C B A'
+check_result 'M H L K J I G E F D C B A' --topo-order
+check_result 'M L H B A' -- file
+check_result 'M L H B A' --parents -- file
+check_outcome failure 'M L J I H G F D B A' --full-history -- file # drops G
+check_result 'M L K J I H G F D B A' --full-history --parents -- file
+check_outcome failure 'M H L J I G F B A' --simplify-merges -- file # drops G
+check_result 'M L K G F D B A' --first-parent
+check_result 'M L G F B A' --first-parent -- file
+
+# Check that odd merge G remains shown when F is the bottom.
+check_result 'M L K J I H G E' F..M
+check_result 'M H L K J I G E' F..M --topo-order
+check_result 'M L H' F..M -- file
+check_result 'M L H' F..M --parents -- file # L+H's parents rewritten to B, so 
more useful than it may seem
+check_outcome failure 'M L J I H G' F..M --full-history -- file # drops G
+check_result 'M L K J I H G' F..M --full-history --parents -- file
+check_outcome failure 'M H L J I G' F..M --simplify-merges -- file # drops G
+check_result 'M L K J I H G' F..M --ancestry-path
+check_outcome failure 'M L J I H G' F..M --ancestry-path -- file # drops G
+check_result 'M L K J I H G' F..M --ancestry-path --parents -- file
+check_result 'M H L J I G' F..M --ancestry-path --simplify-merges -- file
+check_result 'M L K G' F..M --first-parent
+check_result 'M L G' F..M --first-parent -- file
+
+# Note that G is pruned when E is the bottom, even if it's the same commit list
+# If we want history since E, then we're quite happy to ignore G that took E.
+check_result 'M L K J I H G' E..M --ancestry-path
+check_result 'M L J I H' E..M --ancestry-path -- file

[PATCH v3 6/9] t6012: update test for tweaked full-history traversal

2013-05-05 Thread Kevin Bracey
From: Junio C Hamano gits...@pobox.com

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 t/t6012-rev-list-simplify.sh | 29 +++--
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh
index dd6dc84..4e55872 100755
--- a/t/t6012-rev-list-simplify.sh
+++ b/t/t6012-rev-list-simplify.sh
@@ -14,21 +14,24 @@ unnote () {
 
 test_expect_success setup '
echo Hi there file 
-   git add file 
-   test_tick  git commit -m Initial file 
+   echo initial lost 
+   git add file lost 
+   test_tick  git commit -m Initial file and lost 
note A 
 
git branch other-branch 
 
echo Hello file 
-   git add file 
-   test_tick  git commit -m Modified file 
+   echo second lost 
+   git add file lost 
+   test_tick  git commit -m Modified file and lost 
note B 
 
git checkout other-branch 
 
echo Hello file 
-   git add file 
+   lost 
+   git add file lost 
test_tick  git commit -m Modified the file identically 
note C 
 
@@ -37,7 +40,9 @@ test_expect_success setup '
test_tick  git commit -m Add another file 
note D 
 
-   test_tick  git merge -m merge master 
+   test_tick 
+   test_must_fail git merge -m merge master 
+   lost  git commit -a -m merge 
note E 
 
echo Yet another elif 
@@ -110,4 +115,16 @@ check_result 'I B A' -- file
 check_result 'I B A' --topo-order -- file
 check_result 'H' --first-parent -- another-file
 
+check_result 'E C B A' --full-history E -- lost
+test_expect_success 'full history simplification without parent' '
+   printf %s\n E C B A expect 
+   git log --pretty=$FMT --full-history E -- lost |
+   unnote actual 
+   sed -e s/^.*   \([^ ]*\) .*/\1/ check actual 
+   test_cmp expect check || {
+   cat actual
+   false
+   }
+'
+
 test_done
-- 
1.8.3.rc0.28.g682c2d9

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 7/9] simplify-merges: never remove all TREESAME parents

2013-05-05 Thread Kevin Bracey
When simplifying an odd merge, such as one that used -s ours, we may
find ourselves TREESAME to apparently redundant parents. Prevent
simplify_merges() from removing every TREESAME parent; if this would
happen reinstate the first TREESAME parent - the one that the default
log would have followed.

This avoids producing a totally disjoint history from the default log
when the default log is a better explanation of the end result, and aids
visualisation of odd merges.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt |  3 +-
 revision.c | 69 ++
 t/t6111-rev-list-treesame.sh   |  2 +-
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 1dad341..e23bdb0 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -471,7 +471,8 @@ history according to the following rules:
 +
 * Replace each parent `P` of `C'` with its simplification `P'`.  In
   the process, drop parents that are ancestors of other parents, and
-  remove duplicates.
+  remove duplicates, but take care to never drop all parents that
+  we are TREESAME to.
 +
 * If after this parent rewriting, `C'` is a root or merge commit (has
   zero or 1 parents), a boundary commit, or !TREESAME, it remains.
diff --git a/revision.c b/revision.c
index c88ded8..7535757 100644
--- a/revision.c
+++ b/revision.c
@@ -2119,6 +2119,73 @@ static int mark_redundant_parents(struct rev_info *revs, 
struct commit *commit)
return marked;
 }
 
+/*
+ * Awkward naming - this means one parent we are TREESAME to.
+ * cf mark_treesame_root_parents: root parents that are TREESAME (to an
+ * empty tree). Better name suggestions?
+ */
+static int leave_one_treesame_to_parent(struct rev_info *revs, struct commit 
*commit)
+{
+   struct treesame_state *ts = lookup_decoration(revs-treesame, 
commit-object);
+   struct commit *unmarked = NULL, *marked = NULL;
+   struct commit_list *p;
+   unsigned n;
+
+   for (p = commit-parents, n = 0; p; p = p-next, n++) {
+   if (ts-treesame[n]) {
+   if (p-item-object.flags  TMP_MARK) {
+   if (!marked)
+   marked = p-item;
+   } else {
+   if (!unmarked) {
+   unmarked = p-item;
+   break;
+   }
+   }
+   }
+   }
+
+   /*
+* If we are TREESAME to a marked-for-deletion parent, but not to any
+* unmarked parents, unmark the first TREESAME parent. This is the
+* parent that the default simplify_history==1 scan would have followed,
+* and it doesn't make sense to omit that path when asking for a
+* simplified full history. Retaining it improves the chances of
+* understanding odd missed merges that took an old version of a file.
+*
+* Example:
+*
+*   I-X   A modified the file, but mainline merge X used
+*\   /-s ours, so took the version from I. X is
+* `--A--' TREESAME to I and !TREESAME to A.
+*
+* Default log from X would produce I. Without this check,
+* --full-history --simplify-merges would produce I-A-X, showing
+* the merge commit X and that it changed A, but not making clear that
+* it had just taken the I version. With this check, the topology above
+* is retained.
+*
+* Note that it is possible that the simplification chooses a different
+* TREESAME parent from the default, in which case this test doesn't
+* activate, and we _do_ drop the default parent. Example:
+*
+*   I--X A modified the file, but it was reverted in B,
+*\/  meaning mainline merge X is TREESAME to both
+* A--B   parents.
+*
+* Default log would produce I by following the first parent;
+* --full-history --simplify-merges will produce I-A-B. But this is a
+* reasonable result - it presents a logical full history leading from
+* I to X, and X is not an important merge.
+*/
+   if (!unmarked  marked) {
+   marked-object.flags = ~TMP_MARK;
+   return 1;
+   }
+
+   return 0;
+}
+
 static int remove_marked_parents(struct rev_info *revs, struct commit *commit)
 {
struct commit_list **pp, *p;
@@ -,6 +2289,8 @@ static struct commit_list **simplify_one(struct rev_info 
*revs, struct commit *c
if (1  cnt) {
int marked = mark_redundant_parents(revs, commit);
if (marked)
+   marked -= leave_one_treesame_to_parent(revs

[PATCH v3 0/9] History traversal refinements

2013-05-05 Thread Kevin Bracey
New version - nothing much changed since v2.2, except the new test
set to illustrate and prove the changes. Not sure about the t6111
numbering - there wasn't space where I really wanted to put it.
And maybe it should be appended to one of the existing tests.

You will note that I'm floundering for the name for the commits I care
about in part 9. Currently at priority, but that's horrible, not least
because it isn't an adjective. I think the word I really want is
interesting, but that's already taken... Relevant, important?

Junio C Hamano (1):
  t6012: update test for tweaked full-history traversal

Kevin Bracey (8):
  decorate.c: compact table when growing
  t6019: test file dropped in -s ours merge
  t6111: new TREESAME test set
  rev-list-options.txt: correct TREESAME for P
  revision.c: Make --full-history consider more merges
  simplify-merges: never remove all TREESAME parents
  simplify-merges: drop merge from irrelevant side branch
  revision.c: discount side branches when computing TREESAME

 Documentation/rev-list-options.txt |  38 +--
 decorate.c |   2 +-
 revision.c | 511 +
 revision.h |   4 +-
 t/t6012-rev-list-simplify.sh   |  31 ++-
 t/t6019-rev-list-ancestry-path.sh  |  29 ++-
 t/t6111-rev-list-treesame.sh   | 180 +
 7 files changed, 723 insertions(+), 72 deletions(-)
 create mode 100755 t/t6111-rev-list-treesame.sh

-- 
1.8.3.rc0.28.g682c2d9

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/9] t6019: test file dropped in -s ours merge

2013-05-05 Thread Kevin Bracey
In preparation for upcoming TREESAME work, check the result for G.t,
which is dropped in -s ours merge L. The default rev-list is empty, as
expected - it follows the first parent path where it never existed.

Unfortunately, --ancestry-path is also empty. Merges H J and L are all
TREESAME to 1 parent, so are treated as TREESAME and not shown. This is
clearly undesirable in the case of merge L, which dropped our G.t by
taking the non-ancestry-path version. Document this as a known failure,
and expect H J L, the 3 merges along the path that had to chose G.t
versions.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 t/t6019-rev-list-ancestry-path.sh | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/t/t6019-rev-list-ancestry-path.sh 
b/t/t6019-rev-list-ancestry-path.sh
index 39b4cb0..86ef908 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -13,6 +13,9 @@ test_description='--ancestry-path'
 #
 #  D..M -- M.t == M
 #  --ancestry-path D..M -- M.t == M
+#
+#  G..M -- G.t == [nothing - was dropped in -s ours merge L]
+#  --ancestry-path G..M -- G.t == H J L
 
 . ./test-lib.sh
 
@@ -63,13 +66,29 @@ test_expect_success 'rev-list D..M -- M.t' '
test_cmp expect actual
 '
 
-test_expect_success 'rev-list --ancestry-patch D..M -- M.t' '
+test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
echo M expect 
git rev-list --ancestry-path --format=%s D..M -- M.t |
sed -e /^commit /d actual 
test_cmp expect actual
 '
 
+# G.t is dropped in an -s ours merge
+test_expect_success 'rev-list G..M -- G.t' '
+   expect 
+   git rev-list --format=%s G..M -- G.t |
+   sed -e /^commit /d actual 
+   test_cmp expect actual
+'
+
+test_expect_failure 'rev-list --ancestry-path G..M -- G.t' '
+   for c in H J L; do echo $c; done expect 
+   git rev-list --ancestry-path --format=%s G..M -- G.t |
+   sed -e /^commit /d |
+   sort actual 
+   test_cmp expect actual
+'
+
 #   b---bc
 #  / \ /
 # a   X
-- 
1.8.3.rc0.28.g682c2d9

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/9] decorate.c: compact table when growing

2013-05-05 Thread Kevin Bracey
When growing the table, take the opportunity to compact it by removing
entries with NULL decoration.

Users may have removed decorations by passing NULL to
insert_decoration. An object's table entry can't actually be removed
during normal operation, as it would break the linear hash collision
search. But we can remove NULL decoration entries when rebuilding the
table.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 decorate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/decorate.c b/decorate.c
index 2f8a63e..7cb5d29 100644
--- a/decorate.c
+++ b/decorate.c
@@ -49,7 +49,7 @@ static void grow_decoration(struct decoration *n)
const struct object *base = old_hash[i].base;
void *decoration = old_hash[i].decoration;
 
-   if (!base)
+   if (!decoration)
continue;
insert_decoration(n, base, decoration);
}
-- 
1.8.3.rc0.28.g682c2d9

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 8/9] simplify-merges: drop merge from irrelevant side branch

2013-05-05 Thread Kevin Bracey
Reimplement commit 4b7f53da on top of the new simplify-merges
infrastructure, tightening the condition to only consider root parents;
the original version incorrectly dropped parents that were TREESAME to
anything.

Original log message follows.

The merge simplification rule stated in 6546b59 (revision traversal:
show full history with merge simplification, 2008-07-31) still
treated merge commits too specially.  Namely, in a history with this
shape:

---o---o---M
  /
 x---x---x

where three 'x' were on a history completely unrelated to the main
history 'o' and do not touch any of the paths we are following, we
still said that after simplifying all of the parents of M, 'x'
(which is the leftmost 'x' that rightmost 'x simplifies down to) and
'o' (which would be the last commit on the main history that touches
the paths we are following) are independent from each other, and
both need to be kept.

That is incorrect; when the side branch 'x' never touches the paths,
it should be removed to allow M to simplify down to the last commit
on the main history that touches the paths.

Suggested-by: Junio C Hamano gits...@pobox.com
Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt | 34 +-
 revision.c | 26 +-
 t/t6012-rev-list-simplify.sh   |  2 +-
 3 files changed, 47 insertions(+), 15 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index e23bdb0..b7fbc80 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -342,13 +342,13 @@ In the following, we will always refer to the same 
example history to
 illustrate the differences between simplification settings.  We assume
 that you are filtering for a file `foo` in this commit graph:
 ---
- .-A---M---N---O---P
-/ /   /   /   /
-   I B   C   D   E
-\   /   /   /   /
- `-'
+ .-A---M---N---O---P---Q
+/ /   /   /   /   /
+   I B   C   D   E   Y
+\   /   /   /   /   /
+ `-'   X
 ---
-The horizontal line of history A---P is taken to be the first parent of
+The horizontal line of history A---Q is taken to be the first parent of
 each merge.  The commits are:
 
 * `I` is the initial commit, in which `foo` exists with contents
@@ -369,6 +369,10 @@ each merge.  The commits are:
 * `E` changes `quux` to xyzzy, and its merge `P` combines the
   strings to quux xyzzy.  `P` is TREESAME to `O`, but not to `E`.
 
+* `X` is an indpendent root commit that added a new file `side`, and `Y`
+  modified it. `Y` is TREESAME to `X`. Its merge `Q` added `side` to `P`, and
+  `Q` is TREESAME to `P`, but not to `Y`.
+
 'rev-list' walks backwards through history, including or excluding
 commits based on whether '\--full-history' and/or parent rewriting
 (via '\--parents' or '\--children') are used.  The following settings
@@ -409,7 +413,7 @@ parent lines.
the example, we get
 +
 ---
-   I  A  B  N  D  O  P
+   I  A  B  N  D  O  P  Q
 ---
 +
 `M` was excluded because it is TREESAME to both parents.  `E`,
@@ -430,7 +434,7 @@ Along each parent, prune away commits that are not included
 themselves.  This results in
 +
 ---
- .-A---M---N---O---P
+ .-A---M---N---O---P---Q
 / /   /   /   /
I B   /   D   /
 \   /   /   /   /
@@ -440,7 +444,7 @@ themselves.  This results in
 Compare to '\--full-history' without rewriting above.  Note that `E`
 was pruned away because it is TREESAME, but the parent list of P was
 rewritten to contain `E`'s parent `I`.  The same happened for `C` and
-`N`.
+`N`, and `X`, `Y` and `Q`.
 
 In addition to the above settings, you can change whether TREESAME
 affects inclusion:
@@ -470,9 +474,9 @@ history according to the following rules:
 * Set `C'` to `C`.
 +
 * Replace each parent `P` of `C'` with its simplification `P'`.  In
-  the process, drop parents that are ancestors of other parents, and
-  remove duplicates, but take care to never drop all parents that
-  we are TREESAME to.
+  the process, drop parents that are ancestors of other parents or that are
+  root commits TREESAME to an empty tree, and remove duplicates, but take care
+  to never drop all parents that we are TREESAME to.
 +
 * If after this parent rewriting, `C'` is a root or merge commit (has
   zero or 1 parents), a boundary commit, or !TREESAME, it remains.
@@ -490,7 +494,7 @@ The effect of this is best shown by way of comparing

[PATCH v3 9/9] revision.c: discount side branches when computing TREESAME

2013-05-05 Thread Kevin Bracey
Add a BOTTOM flag to commit objects, and use it to define priority for
pruning. Priority commits are those that are !UNINTERESTING or BOTTOM,
and this allows us to identify irrelevant side branches (UNINTERESTING
 !BOTTOM).

If a merge has priority parents, and it is TREESAME to them, then do
not let non-priority parents cause the merge to be treated as
!TREESAME.

When considering simplification, don't always include all merges -
merges with exactly 1 priority parent can be simplified, if TREESAME
according to the above rule.

These two changes greatly increase simplification in limited revision
lists - irrelevant merges from unlisted commits can be omitted.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 revision.c| 201 --
 revision.h|   3 +-
 t/t6019-rev-list-ancestry-path.sh |  12 ++-
 t/t6111-rev-list-treesame.sh  |  22 ++---
 4 files changed, 195 insertions(+), 43 deletions(-)

diff --git a/revision.c b/revision.c
index 20c7058..99a3432 100644
--- a/revision.c
+++ b/revision.c
@@ -333,6 +333,76 @@ static int everybody_uninteresting(struct commit_list 
*orig)
 }
 
 /*
+ * A definition of priority commit that we can use to simplify limited graphs
+ * by eliminating side branches.
+ *
+ * A priority commit is one that is !UNINTERESTING (ie we are including it
+ * in our list), or that is a specified BOTTOM commit. Then after computing
+ * a limited list, during processing we can generally ignore boundary merges
+ * coming from outside the graph, (ie from non-priority parents), and treat
+ * those merges as if they were single-parent. TREESAME is defined to consider
+ * only priority parents, if any. If we are TREESAME to our on-graph parents,
+ * we don't care if we were !TREESAME to non-graph parents.
+ *
+ * Including bottom commits in priority ensures that a limited graph's
+ * connection to the actual bottom commit is not viewed as an uninteresting
+ * side branch, but treated as part of the graph. For example:
+ *
+ *   Z...A---X---o---o---B
+ *. /
+ * W---Y
+ *
+ * When computing A..B, the A-X connection is at least as important as
+ * Y-X, despite A being flagged UNINTERESTING.
+ *
+ * And when computing --ancestry-path A..B, the A-X connection is more
+ * important than Y-X, despite both A and Y being flagged UNINTERESTING.
+ */
+static inline int priority_commit(struct commit *commit)
+{
+   return (commit-object.flags  (UNINTERESTING | BOTTOM)) != 
UNINTERESTING;
+}
+
+/*
+ * Return a single priority commit from a parent list. If we are a TREESAME
+ * commit, and this selects one of our parents, then we can safely simplify to
+ * that parent.
+ *
+ * For 1-parent commits, or if first-parent-only, then this returns the only
+ * parent. TREESAME will have been set purely on that parent.
+ *
+ * For multi-parent commits, identify a sole priority parent, if any.
+ * If we have only one priority parent, then TREESAME will be set purely
+ * with regard to that parent, and we can choose simplification appropriately.
+ *
+ * If we have more than one priority parent, or no priority parents
+ * (and multiple non-priority ones), then we can't select a parent here and
+ * return NULL.
+ */
+static struct commit *priority_select_parent(struct rev_info *revs, struct 
commit_list *orig)
+{
+   struct commit_list *list = orig;
+   struct commit *priority = NULL;
+
+   if (!orig)
+   return NULL;
+
+   if (revs-first_parent_only || !orig-next)
+   return orig-item;
+
+   while (list) {
+   struct commit *commit = list-item;
+   list = list-next;
+   if (priority_commit(commit)) {
+   if (priority)
+   return NULL;
+   priority = commit;
+   }
+   }
+   return priority;
+}
+
+/*
  * The goal is to get REV_TREE_NEW as the result only if the
  * diff consists of all '+' (and no other changes), REV_TREE_OLD
  * if the whole diff is removal of old data, and otherwise
@@ -502,27 +572,52 @@ static unsigned update_treesame(struct rev_info *revs, 
struct commit *commit)
if (commit-parents  commit-parents-next) {
unsigned n;
struct treesame_state *st;
+   struct commit_list *p;
+   unsigned priority_parents;
+   unsigned priority_change, nonpriority_change;
 
st = lookup_decoration(revs-treesame, commit-object);
if (!st)
die(update_treesame %s, 
sha1_to_hex(commit-object.sha1));
-   commit-object.flags |= TREESAME;
-   for (n = 0; n  st-nparents; n++) {
-   if (!st-treesame[n]) {
-   commit-object.flags = ~TREESAME;
-   break;
-   }
+   priority_parents = 0

[PATCH v3 5/9] revision.c: Make --full-history consider more merges

2013-05-05 Thread Kevin Bracey
History simplification previously always treated merges as TREESAME
if they were TREESAME to any parent.

While this was consistent with the default behaviour, this could be
extremely unhelpful when searching detailed history, and could not be
overridden. For example, if a merge had ignored a change, as if by -s
ours, then:

  git log -m -p --full-history -Schange file

would successfully locate change's addition but would not locate the
merge that resolved against it.

Futher, simplify_merges could drop the actual parent that a commit
was TREESAME to, leaving it as a normal commit marked TREESAME that
isn't actually TREESAME to its remaining parent.

Now redefine a commit's TREESAME flag to be true only if a commit is
TREESAME to _all_ of its parents. This doesn't affect either the default
simplify_history behaviour (because partially TREESAME merges are turned
into normal commits), or full-history with parent rewriting (because all
merges are output). But it does affect other modes. The clearest
difference is that --full-history will show more merges - sufficient to
ensure that -m -p --full-history log searches can really explain every
change to the file, including those changes' ultimate fate in merges.

Also modify simplify_merges to recalculate TREESAME after removing
a parent. This is achieved by storing per-parent TREESAME flags on the
initial scan, so the combined flag can be easily recomputed.

This fixes some t6111 failures, but creates a couple of new ones -
we are now showing some merges that don't need to be shown.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt |   6 +-
 revision.c | 241 -
 revision.h |   1 +
 t/t6019-rev-list-ancestry-path.sh  |   2 +-
 t/t6111-rev-list-treesame.sh   |  20 +--
 5 files changed, 226 insertions(+), 44 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 50bbff7..1dad341 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -409,10 +409,10 @@ parent lines.
the example, we get
 +
 ---
-   I  A  B  N  D  O
+   I  A  B  N  D  O  P
 ---
 +
-`P` and `M` were excluded because they are TREESAME to a parent.  `E`,
+`M` was excluded because it is TREESAME to both parents.  `E`,
 `C` and `B` were all walked, but only `B` was !TREESAME, so the others
 do not appear.
 +
@@ -440,7 +440,7 @@ themselves.  This results in
 Compare to '\--full-history' without rewriting above.  Note that `E`
 was pruned away because it is TREESAME, but the parent list of P was
 rewritten to contain `E`'s parent `I`.  The same happened for `C` and
-`N`.  Note also that `P` was included despite being TREESAME.
+`N`.
 
 In addition to the above settings, you can change whether TREESAME
 affects inclusion:
diff --git a/revision.c b/revision.c
index a67b615..c88ded8 100644
--- a/revision.c
+++ b/revision.c
@@ -429,10 +429,100 @@ static int rev_same_tree_as_empty(struct rev_info *revs, 
struct commit *commit)
return retval = 0  (tree_difference == REV_TREE_SAME);
 }
 
+struct treesame_state {
+   unsigned int nparents;
+   unsigned char treesame[FLEX_ARRAY];
+};
+
+static struct treesame_state *initialise_treesame(struct rev_info *revs, 
struct commit *commit)
+{
+   unsigned n = commit_list_count(commit-parents);
+   struct treesame_state *st = xcalloc(1, sizeof(*st) + n);
+   st-nparents = n;
+   add_decoration(revs-treesame, commit-object, st);
+   return st;
+}
+
+/*
+ * Must be called immediately after removing the nth_parent from a commit's
+ * parent list, if we are maintaining the per-parent treesame[] decoration.
+ * This does not recalculate the master TREESAME flag - update_treesame()
+ * should be called to update it after a sequence of treesame[] modifications
+ * that may have affected it.
+ */
+static int compact_treesame(struct rev_info *revs, struct commit *commit, 
unsigned nth_parent)
+{
+   struct treesame_state *st;
+   int old_same;
+
+   if (!commit-parents) {
+   /*
+* Have just removed the only parent from a non-merge.
+* Different handling, as we lack decoration.
+*/
+   if (nth_parent != 0)
+   die(compact_treesame %u, nth_parent);
+   old_same = !!(commit-object.flags  TREESAME);
+   if (rev_same_tree_as_empty(revs, commit))
+   commit-object.flags |= TREESAME;
+   else
+   commit-object.flags = ~TREESAME;
+   return old_same;
+   }
+
+   st = lookup_decoration(revs-treesame, commit-object);
+   if (!st || nth_parent = st-nparents)
+   die(compact_treesame %u

Re: [PATCH] revision.c: Fix a sparse warning

2013-05-04 Thread Kevin Bracey

On 04/05/2013 20:25, Ramsay Jones wrote:

Sparse issues an 'sole_interesting' not declared. Should it be static?
warning. In order to suppress the warning, since this symbol does not
need more than file visibility, we simply add the static modifier to
its declaration.



Thanks! I'll include that fix.

Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 8/8] revision.c: discount UNINTERESTING parents

2013-05-04 Thread Kevin Bracey
On 02/05/2013 23:05, Junio C Hamano wrote:

   Z...A===X---o---o---B
\\/
 W---Y

 OK, I think I understand it, and we are in agreement.  For the
 purpose of the above paragraph, a side branch is what is not on the
 --ancestry-path, so all of the below examples are about the
 behaviour when --ancestry-path is used.

Ah, in fact, no. In some previous mails I was concentrating on
ancestry-path, but those 3 examples were really of ordinary A..B, with
W+Y in the INTERESTING set. I think the side-branch logic remains sound
and desirable even in the absence of --ancestry-path, so I don't think
this is an ancestry-path change. (And from a basic naive usability point
of view I'm much more interested in improving the more obvious modes
than the rather more obscure --ancestry-path.)

Ancestry path forces side branches to be ignored - it's the simple
case for ignoring (and understanding) side branches. But if we let other
modes know where the bottom is, they too can benefit from reliable side
branch logic - we can find out if anything happened on side branches,
but we can also ignore them completely if they turn out to be totally
irrelevant.

When not using ancestry-path, the side branches this patch works on are
thosewhich go off and don't come back - they stub off at some
UNINTERESTING commit other than the bottom(s). If no other limiting is
set, they must have hit an ancestor of our BOTTOM commit(s);
simplify-merges could have potentially pruned away if unlimited. And
this patch restores that pruning ability - simplify-merges can rewrite
them back to just 1 UNINTERESTING merge parent at the boundary (looking
like an ancestry-path boundary), then this patch can chuck the boundary
merge. Hey presto, irrelevant branch now invisible. And the patch also
provides benefits to all other modes.

I'll post v3 of the sequence tomorrow - it includes a new test which
illustrates the changes - it's a 60-or-so-item test set, with about 15
failures in a variety of modes that get fixed by this sequence. I
think that should make an excellent discussion topic. We'll see whether
folks agree with my view about what should and shouldn't be shown...

Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2.1 8/8] revision.c: discount side branches when computing TREESAME

2013-05-02 Thread Kevin Bracey
Add a BOTTOM flag to commit objects, and use it to define priority for
pruning. Priority commits are those that are !UNINTERESTING or BOTTOM,
and this allows us to identify irrelevant side branches (UNINTERESTING
 !BOTTOM).

If a merge has priority parents, and it is TREESAME to them, then do
not let non-priority parents cause the merge to be treated as
!TREESAME.

When considering simplification, don't always include all merges -
merges with exactly 1 priority parent can be simplified, if TREESAME
according to the above rule.

These two changes greatly increase simplification in limited revision
lists - irrelevant merges from unlisted commits can be omitted.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 revision.c| 210 --
 revision.h|   3 +-
 t/t6019-rev-list-ancestry-path.sh |  12 ++-
 3 files changed, 188 insertions(+), 37 deletions(-)

diff --git a/revision.c b/revision.c
index ddaed33..2a7f08d 100644
--- a/revision.c
+++ b/revision.c
@@ -333,6 +333,76 @@ static int everybody_uninteresting(struct commit_list 
*orig)
 }
 
 /*
+ * A definition of priority commit that we can use to simplify limited graphs
+ * by eliminating side branches.
+ *
+ * A priority commit is one that is !UNINTERESTING (ie we are including it
+ * in our list), or that is a specified BOTTOM commit. Then after computing
+ * a limited list, during processing we can generally ignore boundary merges
+ * coming from outside the graph, (ie from non-priority parents), and treat
+ * those merges as if they were single-parent. TREESAME is defined to consider
+ * only priority parents, if any. If we are TREESAME to our on-graph parents,
+ * we don't care if we were !TREESAME to non-graph parents.
+ *
+ * Including bottom commits in priority ensures that a limited graph's
+ * connection to the actual bottom commit is not viewed as an uninteresting
+ * side branch, but treated as part of the graph. For example:
+ *
+ *   Z...A---X---o---o---B
+ *. /
+ * W---Y
+ *
+ * When computing A..B, the A-X connection is at least as important as
+ * Y-X, despite A being flagged UNINTERESTING.
+ *
+ * And when computing --ancestry-path A..B, the A-X connection is more
+ * important than Y-X, despite both A and Y being flagged UNINTERESTING.
+ */
+static inline int priority_commit(struct commit *commit)
+{
+   return (commit-object.flags  (UNINTERESTING | BOTTOM)) != 
UNINTERESTING;
+}
+
+/*
+ * Return a single priority commit from a parent list. If we are a TREESAME
+ * commit, and this selects one of our parents, then we can safely simplify to
+ * that parent.
+ *
+ * For 1-parent commits, or if first-parent-only, then this returns the only
+ * parent. TREESAME will have been set purely on that parent.
+ *
+ * For multi-parent commits, identify a sole priority parent, if any.
+ * If we have only one priority parent, then TREESAME will be set purely
+ * with regard to that parent, and we can choose simplification appropriately.
+ *
+ * If we have more than one priority parent, or no priority parents
+ * (and multiple non-priority ones), then we can't select a parent here and
+ * return NULL.
+ */
+struct commit *priority_select_parent(struct rev_info *revs, struct 
commit_list *orig)
+{
+   struct commit_list *list = orig;
+   struct commit *priority = NULL;
+
+   if (!orig)
+   return NULL;
+
+   if (revs-first_parent_only || !orig-next)
+   return orig-item;
+
+   while (list) {
+   struct commit *commit = list-item;
+   list = list-next;
+   if (priority_commit(commit)) {
+   if (priority)
+   return NULL;
+   priority = commit;
+   }
+   }
+   return priority;
+}
+
+/*
  * The goal is to get REV_TREE_NEW as the result only if the
  * diff consists of all '+' (and no other changes), REV_TREE_OLD
  * if the whole diff is removal of old data, and otherwise
@@ -502,17 +572,28 @@ static unsigned update_treesame(struct rev_info *revs, 
struct commit *commit)
if (commit-parents  commit-parents-next) {
unsigned n;
struct treesame_state *st;
+   struct commit_list *p;
+   unsigned priority_parents;
+   unsigned priority_change, nonpriority_change;
 
st = lookup_decoration(revs-treesame, commit-object);
if (!st)
die(update_treesame %s, 
sha1_to_hex(commit-object.sha1));
-   commit-object.flags |= TREESAME;
-   for (n = 0; n  st-nparents; n++) {
-   if (!st-treesame[n]) {
-   commit-object.flags = ~TREESAME;
-   break;
-   }
+   priority_parents = 0;
+   priority_change = nonpriority_change = 0

[PATCH v2.2 8/8] revision.c: discount side branches when computing TREESAME

2013-05-02 Thread Kevin Bracey
Add a BOTTOM flag to commit objects, and use it to define priority for
pruning. Priority commits are those that are !UNINTERESTING or BOTTOM,
and this allows us to identify irrelevant side branches (UNINTERESTING
 !BOTTOM).

If a merge has priority parents, and it is TREESAME to them, then do
not let non-priority parents cause the merge to be treated as
!TREESAME.

When considering simplification, don't always include all merges -
merges with exactly 1 priority parent can be simplified, if TREESAME
according to the above rule.

These two changes greatly increase simplification in limited revision
lists - irrelevant merges from unlisted commits can be omitted.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---

Something went wrong with v2.1; it got based on an old version of the series.
This one should apply to v2 correctly.

 revision.c| 216 +-
 revision.h|   3 +-
 t/t6019-rev-list-ancestry-path.sh |  12 ++-
 3 files changed, 199 insertions(+), 32 deletions(-)

diff --git a/revision.c b/revision.c
index 20c7058..e5d5091 100644
--- a/revision.c
+++ b/revision.c
@@ -333,6 +333,76 @@ static int everybody_uninteresting(struct commit_list 
*orig)
 }
 
 /*
+ * A definition of priority commit that we can use to simplify limited graphs
+ * by eliminating side branches.
+ *
+ * A priority commit is one that is !UNINTERESTING (ie we are including it
+ * in our list), or that is a specified BOTTOM commit. Then after computing
+ * a limited list, during processing we can generally ignore boundary merges
+ * coming from outside the graph, (ie from non-priority parents), and treat
+ * those merges as if they were single-parent. TREESAME is defined to consider
+ * only priority parents, if any. If we are TREESAME to our on-graph parents,
+ * we don't care if we were !TREESAME to non-graph parents.
+ *
+ * Including bottom commits in priority ensures that a limited graph's
+ * connection to the actual bottom commit is not viewed as an uninteresting
+ * side branch, but treated as part of the graph. For example:
+ *
+ *   Z...A---X---o---o---B
+ *. /
+ * W---Y
+ *
+ * When computing A..B, the A-X connection is at least as important as
+ * Y-X, despite A being flagged UNINTERESTING.
+ *
+ * And when computing --ancestry-path A..B, the A-X connection is more
+ * important than Y-X, despite both A and Y being flagged UNINTERESTING.
+ */
+static inline int priority_commit(struct commit *commit)
+{
+   return (commit-object.flags  (UNINTERESTING | BOTTOM)) != 
UNINTERESTING;
+}
+
+/*
+ * Return a single priority commit from a parent list. If we are a TREESAME
+ * commit, and this selects one of our parents, then we can safely simplify to
+ * that parent.
+ *
+ * For 1-parent commits, or if first-parent-only, then this returns the only
+ * parent. TREESAME will have been set purely on that parent.
+ *
+ * For multi-parent commits, identify a sole priority parent, if any.
+ * If we have only one priority parent, then TREESAME will be set purely
+ * with regard to that parent, and we can choose simplification appropriately.
+ *
+ * If we have more than one priority parent, or no priority parents
+ * (and multiple non-priority ones), then we can't select a parent here and
+ * return NULL.
+ */
+struct commit *priority_select_parent(struct rev_info *revs, struct 
commit_list *orig)
+{
+   struct commit_list *list = orig;
+   struct commit *priority = NULL;
+
+   if (!orig)
+   return NULL;
+
+   if (revs-first_parent_only || !orig-next)
+   return orig-item;
+
+   while (list) {
+   struct commit *commit = list-item;
+   list = list-next;
+   if (priority_commit(commit)) {
+   if (priority)
+   return NULL;
+   priority = commit;
+   }
+   }
+   return priority;
+}
+
+/*
  * The goal is to get REV_TREE_NEW as the result only if the
  * diff consists of all '+' (and no other changes), REV_TREE_OLD
  * if the whole diff is removal of old data, and otherwise
@@ -502,27 +572,54 @@ static unsigned update_treesame(struct rev_info *revs, 
struct commit *commit)
if (commit-parents  commit-parents-next) {
unsigned n;
struct treesame_state *st;
+   struct commit_list *p;
+   unsigned priority_parents;
+   unsigned priority_change, nonpriority_change;
 
st = lookup_decoration(revs-treesame, commit-object);
if (!st)
die(update_treesame %s, 
sha1_to_hex(commit-object.sha1));
-   commit-object.flags |= TREESAME;
-   for (n = 0; n  st-nparents; n++) {
-   if (!st-treesame[n]) {
-   commit-object.flags = ~TREESAME;
-   break

Re: [PATCH v2 8/8] revision.c: discount UNINTERESTING parents

2013-05-02 Thread Kevin Bracey

On 01/05/2013 00:18, Junio C Hamano wrote:


These rules paying more attention to UNINTERESTING do add a tricky
wrinkle to behaviour. Because limited revision lists are conventionally
expressed as A..B (ie B ^A), the bottom commit is UNINTERESTING.

OK.


Thus
its connection to the INTERESTING graph is not privileged over side
branches,

I take that its connection refers to the === link below, the
nodes connected with --- form the INTERESTING graph, and

  Z...A===X---o---o---B
   \\/
W---Y
 
side branches refer to the development that built W and Y and

merged at X.  And you are saying that A===X is not privileged over
Y---X, with some definition of privileged I am not sure about.


Okay, that's a good graph. The basic problem is that all the rules above 
try to identify a merge from an irrelevant branch and eliminate it. But 
how can we define what a side branch is?


I think the rules I state are conceptually sound - a side branch is an 
extra parent coming in from outside our limited history graph. But the 
problem is at the bottom. In the event that someone specifies A..B 
with the above history, we get the resultant graph W-Y-X-o-o-B. A is 
not on that graph. So with the rules as they stand, A being 
UNINTERESTING makes it get treated as a side branch of X, which isn't 
good. A needs to be INTERESTING for the purpose of side-branch logic.


So when someone says A..B and generates W-Y-X-o-o-B, we want to know 
that X's parent path (Z)-W-Y-X is the (possibly irrelevant) side 
branch, not (A)-X.


Example undesirable behaviour, with A treated as a side branch:

1) If only commit A changed our file, and merge X took old version Y 
for some reason, under these rules --full-history A..B would show 
nothing. X doesn't consider A because it's UNINTERESTING. If there had 
been an intervening (even irrelevant) commit A1 between A1 and X, X 
would have been shown.


2) If only commit A changed our file, and merge X took A, with this 
rules--simplify-merges A..B would show X, with two rewritten 
UNINTERESTING parents A and Z. That's not what we want - Z is an 
irrelevant side-branch in this case. If there had been an intervening 
(unsimplifiable) commit A1 between A and X, X would have had INTERESTING 
parent A1, and WY would have been successfully discarded.


3) Even before my new rules, there is one existing place trying to spot 
side branches, and it can fail here for the same reasons. See 
simplify_history's even if a merge with an uninteresting side branch 
brought the entire change we are interested in test. It would do the 
wrong thing at X, if W had made a change and Y reverted it. git log 
A..B would show W and Y, which is not what we want. As the scan hits X, 
it follows parent Y rather than just go for first parent A, because it 
thinks A is an uninteresting side branch. Again, if there had been an 
intervening commit A1 before X, X would have followed A1 and W and Y 
would not have been shown.


All 3 cases work if A is treated as INTERESTING for side-branch rules. 
We shouldn't have needed to put in an extra A1 commit to make them work.



  #  D---E---F
  # / \   \
+#B---C-G0-G--H---I---J
  #   / \
  #  A---K---L--M
  #


Conceptually, the ancestry-path shouldn't get affected by any
pathspec. The range --ancestry-path G0..M should be equivalent to
the range G0..M ^F ^E ^K, and with the rule to ignore non-sameness
with uninteresting side branches, I would have expected that H and J
would be equally irrelevant, because E and F would be outside the
graph we would want to look at sameness.


Those two pathspecs produce the same graph of commits, and yes, they've 
always produced the same thing up until now. But we're trying to do 
something new(ish) here. We're trying to define side branch, to allow 
us to make more useful pruning comparisons. And we can't reliably define 
side branch unless we can reliably define where we're coming from.


Looking at this case, given that graph of commits G-H-I-J-L-M (produced 
by any pathspec/flags), that is easy because it's linear. The bottom 
of the INTERESTING graph has a single parent, and we can follow it 
straight from bottom to top. We can deduce that non-merge G is bottom, 
and thus call the connections to E and F sides. (But that could have 
been wrong if the graph had been made some other way, and the user 
wasn't asking for history since G0).


But if presented with H-I-J-L-M we get stuck. The lowest commit in our 
graph has 2 parents. How do we distinguish between E and G? Which is the 
side, and which is the bottom? We can only define side branch here if 
we know where our bottom is. The version of this patch as posted can't 
distinguish whether E or G is more important, so merge H always gets 
shown. And that makes me unhappy. And note that normal 
merge-simplification will often prune away boring non-merge commits, 
leaving the user-specified 

[PATCH v2 5/8] t6012: update test for tweaked full-history traversal

2013-04-30 Thread Kevin Bracey
From: Junio C Hamano gits...@pobox.com

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 t/t6012-rev-list-simplify.sh | 29 +++--
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh
index dd6dc84..4e55872 100755
--- a/t/t6012-rev-list-simplify.sh
+++ b/t/t6012-rev-list-simplify.sh
@@ -14,21 +14,24 @@ unnote () {
 
 test_expect_success setup '
echo Hi there file 
-   git add file 
-   test_tick  git commit -m Initial file 
+   echo initial lost 
+   git add file lost 
+   test_tick  git commit -m Initial file and lost 
note A 
 
git branch other-branch 
 
echo Hello file 
-   git add file 
-   test_tick  git commit -m Modified file 
+   echo second lost 
+   git add file lost 
+   test_tick  git commit -m Modified file and lost 
note B 
 
git checkout other-branch 
 
echo Hello file 
-   git add file 
+   lost 
+   git add file lost 
test_tick  git commit -m Modified the file identically 
note C 
 
@@ -37,7 +40,9 @@ test_expect_success setup '
test_tick  git commit -m Add another file 
note D 
 
-   test_tick  git merge -m merge master 
+   test_tick 
+   test_must_fail git merge -m merge master 
+   lost  git commit -a -m merge 
note E 
 
echo Yet another elif 
@@ -110,4 +115,16 @@ check_result 'I B A' -- file
 check_result 'I B A' --topo-order -- file
 check_result 'H' --first-parent -- another-file
 
+check_result 'E C B A' --full-history E -- lost
+test_expect_success 'full history simplification without parent' '
+   printf %s\n E C B A expect 
+   git log --pretty=$FMT --full-history E -- lost |
+   unnote actual 
+   sed -e s/^.*   \([^ ]*\) .*/\1/ check actual 
+   test_cmp expect check || {
+   cat actual
+   false
+   }
+'
+
 test_done
-- 
1.8.2.1.632.gd2b1879

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 6/8] simplify-merges: never remove all TREESAME parents

2013-04-30 Thread Kevin Bracey
When simplifying an odd merge, such as one that used -s ours, we may
find ourselves TREESAME to apparently redundant parents. Prevent
simplify_merges() from removing every TREESAME parent; if this would
happen reinstate the first TREESAME parent - the one that the default
log would have followed.

This avoids producing a totally disjoint history from the default log
when the default log is a better explanation of the end result, and aids
visualisation of odd merges.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt |  3 +-
 revision.c | 69 ++
 2 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 1dad341..e23bdb0 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -471,7 +471,8 @@ history according to the following rules:
 +
 * Replace each parent `P` of `C'` with its simplification `P'`.  In
   the process, drop parents that are ancestors of other parents, and
-  remove duplicates.
+  remove duplicates, but take care to never drop all parents that
+  we are TREESAME to.
 +
 * If after this parent rewriting, `C'` is a root or merge commit (has
   zero or 1 parents), a boundary commit, or !TREESAME, it remains.
diff --git a/revision.c b/revision.c
index c88ded8..7535757 100644
--- a/revision.c
+++ b/revision.c
@@ -2119,6 +2119,73 @@ static int mark_redundant_parents(struct rev_info *revs, 
struct commit *commit)
return marked;
 }
 
+/*
+ * Awkward naming - this means one parent we are TREESAME to.
+ * cf mark_treesame_root_parents: root parents that are TREESAME (to an
+ * empty tree). Better name suggestions?
+ */
+static int leave_one_treesame_to_parent(struct rev_info *revs, struct commit 
*commit)
+{
+   struct treesame_state *ts = lookup_decoration(revs-treesame, 
commit-object);
+   struct commit *unmarked = NULL, *marked = NULL;
+   struct commit_list *p;
+   unsigned n;
+
+   for (p = commit-parents, n = 0; p; p = p-next, n++) {
+   if (ts-treesame[n]) {
+   if (p-item-object.flags  TMP_MARK) {
+   if (!marked)
+   marked = p-item;
+   } else {
+   if (!unmarked) {
+   unmarked = p-item;
+   break;
+   }
+   }
+   }
+   }
+
+   /*
+* If we are TREESAME to a marked-for-deletion parent, but not to any
+* unmarked parents, unmark the first TREESAME parent. This is the
+* parent that the default simplify_history==1 scan would have followed,
+* and it doesn't make sense to omit that path when asking for a
+* simplified full history. Retaining it improves the chances of
+* understanding odd missed merges that took an old version of a file.
+*
+* Example:
+*
+*   I-X   A modified the file, but mainline merge X used
+*\   /-s ours, so took the version from I. X is
+* `--A--' TREESAME to I and !TREESAME to A.
+*
+* Default log from X would produce I. Without this check,
+* --full-history --simplify-merges would produce I-A-X, showing
+* the merge commit X and that it changed A, but not making clear that
+* it had just taken the I version. With this check, the topology above
+* is retained.
+*
+* Note that it is possible that the simplification chooses a different
+* TREESAME parent from the default, in which case this test doesn't
+* activate, and we _do_ drop the default parent. Example:
+*
+*   I--X A modified the file, but it was reverted in B,
+*\/  meaning mainline merge X is TREESAME to both
+* A--B   parents.
+*
+* Default log would produce I by following the first parent;
+* --full-history --simplify-merges will produce I-A-B. But this is a
+* reasonable result - it presents a logical full history leading from
+* I to X, and X is not an important merge.
+*/
+   if (!unmarked  marked) {
+   marked-object.flags = ~TMP_MARK;
+   return 1;
+   }
+
+   return 0;
+}
+
 static int remove_marked_parents(struct rev_info *revs, struct commit *commit)
 {
struct commit_list **pp, *p;
@@ -,6 +2289,8 @@ static struct commit_list **simplify_one(struct rev_info 
*revs, struct commit *c
if (1  cnt) {
int marked = mark_redundant_parents(revs, commit);
if (marked)
+   marked -= leave_one_treesame_to_parent(revs, commit);
+   if (marked

[PATCH v2 1/8] decorate.c: compact table when growing

2013-04-30 Thread Kevin Bracey
When growing the table, take the opportunity to compact it by removing
entries with NULL decoration.

Users may have removed decorations by passing NULL to
insert_decoration. An object's table entry can't actually be removed
during normal operation, as it would break the linear hash collision
search. But we can remove NULL decoration entries when rebuilding the
table.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 decorate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/decorate.c b/decorate.c
index 2f8a63e..7cb5d29 100644
--- a/decorate.c
+++ b/decorate.c
@@ -49,7 +49,7 @@ static void grow_decoration(struct decoration *n)
const struct object *base = old_hash[i].base;
void *decoration = old_hash[i].decoration;
 
-   if (!base)
+   if (!decoration)
continue;
insert_decoration(n, base, decoration);
}
-- 
1.8.2.1.632.gd2b1879

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/8] rev-list-options.txt: correct TREESAME for P

2013-04-30 Thread Kevin Bracey
In the example given, P is not TREESAME to E. This doesn't affect the
current result, but it will matter when we change behaviour.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 3bdbf5e..50bbff7 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -367,8 +367,7 @@ each merge.  The commits are:
   `N` and `D` to foobarbaz; i.e., it is not TREESAME to any parent.
 
 * `E` changes `quux` to xyzzy, and its merge `P` combines the
-  strings to quux xyzzy.  Despite appearing interesting, `P` is
-  TREESAME to all parents.
+  strings to quux xyzzy.  `P` is TREESAME to `O`, but not to `E`.
 
 'rev-list' walks backwards through history, including or excluding
 commits based on whether '\--full-history' and/or parent rewriting
-- 
1.8.2.1.632.gd2b1879

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/8] History traversal refinements

2013-04-30 Thread Kevin Bracey
Okay, here's what I'll call v2 of this series. In the 3 parts from
before (4,6  7), I've addressed the comments from Junio and David,
corrected some errors, reconstructed the main commit message, and made
some adjustments in preparation for part 8.

New part 1 is just me making amends for writing NULL into decoration
and leaving cruft behind in part 4.

New part 2 expands the ancestry-path test - which is useful
because it's full of -s ours merges.

New part 3 has a little look at the TREESAME documentation bug -
maybe we should add Junio's little asterisk decorations.

Part 5 is Junio's test, in the correct place in the sequence. (Not
sure if it's valid to send that with git send-email - I'll find out).

And finally part 8 is a first attempt at the new UNINTERESTING/TREESAME
interaction logic. I'm pretty happy with the results it produces,
but it's an even more deep and scary change than the earlier
parts.

And we obviously need some more new tests - the effects of these changes
are almost non-existent on the pre-existing set. I'd like to beg for any
volunteers here - I'm not that proficient at shell scripting, and on
top of that something like this could really do with an independent
set of eyes checking that the claimed benefits actually match the
results. (And that the claims are understandable.)

Junio C Hamano (1):
  t6012: update test for tweaked full-history traversal

Kevin Bracey (7):
  decorate.c: compact table when growing
  t6019: test file dropped in -s ours merge
  rev-list-options.txt: correct TREESAME for P
  revision.c: Make --full-history consider more merges
  simplify-merges: never remove all TREESAME parents
  simplify-merges: drop merge from irrelevant side branch
  revision.c: discount UNINTERESTING parents

 Documentation/rev-list-options.txt |  38 ++--
 decorate.c |   2 +-
 revision.c | 453 +
 revision.h |   1 +
 t/t6012-rev-list-simplify.sh   |  31 ++-
 t/t6019-rev-list-ancestry-path.sh  |  37 ++-
 6 files changed, 494 insertions(+), 68 deletions(-)

-- 
1.8.2.1.632.gd2b1879

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 7/8] simplify-merges: drop merge from irrelevant side branch

2013-04-30 Thread Kevin Bracey
Reimplement commit 4b7f53da on top of the new simplify-merges
infrastructure, tightening the condition to only consider root parents;
the original version incorrectly dropped parents that were TREESAME to
anything.

Original log message follows.

The merge simplification rule stated in 6546b59 (revision traversal:
show full history with merge simplification, 2008-07-31) still
treated merge commits too specially.  Namely, in a history with this
shape:

---o---o---M
  /
 x---x---x

where three 'x' were on a history completely unrelated to the main
history 'o' and do not touch any of the paths we are following, we
still said that after simplifying all of the parents of M, 'x'
(which is the leftmost 'x' that rightmost 'x simplifies down to) and
'o' (which would be the last commit on the main history that touches
the paths we are following) are independent from each other, and
both need to be kept.

That is incorrect; when the side branch 'x' never touches the paths,
it should be removed to allow M to simplify down to the last commit
on the main history that touches the paths.

Suggested-by: Junio C Hamano gits...@pobox.com
Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt | 34 +-
 revision.c | 26 +-
 t/t6012-rev-list-simplify.sh   |  2 +-
 3 files changed, 47 insertions(+), 15 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index e23bdb0..b7fbc80 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -342,13 +342,13 @@ In the following, we will always refer to the same 
example history to
 illustrate the differences between simplification settings.  We assume
 that you are filtering for a file `foo` in this commit graph:
 ---
- .-A---M---N---O---P
-/ /   /   /   /
-   I B   C   D   E
-\   /   /   /   /
- `-'
+ .-A---M---N---O---P---Q
+/ /   /   /   /   /
+   I B   C   D   E   Y
+\   /   /   /   /   /
+ `-'   X
 ---
-The horizontal line of history A---P is taken to be the first parent of
+The horizontal line of history A---Q is taken to be the first parent of
 each merge.  The commits are:
 
 * `I` is the initial commit, in which `foo` exists with contents
@@ -369,6 +369,10 @@ each merge.  The commits are:
 * `E` changes `quux` to xyzzy, and its merge `P` combines the
   strings to quux xyzzy.  `P` is TREESAME to `O`, but not to `E`.
 
+* `X` is an indpendent root commit that added a new file `side`, and `Y`
+  modified it. `Y` is TREESAME to `X`. Its merge `Q` added `side` to `P`, and
+  `Q` is TREESAME to `P`, but not to `Y`.
+
 'rev-list' walks backwards through history, including or excluding
 commits based on whether '\--full-history' and/or parent rewriting
 (via '\--parents' or '\--children') are used.  The following settings
@@ -409,7 +413,7 @@ parent lines.
the example, we get
 +
 ---
-   I  A  B  N  D  O  P
+   I  A  B  N  D  O  P  Q
 ---
 +
 `M` was excluded because it is TREESAME to both parents.  `E`,
@@ -430,7 +434,7 @@ Along each parent, prune away commits that are not included
 themselves.  This results in
 +
 ---
- .-A---M---N---O---P
+ .-A---M---N---O---P---Q
 / /   /   /   /
I B   /   D   /
 \   /   /   /   /
@@ -440,7 +444,7 @@ themselves.  This results in
 Compare to '\--full-history' without rewriting above.  Note that `E`
 was pruned away because it is TREESAME, but the parent list of P was
 rewritten to contain `E`'s parent `I`.  The same happened for `C` and
-`N`.
+`N`, and `X`, `Y` and `Q`.
 
 In addition to the above settings, you can change whether TREESAME
 affects inclusion:
@@ -470,9 +474,9 @@ history according to the following rules:
 * Set `C'` to `C`.
 +
 * Replace each parent `P` of `C'` with its simplification `P'`.  In
-  the process, drop parents that are ancestors of other parents, and
-  remove duplicates, but take care to never drop all parents that
-  we are TREESAME to.
+  the process, drop parents that are ancestors of other parents or that are
+  root commits TREESAME to an empty tree, and remove duplicates, but take care
+  to never drop all parents that we are TREESAME to.
 +
 * If after this parent rewriting, `C'` is a root or merge commit (has
   zero or 1 parents), a boundary commit, or !TREESAME, it remains.
@@ -490,7 +494,7 @@ The effect of this is best shown by way of comparing

[PATCH v2 4/8] revision.c: Make --full-history consider more merges

2013-04-30 Thread Kevin Bracey
History simplification previously always treated merges as TREESAME
if they were TREESAME to any parent.

While this was consistent with the default behaviour, this could be
extremely unhelpful when searching detailed history, and could not be
overridden. For example, if a merge had ignored a change, as if by -s
ours, then:

  git log -m -p --full-history -Schange file

would successfully locate change's addition but would not locate the
merge that resolved against it.

Futher, simplify_merges could drop the actual parent that a commit
was TREESAME to, leaving it as a normal commit marked TREESAME that
isn't actually TREESAME to its remaining parent.

Now redefine a commit's TREESAME flag to be true only if a commit is
TREESAME to _all_ of its parents. This doesn't affect either the default
simplify_history behaviour (because partially TREESAME merges are turned
into normal commits), or full-history with parent rewriting (because all
merges are output). But it does affect other modes. The clearest
difference is that --full-history will show more merges - sufficient to
ensure that -m -p --full-history log searches can really explain every
change to the file, including those changes' ultimate fate in merges.

Also modify simplify_merges to recalculate TREESAME after removing
a parent. This is achieved by storing per-parent TREESAME flags on the
initial scan, so the combined flag can be easily recomputed.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt |   6 +-
 revision.c | 241 -
 revision.h |   1 +
 t/t6019-rev-list-ancestry-path.sh  |   2 +-
 4 files changed, 216 insertions(+), 34 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 50bbff7..1dad341 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -409,10 +409,10 @@ parent lines.
the example, we get
 +
 ---
-   I  A  B  N  D  O
+   I  A  B  N  D  O  P
 ---
 +
-`P` and `M` were excluded because they are TREESAME to a parent.  `E`,
+`M` was excluded because it is TREESAME to both parents.  `E`,
 `C` and `B` were all walked, but only `B` was !TREESAME, so the others
 do not appear.
 +
@@ -440,7 +440,7 @@ themselves.  This results in
 Compare to '\--full-history' without rewriting above.  Note that `E`
 was pruned away because it is TREESAME, but the parent list of P was
 rewritten to contain `E`'s parent `I`.  The same happened for `C` and
-`N`.  Note also that `P` was included despite being TREESAME.
+`N`.
 
 In addition to the above settings, you can change whether TREESAME
 affects inclusion:
diff --git a/revision.c b/revision.c
index a67b615..c88ded8 100644
--- a/revision.c
+++ b/revision.c
@@ -429,10 +429,100 @@ static int rev_same_tree_as_empty(struct rev_info *revs, 
struct commit *commit)
return retval = 0  (tree_difference == REV_TREE_SAME);
 }
 
+struct treesame_state {
+   unsigned int nparents;
+   unsigned char treesame[FLEX_ARRAY];
+};
+
+static struct treesame_state *initialise_treesame(struct rev_info *revs, 
struct commit *commit)
+{
+   unsigned n = commit_list_count(commit-parents);
+   struct treesame_state *st = xcalloc(1, sizeof(*st) + n);
+   st-nparents = n;
+   add_decoration(revs-treesame, commit-object, st);
+   return st;
+}
+
+/*
+ * Must be called immediately after removing the nth_parent from a commit's
+ * parent list, if we are maintaining the per-parent treesame[] decoration.
+ * This does not recalculate the master TREESAME flag - update_treesame()
+ * should be called to update it after a sequence of treesame[] modifications
+ * that may have affected it.
+ */
+static int compact_treesame(struct rev_info *revs, struct commit *commit, 
unsigned nth_parent)
+{
+   struct treesame_state *st;
+   int old_same;
+
+   if (!commit-parents) {
+   /*
+* Have just removed the only parent from a non-merge.
+* Different handling, as we lack decoration.
+*/
+   if (nth_parent != 0)
+   die(compact_treesame %u, nth_parent);
+   old_same = !!(commit-object.flags  TREESAME);
+   if (rev_same_tree_as_empty(revs, commit))
+   commit-object.flags |= TREESAME;
+   else
+   commit-object.flags = ~TREESAME;
+   return old_same;
+   }
+
+   st = lookup_decoration(revs-treesame, commit-object);
+   if (!st || nth_parent = st-nparents)
+   die(compact_treesame %u, nth_parent);
+
+   old_same = st-treesame[nth_parent];
+   memmove(st-treesame + nth_parent,
+   st-treesame + nth_parent + 1,
+   st-nparents

[PATCH v2 8/8] revision.c: discount UNINTERESTING parents

2013-04-30 Thread Kevin Bracey
The simplification and rewriting logic previously paid little heed to
whether parents were UNINTERESTING, leading to situations where limited
histories could unnecessarily include a lot of irrelevant merges along
the boundary. Tighten up the rules to properly account for limited
lists:

1) If a merge has INTERESTING parents, and it is TREESAME to them, then
do not let UNINTERESTING parents cause the merge to be treated as
!TREESAME. If we match our walked parents, we don't care if we don't
match unwalked parents.

2) Do not let UNINTERESTING parents prevent commits from being
simplified or omitted: merges with exactly 1 INTERESTING parent that
they are TREESAME to can be treated as a non-merge commit.

3) When rewriting parents, we don't need to show all merges - only merges
with 2 or more INTERESTING parents are required to hold topology together.

These changes greatly increase simplification in pruned, limited
revision lists - irrelevant merges from unlisted or partially listed
side branches can be omitted.

These rules paying more attention to UNINTERESTING do add a tricky
wrinkle to behaviour. Because limited revision lists are conventionally
expressed as A..B (ie B ^A), the bottom commit is UNINTERESTING. Thus
its connection to the INTERESTING graph is not privileged over side
branches, and this can lead to its first descendant merge being shown
for no particularly good reason.

See t6019's --ancestry-path G..M -- G.t for an example of this effect.
Merges H and J are semantically identical and equally irrelevant, from
the point of view of tracking the history of G.t, but H is shown and J
isn't. Bottom commit G is marked UNINTERESTING, and thus isn't
privileged over E, so H is shown because it differs from E.  Whereas
higher up the graph, I is INTERESTING and thus privileged over F, so we
don't care that J differs from F.

So should we treat bottom commits as interesting for the rules above?

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 revision.c| 141 --
 t/t6019-rev-list-ancestry-path.sh |  20 --
 2 files changed, 134 insertions(+), 27 deletions(-)

diff --git a/revision.c b/revision.c
index 20c7058..aa814bc 100644
--- a/revision.c
+++ b/revision.c
@@ -333,6 +333,45 @@ static int everybody_uninteresting(struct commit_list 
*orig)
 }
 
 /*
+ * Compute whether we have only one interesting parent. If we are TREESAME,
+ * and this returns a parent, then we can safely simplify to that parent.
+ *
+ * For 1-parent commits, or if first-parent-only, then this returns the only
+ * parent (whether UNINTERESTING or not, that is our interesting parent).
+ * TREESAME will have been set purely on that parent.
+ *
+ * For multi-parent commits, identify a sole INTERESTING parent, if any.
+ * If we have only 1 INTERESTING parent, then TREESAME will be set purely
+ * with regard to that parent, and we can choose simplification appropriately.
+ *
+ * If we have more than one INTERESTING parent, or no INTERESTING parents
+ * (and multiple UNINTERESTING ones), then we can't choose a parent to follow,
+ * and we should not be simplified.
+ */
+struct commit *sole_interesting(struct rev_info *revs, struct commit_list 
*orig)
+{
+   struct commit_list *list = orig;
+   struct commit *interesting = NULL;
+
+   if (!orig)
+   return NULL;
+
+   if (revs-first_parent_only || !orig-next)
+   return orig-item;
+
+   while (list) {
+   struct commit *commit = list-item;
+   list = list-next;
+   if (!(commit-object.flags  UNINTERESTING)) {
+   if (interesting)
+   return NULL;
+   interesting = commit;
+   }
+   }
+   return interesting;
+}
+
+/*
  * The goal is to get REV_TREE_NEW as the result only if the
  * diff consists of all '+' (and no other changes), REV_TREE_OLD
  * if the whole diff is removal of old data, and otherwise
@@ -502,27 +541,53 @@ static unsigned update_treesame(struct rev_info *revs, 
struct commit *commit)
if (commit-parents  commit-parents-next) {
unsigned n;
struct treesame_state *st;
+   struct commit_list *p;
+   unsigned interesting_parents;
+   unsigned uninteresting_change, interesting_change;
 
st = lookup_decoration(revs-treesame, commit-object);
if (!st)
die(update_treesame %s, 
sha1_to_hex(commit-object.sha1));
-   commit-object.flags |= TREESAME;
-   for (n = 0; n  st-nparents; n++) {
-   if (!st-treesame[n]) {
-   commit-object.flags = ~TREESAME;
-   break;
+   interesting_parents = 0;
+   uninteresting_change = interesting_change = 0;
+   for (p = commit-parents, n = 0; p; n++, p = p-next

[PATCH v2 2/8] t6019: test file dropped in -s ours merge

2013-04-30 Thread Kevin Bracey
In preparation for upcoming TREESAME work, check the result for G.t,
which is dropped in -s ours merge L. The default rev-list is empty, as
expected - it follows the first parent path where it never existed.

Unfortunately, --ancestry-path is also empty. Merges H J and L are all
TREESAME to 1 parent, so are treated as TREESAME and not shown. This is
clearly undesirable in the case of merge L, which dropped our G.t by
taking the non-ancestry-path version. Document this as a known failure,
and expect H J L, the 3 merges along the path that had to choose
between G.t versions.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 t/t6019-rev-list-ancestry-path.sh | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/t/t6019-rev-list-ancestry-path.sh 
b/t/t6019-rev-list-ancestry-path.sh
index 39b4cb0..86ef908 100755
--- a/t/t6019-rev-list-ancestry-path.sh
+++ b/t/t6019-rev-list-ancestry-path.sh
@@ -13,6 +13,9 @@ test_description='--ancestry-path'
 #
 #  D..M -- M.t == M
 #  --ancestry-path D..M -- M.t == M
+#
+#  G..M -- G.t == [nothing - was dropped in -s ours merge L]
+#  --ancestry-path G..M -- G.t == H J L
 
 . ./test-lib.sh
 
@@ -63,13 +66,29 @@ test_expect_success 'rev-list D..M -- M.t' '
test_cmp expect actual
 '
 
-test_expect_success 'rev-list --ancestry-patch D..M -- M.t' '
+test_expect_success 'rev-list --ancestry-path D..M -- M.t' '
echo M expect 
git rev-list --ancestry-path --format=%s D..M -- M.t |
sed -e /^commit /d actual 
test_cmp expect actual
 '
 
+# G.t is dropped in an -s ours merge
+test_expect_success 'rev-list G..M -- G.t' '
+   expect 
+   git rev-list --format=%s G..M -- G.t |
+   sed -e /^commit /d actual 
+   test_cmp expect actual
+'
+
+test_expect_failure 'rev-list --ancestry-path G..M -- G.t' '
+   for c in H J L; do echo $c; done expect 
+   git rev-list --ancestry-path --format=%s G..M -- G.t |
+   sed -e /^commit /d |
+   sort actual 
+   test_cmp expect actual
+'
+
 #   b---bc
 #  / \ /
 # a   X
-- 
1.8.2.1.632.gd2b1879

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/3] revision.c: tighten up TREESAME handling of merges

2013-04-29 Thread Kevin Bracey

On 28/04/2013 21:38, Junio C Hamano wrote:



   @@ -773,6 +861,9 @@ static void limit_to_ancestry(struct
commit_list *bottom, struct commit_list *li
 * NEEDSWORK: decide if we want to remove parents that are
 * not marked with TMP_MARK from commit-parents for commits
 * in the resulting list.  We may not want to do that, though.
+*
+* Maybe it should be considered if we are TREESAME to such
+* parents - now possible with stored per-parent flags.
 */

Hmm, that is certainly a thought.

My comment's wrong though. Reconsidering, what I think needs removing
is actually off-ancestry parents that we are !TREESAME to, when we are
TREESAME on the ancestry path.

I thought I read you meant exactly that, i.e. !TREESAME, but now I
re-read what is quoted, you did say we are TREESAME ;-).  I think
I agree with you that we do not want any side branch that is not on
the ancestry path we are interested in to affect the sameness
assigned to the merge commit.


I did a trial implementation of this in limit_to_ancestry(), and the 
result was lovely, but in the end I decided it's not actually the right 
place to do it. The logic is more general than that; this isn't just an 
ancestry-path issue, and I think hiding parents isn't the right way to 
go about it anyway.


To slightly generalise your own wording: I think the rule is we do not 
want any side branch that is UNINTERESTING to affect the sameness 
assigned to the merge commit. I think that rule applies to all dense, 
pruned modes.


Having experimented with some of the annoyingly complex merge paths that 
originally prompted this series, it seems this rule makes a huge 
difference, and it's useful whether asking --simplify-merges A..B 
file or --ancestry-path A..B file.


At present, either query will show lots of really boring merge commits 
of topic branches at the boundary, with 1 INTERESTING parent that 
they're TREESAME too, and 1 UNINTERESTING parent that they may or may 
not be TREESAME to, depending on how old the base of that topic branch 
was. Most such commits are of no relevance to our history whatsoever. In 
the case of --simplify-merges, the fact that they're UNINTERESTING 
actually _prevented_ their simplification - if it had been allowed to 
follow the UNINTERESTING path back further, it would have reached an 
ancestor, and been found redundant. So limiting the rev-list actually 
increases the number of merges shown.


We can lose all those boring commits with these two changes:

1) Previously TREESAME was defined as this commit matches at least 1 
parent. My first patch changes it to this commit matches all parents. 
It should be refined further to this commit matches all INTERESTING 
parents, if it has any, else all (UNINTERESTING) parents. (Can we word 
that better?) Note that this fancy rule collapses to the same 
straightforward TREESAME check as ever for 0- or 1-parent commits.


2) simplify_merges currently will not simplify commits unless they have 
exactly 1 parent. That's not what we want. We only need to preserve 
commits that don't have exactly 1 INTERESTING parent.


Those 2 rules produce the desirable result: if we have a merge commit 
with exactly 1 INTERESTING parent it is TREESAME to, it is always 
simplified away - any other UNINTERESTING parents it may have did not 
affect our code, so we don't care about whether we were TREESAME to them 
or not, and as we don't want to see any of the UNINTERESTING parents 
themselves, the merge is not worth showing.


This makes a massive difference on some of my searches, reducing the 
total commits shown by a factor of 5 to 10, greatly improving the 
signal-to-noise ratio.


I'll put together a trial patch at the end of the next iteration of the 
series that implements this logic. I need to think a bit more - I think 
get_commit_action needs a similar INTERESTING check for merges too, to 
get the same sort of effect without relying on simplify_merges. Parent 
rewriting shouldn't necessitate keeping all merges - only merges with 2+ 
INTERESTING parents.




   *   *
   .-A---M---N---O---P
  /*/   /*  /*  /*
 I B   C   D   E
  \   /*  /   /*  /
   `-'

I've added '*' next to each arc between a commit-pair whose contents
at 'foo' are different to the illustration, following the set-up the
manual describes.  E is the same as I for 'foo' and P would resolve
'foo' to be the same as O.


I think that sort of thing could be a useful patch to the docs.



Given this error, and this change, I think this example may want a
slight rethink. Do we want a proper messing with other paths but
TREESAME merge example? Say if E's parent was O, P would not be
TREESAME and not included in --full-history.

I am not sure if I follow your last sentence.

Do you mean this topology, where E's sole parent is O, i.e.

   E
  / \
N---O---P
/*
   

Re: [RFC/PATCH 2/3] simplify-merges: never remove all TREESAME parents

2013-04-28 Thread Kevin Bracey

On 28/04/2013 02:02, Junio C Hamano wrote:

Kevin Bracey ke...@bracey.fi writes:


In the event of an odd merge, we may find ourselves TREESAME to
apparently redundant parents. Prevent simplify_merges() from removing
every TREESAME parent - in the event of such a merge it's useful to see
where we came actually from came.


came from :)



@@ -2106,8 +2106,32 @@ static int remove_marked_parents(struct rev_info *revs, 
struct commit *commit)
  {
struct treesame_state *ts = lookup_decoration(revs-treesame, 
commit-object);
struct commit_list **pp, *p;
+   struct commit *su = NULL, *sm = NULL;

What do su and sm stand for?


same, unmarked and same, marked were what was in my head.



Could you explain here a bit more the reason why we do not want to
remove them and why -s ours is so significant that it deserves to
be singled out?  And why randomly picking one that is redundant
(because it is an ancestor of some other parent) is an improvement?


I feel it's consistent with the default non-full-history behaviour. The 
parent that we choose not to remove is the  same one that the default 
log with simplify_history==1 would have followed: the first parent we 
are TREESAME to. Or at least that's the intent. So this parent would 
normally be singled out, and it's not an arbitrary (or random) choice.


It feels wrong to me that --full-history --simplify-merges could produce 
a disjoint history from the default. I think it should include the 
default simple history. If this code triggered, it should produce 
something quite distinctive in a graphical view, which will help find 
odd/broken merges.




The remove-redundant logic first marks commits that are ancestors
of other commits in the original set, without taking treesame[] into
account at all.  If the final objective of the code is to keep paths
that consists of non-treesame[] commits, perhaps the logic needs to
be changed to reject non-treesame[] commits that are ancestors of
other non-treesame[] commits, or something?


Maybe - the main simplify_merges machinery could certainly make use of 
the treesame[] information, and one of the motivations for introducing 
it was to open up the possibility of richer analysis. I'll spend a 
little time to think about whether we want a more fundamental change to 
the original scan.


But this patch as it stands was an easy change to make with clearly 
limited scope and relatively little risk - I specifically wanted just to 
include our default simple parent.


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/3] revision.c: tighten up TREESAME handling of merges

2013-04-28 Thread Kevin Bracey

On 28/04/2013 01:36, Junio C Hamano wrote:

Kevin Bracey ke...@bracey.fi writes:


Historically TREESAME was set on a commit if it was TREESAME to _any_ of
its parents. This is not optimal, as such a merge could still be worth
showing, particularly if it is an odd -s ours merge that (possibly
accidentally) dropped a change.

... and with options like --full-history or --simplify-merges are
used to get more complete history, I think.  git log path without
these options is a tool to get one version of simplified history
that explains the end result, and by definition, the side branch
merged by -s ours did _not_ contribute anything to the end result.


Yeah, I'm not happy with this commit message - I knocked it up 
separately from my first pass, which I didn't have to hand. Next version 
will combine it with the original, which better distinguished the 
default mode, and specifically addressed the --full-history -S search 
problem.


That's key - that I really want such searches to be able to track the 
entire life of a change on a side branch, not potentially showing just 
its birth as now, but also always including any ultimate merge death. (I 
think that we may be able to refine --ancestry-path to give an even 
tighter pinpoint, but --full-history should definitely include the 
information, as per its name).




Do we want to discard the decoration data when the commit becomes a
non-merge?


Would seem reasonable, and would also help make concrete why we update 
TREESAME immediately, and not in update_treesame(), but I didn't spot a 
mechanism to discard decoration. I'll recheck.





+   commit-object.flags |= TREESAME;
+   for (n = 0; n  st-nparents; n++) {
+   if (!st-treesame[n]) {
+   commit-object.flags = ~TREESAME;
+   break;
+   }
+   }

Can a commit that earlier was marked as TREESAME become not TREESAME?
Wouldn't simplification only increase sameness, never decrease?


That's true - I paid attention to that earlier when it really mattered 
due to the cost of recalculating it with try_to_simplify_commit(). Not 
sure that it matters so much any more, and I don't see how we can use 
that information to change this scan for !treesame loop.


I could insert an if (!commit-object.flags  TREESAME) test to skip 
the entire update. I'd be inclined to do that as the caller of 
update_treesame(). I think update_treesame() itself should be 
general-purpose without assumptions about what changes have been made, 
so it's a pure treesame[]-TREESAME calculation, without TREESAME as an 
input.


(Aside - just occurred to me we could swap the loop for 
strlen(st-treesame) == st-nparents, if we kept a zero terminator in 
the array. Maybe a bit too smart-ass?)






+   for (pp = commit-parents;
+(parent = *pp) != NULL;
+pp = parent-next, nth_parent++) {

I see the reason to change from while to for is because you wanted
to count, and I think it makes sense; but it is more readable to
initialise the counter here, too, if that is the case. I.e.

for (pp = commit-parents, nth_parent = 0;
 !(parent = *pp);
 pp = parent-next, nth_parent++) {


Agree on nth_parent, but  !(parent = *pp)  isn't (parent = *pp) != 
NULL, mind. Did you mean !!? In which case I still prefer it my way.




+   if (!tree_changed)
+   ts-treesame[0] = 1;
Have we made any two tree comparison at this point to set this one?
Ahh, this is tricky.  You do this in the _second_ iteration of the
loop, so tree_changed here is from inspecting the first parent, not
the one we are looking at (i.e. *p).


Yes, this is the we've reached our second iteration, so from now on 
we're dealing a merge if {} block. I'll clarify this in the comment at 
the top, and note that we're populating the newly-allocated treesame[] 
from our first iteration.




  
@@ -773,6 +861,9 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li

 * NEEDSWORK: decide if we want to remove parents that are
 * not marked with TMP_MARK from commit-parents for commits
 * in the resulting list.  We may not want to do that, though.
+*
+* Maybe it should be considered if we are TREESAME to such
+* parents - now possible with stored per-parent flags.
 */

Hmm, that is certainly a thought.


My comment's wrong though. Reconsidering, what I think needs removing is 
actually off-ancestry parents that we are !TREESAME to, when we are 
TREESAME on the ancestry path.


I've realised while testing this that there's been one thing that's 
confused me repeatedly, and I think this comment was an example of it. 
The example in the rev-list-options manual is wrong.


  .-A---M---N---O---P
 / /   /   /   /
I B   C   D   E

Re: [RFC/PATCH] Make --full-history consider more merges

2013-04-26 Thread Kevin Bracey

On 25/04/2013 21:19, Junio C Hamano wrote:
How many decorations are we talking about here? One for each merge 
commit in the entire history? Do we have a cue that can tell us when 
we are really done with a commit that lets us discard the associated 
data as we go on digging, keeping the size of our working set somewhat 
bounded, perhaps proportional to the number of commits in our 
rev_info-commits queue? 


As it stands, it's one decoration per interesting merge commit that's 
processed by try_to_simplify_commit(), if simplify_merges is set. (Only 
simplify_merges needs the information at present - conceivably 
limit_to_ancestry could use it too). I don't know how to go about 
discarding the data, or when it could be done - I'm not clear enough on 
the wider sequencing machinery in revision.c yet.


I have a first draft of a set of patches, which will follow this message.

1/3 addresses the initial get simplify_merges right, passing your 
test. That patch is written to make it easy to extend with the next two, 
which add a never eliminate all our TREESAME parents rule, and 
reinstate your revertedeliminate irrelevant side branches. The error I 
spotted in that was that you weren't checking that you were eliminating 
root commits - with that fixed it seems sound to me.


I need to create a new test for patch 2, but this version passes the 
full test suite, including fixing the known failure in t6012, and it 
also gets the examples in the manual right. (I've extended the examples 
to include the irrelevant side branch - not sure this is worthwhile, as 
it's such an unusual case, and I think that section is confusing enough 
for newbies...)


Kevin




--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC/PATCH 1/3] revision.c: tighten up TREESAME handling of merges

2013-04-26 Thread Kevin Bracey
Historically TREESAME was set on a commit if it was TREESAME to _any_ of
its parents. This is not optimal, as such a merge could still be worth
showing, particularly if it is an odd -s ours merge that (possibly
accidentally) dropped a change. And the problem is further compounded by
the fact that simplify_merges could drop the actual parent that a commit
was TREESAME to, leaving it as a normal commit marked TREESAME that
isn't actually TREESAME to its remaining parent.

This commit redefines TREESAME to be true only if a commit is TREESAME to
_all_ of its parents. This doesn't affect either the default
simplify_history behaviour (because partially TREESAME merges are turned
into normal commits), or full-history with parent rewriting (because all
merges are output). But it does affect other modes.

It also modifies simplify_merges to recalculate TREESAME after removing
a parent. This is achieved by storing per-parent TREESAME flags on the
initial scan, so the combined flag can be easily recomputed.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt |   2 +-
 revision.c | 220 -
 revision.h |   1 +
 3 files changed, 192 insertions(+), 31 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 3bdbf5e..380db48 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -413,7 +413,7 @@ parent lines.
I  A  B  N  D  O
 ---
 +
-`P` and `M` were excluded because they are TREESAME to a parent.  `E`,
+`P` and `M` were excluded because they are TREESAME to both parents.  `E`,
 `C` and `B` were all walked, but only `B` was !TREESAME, so the others
 do not appear.
 +
diff --git a/revision.c b/revision.c
index a67b615..176eb7b 100644
--- a/revision.c
+++ b/revision.c
@@ -429,10 +429,85 @@ static int rev_same_tree_as_empty(struct rev_info *revs, 
struct commit *commit)
return retval = 0  (tree_difference == REV_TREE_SAME);
 }
 
+struct treesame_state {
+   unsigned int nparents;
+   unsigned char treesame[FLEX_ARRAY];
+};
+
+static struct treesame_state *initialise_treesame(struct rev_info *revs, 
struct commit *commit)
+{
+   unsigned n = commit_list_count(commit-parents);
+   struct treesame_state *st = xcalloc(1, sizeof(*st) + n);
+   st-nparents = n;
+   add_decoration(revs-treesame, commit-object, st);
+   return st;
+}
+
+static int compact_treesame(struct rev_info *revs, struct commit *commit, 
unsigned parent)
+{
+   struct treesame_state *st = lookup_decoration(revs-treesame, 
commit-object);
+   int old_same;
+
+   if (!st || parent = st-nparents)
+   die(compact_treesame %u, parent);
+
+   /* Can be useful to indicate sameness of removed parent */
+   old_same = st-treesame[parent];
+
+   memmove(st-treesame + parent,
+   st-treesame + parent + 1,
+   st-nparents - parent - 1);
+
+   /* If we've just become a non-merge commit, update TREESAME
+* immediately, in case we trigger an early-exit optimisation.
+* (Note that there may be a mismatch between commit-parents and the
+* treesame_state at this stage, as we may be in mid-rewrite).
+* If still a merge, defer update until update_treesame().
+*/
+   switch (--st-nparents) {
+   case 0:
+   if (rev_same_tree_as_empty(revs, commit))
+   commit-object.flags |= TREESAME;
+   else
+   commit-object.flags = ~TREESAME;
+   break;
+   case 1:
+   if (st-treesame[0]  revs-dense)
+   commit-object.flags |= TREESAME;
+   else
+   commit-object.flags = ~TREESAME;
+   break;
+   }
+
+   return old_same;
+}
+
+static unsigned update_treesame(struct rev_info *revs, struct commit *commit)
+{
+   /* If 0 or 1 parents, TREESAME should be valid already */
+   if (commit-parents  commit-parents-next) {
+   int n = 0;
+   struct treesame_state *st;
+
+   st = lookup_decoration(revs-treesame, commit-object);
+   if (!st) die(update_treesame);
+   commit-object.flags |= TREESAME;
+   for (n = 0; n  st-nparents; n++) {
+   if (!st-treesame[n]) {
+   commit-object.flags = ~TREESAME;
+   break;
+   }
+   }
+   }
+
+   return commit-object.flags  TREESAME;
+}
+
 static void try_to_simplify_commit(struct rev_info *revs, struct commit 
*commit)
 {
struct commit_list **pp, *parent;
-   int tree_changed = 0, tree_same = 0, nth_parent = 0;
+   struct treesame_state *ts = NULL;
+   int tree_changed = 0

[RFC/PATCH 2/3] simplify-merges: never remove all TREESAME parents

2013-04-26 Thread Kevin Bracey
In the event of an odd merge, we may find ourselves TREESAME to
apparently redundant parents. Prevent simplify_merges() from removing
every TREESAME parent - in the event of such a merge it's useful to see
where we came actually from came.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt |  3 ++-
 revision.c | 24 
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 380db48..0832004 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -472,7 +472,8 @@ history according to the following rules:
 +
 * Replace each parent `P` of `C'` with its simplification `P'`.  In
   the process, drop parents that are ancestors of other parents, and
-  remove duplicates.
+  remove duplicates, but take care to never drop all parents that
+  we are TREESAME to.
 +
 * If after this parent rewriting, `C'` is a root or merge commit (has
   zero or 1 parents), a boundary commit, or !TREESAME, it remains.
diff --git a/revision.c b/revision.c
index 176eb7b..4e27c9a 100644
--- a/revision.c
+++ b/revision.c
@@ -2106,8 +2106,32 @@ static int remove_marked_parents(struct rev_info *revs, 
struct commit *commit)
 {
struct treesame_state *ts = lookup_decoration(revs-treesame, 
commit-object);
struct commit_list **pp, *p;
+   struct commit *su = NULL, *sm = NULL;
int n, removed = 0;
 
+   /* Prescan - look for both marked and unmarked TREESAME parents */
+   for (p = commit-parents, n = 0; p; p = p-next, n++) {
+   if (ts-treesame[n]) {
+   if (p-item-object.flags  TMP_MARK) {
+   if (!sm) sm = p-item;
+   }
+   else {
+   if (!su) {
+   su = p-item;
+   break;
+   }
+   }
+   }
+   }
+
+   /* If we are TREESAME to a marked-for-deletion parent, but not to any
+* unmarked parents, unmark the first TREESAME parent. We don't want
+* to remove our real parent in the event of an -s ours type
+* merge.
+*/
+   if (!su  sm)
+   sm-object.flags = ~TMP_MARK;
+
pp = commit-parents;
n = 0;
while ((p = *pp) != NULL) {
-- 
1.8.2.1.632.gd2b1879

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC/PATCH 3/3] simplify-merges: drop merge from irrelevant side branch

2013-04-26 Thread Kevin Bracey
Reimplement commit 4b7f53da on top of the new simplify-merges
infrastructure, tightening the condition to only consider root parents;
the original version incorrectly dropped parents that were TREESAME to
anything.

Original log message follows.

The merge simplification rule stated in 6546b59 (revision traversal:
show full history with merge simplification, 2008-07-31) still
treated merge commits too specially.  Namely, in a history with this
shape:

---o---o---M
  /
 x---x---x

where three 'x' were on a history completely unrelated to the main
history 'o' and do not touch any of the paths we are following, we
still said that after simplifying all of the parents of M, 'x'
(which is the leftmost 'x' that rightmost 'x simplifies down to) and
'o' (which would be the last commit on the main history that touches
the paths we are following) are independent from each other, and
both need to be kept.

That is incorrect; when the side branch 'x' never touches the paths,
it should be removed to allow M to simplify down to the last commit
on the main history that touches the paths.

Suggested-by: Junio C Hamano gits...@pobox.com
Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/rev-list-options.txt | 35 ++-
 revision.c | 26 +-
 t/t6012-rev-list-simplify.sh   |  2 +-
 3 files changed, 48 insertions(+), 15 deletions(-)

diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 0832004..db45401 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -342,13 +342,13 @@ In the following, we will always refer to the same 
example history to
 illustrate the differences between simplification settings.  We assume
 that you are filtering for a file `foo` in this commit graph:
 ---
- .-A---M---N---O---P
-/ /   /   /   /
-   I B   C   D   E
-\   /   /   /   /
- `-'
+ .-A---M---N---O---P---Q
+/ /   /   /   /   /
+   I B   C   D   E   Y
+\   /   /   /   /   /
+ `-'   X
 ---
-The horizontal line of history A---P is taken to be the first parent of
+The horizontal line of history A---Q is taken to be the first parent of
 each merge.  The commits are:
 
 * `I` is the initial commit, in which `foo` exists with contents
@@ -370,6 +370,10 @@ each merge.  The commits are:
   strings to quux xyzzy.  Despite appearing interesting, `P` is
   TREESAME to all parents.
 
+* `X` is an indpendent root commit that added a new file `side`, and `Y`
+  modified it. `Y` is TREESAME to `X`. Its merge `Q` added `side` to `P`, and
+  `Q` is TREESAME to all parents.
+
 'rev-list' walks backwards through history, including or excluding
 commits based on whether '\--full-history' and/or parent rewriting
 (via '\--parents' or '\--children') are used.  The following settings
@@ -413,7 +417,7 @@ parent lines.
I  A  B  N  D  O
 ---
 +
-`P` and `M` were excluded because they are TREESAME to both parents.  `E`,
+`Q`, `P` and `M` were excluded because they are TREESAME to both parents.  `E`,
 `C` and `B` were all walked, but only `B` was !TREESAME, so the others
 do not appear.
 +
@@ -431,7 +435,7 @@ Along each parent, prune away commits that are not included
 themselves.  This results in
 +
 ---
- .-A---M---N---O---P
+ .-A---M---N---O---P---Q
 / /   /   /   /
I B   /   D   /
 \   /   /   /   /
@@ -441,7 +445,8 @@ themselves.  This results in
 Compare to '\--full-history' without rewriting above.  Note that `E`
 was pruned away because it is TREESAME, but the parent list of P was
 rewritten to contain `E`'s parent `I`.  The same happened for `C` and
-`N`.  Note also that `P` was included despite being TREESAME.
+`N`, and `X`, `Y` and `Q`.  Note also that `P` and `Q` were included despite
+being TREESAME.
 
 In addition to the above settings, you can change whether TREESAME
 affects inclusion:
@@ -471,9 +476,9 @@ history according to the following rules:
 * Set `C'` to `C`.
 +
 * Replace each parent `P` of `C'` with its simplification `P'`.  In
-  the process, drop parents that are ancestors of other parents, and
-  remove duplicates, but take care to never drop all parents that
-  we are TREESAME to.
+  the process, drop parents that are ancestors of other parents or that are
+  root commits TREESAME to an empty tree, and remove duplicates, but take care
+  to never drop all parents that we are TREESAME to.
 +
 * If after this parent rewriting, `C'` is a root or merge commit (has
   zero or 1 parents), a boundary commit, or !TREESAME

Re: [RFC/PATCH] Make --full-history consider more merges

2013-04-25 Thread Kevin Bracey

On 25/04/2013 04:59, Junio C Hamano wrote:

Junio C Hamano gits...@pobox.com writes:


So, given all that, revised patch below:

I tried to squeeze the minimum test I sent $gmane/220919 to the test
suite.  I think the do not use --parents option for this test
switch needs to be cleaned up a bit more, but it fails without your
patch and does pass with your patch.

I somehow was hoping that your fix to TREESAME semantics would also
correct the known breakage documented in that test, but it seems
that I was too greedy ;-)
Thanks for the test addition. Maybe we will be able to satisfy your 
greed in this series. There could be more worth doing here, and I think 
getting TREESAME precise is key.


I think I do want to take the step of storing treesame per parent. And 
once we do that, as well as avoiding the expensive re-diff, we have much 
richer information readily available as a simplification input (and output).


I'm working on a patch that does this - filling in an initial treesame[] 
array as a decoration in try_to_simplify_commit() is easy, and 
maintaining the array through later parent rewrites isn't as onerous as 
I feared - there are only a few places that rewrite parents after the 
initial scan. With a couple of helper functions to do things like 
delete nth, I think it'll be quite tidy.


I believe that simplify_merges itself needs at least one addition, and 
could use the treesame[] array to do it: if after doing reduce_heads, a 
commit is now different to all remaining parents, but there was a 
TREESAME parent eliminated, that parent should be reinstated. That would 
clearly highlight missed merges, showing both that older TREESAME 
parent and the newer !TREESAME parent that would have been taken in a 
normal merge.


And maybe there's more simplify_merges could do, if it had this full 
TREESAME information available.


(But even after you do all this stuff to get the right commits out, we 
then hit a niggle of mine that gitk forces --cc diffs - even if it shows 
shows the offending merge commit, you can't get it to do a diff...)


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] Make --full-history consider more merges

2013-04-25 Thread Kevin Bracey

On 25/04/2013 19:51, Junio C Hamano wrote:

Kevin Bracey ke...@bracey.fi writes:


Thanks for the test addition. Maybe we will be able to satisfy your
greed in this series. There could be more worth doing here, and I
think getting TREESAME precise is key.

It is perfectly fine to do things one step at a time.  Let's get
the --full-history change into a good shape first and worry about
the more complex case after we are done.


So do you see the rerun of try_to_simplify_commit() as acceptable? I'm 
really not happy with it - it was fine for an initial proof-of-concept, 
but it's an obvious waste of CPU cycles to recompute something we 
already figured out, and I'm uncomfortable with the fact that the 
function potentially does more than just compute TREESAME; by inspection 
it seems safe given the known context inside simplify_merges(), but it 
feels like something waiting to trip us up. The latter could be dealt 
with by breaking try_to_simplify_commit() up, but that feels like a 
diversion. I'd rather just get on and make this first patch store and 
use the per-parent-treesame flags if feasible.


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH] Make --full-history consider more merges

2013-04-23 Thread Kevin Bracey
On 22/04/2013 22:49, Junio C Hamano wrote:

 So in addition to have some change and there is no same parent
 case, under _some_ condition we avoid marking a merge not worth
 showing (i.e. TREESAME) if there is any change.

 And the condition is !simplify_history and !simplify_merges, which
 would cover --full-history, but I am not sure if requiring
 !simplify_merges is correct.

You're right, it isn't correct. Logically, the test should be for
simplify_history. The original patch was overly cautious about
limiting it to --full-history. And it avoided breaking a test -
specifically t6016.7. simplify_merges() should just be post-
processing full history with parents as you say, not changing its
behaviour.

I've dug into this more, and am now fairly confident about my analysis.
I believe the problem arises because TREESAME's definition was
conflated with the default simplification behaviour, unnecessarily.
Merges were defined as TREESAME if they matched ANY parent.

That's fine if simplify_history is set, but then in that case a merge
which is TREESAME to any parent is turned into a normal commit based
on that parent anyway, and that simplified commit is inherently
TREESAME, regardless of how we define TREESAME for merges.

If simplify_history isn't set, and we don't want ancestry, then that
TREESAME definition is problematic. It avoids showing merges that
don't have anything actually new, but does lead to this issue that
git log -Sxxx --full-history can't locate a dropped change.

If simplify_history is set, and we do want ancestry, then it doesn't
matter about the TREESAME definition because it shows all merges,
regardless of the TREESAME flag. Thus adding --parents to the above
command means it can find it, but only because it drags _every_ merge
into consideration. Should that be necessary?

Futher, if we add simplify_merges, then TREESAME becomes a problem.
After merge simplification, we can be left with a single-parent
commit that doesn't match its parent but is skipped due to its
TREESAME flag being set: it was TREESAME to a parent that got dropped.

I think the correction is that TREESAME for a commit needs to be
redefined to be this commit matches all its (remaining) parents,
rather than this commit matches at least 1 of its (remaining)
parents. And when we eliminate parents, we have to check for the
flag changing, but we should have been doing that anyway.

Now, what effect does post-processing have on TREESAME? Parent
rewriting due to elimination of duplicates, and making dense has no
effect, as there we're always replacing a commit with something it
was TREESAME to, so that doesn't affect TREESAMEness of its children.

But reduce_heads() in simplify_merges() can have an effect. In the
example shown in the comment, X may or may not be TREESAME to its
remaining single parent. And test 6016.7 covers one instance of
that - reduce_heads() eliminates branches B and C, because they
didn't touch bar.txt, and the rewritten parents were then ancestors
of the A path. But that that drop ancestor paths logic totally
ignores TREESAMEness, and we don't know whether the merge kept
the bar.txt from A, or eliminated it by an -s ours take of B.

In the 6016.7 case, the merge was originally marked TREESAME, as it
matched A. When B and C were eliminated, it remained TREESAME, and
this wasn't recalculated. And that happened to be correct in this
case, because the merge was normal.

If on the other hand, the merge had taken the old version B with
-s ours, or if there had been some other semantic change arising
in the merge, then simplify_merges would still have eliminated
B and C, and left a normal commit differing from parent A, but still
marked TREESAME from having matched B. So that would have wrongly
hidden the merge. Recalculation is already necessary to spot the
odd case.

If we redefine TREESAME, then recalculation becomes necessary to
handle the normal case - the merge will not initially be TREESAME,
as it differs from B and C. But once B and C are eliminated, we
should then determine that the commit is TREESAME from matching
its remaining parent A. Simplification should only ever increase
the sameness, so we can avoid recalculation if TREESAME is already
set. Without this recalculation, 6016.7 fails.

So I believe that if reduce_heads() does eliminate any parents, and
TREESAME wasn't set, then we have to recalculate TREESAME, by
running over the remaining parents. A call to
try_to_simplify_commit() does the job, but it feels hacky, and
obviously it's slow. It would be nice if we could remember
per-parent TREESAME flags, and shuffle them when we change
parents, but that could be fiddly.

Also if limit_to_ancestry() did eliminate parents from outside the
ancestry, as per the NEEDSWORK comment, then that would also want
to do the TREESAME recalculation. (And it could conceivably be very
useful, helping pick up only the merges that went against your base
commit).

So, given all that, revised patch below:

---
 

[PATCH] cherry-pick/revert: make usage say 'commit-ish...'

2013-04-22 Thread Kevin Bracey
The usage string for cherry-pick and revert has never been updated to
reflect their ability to handle multiple commits. Other documentation is
already correct.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 builtin/revert.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/revert.c b/builtin/revert.c
index c5e36b9..0401fdb 100644
--- a/builtin/revert.c
+++ b/builtin/revert.c
@@ -19,13 +19,13 @@
  */
 
 static const char * const revert_usage[] = {
-   N_(git revert [options] commit-ish),
+   N_(git revert [options] commit-ish...),
N_(git revert subcommand),
NULL
 };
 
 static const char * const cherry_pick_usage[] = {
-   N_(git cherry-pick [options] commit-ish),
+   N_(git cherry-pick [options] commit-ish...),
N_(git cherry-pick subcommand),
NULL
 };
-- 
1.8.2.255.g39c5835

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC/PATCH] Make --full-history consider more merges

2013-04-22 Thread Kevin Bracey
History simplification previously always treated merges as TREESAME
if they were TREESAME to any parent.

While the desired default behaviour, this could be extremely unhelpful
when searching detailed history, and could not be overridden. For
example, if a merge had ignored a change, as if by -s ours, then:

  git log -m -p --full-history -Schange file

would successfully locate change's addition but would not locate the
merge that resolved against it.

This patch changes the simplification so that when --full-history is
specified, a merge is treated as TREESAME only if it is TREESAME to
_all_ parents. This means the command above locates a merge that dropped
change.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
This would address my problem case - it passes existing tests, and covers
my (all-too-common) problem. But it would also need documentation changes and
a new test.

 revision.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/revision.c b/revision.c
index eb98128..96fe3f5 100644
--- a/revision.c
+++ b/revision.c
@@ -516,8 +516,14 @@ static void try_to_simplify_commit(struct rev_info *revs, 
struct commit *commit)
}
die(bad tree compare for commit %s, 
sha1_to_hex(commit-object.sha1));
}
-   if (tree_changed  !tree_same)
-   return;
+
+   if (tree_changed) {
+   if (!tree_same)
+   return;
+
+   if (!revs-simplify_history  !revs-simplify_merges)
+   return;
+   }
commit-object.flags |= TREESAME;
 }
 
-- 
1.8.2.255.g39c5835

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Locating merge that dropped a change

2013-04-11 Thread Kevin Bracey

On 09/04/2013 21:00, Kevin Bracey wrote:


So, how to automatically find a merge that ignored a known change?


I think I've found the problem. It only doesn't work _if you specify the 
file_.


Specifically, if I was missing an addition, my first attempt to find it 
would be


  git log -p -m -Saddition file

If the addition was lost in a merge, that doesn't even show the 
addition, which is surprising, but intentional. The addition isn't part 
of the HEAD version of file, so no point going down that path of the 
merge. Fine. However, I expected this to work:


  git log --full-history -p -m -Saddition file

But it doesn't. It finds the addition, but _not_ the loss in the merge 
commit.


But this does work:

  git log -p -m -Saddition

That really feels like a bug to me. By specifying a file, I've made it 
miss the change, and I can see no way to get the change without making 
it a full-tree operation.


Looking at try_to_simplify_commit() it appears the merge that removed 
the addition is excluded because it's TREESAME to its _other_ parent. 
But with --full-history, we should only be eliminating a merge if its 
TREESAME to all parents, surely? Otherwise we have this case that we 
show a commit but not its reversion.


And the code doing this looks broken, or at least illogical - the parent 
loop sets tree_same for a same parent in --full-history, rather than 
immediately setting the TREESAME flag, so maybe previous authors were 
thinking about this. But setting tree_same guarantees that TREESAME is 
always set on exit anyway, so it's pointless (unless I'm missing something).


It does appear this is documented behaviour in the manual: full-history 
without parent rewriting .. P and M were excluded because they are 
TREESAME to _a_ parent. I would say that they should have been excluded 
due to being TREESAME to _all_ parents. I really don't want a merge 
where one version of my file was chosen over another excluded from its 
so-called full history.


Now, obviously in a sane environment, most merges won't be that 
interesting, as they'll just be propagating wanted patches from the real 
commits already in the log. But I'd like some way to find merges that 
drop code in a specified file, and surely --full-history is it?


Kevin


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Locating merge that dropped a change

2013-04-09 Thread Kevin Bracey
This morning, I was struggling (not for the first time) to produce a Git 
command that would identify a merge commit that dropped a change. I 
could see where it was added, but couldn't automate finding out why it 
wasn't any longer in HEAD.


All the permutations of --full-history, -m, -S, -G on git log 
I could think of did not get me anywhere. As long as I had 
--full-history, they could find the original commit that had added the 
change, but not the merge commit that had dropped it by taking the other 
parent.


So, how to automatically find a merge that ignored a known change?

And then for visualisation purposes, how do you persuade gitk's diff 
display to actually show that that merge commit removed the change from 
one of its parents? Again, -m didn't seem to work.


Help appreciated!

Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] checkout: abbreviate hash in suggest_reattach

2013-04-08 Thread Kevin Bracey
After printing the list of left-behind commits (with abbreviated
hashes), use an abbreviated hash in the suggested 'git branch' command;
there's no point in outputting a full 40-character hex string in some
friendly advice.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 builtin/checkout.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index a9c1b5a..e168bfb 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -700,7 +700,7 @@ static void suggest_reattach(struct commit *commit, struct 
rev_info *revs)
If you want to keep them by creating a new branch, 
this may be a good time\nto do so with:\n\n
 git branch new_branch_name %s\n\n),
-   sha1_to_hex(commit-object.sha1));
+   find_unique_abbrev(commit-object.sha1, 
DEFAULT_ABBREV));
 }
 
 /*
-- 
1.8.2.445.gfcda34b

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--simplify-merges breakage

2013-04-08 Thread Kevin Bracey
Commit 4b7f53d (simplify-merges: drop merge from irrelevant side 
branch) appears to have broken simplify-merges.


[Credit to git bisect run - and now I see the report problems 
immediately message on the recent merge of that change.]


I was trying to get my head around history simplification, by creating a 
repo with the example shown in the manual.The result of gitk 
--simplify-merges foo was far from pretty.


With the example in the manual, you should get this:

  .-A---M---N---O
 / /   /
I B   D
 \   /   /
  `-'

After 4b7f53 you instead get:

  .-A---M   N---O
/ /   /   /
I B   /   D
 \   /   /   /
  `-'

Kevin


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/3] git-merge-one-file error reporting

2013-03-24 Thread Kevin Bracey
Style clean up, as requested, followed by the fix to the both sides added
handling for git-merge-one-file.

This is based on v4 of my p4merge series, as they touch the same area.

Kevin Bracey (3):
  git-merge-one-file: style cleanup
  git-merge-one-file: send ERROR: messages to stderr
  git-merge-one-file: revise merge error reporting

 git-merge-one-file.sh | 50 +-
 1 file changed, 25 insertions(+), 25 deletions(-)

-- 
1.8.2.rc3.21.g744ac65

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 1/2] mergetools/p4merge: swap LOCAL and REMOTE

2013-03-24 Thread Kevin Bracey
Reverse LOCAL and REMOTE when invoking P4Merge as a mergetool, so that
the incoming branch is now in the left-hand, blue triangle pane, and the
current branch is in the right-hand, green circle pane.

This change makes use of P4Merge consistent with its built-in help, its
reference documentation, and Perforce itself. But most importantly, it
makes merge results clearer. P4Merge is not totally symmetrical between
left and right; despite changing a few text labels from theirs/ours to
left/right when invoked manually, it still retains its original
Perforce theirs/ours viewpoint.

Most obviously, in the result pane P4Merge shows changes that are common
to both branches in green. This is on the basis of the current branch
being green, as it is when invoked from Perforce; it means that lines in
the result are blue if and only if they are being changed by the merge,
making the resulting diff clearer.

Note that P4Merge now shows ours on the right for both diff and merge,
unlike other diff/mergetools, which always have REMOTE on the right.
But observe that REMOTE is the working tree (ie ours) for a diff,
while it's another branch (ie theirs) for a merge.

Ours and theirs are reversed for a rebase - see git help rebase.
However, this does produce the desired show the results of this commit
effect in P4Merge - changes that remain in the rebased commit (in your
branch, but not in the new base) appear in blue; changes that do not
appear in the rebased commit (from the new base, or common to both) are
in green. If Perforce had rebase, they'd probably not swap ours/theirs,
but make P4Merge show common changes in blue, picking out our changes in
green. We can't do that, so this is next best.

Signed-off-by: Kevin Bracey ke...@bracey.fi
Reviewed-by: David Aguilar dav...@gmail.com
---
No change to part 1/2 from previous version, apart from Reviewed-by.
Part 2/2 is modified.

 mergetools/p4merge | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mergetools/p4merge b/mergetools/p4merge
index 8a36916..46b3a5a 100644
--- a/mergetools/p4merge
+++ b/mergetools/p4merge
@@ -22,7 +22,7 @@ diff_cmd () {
 merge_cmd () {
touch $BACKUP
$base_present || $BASE
-   $merge_tool_path $BASE $LOCAL $REMOTE $MERGED
+   $merge_tool_path $BASE $REMOTE $LOCAL $MERGED
check_unchanged
 }
 
-- 
1.8.2.rc3.21.g744ac65

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/3] git-merge-one-file: revise merge error reporting

2013-03-24 Thread Kevin Bracey
Commit 718135e improved the merge error reporting for the resolve
strategy's merge conflict and permission conflict cases, but led to a
malformed ERROR:  in myfile.c message in the case of a file added
differently.

This commit reverts that change, and uses an alternative approach without
this flaw.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 git-merge-one-file.sh | 22 +++---
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
index 39b7799..e231d20 100755
--- a/git-merge-one-file.sh
+++ b/git-merge-one-file.sh
@@ -107,10 +107,12 @@ case ${1:-.}${2:-.}${3:-.} in
;;
esac
 
+   ret=0
src2=$(git-unpack-file $3)
case $1 in
'')
-   echo Added $4 in both, but differently.
+   echo ERROR: Added $4 in both, but differently. 2
+   ret=1
orig=$(git-unpack-file $2)
create_virtual_base $orig $src2
;;
@@ -124,11 +126,10 @@ case ${1:-.}${2:-.}${3:-.} in
# would confuse merge greatly.
src1=$(git-unpack-file $2)
git merge-file $src1 $orig $src2
-   ret=$?
-   msg=
-   if test $ret != 0
+   if test $? != 0
then
-   msg='content conflict'
+   echo ERROR: Content conflict in $4 2
+   ret=1
fi
 
# Create the working tree file, using our tree version from the
@@ -138,21 +139,12 @@ case ${1:-.}${2:-.}${3:-.} in
 
if test $6 != $7
then
-   if test -n $msg
-   then
-   msg=$msg, 
-   fi
-   msg=${msg}permissions conflict: $5-$6,$7
-   ret=1
-   fi
-   if test -z $1
-   then
+   echo ERROR: Permissions conflict: $5-$6,$7 2
ret=1
fi
 
if test $ret != 0
then
-   echo ERROR: $msg in $4 2
exit 1
fi
exec git update-index -- $4
-- 
1.8.2.rc3.21.g744ac65

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 2/2] mergetools/p4merge: create a base if none available

2013-03-24 Thread Kevin Bracey
Originally, with no base, Git gave P4Merge $LOCAL as a dummy base:

   p4merge $LOCAL $LOCAL $REMOTE $MERGED

Commit 0a0ec7bd changed this to:

   p4merge empty file $LOCAL $REMOTE $MERGED

to avoid the problem of being unable to save in some circumstances with
similar inputs.

Unfortunately this approach produces much worse results on differing
inputs. P4Merge really regards the blank file as the base, and once you
have just a couple of differences between the two branches you end up
with one a massive full-file conflict. The 3-way diff is not readable,
and you have to invoke difftool MERGE_HEAD HEAD manually to get a
useful view.

The original approach appears to have invoked special 2-way merge
behaviour in P4Merge that occurs only if the base filename is  or
equal to the left input.  You get a good visual comparison, and it does
not auto-resolve differences. (Normally if one branch matched the base,
it would autoresolve to the other branch).

But there appears to be no way of getting this 2-way behaviour and being
able to reliably save. Having base==left appears to be triggering other
assumptions. There are tricks the user can use to force the save icon
on, but it's not intuitive.

So we now follow a suggestion given in the original patch's discussion:
generate a virtual base, consisting of the lines common to the two
branches. This is the same as the technique used in resolve and octopus
merges, so we relocate that code to a shared function.

Note that if there are no differences at the same location, this
technique can lead to automatic resolution without conflict, combining
everything from the 2 files.  As with the other merges using this
technique, we assume the user will inspect the result before saving.

Signed-off-by: Kevin Bracey ke...@bracey.fi
Reviewed-by: David Aguilar dav...@gmail.com
---
Minor change from v3: that version moved initialisation of src1 higher up,
detaching it from its associated comment. This move was only required by
earlier versions, so v4 leaves src1 in its original position.

Added Reviewed-by footer.

 Documentation/git-sh-setup.txt |  6 ++
 git-merge-one-file.sh  | 18 +-
 git-sh-setup.sh| 12 
 mergetools/p4merge |  6 +-
 4 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/Documentation/git-sh-setup.txt b/Documentation/git-sh-setup.txt
index 6a9f66d..5d709d0 100644
--- a/Documentation/git-sh-setup.txt
+++ b/Documentation/git-sh-setup.txt
@@ -82,6 +82,12 @@ get_author_ident_from_commit::
outputs code for use with eval to set the GIT_AUTHOR_NAME,
GIT_AUTHOR_EMAIL and GIT_AUTHOR_DATE variables for a given commit.
 
+create_virtual_base::
+   modifies the first file so only lines in common with the
+   second file remain. If there is insufficient common material,
+   then the first file is left empty. The result is suitable
+   as a virtual base input for a 3-way merge.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
index 3373c04..255c07a 100755
--- a/git-merge-one-file.sh
+++ b/git-merge-one-file.sh
@@ -104,30 +104,22 @@ case ${1:-.}${2:-.}${3:-.} in
;;
esac
 
-   src2=`git-unpack-file $3`
+   src2=$(git-unpack-file $3)
case $1 in
'')
echo Added $4 in both, but differently.
-   # This extracts OUR file in $orig, and uses git apply to
-   # remove lines that are unique to ours.
-   orig=`git-unpack-file $2`
-   sz0=`wc -c $orig`
-   @@DIFF@@ -u -La/$orig -Lb/$orig $orig $src2 | git apply --no-add
-   sz1=`wc -c $orig`
-
-   # If we do not have enough common material, it is not
-   # worth trying two-file merge using common subsections.
-   expr $sz0 \ $sz1 \* 2 /dev/null || : $orig
+   orig=$(git-unpack-file $2)
+   create_virtual_base $orig $src2
;;
*)
echo Auto-merging $4
-   orig=`git-unpack-file $1`
+   orig=$(git-unpack-file $1)
;;
esac
 
# Be careful for funny filename such as -L in $4, which
# would confuse merge greatly.
-   src1=`git-unpack-file $2`
+   src1=$(git-unpack-file $2)
git merge-file $src1 $orig $src2
ret=$?
msg=
diff --git a/git-sh-setup.sh b/git-sh-setup.sh
index 9cfbe7f..2f78359 100644
--- a/git-sh-setup.sh
+++ b/git-sh-setup.sh
@@ -249,6 +249,18 @@ clear_local_git_env() {
unset $(git rev-parse --local-env-vars)
 }
 
+# Generate a virtual base file for a two-file merge. Uses git apply to
+# remove lines from $1 that are not in $2, leaving only common lines.
+create_virtual_base() {
+   sz0=$(wc -c $1)
+   @@DIFF@@ -u -La/$1 -Lb/$1 $1 $2 | git apply --no-add
+   sz1=$(wc -c $1)
+
+   # If we do not have enough common material

Re: [PATCH v3 3/3] git-merge-one-file: revise merge error reporting

2013-03-14 Thread Kevin Bracey

On 13/03/2013 19:57, Junio C Hamano wrote:

Kevin Bracey ke...@bracey.fi writes:


-   echo Added $4 in both, but differently.
+   echo ERROR: Added $4 in both, but differently.
+   ret=1

The problem you identified may be worth fixing, but I do not think
this change is correct.

This message is at the same severity level as the message on the
other arm of this case that says Auto-merging $4.  In that other
case arm, we are attempting a true three-way merge, and in this case
arm, we are attempting a similar three-way merge using your virtual
base.

Neither has found any error in this case arm yet.  The messages are
both informational, not an error.  I do not think you would want
to set ret=1 until you see content conflict.


I disagree here. At the minute, it does set ret to 1 (but further down 
the code - bringing it up here next to the ERROR print clarifies 
that), and will report the merge as failed, conflict in the 3-way merge 
or not. Which I think is correct.


We have to stop for user inspection here. We do have a fake base; we 
can't trust the 3-way merge with it.


The virtual 3-way merge will take ABCDE and ABDE and produce ABCDE 
without conflict. That's flat wrong if the real base they failed to tell 
Git about was ABCDE.


Despite being useful, I'm still slightly uncomfortable that it can 
produce something without any conflict markers. The user really needs to 
look at properly.


(And one interesting related glitch, or at least thing that puzzled me 
when it happened. This is from memory, so may be slightly mistaken, but 
what seemed to happen was that if you have rerere enabled, then 
mergetool tends to say nothing to merge, because it relies on rerere 
remaining, which relies on conflict markers. I think you could still 
force a mergetool up by specifying the specific file though.)


Maybe the virtual base itself should be different. Maybe it should put a 
??? marker in place of every unique line. So you get:


Left   ABCEFGH
Right XABCDEFJH  - Merge result |XABC|DEFG|JH
VBase ?ABC?EF??H

That actually feels like it may be the correct answer here. And it's 
effectively what P4Merge does in its 2-way mode I failed to invoke. 
(At least for the result view).


Kevin


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 3/3] git-merge-one-file: revise merge error reporting

2013-03-14 Thread Kevin Bracey

On 14/03/2013 16:56, Junio C Hamano wrote:

Kevin Bracey ke...@bracey.fi writes:


Maybe the virtual base itself should be different. Maybe it should put
a ??? marker in place of every unique line. So you get:

Left   ABCEFGH
Right XABCDEFJH  - Merge result |XABC|DEFG|JH
VBase ?ABC?EF??H

That actually feels like it may be the correct answer here.

Interesting, though the approach has downsides with the diff3
conflict style, no?

Well, yes, but I would assume that we would forcibly select normal diff 
here somehow, if we aren't already. We should be - turning ABCDEFGH vs 
ABCD into ABCDEFGH|EFGH= is silly.


This topic has a lot in common with the zdiff3 discussion going on. The 
concern there is about large chunks of similar code appearing on two 
sides, and not being in the base, leading to useless diff3.


This is just the special case of the base being totally empty.

The thought on zdiff3 philosophy was that common additions should be 
treated as resolved, and not appear inside conflict markers. That's 
exactly what we'd be doing.  So, same conflict as above, but this time 
embedded in a larger file, using zdiff3 logic:


LeftaabaacaaABCEFGHeee
Baseeee - zdiff3 
aaadab|a=faacaaABC|DEFG|JHeee

Right   aaadaafaABCDEFJHeee

Note that I've chosen to suppress the = marker if the lines surrounding 
the conflict are not in the base. I think that helps highlight the fact 
that we're in a diff2 section. EFG|=JH reads like an assertion that 
the base has EFH. Whereas EFG|JH avoids that.


So, anyway, commonality with zdiff3 would be good. Even if we can't 
share code, we should at least share the general style of result.


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 3/3] git-merge-one-file: revise merge error reporting

2013-03-14 Thread Kevin Bracey

On 14/03/2013 19:31, Kevin Bracey wrote:

On 14/03/2013 16:56, Junio C Hamano wrote:


Well, yes, but I would assume that we would forcibly select normal 
diff here somehow, if we aren't already. We should be - turning 
ABCDEFGH vs ABCD into ABCDEFGH|EFGH= is silly.


Doh. But anyway, we don't want to waste space with |= markers, and make 
the same surrounding code is in the base suggestion. So we should be 
selecting diff.


Kevin

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/3] mergetools/p4merge: create a base if none available

2013-03-12 Thread Kevin Bracey
Originally, with no base, Git gave P4Merge $LOCAL as a dummy base:

   p4merge $LOCAL $LOCAL $REMOTE $MERGED

Commit 0a0ec7bd changed this to:

   p4merge empty file $LOCAL $REMOTE $MERGED

to avoid the problem of being unable to save in some circumstances with
similar inputs.

Unfortunately this approach produces much worse results on differing
inputs. P4Merge really regards the blank file as the base, and once you
have just a couple of differences between the two branches you end up
with one a massive full-file conflict. The 3-way diff is not readable,
and you have to invoke difftool MERGE_HEAD HEAD manually to get a
useful view.

The original approach appears to have invoked special 2-way merge
behaviour in P4Merge that occurs only if the base filename is  or
equal to the left input.  You get a good visual comparison, and it does
not auto-resolve differences. (Normally if one branch matched the base,
it would autoresolve to the other branch).

But there appears to be no way of getting this 2-way behaviour and being
able to reliably save. Having base==left appears to be triggering other
assumptions. There are tricks the user can use to force the save icon
on, but it's not intuitive.

So we now follow a suggestion given in the original patch's discussion:
generate a virtual base, consisting of the lines common to the two
branches. This is the same as the technique used in resolve and octopus
merges, so we relocate that code to a shared function.

Note that if there are no differences at the same location, this
technique can lead to automatic resolution without conflict, combining
everything from the 2 files.  As with the other merges using this
technique, we assume the user will inspect the result before saving.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 Documentation/git-sh-setup.txt |  6 ++
 git-merge-one-file.sh  | 18 +-
 git-sh-setup.sh| 12 
 mergetools/p4merge |  6 +-
 4 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/Documentation/git-sh-setup.txt b/Documentation/git-sh-setup.txt
index 6a9f66d..5d709d0 100644
--- a/Documentation/git-sh-setup.txt
+++ b/Documentation/git-sh-setup.txt
@@ -82,6 +82,12 @@ get_author_ident_from_commit::
outputs code for use with eval to set the GIT_AUTHOR_NAME,
GIT_AUTHOR_EMAIL and GIT_AUTHOR_DATE variables for a given commit.
 
+create_virtual_base::
+   modifies the first file so only lines in common with the
+   second file remain. If there is insufficient common material,
+   then the first file is left empty. The result is suitable
+   as a virtual base input for a 3-way merge.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
index f612cb8..0f164e5 100755
--- a/git-merge-one-file.sh
+++ b/git-merge-one-file.sh
@@ -104,30 +104,22 @@ case ${1:-.}${2:-.}${3:-.} in
;;
esac
 
-   src2=`git-unpack-file $3`
+   src1=$(git-unpack-file $2)
+   src2=$(git-unpack-file $3)
case $1 in
'')
echo Added $4 in both, but differently.
-   # This extracts OUR file in $orig, and uses git apply to
-   # remove lines that are unique to ours.
-   orig=`git-unpack-file $2`
-   sz0=`wc -c $orig`
-   @@DIFF@@ -u -La/$orig -Lb/$orig $orig $src2 | git apply --no-add
-   sz1=`wc -c $orig`
-
-   # If we do not have enough common material, it is not
-   # worth trying two-file merge using common subsections.
-   expr $sz0 \ $sz1 \* 2 /dev/null || : $orig
+   orig=$(git-unpack-file $2)
+   create_virtual_base $orig $src2
;;
*)
echo Auto-merging $4
-   orig=`git-unpack-file $1`
+   orig=$(git-unpack-file $1)
;;
esac
 
# Be careful for funny filename such as -L in $4, which
# would confuse merge greatly.
-   src1=`git-unpack-file $2`
git merge-file $src1 $orig $src2
ret=$?
msg=
diff --git a/git-sh-setup.sh b/git-sh-setup.sh
index 795edd2..349a5d4 100644
--- a/git-sh-setup.sh
+++ b/git-sh-setup.sh
@@ -249,6 +249,18 @@ clear_local_git_env() {
unset $(git rev-parse --local-env-vars)
 }
 
+# Generate a virtual base file for a two-file merge. Uses git apply to
+# remove lines from $1 that are not in $2, leaving only common lines.
+create_virtual_base() {
+   sz0=$(wc -c $1)
+   @@DIFF@@ -u -La/$1 -Lb/$1 $1 $2 | git apply --no-add
+   sz1=$(wc -c $1)
+
+   # If we do not have enough common material, it is not
+   # worth trying two-file merge using common subsections.
+   expr $sz0 \ $sz1 \* 2 /dev/null || : $1
+}
+
 
 # Platform specific tweaks to work around some commands
 case $(uname -s) in
diff --git a/mergetools/p4merge b/mergetools/p4merge
index 46b3a5a..5a608ab

[PATCH v3 3/3] git-merge-one-file: revise merge error reporting

2013-03-12 Thread Kevin Bracey
Commit 718135e improved the merge error reporting for the resolve
strategy's merge conflict and permission conflict cases, but led to a
malformed ERROR:  in myfile.c message in the case of a file added
differently.

This commit reverts that change, and uses an alternative approach without
this flaw.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 git-merge-one-file.sh | 20 +++-
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
index 0f164e5..78b07a8 100755
--- a/git-merge-one-file.sh
+++ b/git-merge-one-file.sh
@@ -104,11 +104,13 @@ case ${1:-.}${2:-.}${3:-.} in
;;
esac
 
+   ret=0
src1=$(git-unpack-file $2)
src2=$(git-unpack-file $3)
case $1 in
'')
-   echo Added $4 in both, but differently.
+   echo ERROR: Added $4 in both, but differently.
+   ret=1
orig=$(git-unpack-file $2)
create_virtual_base $orig $src2
;;
@@ -121,10 +123,9 @@ case ${1:-.}${2:-.}${3:-.} in
# Be careful for funny filename such as -L in $4, which
# would confuse merge greatly.
git merge-file $src1 $orig $src2
-   ret=$?
-   msg=
-   if [ $ret -ne 0 ]; then
-   msg='content conflict'
+   if [ $? -ne 0 ]; then
+   echo ERROR: Content conflict in $4
+   ret=1
fi
 
# Create the working tree file, using our tree version from the
@@ -133,18 +134,11 @@ case ${1:-.}${2:-.}${3:-.} in
rm -f -- $orig $src1 $src2
 
if [ $6 != $7 ]; then
-   if [ -n $msg ]; then
-   msg=$msg, 
-   fi
-   msg=${msg}permissions conflict: $5-$6,$7
-   ret=1
-   fi
-   if [ $1 = '' ]; then
+   echo ERROR: Permissions conflict: $5-$6,$7
ret=1
fi
 
if [ $ret -ne 0 ]; then
-   echo ERROR: $msg in $4
exit 1
fi
exec git update-index -- $4
-- 
1.8.2.rc3.7.g1100d09.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/3] mergetools/p4merge: swap LOCAL and REMOTE

2013-03-12 Thread Kevin Bracey
Reverse LOCAL and REMOTE when invoking P4Merge as a mergetool, so that
the incoming branch is now in the left-hand, blue triangle pane, and the
current branch is in the right-hand, green circle pane.

This change makes use of P4Merge consistent with its built-in help, its
reference documentation, and Perforce itself. But most importantly, it
makes merge results clearer. P4Merge is not totally symmetrical between
left and right; despite changing a few text labels from theirs/ours to
left/right when invoked manually, it still retains its original
Perforce theirs/ours viewpoint.

Most obviously, in the result pane P4Merge shows changes that are common
to both branches in green. This is on the basis of the current branch
being green, as it is when invoked from Perforce; it means that lines in
the result are blue if and only if they are being changed by the merge,
making the resulting diff clearer.

Note that P4Merge now shows ours on the right for both diff and merge,
unlike other diff/mergetools, which always have REMOTE on the right.
But observe that REMOTE is the working tree (ie ours) for a diff,
while it's another branch (ie theirs) for a merge.

Ours and theirs are reversed for a rebase - see git help rebase.
However, this does produce the desired show the results of this commit
effect in P4Merge - changes that remain in the rebased commit (in your
branch, but not in the new base) appear in blue; changes that do not
appear in the rebased commit (from the new base, or common to both) are
in green. If Perforce had rebase, they'd probably not swap ours/theirs,
but make P4Merge show common changes in blue, picking out our changes in
green. We can't do that, so this is next best.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 mergetools/p4merge | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mergetools/p4merge b/mergetools/p4merge
index 8a36916..46b3a5a 100644
--- a/mergetools/p4merge
+++ b/mergetools/p4merge
@@ -22,7 +22,7 @@ diff_cmd () {
 merge_cmd () {
touch $BACKUP
$base_present || $BASE
-   $merge_tool_path $BASE $LOCAL $REMOTE $MERGED
+   $merge_tool_path $BASE $REMOTE $LOCAL $MERGED
check_unchanged
 }
 
-- 
1.8.2.rc3.7.g1100d09.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] git.c: make usage match manual page

2013-03-11 Thread Kevin Bracey
Re-ordered option list in command-line usage to match the manual page.
Also makes it less than 80-characters wide.

Signed-off-by: Kevin Bracey ke...@bracey.fi
---
 git.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/git.c b/git.c
index d33f9b3..2a98624 100644
--- a/git.c
+++ b/git.c
@@ -6,10 +6,10 @@
 #include run-command.h
 
 const char git_usage_string[] =
-   git [--version] [--exec-path[=path]] [--html-path] [--man-path] 
[--info-path]\n
+   git [--version] [--help] [-c name=value]\n
+  [--exec-path[=path]] [--html-path] [--man-path] 
[--info-path]\n
   [-p|--paginate|--no-pager] [--no-replace-objects] 
[--bare]\n
   [--git-dir=path] [--work-tree=path] 
[--namespace=name]\n
-  [-c name=value] [--help]\n
   command [args];
 
 const char git_more_info_string[] =
-- 
1.8.2.rc3.7.g1100d09.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git.c: make usage match manual page

2013-03-11 Thread Kevin Bracey

On 11/03/2013 21:58, Junio C Hamano wrote:

Kevin Bracey ke...@bracey.fi writes:


Re-ordered option list in command-line usage to match the manual page.
Also makes it less than 80-characters wide.

Thanks (s/Re-ordered/reorder/ and s/makes/make/, though).


Got it. But I'm going to reword it, to follow the history of the manual 
change.



Is git.c the only one whose -h output does not match the manual
synopsis?

Generally, -h just puts options in the synopsis, and then prints a 
line per option, so most commands don't really match the manual show 
all options on one line style anyway. git.c is atypical. (Something 
else to look at for the whole git help thing? Should git -h print a 
option list in that style?)


But, yes, I've found a few others that are show almost the same thing as 
the manual but with subtle pointless differences. git remote, for 
example. That's a larger project, I feel; the 80-column thing is key here.


Kevin
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >