Re: [PATCH] Correct compile errors when DEBUG_BISECT=1 after supporting other hash algorithms

2017-03-29 Thread Alex Hoffman
Any news about this patch?

2017-03-21 22:24 GMT+01:00 Alex Hoffman <s...@gal.ro>:
> Hi, Brian,
>
> We definitely prefer the wrapper function oid_to_hex() to
> sha1_to_hex(). Thanks for feedback.
> Below is the updated patch:
>
> ---
>  bisect.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/bisect.c b/bisect.c
> index 30808cadf..7b65acbcd 100644
> --- a/bisect.c
> +++ b/bisect.c
> @@ -131,7 +131,7 @@ static void show_list(const char *debug, int
> counted, int nr,
> unsigned flags = commit->object.flags;
> enum object_type type;
> unsigned long size;
> -   char *buf = read_sha1_file(commit->object.sha1, , );
> +   char *buf = read_sha1_file(commit->object.oid.hash,
> , );
> const char *subject_start;
> int subject_len;
>
> @@ -143,10 +143,10 @@ static void show_list(const char *debug, int
> counted, int nr,
> fprintf(stderr, "%3d", weight(p));
> else
> fprintf(stderr, "---");
> -   fprintf(stderr, " %.*s", 8, sha1_to_hex(commit->object.sha1));
> +   fprintf(stderr, " %.*s", 8, oid_to_hex(>object.oid));
> for (pp = commit->parents; pp; pp = pp->next)
> fprintf(stderr, " %.*s", 8,
> -   sha1_to_hex(pp->item->object.sha1));
> +   oid_to_hex(>item->object.oid));
>
> subject_len = find_commit_subject(buf, _start);
> if (subject_len)
> --
> 2.12.0.400.g54ad2d445.dirty


Re: [PATCH] Correct compile errors when DEBUG_BISECT=1 after supporting other hash algorithms

2017-03-21 Thread Alex Hoffman
Hi, Brian,

We definitely prefer the wrapper function oid_to_hex() to
sha1_to_hex(). Thanks for feedback.
Below is the updated patch:

---
 bisect.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/bisect.c b/bisect.c
index 30808cadf..7b65acbcd 100644
--- a/bisect.c
+++ b/bisect.c
@@ -131,7 +131,7 @@ static void show_list(const char *debug, int
counted, int nr,
unsigned flags = commit->object.flags;
enum object_type type;
unsigned long size;
-   char *buf = read_sha1_file(commit->object.sha1, , );
+   char *buf = read_sha1_file(commit->object.oid.hash,
, );
const char *subject_start;
int subject_len;

@@ -143,10 +143,10 @@ static void show_list(const char *debug, int
counted, int nr,
fprintf(stderr, "%3d", weight(p));
else
fprintf(stderr, "---");
-   fprintf(stderr, " %.*s", 8, sha1_to_hex(commit->object.sha1));
+   fprintf(stderr, " %.*s", 8, oid_to_hex(>object.oid));
for (pp = commit->parents; pp; pp = pp->next)
fprintf(stderr, " %.*s", 8,
-   sha1_to_hex(pp->item->object.sha1));
+   oid_to_hex(>item->object.oid));

subject_len = find_commit_subject(buf, _start);
if (subject_len)
-- 
2.12.0.400.g54ad2d445.dirty


[PATCH] Correct compile errors when DEBUG_BISECT=1 after supporting other hash algorithms

2017-03-19 Thread Alex Hoffman
---
 bisect.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/bisect.c b/bisect.c
index 30808cadf..6feed8533 100644
--- a/bisect.c
+++ b/bisect.c
@@ -131,7 +131,7 @@ static void show_list(const char *debug, int
counted, int nr,
unsigned flags = commit->object.flags;
enum object_type type;
unsigned long size;
-   char *buf = read_sha1_file(commit->object.sha1, , );
+   char *buf = read_sha1_file(commit->object.oid.hash,
, );
const char *subject_start;
int subject_len;

@@ -143,10 +143,10 @@ static void show_list(const char *debug, int
counted, int nr,
fprintf(stderr, "%3d", weight(p));
else
fprintf(stderr, "---");
-   fprintf(stderr, " %.*s", 8, sha1_to_hex(commit->object.sha1));
+   fprintf(stderr, " %.*s", 8,
sha1_to_hex(commit->object.oid.hash));
for (pp = commit->parents; pp; pp = pp->next)
fprintf(stderr, " %.*s", 8,
-   sha1_to_hex(pp->item->object.sha1));
+   sha1_to_hex(pp->item->object.oid.hash));

subject_len = find_commit_subject(buf, _start);
if (subject_len)
-- 
2.12.0.399.g9d77b0405.dirty


Re: Git bisect does not find commit introducing the bug

2017-02-21 Thread Alex Hoffman
> isn't that spelt `--ancestry-path` ?
> (--ancestry-path has it's own issues such as needing an --first-parent-show
> option, but that's possibly a by the by)

Indeed it is spelled `--ancestry-path`. And interestingly enough you
may use it multiple times with the wanted effect in our case (e.g when
the user has multiple good commit and a single bad commit before
running the bisect itself). Also it is `--first-parent` (not
`--first-parent-show`), but I do not understand why do we need this
option? What kind of issues does `--ancestry-path` have?

Best regards,
VG


Re: Git bisect does not find commit introducing the bug

2017-02-20 Thread Alex Hoffman
> If `git bisect` is/would be affected by `git log` history-related options
> then this is what `--strict-ancestor` option gives/would give.

Exactly my thoughts. All that needs to be changed in the 2nd problem
is the graph where to search.

But first we must agree about the usefulness of the 2nd problem. Any
thoughts/comments/votes for/against?


Re: Git bisect does not find commit introducing the bug

2017-02-20 Thread Alex Hoffman
I see two different problems each with a different assumption (see the
definition of "bisectable" in the email of Junio C Hamano):

1. (Current) Assume the entire history graph is bisectable. DO: Search
where in the entire graph the first 'trait'/transition occurs.
2. (New) Assume only the graph between one good commit and one bad
commit is bisectable. DO: Search where the first transition occurs in
the graph between these two commits.

It seems that the real world needs a solution also for the second
problem (example if the good commit is the FIRST good commit of a
feature or if the good commit is not the first good commit, but you
definitely know, that it broke first somewhere in between the good and
bad commit).

I find the way to go as Oleg proposed is gittish enough (with a new
parameter --strategy). Beside I would underline that also the second
problem is a bisect problem, just for another graph, thus it makes
perfect sense to extend 'git bisect' for this.

Does that look reasonably?


Re: Git bisect does not find commit introducing the bug

2017-02-19 Thread Alex Hoffman
> Then you must adjust your definition of "good": All commits that do not have
> the feature, yet, are "good": since they do not have the feature in the
> first place, they cannot have the breakage that you found in the feature.
>
> That is exactly the situation in your original example! But you constructed
> the condition of goodness in such a simplistic way (depending on the
> presence of a string), that it was impossible to distinguish between "does
> not have the feature at all" and "has the feature, but it is broken".

Johannes, thank you for correctly identifying the error in my logic.
Indeed I was using the term 'bad' also for the commit without the
feature. In order to find the commit introducing the bug in my example
a new state is needed, which would make 'git bisect' a bit more
complicated than the user 'most of the time' probably needs. Or do you
think, it would make sense to ask the user for this state (if e.g 'git
bisect' would be started with a new parameter)?


Re: Git bisect does not find commit introducing the bug

2017-02-19 Thread Alex Hoffman
Below is a correction of the first proposed algorithm:

>--o1--o2--o3--G --X1
>\\
> x1--x2--x3--x4--X2--B--
>  \  /
>   y1--y2--y3
>
Step 1a. (Unchanged) keep only the commits that:

a) are ancestor of the "bad" commit (including the "bad" commit itself),
b) are not ancestor of a "good" commit (excluding the "good" commits).

The following graph results:
  x1--x2--x3--x4--X2--B--
   \  /
y1--y2--y3

Step 1b. (New) Mark all commits of the resulting graph as unconfirmed
(unconfirmed=node without good ancestors).
Mark as confirmed all descendants of commits that user marked
explicitly as good (e.g if user already marked its parent/grand
parent/... as good right before starting bisect run).
Step 1c. From all unconfirmed root commits build a set SET_GOOD of
those with any good parents in the original graph (root commit =
commit without parents in the resulting graph) and one SET_BAD for
commits with only BAD parents. To build a set means to ask explicitly
the user (or the command passed in git bisect run) whether any of its
parents is good. In the example above find out whether x1 has any good
parents or no parent at all and if so add it to SET_GOOD, otherwise to
SET_BAD.
Mark as confirmed each commit in SET_GOOD and all its descendants.
For every commit in SET_BAD delete all paths from it until to the next
confirmed commit. In the example above if x1 is in SET_BAD delete the
path x1-x3 and x1-y3. If any new root commits result (commit x4), redo
step 1c with them.


Step2. Continue the existing algorithm.

If this improvement works (i.e you do not find any bugs in it and it
is feasible to implement, which seems to me) the following would be
its advantages:
1. An assumption less, as we explicitly check the assumption.
2. It might be quicker, because we delete parts of graph that cannot
contain transitions.
3. It returns more exact results.

VG


Re: Git bisect does not find commit introducing the bug

2017-02-19 Thread Alex Hoffman
> At the end of the git-bisect man page (in the SEE ALSO section) there
> is a link to 
> https://github.com/git/git/blob/master/Documentation/git-bisect-lk2009.txt
> which has a lot of details about how bisect works.
>

Thanks for pointing out the SEE ALSO section. I think it makes sense
to include/describe the entire algorithm in the man page itself,
although I am not sure whether the graphs would be always correctly
visually represented in the man page format.

> The goal is to find the first bad commit, which is a commit that has
> only good parents.

OK, bisect's mission is more exact than I thought, which is good. M

> As o1 is an ancestor of G, then o1 is considered good by the bisect algorithm.
> If it was bad, it would means that there is a transition from bad to
> good between o1 and G.
> But when a good commit is an ancestor of the bad commit, git bisect
> makes the assumption that there is no transition from bad to good in
> the graph.

The assumption that there is no transition from bad to good in the
graph did not hold in my example and it does not hold when a feature
was recently introduced and gets broken relative shortly afterwards.
But I consider it is easy to change the algorithm not to assume, but
rather to check it.

> git bisect makes some assumptions that are true most of the time, so
> in practice it works well most of the time.

Whatever the definition of "most of the time" everyone might have, I
think there is room for improvement. Below I am trying to make a small
change to the current algorithm in order to deal with the assumption
that sometimes does not hold (e.g in my example), by explicitly
validating the check.

> --o1--o2--o3--G--X1
> \\
>  x1--x2--x3--x4--X2--B--
>   \  /
>y1--y2--y3

Step 1a. (Unchanged) keep only the commits that:

a) are ancestor of the "bad" commit (including the "bad" commit itself),
b) are not ancestor of a "good" commit (excluding the "good" commits).

The following graph results:
  x1--x2--x3--x4--X2--B--
   \  /
y1--y2--y3

Step 1b. (New) Mark all root commits of the resulting graph (i.e
commits without parents) as unconfirmed (unconfirmed=node that has
only bad parents). Remove all root commits that user already confirmed
(e.g if user already marked its parent as good right before starting
bisect run). For every unconfirmed root commit check if it has any
good parents. In the example above check whether x1 has good parents.
 If the current root element has any parents and none of them is
good, we can delete all paths from it until to the next commit that
has a parent in the ancestors of GOOD. In the example above to delete
the path x1-x3 and x1-y3. Also add new resulting root commits to the
list of unconfirmed commits (commit x4).
 Otherwise mark it as confirmed.

Step2. Continue the existing algorithm.


If this improvement works (i.e you do not find any bugs in it and it
is feasible to implement, which seems to me) the following would be
its advantages:
1. An assumption less, as we explicitly check the assumption.
2. It might be quicker, because we delete parts of graph that cannot
contain transitions.
3. It returns more exact results.

VG


Re: Git bisect does not find commit introducing the bug

2017-02-18 Thread Alex Hoffman
> But this is not how Git works. Git computes graph differences, i.e., it
> subtracts from the commits reachable from v.bad those that are reachable
> from v.good. This leaves more than just those on some path from v.good to
> v.bad. And it should work this way. Consider this history:
>
> --o--o--o--G--X
>\   \
> x--x--x--x--X--B--
>
> When you find B bad and G good, than any one of the x or X may have
> introduced the breakage, not just one of the X.
>

Thank you for clarifying how git bisect works. How did you find that
out? Did you check the source code? If that is not documented in the
man page may be it worth documenting it in order to avoid future
confusion for other users.

Let's consider your example with distinct names for nodes:

--o1--o2--o3--G--X1
\\
 x1--x2--x3--x4--X1--B--

It makes sense that git bisect is expecting a single transition, as
this is a precondition for a binary search to work. My definition of
"the transition" is a commit with at least one of its parents as a
good version, but the commit itself a bad version. I hope we agree
that git bisect's mission is to search for this transition (as I
suppose that most of people need such a functionality from git, if not
exactly from git bisect). How could be x1 or x3 be the transition, if
chances are that o1 is not a good version? Of course it would make
sense to me if bisect would check o1 whether good and only then to
check also x1-x3, but this is not what git makes (at least not in my
initial example).

If you consider that git bisect's mission is different from finding
the transition, could you please explain what exact commit git bisect
is supposed to return (ideally with terms from the graph theory) and
when it makes sense to return that? Because I do not see any sense in
looking in the path x1-x3 without knowing that those commits may be a
transition.


> Oh, IMO git bisect was well thought through. If it considered just paths
> from good to bad, it would not given the correct answer. See the example
> history above. Bisect authors would not have deemed that sufficiently good

You definitely convinced me that git MUST search more than only in the
paths between good and bad commits, as the good commit G does not have
to be the first good commit (thank you for that). My problem/confusion
is that it returns something that does not make sense to me, because
it does not make sure it returns a transition.

VG

PS: thank you for continuing this discussion.


Re: Git bisect does not find commit introducing the bug

2017-02-18 Thread Alex Hoffman
>> First of all this is confusing, as this commit cannot be reached
>> starting from "v.good".

> Hm, IMHO it shows that your example is pretty artificial (although you
> might have come across it in a real-world scenario): you introduced a
> new feature in f4154e9 (and it worked) and you broke that feature by
> making the merge 671cec2. However, the feature (that broke in 671cec2)
> did not even exist in 04c6f4b; so a test on the feature would not fail
> (leading to "bisect bad" as in the example), it would not exist (leading
> to "bisect skip").

No one commented the fact, that I find this very confusing. Don't you
find this confusing? I will underline, that 'git bisect good v.good'
will fail if the commit 'v.good' is not a parent of the bad commit,
meaning there MUST be at least a path between 'v.good' and 'v.bad',
thus I would expect it looks on this path ONLY. Beside that, this is
what I understand by 'binary search' (to search on this commit path).
You might find this example artificial, but I doubt git is/was
intentionally designed to work with  'natural' examples only (no
matter how you define 'natural' and 'artificial').



>> In other words: bisect assumes that your repo is usually in a good state
>> and you have a commit that changes it to a bad state. In your case you
>> have a repo that is in a bad state and you have a commit that switches
>> it to a good state and later you merge a bad-state branch and you have a
>> bad state again. It is not made for that use-case, I think.

> Correct. The assumption of bisection is that there is only one transition 
> between GOOD and BAD. By violating that assumption, anything can happen.

I did not find that in the manpage or did I miss it? Why would someone
assume that the commit graph looks in a certain way? I assume, that
'git bisect' was not thought through and that it considers the first
directed path between v.good and v.bad, instead of all paths (in my
example graph there are two such paths). I will also underline that
git bisect was designed to work with multiple good commits and one bad
commit (also multiple paths), but probably NOT with multiple paths
between the same pair of good and bad commits.

VG


2017-02-18 10:12 GMT+01:00 Johannes Sixt <j...@kdbg.org>:
> Am 18.02.2017 um 00:21 schrieb Stephan Beyer:
>>
>> On 02/17/2017 11:29 PM, Alex Hoffman wrote:
>> *   7a9e952 (bisect bad) 
>> |\
>> | *   671cec2  <--- expected
>> | |\
>> | * | 04c6f4b  <--- found
>> * | |   3915157 
>> |\ \ \
>> | | |/
>> | |/|
>> | * | f4154e9 (bisect good) 
>> | * | 85855bf 
>> | |/
>> * | f1a36f5 
>> |/
>> * 1b7fb88 
>>
>> The  and  markers are set by your definition of what good and
>> what bad commits are.
>>
>> [...]
>> In other words: bisect assumes that your repo is usually in a good state
>> and you have a commit that changes it to a bad state. In your case you
>> have a repo that is in a bad state and you have a commit that switches
>> it to a good state and later you merge a bad-state branch and you have a
>> bad state again. It is not made for that use-case, I think.
>
>
> Correct. The assumption of bisection is that there is only one transition
> between GOOD and BAD. By violating that assumption, anything can happen.
>
> -- Hannes
>


Git bisect does not find commit introducing the bug

2017-02-17 Thread Alex Hoffman
Hi there,

According to the documentation "git bisect" is designed "to find the
commit that introduced a bug" .
I have found a situation in which it does not returns the commit I expected.
In order to reproduce the problem:

1. mkdir test; cd test;
git clone https://github.com/entbugger/git-bisect-issue.git
cd git-bisect-issue

The tag "v.bad" is one bad version and tag "v.good" is the first good
version. A good version is one having the line "FEATURE2" in
file1.txt, which was introduced in "v.good".

2. Copy test scripts to another folder to make sure they are not
overridden by 'git bisect'
cp *.sh ../
cd ..
./search-bug-git.sh

3. Run ./search-bug-git.sh to search for the commit introducing the
bug. It finds that the commit 04c6f4b9ed with the message "Feature 1"
is the first one introducing the bug.

First of all this is confusing, as this commit cannot be reached
starting from "v.good". Then I expected the commit with the message
"Introduce bug" to be returned by 'git bisect', as it is the first
commit  between "v.good" and "v.bad" that does not contain the line
"FEATURE2" in file1.txt, which is what I understand by the commit
"that introduced a bug" (cited from the manpage of git bisect).

Thanks for looking to this,

Best regards,
VG