Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak
Junio C Hamanowrites: > Lars Schneider writes: > ... >> Following your explanation this patch looks good to me and this fixes the >> test failure. TBH I never thought about the difference of these commands >> before. "rev" and "ref" sound so similar although they denote completely >> different things. > > Thanks for testing. > ... > Hence, I would really prefer not to commit mine myself. I'd rather > see somebody from git-p4 circle to come up with a version that is > more in line with the way things are done in the existing code and > send a tested version for me to apply. Tentatively I queued my hack together with the name-rev thing on 'pu', and Travis seems to be OK with the result. It would be nice if we can "fix" the use of "name-rev HEAD" in "git p4" sooner rather than later. If the code truly uses it to figure out what branch we are currently on and acts on it, as the name of that function suggests, it may be easy to construct a testcase where the code without the fix does a wrong thing (e.g. have two branches pointing at the same commit, and tell "git p4" to act on the one that sorts later in the "git branch --list" output; the command would act as if it were working on the other branch), and shows that the patch fixes that problem. That would be a bug worth fixing quickly regardless of the "name-rev" change michael wanted to make. Thanks.
Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak
Lars Schneiderwrites: >> git-p4.py | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/git-p4.py b/git-p4.py >> index eab319d76e..351d1ab58e 100755 >> --- a/git-p4.py >> +++ b/git-p4.py >> @@ -582,7 +582,7 @@ def currentGitBranch(): >> # on a detached head >> return None >> else: >> -return read_pipe(["git", "name-rev", "HEAD"]).split(" ")[1].strip() >> +return read_pipe(["git", "symbolic-ref", "HEAD"]).strip()[11:] >> >> def isValidGitDir(path): >> return git_dir(path) != None > > Following your explanation this patch looks good to me and this fixes the > test failure. TBH I never thought about the difference of these commands > before. "rev" and "ref" sound so similar although they denote completely > different things. Thanks for testing. The above was done merely to point out the problematic place and a possible solution. As I am not familiar with the code in git-p4.py, I didn't even try to check if the code already has a helper function that strips "refs/heads/" from the beginning of the string (iow, I am not happy with the [11:]). I didn't and don't like the fact that this function now runs "symbolic-ref HEAD" twice but I didn't try to see if there are more suitable and idiomatic ways to do this with a single invocation. Hence, I would really prefer not to commit mine myself. I'd rather see somebody from git-p4 circle to come up with a version that is more in line with the way things are done in the existing code and send a tested version for me to apply. Thanks.
Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak
> On 17 Mar 2017, at 13:56, Junio C Hamanowrote: > > Junio C Hamano writes: > >> Lars Schneider writes: >> >>> A quick bisect indicates that this patch might break >>> t9807-git-p4-submit.sh 8 and 13. I haven't looked into >>> it further, yet. >> >> As I do not do P4, help in diagnosing why it breaks is appreciated. >> If the test script expects... >> On the other hand, if git-p4 command internally uses name-rev and it >> is not prepared to properly handle commits that can be named in more >> than one way, the problem would be deeper, as it would mean it can >> misbehave even without the change to name-rev when multiple branches >> point at the same commit. > > Yikes. Perhaps something along this line? > > This function seems to want to learn which branch we are on, and > running "name-rev HEAD" is *NEVER* the right way to do so. If you > are on branch B which happens to point at the same commit as branch > A, "name-rev HEAD" can say either A or B (and it is likely it would > say A simply because it sorts earlier, and the logic seems to favor > the one that was discovered earlier when all else being equal). > > git-p4.py | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/git-p4.py b/git-p4.py > index eab319d76e..351d1ab58e 100755 > --- a/git-p4.py > +++ b/git-p4.py > @@ -582,7 +582,7 @@ def currentGitBranch(): > # on a detached head > return None > else: > -return read_pipe(["git", "name-rev", "HEAD"]).split(" ")[1].strip() > +return read_pipe(["git", "symbolic-ref", "HEAD"]).strip()[11:] > > def isValidGitDir(path): > return git_dir(path) != None Following your explanation this patch looks good to me and this fixes the test failure. TBH I never thought about the difference of these commands before. "rev" and "ref" sound so similar although they denote completely different things. Thanks, Lars
Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak
Michael J Gruberwrites: >> +#define COMPARE(attribute, smaller_is_better)\ >> +if (name->attribute > attribute) \ >> +return smaller_is_better; \ >> +if (name->attribute < attribute) \ >> +return !smaller_is_better > > I find that define pretty confusing. On first reading, the "==" case > seems to be missing, but that is basically "pass" as one can see in the > later code. > > Also, the comparison ">" and "<" is used to switch between the cases > "tag vs. non-tag" and "non-tag vs. tag" which happen to be encoded by > the from_tag attribute beeing "1 vs. 0" resp "0 vs. 1" in the following: > ... > But in the next two instances you use it to do "actual" comparisons > between distances and dates: That is pretty much deliberate. The macro is designed to compare the attribute and return if one side is strictly better than the other side, and fall through to tie-breaker that follow (i.e. lack of "==" is very deliberate). Also it is not "return if one side is strictly smaller"---the second parameter decides if smaller means better or bigger means better (i.e. "from_tag" being 1 for a name based on tag and 0 for a name based on non-tag can use the macro by telling that the side that is strictly bigger is better). > How about something like > ... I did a sequence of COMPARE() macros that cascade to implement tie-breaking logic because I thought it was the easiest to read and understand, and I did "tag vs non-tag first decides, and then tiebreak to favor shorter hops and then tiebreak on date, while paying attention to committer dates even when we are comparing two commits", because I thought that would be one of the easier logic to explain to the users. But this topic is not my itch, and quite honestly I'd be happier if you took it over, improving both the implementation and the semantics it implements, following your best judgment. You are equipped with better motivation than I am to make these decisions. This patch has turned up a bug in git-p4 in that it seems to have misunderstood that "name-rev HEAD" is a good way to learn which branch we are currently on (it never is); as far as I am concerned, it has served its purpose ;-) Thanks.
Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak
> hops, without taking the "taggerdate" into account. As we are > taking over the "taggerdate" field to store the committer date for > tips with commits: > > (1) keep the original logic when comparing names based on two refs > both of which are from refs/tags/; > > (2) favoring a name based on a ref in refs/tags/ hierarchy over > a ref outside the hierarchy; > > (3) between two names based on a ref both outside refs/tags/, give > precedence to a name with shorter hops and use "taggerdate" > only to tie-break. > > A change to t4202 is a natural consequence. The test creates a > commit on a branch "side" and points at it with an unannotated tag > "refs/tags/side-2". The original code couldn't decide which one to > favor at all, and gave a name based on a branch (simply because > refs/heads/side sorts earlier than refs/tags/side-2). Because the > updated logic is taught to favor refs in refs/tags/ hierarchy, the > the test is updated to expect to see tags/side-2 instead. > > Signed-off-by: Junio C HamanoI think that strategy is fine and works as I expected in all tested cases. Thanks! > --- > > * I am sure others can add better heurisitics in is_better_name() >for comparing names based on non-tag refs, and I do not mind >people disagreeing with the logic that this patch happens to >implement. This is done primarily to illustrate the value of >using a separate helper function is_better_name() instead of >open-coding the selection logic in name_rev() function. > --- > builtin/name-rev.c | 57 > +- > t/t4202-log.sh | 2 +- > 2 files changed, 49 insertions(+), 10 deletions(-) > > diff --git a/builtin/name-rev.c b/builtin/name-rev.c > index f64c71d9bc..eac0180c62 100644 > --- a/builtin/name-rev.c > +++ b/builtin/name-rev.c > @@ -13,6 +13,7 @@ typedef struct rev_name { > unsigned long taggerdate; > int generation; > int distance; > + int from_tag; > } rev_name; > > static long cutoff = LONG_MAX; > @@ -24,16 +25,47 @@ static int is_better_name(struct rev_name *name, > const char *tip_name, > unsigned long taggerdate, > int generation, > - int distance) > + int distance, > + int from_tag) > { > - return (name->taggerdate > taggerdate || > - (name->taggerdate == taggerdate && > - name->distance > distance)); > + /* > + * When comparing names based on tags, prefer names > + * based on the older tag, even if it is farther away. > + */ > + if (from_tag && name->from_tag) > + return (name->taggerdate > taggerdate || > + (name->taggerdate == taggerdate && > + name->distance > distance)); > + > +#define COMPARE(attribute, smaller_is_better) \ > + if (name->attribute > attribute) \ > + return smaller_is_better; \ > + if (name->attribute < attribute) \ > + return !smaller_is_better I find that define pretty confusing. On first reading, the "==" case seems to be missing, but that is basically "pass" as one can see in the later code. Also, the comparison ">" and "<" is used to switch between the cases "tag vs. non-tag" and "non-tag vs. tag" which happen to be encoded by the from_tag attribute beeing "1 vs. 0" resp "0 vs. 1" in the following: > + > + /* > + * We know that at least one of them is a non-tag at this point. > + * favor a tag over a non-tag. > + */ > + COMPARE(from_tag, 0); > + But in the next two instances you use it to do "actual" comparisons between distances and dates: > + /* > + * We are now looking at two non-tags. Tiebreak to favor > + * shorter hops. > + */ > + COMPARE(distance, 1); > + > + /* ... or tiebreak to favor older date */ > + COMPARE(taggerdate, 1); > + > + /* keep the current one if we cannot decide */ > + return 0; > +#undef COMPARE > } So, while I do understand that now, I'm wondering whether I will next time ;) How about something like /* favor tag over non-tag */ if (name->from_tag != from_tag) return from_tag; at least for the first instance and possibly /* favor shorter hops for non-tags */ if (name->distance != distance) return name->distance > distance; /* tie-break by date */ if (name->taggerdate != taggerdate) return name->taggerdate > taggerdate; > > static void name_rev(struct commit *commit, > const char *tip_name, unsigned long taggerdate, > - int generation, int distance, > + int generation, int distance, int from_tag, > int deref) > { > struct rev_name *name = (struct rev_name *)commit->util; > @@ -57,12 +89,13 @@ static void name_rev(struct commit *commit, >
Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak
Junio C Hamanowrites: > Lars Schneider writes: > >> A quick bisect indicates that this patch might break >> t9807-git-p4-submit.sh 8 and 13. I haven't looked into >> it further, yet. > > As I do not do P4, help in diagnosing why it breaks is appreciated. > If the test script expects... > On the other hand, if git-p4 command internally uses name-rev and it > is not prepared to properly handle commits that can be named in more > than one way, the problem would be deeper, as it would mean it can > misbehave even without the change to name-rev when multiple branches > point at the same commit. Yikes. Perhaps something along this line? This function seems to want to learn which branch we are on, and running "name-rev HEAD" is *NEVER* the right way to do so. If you are on branch B which happens to point at the same commit as branch A, "name-rev HEAD" can say either A or B (and it is likely it would say A simply because it sorts earlier, and the logic seems to favor the one that was discovered earlier when all else being equal). git-p4.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/git-p4.py b/git-p4.py index eab319d76e..351d1ab58e 100755 --- a/git-p4.py +++ b/git-p4.py @@ -582,7 +582,7 @@ def currentGitBranch(): # on a detached head return None else: -return read_pipe(["git", "name-rev", "HEAD"]).split(" ")[1].strip() +return read_pipe(["git", "symbolic-ref", "HEAD"]).strip()[11:] def isValidGitDir(path): return git_dir(path) != None
Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak
Lars Schneiderwrites: > A quick bisect indicates that this patch might break > t9807-git-p4-submit.sh 8 and 13. I haven't looked into > it further, yet. As I do not do P4, help in diagnosing why it breaks is appreciated. If the test script expects a commit, that can be described in two more more ways, is named one way (e.g. named after a branch) and the new "favor tags even if they are unannotated" behaviour breaks its expectation, and the only thing the test _should_ care about is what commit the result is, then clearly the test script needs to be fixed. On the other hand, if git-p4 command internally uses name-rev and it is not prepared to properly handle commits that can be named in more than one way, the problem would be deeper, as it would mean it can misbehave even without the change to name-rev when multiple branches point at the same commit. Thanks.
Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak
> On 16 Mar 2017, at 06:50, Junio C Hamanowrote: > > "git name-rev" assigned a phony "far in the future" date to tips of > refs that are not pointing at tag objects, and favored names based > on a ref with the oldest date. This made it almost impossible for > an unannotated tags and branches to be counted as a viable base, > which was especially problematic when the command is run with the > "--tags" option. If an unannotated tag that points at an ancient > commit and an annotated tag that points at a much newer commit > reaches the commit that is being named, the old unannotated tag was > ignored. > > Update the "taggerdate" field of the rev-name structure, which is > initialized from the tip of ref, to have the committer date if the > object at the tip of ref is a commit, not a tag, so that we can > optionally take it into account when doing "is this name better?" > comparison logic. > > When "name-rev" is run without the "--tags" option, the general > expectation is still to name the commit based on a tag if possible, > but use non-tag refs as fallback, and tiebreak among these non-tag > refs by favoring names with shorter hops from the tip. The use of a > phony "far in the future" date in the original code was an effective > way to ensure this expectation is held: a non-tag tip gets the same > "far in the future" timestamp, giving precedence to tags, and among > non-tag tips, names with shorter hops are preferred over longer > hops, without taking the "taggerdate" into account. As we are > taking over the "taggerdate" field to store the committer date for > tips with commits: > > (1) keep the original logic when comparing names based on two refs > both of which are from refs/tags/; > > (2) favoring a name based on a ref in refs/tags/ hierarchy over > a ref outside the hierarchy; > > (3) between two names based on a ref both outside refs/tags/, give > precedence to a name with shorter hops and use "taggerdate" > only to tie-break. > > A change to t4202 is a natural consequence. The test creates a > commit on a branch "side" and points at it with an unannotated tag > "refs/tags/side-2". The original code couldn't decide which one to > favor at all, and gave a name based on a branch (simply because > refs/heads/side sorts earlier than refs/tags/side-2). Because the > updated logic is taught to favor refs in refs/tags/ hierarchy, the > the test is updated to expect to see tags/side-2 instead. > > Signed-off-by: Junio C Hamano > --- A quick bisect indicates that this patch might break t9807-git-p4-submit.sh 8 and 13. I haven't looked into it further, yet. https://travis-ci.org/git/git/jobs/211948406#L2152 https://travis-ci.org/git/git/jobs/211948406#L2460 Non-JS: https://s3.amazonaws.com/archive.travis-ci.org/jobs/211948406/log.txt - Lars
[PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak
"git name-rev" assigned a phony "far in the future" date to tips of refs that are not pointing at tag objects, and favored names based on a ref with the oldest date. This made it almost impossible for an unannotated tags and branches to be counted as a viable base, which was especially problematic when the command is run with the "--tags" option. If an unannotated tag that points at an ancient commit and an annotated tag that points at a much newer commit reaches the commit that is being named, the old unannotated tag was ignored. Update the "taggerdate" field of the rev-name structure, which is initialized from the tip of ref, to have the committer date if the object at the tip of ref is a commit, not a tag, so that we can optionally take it into account when doing "is this name better?" comparison logic. When "name-rev" is run without the "--tags" option, the general expectation is still to name the commit based on a tag if possible, but use non-tag refs as fallback, and tiebreak among these non-tag refs by favoring names with shorter hops from the tip. The use of a phony "far in the future" date in the original code was an effective way to ensure this expectation is held: a non-tag tip gets the same "far in the future" timestamp, giving precedence to tags, and among non-tag tips, names with shorter hops are preferred over longer hops, without taking the "taggerdate" into account. As we are taking over the "taggerdate" field to store the committer date for tips with commits: (1) keep the original logic when comparing names based on two refs both of which are from refs/tags/; (2) favoring a name based on a ref in refs/tags/ hierarchy over a ref outside the hierarchy; (3) between two names based on a ref both outside refs/tags/, give precedence to a name with shorter hops and use "taggerdate" only to tie-break. A change to t4202 is a natural consequence. The test creates a commit on a branch "side" and points at it with an unannotated tag "refs/tags/side-2". The original code couldn't decide which one to favor at all, and gave a name based on a branch (simply because refs/heads/side sorts earlier than refs/tags/side-2). Because the updated logic is taught to favor refs in refs/tags/ hierarchy, the the test is updated to expect to see tags/side-2 instead. Signed-off-by: Junio C Hamano--- * I am sure others can add better heurisitics in is_better_name() for comparing names based on non-tag refs, and I do not mind people disagreeing with the logic that this patch happens to implement. This is done primarily to illustrate the value of using a separate helper function is_better_name() instead of open-coding the selection logic in name_rev() function. --- builtin/name-rev.c | 57 +- t/t4202-log.sh | 2 +- 2 files changed, 49 insertions(+), 10 deletions(-) diff --git a/builtin/name-rev.c b/builtin/name-rev.c index f64c71d9bc..eac0180c62 100644 --- a/builtin/name-rev.c +++ b/builtin/name-rev.c @@ -13,6 +13,7 @@ typedef struct rev_name { unsigned long taggerdate; int generation; int distance; + int from_tag; } rev_name; static long cutoff = LONG_MAX; @@ -24,16 +25,47 @@ static int is_better_name(struct rev_name *name, const char *tip_name, unsigned long taggerdate, int generation, - int distance) + int distance, + int from_tag) { - return (name->taggerdate > taggerdate || - (name->taggerdate == taggerdate && -name->distance > distance)); + /* +* When comparing names based on tags, prefer names +* based on the older tag, even if it is farther away. +*/ + if (from_tag && name->from_tag) + return (name->taggerdate > taggerdate || + (name->taggerdate == taggerdate && +name->distance > distance)); + +#define COMPARE(attribute, smaller_is_better) \ + if (name->attribute > attribute) \ + return smaller_is_better; \ + if (name->attribute < attribute) \ + return !smaller_is_better + + /* +* We know that at least one of them is a non-tag at this point. +* favor a tag over a non-tag. +*/ + COMPARE(from_tag, 0); + + /* +* We are now looking at two non-tags. Tiebreak to favor +* shorter hops. +*/ + COMPARE(distance, 1); + + /* ... or tiebreak to favor older date */ + COMPARE(taggerdate, 1); + + /* keep the current one if we cannot decide */ + return 0; +#undef COMPARE } static void name_rev(struct commit *commit, const char *tip_name, unsigned long taggerdate, - int generation, int distance, + int