Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak

2017-03-17 Thread Junio C Hamano
Junio C Hamano  writes:

> Lars Schneider  writes:
> ...
>> Following your explanation this patch looks good to me and this fixes the
>> test failure. TBH I never thought about the difference of these commands
>> before. "rev" and "ref" sound so similar although they denote completely 
>> different things.
>
> Thanks for testing.
> ...
> Hence, I would really prefer not to commit mine myself.  I'd rather
> see somebody from git-p4 circle to come up with a version that is
> more in line with the way things are done in the existing code and
> send a tested version for me to apply.

Tentatively I queued my hack together with the name-rev thing on
'pu', and Travis seems to be OK with the result.  

It would be nice if we can "fix" the use of "name-rev HEAD" in "git
p4" sooner rather than later.  If the code truly uses it to figure
out what branch we are currently on and acts on it, as the name of
that function suggests, it may be easy to construct a testcase where
the code without the fix does a wrong thing (e.g. have two branches
pointing at the same commit, and tell "git p4" to act on the one
that sorts later in the "git branch --list" output; the command
would act as if it were working on the other branch), and shows that
the patch fixes that problem.  That would be a bug worth fixing
quickly regardless of the "name-rev" change michael wanted to make.

Thanks.



Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak

2017-03-17 Thread Junio C Hamano
Lars Schneider  writes:

>> git-p4.py | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/git-p4.py b/git-p4.py
>> index eab319d76e..351d1ab58e 100755
>> --- a/git-p4.py
>> +++ b/git-p4.py
>> @@ -582,7 +582,7 @@ def currentGitBranch():
>> # on a detached head
>> return None
>> else:
>> -return read_pipe(["git", "name-rev", "HEAD"]).split(" ")[1].strip()
>> +return read_pipe(["git", "symbolic-ref", "HEAD"]).strip()[11:]
>> 
>> def isValidGitDir(path):
>> return git_dir(path) != None
>
> Following your explanation this patch looks good to me and this fixes the
> test failure. TBH I never thought about the difference of these commands
> before. "rev" and "ref" sound so similar although they denote completely 
> different things.

Thanks for testing.

The above was done merely to point out the problematic place and a
possible solution.  As I am not familiar with the code in git-p4.py,
I didn't even try to check if the code already has a helper function
that strips "refs/heads/" from the beginning of the string (iow, I
am not happy with the [11:]).  I didn't and don't like the fact that
this function now runs "symbolic-ref HEAD" twice but I didn't try to
see if there are more suitable and idiomatic ways to do this with a
single invocation.

Hence, I would really prefer not to commit mine myself.  I'd rather
see somebody from git-p4 circle to come up with a version that is
more in line with the way things are done in the existing code and
send a tested version for me to apply.

Thanks.




Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak

2017-03-17 Thread Lars Schneider

> On 17 Mar 2017, at 13:56, Junio C Hamano  wrote:
> 
> Junio C Hamano  writes:
> 
>> Lars Schneider  writes:
>> 
>>> A quick bisect indicates that this patch might break 
>>> t9807-git-p4-submit.sh 8 and 13. I haven't looked into
>>> it further, yet.
>> 
>> As I do not do P4, help in diagnosing why it breaks is appreciated.
>> If the test script expects...
>> On the other hand, if git-p4 command internally uses name-rev and it
>> is not prepared to properly handle commits that can be named in more
>> than one way, the problem would be deeper, as it would mean it can
>> misbehave even without the change to name-rev when multiple branches
>> point at the same commit.
> 
> Yikes.  Perhaps something along this line?  
> 
> This function seems to want to learn which branch we are on, and
> running "name-rev HEAD" is *NEVER* the right way to do so.  If you
> are on branch B which happens to point at the same commit as branch
> A, "name-rev HEAD" can say either A or B (and it is likely it would
> say A simply because it sorts earlier, and the logic seems to favor
> the one that was discovered earlier when all else being equal).
> 
> git-p4.py | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/git-p4.py b/git-p4.py
> index eab319d76e..351d1ab58e 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -582,7 +582,7 @@ def currentGitBranch():
> # on a detached head
> return None
> else:
> -return read_pipe(["git", "name-rev", "HEAD"]).split(" ")[1].strip()
> +return read_pipe(["git", "symbolic-ref", "HEAD"]).strip()[11:]
> 
> def isValidGitDir(path):
> return git_dir(path) != None

Following your explanation this patch looks good to me and this fixes the
test failure. TBH I never thought about the difference of these commands
before. "rev" and "ref" sound so similar although they denote completely 
different things.

Thanks,
Lars

Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak

2017-03-17 Thread Junio C Hamano
Michael J Gruber  writes:

>> +#define COMPARE(attribute, smaller_is_better)\
>> +if (name->attribute > attribute) \
>> +return smaller_is_better; \
>> +if (name->attribute < attribute) \
>> +return !smaller_is_better
>
> I find that define pretty confusing. On first reading, the "==" case
> seems to be missing, but that is basically "pass" as one can see in the
> later code.
>
> Also, the comparison ">"  and "<" is used to switch between the cases
> "tag vs. non-tag" and "non-tag vs. tag" which happen to be encoded by
> the from_tag attribute beeing "1 vs. 0" resp "0 vs. 1" in the following:
> ...
> But in the next two instances you use it to do "actual" comparisons
> between distances and dates:

That is pretty much deliberate.  The macro is designed to compare
the attribute and return if one side is strictly better than the
other side, and fall through to tie-breaker that follow (i.e. lack
of "==" is very deliberate).  Also it is not "return if one side is
strictly smaller"---the second parameter decides if smaller means
better or bigger means better (i.e. "from_tag" being 1 for a name
based on tag and 0 for a name based on non-tag can use the macro by
telling that the side that is strictly bigger is better).

> How about something like
> ...

I did a sequence of COMPARE() macros that cascade to implement
tie-breaking logic because I thought it was the easiest to read and
understand, and I did "tag vs non-tag first decides, and then
tiebreak to favor shorter hops and then tiebreak on date, while
paying attention to committer dates even when we are comparing two
commits", because I thought that would be one of the easier logic to
explain to the users.  

But this topic is not my itch, and quite honestly I'd be happier if
you took it over, improving both the implementation and the
semantics it implements, following your best judgment.  You are
equipped with better motivation than I am to make these decisions.

This patch has turned up a bug in git-p4 in that it seems to have
misunderstood that "name-rev HEAD" is a good way to learn which
branch we are currently on (it never is); as far as I am concerned,
it has served its purpose ;-)

Thanks.


Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak

2017-03-17 Thread Michael J Gruber
> hops, without taking the "taggerdate" into account.  As we are
> taking over the "taggerdate" field to store the committer date for
> tips with commits:
> 
>  (1) keep the original logic when comparing names based on two refs
>  both of which are from refs/tags/;
> 
>  (2) favoring a name based on a ref in refs/tags/ hierarchy over
>  a ref outside the hierarchy;
> 
>  (3) between two names based on a ref both outside refs/tags/, give
>  precedence to a name with shorter hops and use "taggerdate"
>  only to tie-break.
> 
> A change to t4202 is a natural consequence.  The test creates a
> commit on a branch "side" and points at it with an unannotated tag
> "refs/tags/side-2".  The original code couldn't decide which one to
> favor at all, and gave a name based on a branch (simply because
> refs/heads/side sorts earlier than refs/tags/side-2).  Because the
> updated logic is taught to favor refs in refs/tags/ hierarchy, the
> the test is updated to expect to see tags/side-2 instead.
> 
> Signed-off-by: Junio C Hamano 

I think that strategy is fine and works as I expected in all tested
cases. Thanks!

> ---
> 
>  * I am sure others can add better heurisitics in is_better_name()
>for comparing names based on non-tag refs, and I do not mind
>people disagreeing with the logic that this patch happens to
>implement.  This is done primarily to illustrate the value of
>using a separate helper function is_better_name() instead of
>open-coding the selection logic in name_rev() function.
> ---
>  builtin/name-rev.c | 57 
> +-
>  t/t4202-log.sh |  2 +-
>  2 files changed, 49 insertions(+), 10 deletions(-)
> 
> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
> index f64c71d9bc..eac0180c62 100644
> --- a/builtin/name-rev.c
> +++ b/builtin/name-rev.c
> @@ -13,6 +13,7 @@ typedef struct rev_name {
>   unsigned long taggerdate;
>   int generation;
>   int distance;
> + int from_tag;
>  } rev_name;
>  
>  static long cutoff = LONG_MAX;
> @@ -24,16 +25,47 @@ static int is_better_name(struct rev_name *name,
> const char *tip_name,
> unsigned long taggerdate,
> int generation,
> -   int distance)
> +   int distance,
> +   int from_tag)
>  {
> - return (name->taggerdate > taggerdate ||
> - (name->taggerdate == taggerdate &&
> -  name->distance > distance));
> + /*
> +  * When comparing names based on tags, prefer names
> +  * based on the older tag, even if it is farther away.
> +  */
> + if (from_tag && name->from_tag)
> + return (name->taggerdate > taggerdate ||
> + (name->taggerdate == taggerdate &&
> +  name->distance > distance));
> +
> +#define COMPARE(attribute, smaller_is_better) \
> + if (name->attribute > attribute) \
> + return smaller_is_better; \
> + if (name->attribute < attribute) \
> + return !smaller_is_better

I find that define pretty confusing. On first reading, the "==" case
seems to be missing, but that is basically "pass" as one can see in the
later code.

Also, the comparison ">"  and "<" is used to switch between the cases
"tag vs. non-tag" and "non-tag vs. tag" which happen to be encoded by
the from_tag attribute beeing "1 vs. 0" resp "0 vs. 1" in the following:

> +
> + /*
> +  * We know that at least one of them is a non-tag at this point.
> +  * favor a tag over a non-tag.
> +  */
> + COMPARE(from_tag, 0);
> +

But in the next two instances you use it to do "actual" comparisons
between distances and dates:

> + /*
> +  * We are now looking at two non-tags.  Tiebreak to favor
> +  * shorter hops.
> +  */
> + COMPARE(distance, 1);
> +
> + /* ... or tiebreak to favor older date */
> + COMPARE(taggerdate, 1);
> +
> + /* keep the current one if we cannot decide */
> + return 0;
> +#undef COMPARE
>  }

So, while I do understand that now, I'm wondering whether I will next
time ;)

How about something like

/* favor tag over non-tag */
if (name->from_tag != from_tag)
return from_tag;

at least for the first instance and possibly

/* favor shorter hops for non-tags */
if (name->distance != distance)
return name->distance > distance;

/* tie-break by date */
if (name->taggerdate != taggerdate)
return name->taggerdate > taggerdate;

>  
>  static void name_rev(struct commit *commit,
>   const char *tip_name, unsigned long taggerdate,
> - int generation, int distance,
> + int generation, int distance, int from_tag,
>   int deref)
>  {
>   struct rev_name *name = (struct rev_name *)commit->util;
> @@ -57,12 +89,13 @@ static void name_rev(struct commit *commit,
>   

Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak

2017-03-16 Thread Junio C Hamano
Junio C Hamano  writes:

> Lars Schneider  writes:
>
>> A quick bisect indicates that this patch might break 
>> t9807-git-p4-submit.sh 8 and 13. I haven't looked into
>> it further, yet.
>
> As I do not do P4, help in diagnosing why it breaks is appreciated.
> If the test script expects...
> On the other hand, if git-p4 command internally uses name-rev and it
> is not prepared to properly handle commits that can be named in more
> than one way, the problem would be deeper, as it would mean it can
> misbehave even without the change to name-rev when multiple branches
> point at the same commit.

Yikes.  Perhaps something along this line?  

This function seems to want to learn which branch we are on, and
running "name-rev HEAD" is *NEVER* the right way to do so.  If you
are on branch B which happens to point at the same commit as branch
A, "name-rev HEAD" can say either A or B (and it is likely it would
say A simply because it sorts earlier, and the logic seems to favor
the one that was discovered earlier when all else being equal).

 git-p4.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-p4.py b/git-p4.py
index eab319d76e..351d1ab58e 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -582,7 +582,7 @@ def currentGitBranch():
 # on a detached head
 return None
 else:
-return read_pipe(["git", "name-rev", "HEAD"]).split(" ")[1].strip()
+return read_pipe(["git", "symbolic-ref", "HEAD"]).strip()[11:]
 
 def isValidGitDir(path):
 return git_dir(path) != None


Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak

2017-03-16 Thread Junio C Hamano
Lars Schneider  writes:

> A quick bisect indicates that this patch might break 
> t9807-git-p4-submit.sh 8 and 13. I haven't looked into
> it further, yet.

As I do not do P4, help in diagnosing why it breaks is appreciated.
If the test script expects a commit, that can be described in two
more more ways, is named one way (e.g. named after a branch) and the
new "favor tags even if they are unannotated" behaviour breaks its
expectation, and the only thing the test _should_ care about is what
commit the result is, then clearly the test script needs to be fixed.
On the other hand, if git-p4 command internally uses name-rev and it
is not prepared to properly handle commits that can be named in more
than one way, the problem would be deeper, as it would mean it can
misbehave even without the change to name-rev when multiple branches
point at the same commit.

Thanks.



Re: [PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak

2017-03-16 Thread Lars Schneider

> On 16 Mar 2017, at 06:50, Junio C Hamano  wrote:
> 
> "git name-rev" assigned a phony "far in the future" date to tips of
> refs that are not pointing at tag objects, and favored names based
> on a ref with the oldest date.  This made it almost impossible for
> an unannotated tags and branches to be counted as a viable base,
> which was especially problematic when the command is run with the
> "--tags" option.  If an unannotated tag that points at an ancient
> commit and an annotated tag that points at a much newer commit
> reaches the commit that is being named, the old unannotated tag was
> ignored.
> 
> Update the "taggerdate" field of the rev-name structure, which is
> initialized from the tip of ref, to have the committer date if the
> object at the tip of ref is a commit, not a tag, so that we can
> optionally take it into account when doing "is this name better?"
> comparison logic.
> 
> When "name-rev" is run without the "--tags" option, the general
> expectation is still to name the commit based on a tag if possible,
> but use non-tag refs as fallback, and tiebreak among these non-tag
> refs by favoring names with shorter hops from the tip.  The use of a
> phony "far in the future" date in the original code was an effective
> way to ensure this expectation is held: a non-tag tip gets the same
> "far in the future" timestamp, giving precedence to tags, and among
> non-tag tips, names with shorter hops are preferred over longer
> hops, without taking the "taggerdate" into account.  As we are
> taking over the "taggerdate" field to store the committer date for
> tips with commits:
> 
> (1) keep the original logic when comparing names based on two refs
> both of which are from refs/tags/;
> 
> (2) favoring a name based on a ref in refs/tags/ hierarchy over
> a ref outside the hierarchy;
> 
> (3) between two names based on a ref both outside refs/tags/, give
> precedence to a name with shorter hops and use "taggerdate"
> only to tie-break.
> 
> A change to t4202 is a natural consequence.  The test creates a
> commit on a branch "side" and points at it with an unannotated tag
> "refs/tags/side-2".  The original code couldn't decide which one to
> favor at all, and gave a name based on a branch (simply because
> refs/heads/side sorts earlier than refs/tags/side-2).  Because the
> updated logic is taught to favor refs in refs/tags/ hierarchy, the
> the test is updated to expect to see tags/side-2 instead.
> 
> Signed-off-by: Junio C Hamano 
> ---

A quick bisect indicates that this patch might break 
t9807-git-p4-submit.sh 8 and 13. I haven't looked into
it further, yet.

https://travis-ci.org/git/git/jobs/211948406#L2152
https://travis-ci.org/git/git/jobs/211948406#L2460

Non-JS: https://s3.amazonaws.com/archive.travis-ci.org/jobs/211948406/log.txt

- Lars


[PATCH 2/2] name-rev: favor describing with tags and use committer date to tiebreak

2017-03-15 Thread Junio C Hamano
"git name-rev" assigned a phony "far in the future" date to tips of
refs that are not pointing at tag objects, and favored names based
on a ref with the oldest date.  This made it almost impossible for
an unannotated tags and branches to be counted as a viable base,
which was especially problematic when the command is run with the
"--tags" option.  If an unannotated tag that points at an ancient
commit and an annotated tag that points at a much newer commit
reaches the commit that is being named, the old unannotated tag was
ignored.

Update the "taggerdate" field of the rev-name structure, which is
initialized from the tip of ref, to have the committer date if the
object at the tip of ref is a commit, not a tag, so that we can
optionally take it into account when doing "is this name better?"
comparison logic.

When "name-rev" is run without the "--tags" option, the general
expectation is still to name the commit based on a tag if possible,
but use non-tag refs as fallback, and tiebreak among these non-tag
refs by favoring names with shorter hops from the tip.  The use of a
phony "far in the future" date in the original code was an effective
way to ensure this expectation is held: a non-tag tip gets the same
"far in the future" timestamp, giving precedence to tags, and among
non-tag tips, names with shorter hops are preferred over longer
hops, without taking the "taggerdate" into account.  As we are
taking over the "taggerdate" field to store the committer date for
tips with commits:

 (1) keep the original logic when comparing names based on two refs
 both of which are from refs/tags/;

 (2) favoring a name based on a ref in refs/tags/ hierarchy over
 a ref outside the hierarchy;

 (3) between two names based on a ref both outside refs/tags/, give
 precedence to a name with shorter hops and use "taggerdate"
 only to tie-break.

A change to t4202 is a natural consequence.  The test creates a
commit on a branch "side" and points at it with an unannotated tag
"refs/tags/side-2".  The original code couldn't decide which one to
favor at all, and gave a name based on a branch (simply because
refs/heads/side sorts earlier than refs/tags/side-2).  Because the
updated logic is taught to favor refs in refs/tags/ hierarchy, the
the test is updated to expect to see tags/side-2 instead.

Signed-off-by: Junio C Hamano 
---

 * I am sure others can add better heurisitics in is_better_name()
   for comparing names based on non-tag refs, and I do not mind
   people disagreeing with the logic that this patch happens to
   implement.  This is done primarily to illustrate the value of
   using a separate helper function is_better_name() instead of
   open-coding the selection logic in name_rev() function.
---
 builtin/name-rev.c | 57 +-
 t/t4202-log.sh |  2 +-
 2 files changed, 49 insertions(+), 10 deletions(-)

diff --git a/builtin/name-rev.c b/builtin/name-rev.c
index f64c71d9bc..eac0180c62 100644
--- a/builtin/name-rev.c
+++ b/builtin/name-rev.c
@@ -13,6 +13,7 @@ typedef struct rev_name {
unsigned long taggerdate;
int generation;
int distance;
+   int from_tag;
 } rev_name;
 
 static long cutoff = LONG_MAX;
@@ -24,16 +25,47 @@ static int is_better_name(struct rev_name *name,
  const char *tip_name,
  unsigned long taggerdate,
  int generation,
- int distance)
+ int distance,
+ int from_tag)
 {
-   return (name->taggerdate > taggerdate ||
-   (name->taggerdate == taggerdate &&
-name->distance > distance));
+   /*
+* When comparing names based on tags, prefer names
+* based on the older tag, even if it is farther away.
+*/
+   if (from_tag && name->from_tag)
+   return (name->taggerdate > taggerdate ||
+   (name->taggerdate == taggerdate &&
+name->distance > distance));
+
+#define COMPARE(attribute, smaller_is_better)   \
+   if (name->attribute > attribute) \
+   return smaller_is_better; \
+   if (name->attribute < attribute) \
+   return !smaller_is_better
+
+   /*
+* We know that at least one of them is a non-tag at this point.
+* favor a tag over a non-tag.
+*/
+   COMPARE(from_tag, 0);
+
+   /*
+* We are now looking at two non-tags.  Tiebreak to favor
+* shorter hops.
+*/
+   COMPARE(distance, 1);
+
+   /* ... or tiebreak to favor older date */
+   COMPARE(taggerdate, 1);
+
+   /* keep the current one if we cannot decide */
+   return 0;
+#undef COMPARE
 }
 
 static void name_rev(struct commit *commit,
const char *tip_name, unsigned long taggerdate,
-   int generation, int distance,
+   int