Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
Stefan Beller writes: > On Tue, Dec 19, 2017 at 11:22 AM, Junio C Hamano wrote: >> I had to squash in the following to make 'pu' pass under >> gettext-poison build. Is this ready for 'next' otherwise? > > I saw that in pu, thanks for squashing. I should have spoken > up earlier confirming it. So is this ready for 'next' if it is squashed? >> With the "log --find-object" thing, it may be that this no longer is >> needed, but then again we haven't done anything with the other >> Jonathan's idea to unify the --find-object thing into the --pickaxe >> framework. > > I'll look into that after I finish looking at a submodule related bug. Thanks. >> It seems that we tend to open and then abandon new interests without >> seeing them through completion, leaving too many loose ends untied. >> This has to stop. > > I'll try to find my balance again. Sorry, but this was not really about you, but was more about me. I probably should more aggressively drop a topic that stalled and also merge a topic that reached the point of diminishing returns to 'next'.
Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
On Tue, Dec 19, 2017 at 11:22 AM, Junio C Hamano wrote: > I had to squash in the following to make 'pu' pass under > gettext-poison build. Is this ready for 'next' otherwise? I saw that in pu, thanks for squashing. I should have spoken up earlier confirming it. > With the "log --find-object" thing, it may be that this no longer is > needed, but then again we haven't done anything with the other > Jonathan's idea to unify the --find-object thing into the --pickaxe > framework. I'll look into that after I finish looking at a submodule related bug. > It seems that we tend to open and then abandon new interests without > seeing them through completion, leaving too many loose ends untied. > This has to stop. I'll try to find my balance again. When working on too few topics at the same time, sometimes I get stalled waiting for the mailing list (or internal reviewers) to respond, so I start many topics, which then overwhelm me after a while once reviews catch up. In this specific case it was not clear what the best way is, i.e. the interest was rather broad: "Hey I have this blob sha1, how do I get more information?", which can be solved in many different ways, so I tried some of them. (Another alternative just now considered: I could have wrapped that script from stackoverflow into a new command and called it a day, though that has very poor benefits for the rest of ecosystem, an addition to an established powerful command gives more benefits IMHO) Thanks, Stefan
Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
I had to squash in the following to make 'pu' pass under gettext-poison build. Is this ready for 'next' otherwise? With the "log --find-object" thing, it may be that this no longer is needed, but then again we haven't done anything with the other Jonathan's idea to unify the --find-object thing into the --pickaxe framework. It seems that we tend to open and then abandon new interests without seeing them through completion, leaving too many loose ends untied. This has to stop. diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index 4668f0058e..3e3fb462a0 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -341,7 +341,7 @@ test_expect_success 'describe directly tagged blob' ' test_expect_success 'describe tag object' ' git tag test-blob-1 -a -m msg unique-file:file && test_must_fail git describe test-blob-1 2>actual && - grep "fatal: test-blob-1 is neither a commit nor blob" actual + test_i18ngrep "fatal: test-blob-1 is neither a commit nor blob" actual ' test_expect_failure ULIMIT_STACK_SIZE 'name-rev works in a deep repo' '
Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
Stefan Beller writes: >> I was reacting to your "fixed". So will we see a rerolled series or >> not? > > I was not planning on rerolling it. OK. Then I do not have to worry about this one and drop it perhaps. One less topic on 'pu' is always good, if it is not active. Enjoy your vacation. Thanks.
Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
On Thu, Nov 23, 2017 at 11:18 PM, Junio C Hamano wrote: > Stefan Beller writes: > >> On Tue, Nov 21, 2017 at 11:55 PM, Junio C Hamano wrote: >>> Stefan Beller writes: >>> ... fixed. ... fixed the text... ... I am not fully convinced all descriptions are in recent history, but I tend to agree that most are, so probably the trade off is a wash. >>> >>> So what do we want with this topic? I think the "teach 'git log' to >>> highlight commits whose changes involve the given blob" is a more or >>> less an orthogonal thing, >> >> Well, both of them solve our immediate needs, so I'd be fine with pursuing >> just one of them, but I do not oppose taking both. >> >>> and I suspect that it is something users >>> may (although I personally do not) find valuable to have a related >>> but different feature in "git describe". >> >> agreed. > > I was reacting to your "fixed". So will we see a rerolled series or > not? I was not planning on rerolling it. FYI: I am on vacation until next Monday, so if there is anything to be fixed here, I'd do it in a week, which may be enough time to warrant incremental fixes?
Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
Stefan Beller writes: > On Tue, Nov 21, 2017 at 11:55 PM, Junio C Hamano wrote: >> Stefan Beller writes: >> >>> ... >>> fixed. >>> ... >>> fixed the text... >>> ... >>> I am not fully convinced all descriptions are in recent history, but I >>> tend to agree that most are, so probably the trade off is a wash. >> >> So what do we want with this topic? I think the "teach 'git log' to >> highlight commits whose changes involve the given blob" is a more or >> less an orthogonal thing, > > Well, both of them solve our immediate needs, so I'd be fine with pursuing > just one of them, but I do not oppose taking both. > >> and I suspect that it is something users >> may (although I personally do not) find valuable to have a related >> but different feature in "git describe". > > agreed. I was reacting to your "fixed". So will we see a rerolled series or not?
Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
On Tue, Nov 21, 2017 at 11:55 PM, Junio C Hamano wrote: > Stefan Beller writes: > >> ... >> fixed. >> ... >> fixed the text... >> ... >> I am not fully convinced all descriptions are in recent history, but I >> tend to agree that most are, so probably the trade off is a wash. > > So what do we want with this topic? I think the "teach 'git log' to > highlight commits whose changes involve the given blob" is a more or > less an orthogonal thing, Well, both of them solve our immediate needs, so I'd be fine with pursuing just one of them, but I do not oppose taking both. > and I suspect that it is something users > may (although I personally do not) find valuable to have a related > but different feature in "git describe". agreed.
Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
Stefan Beller writes: > ... > fixed. > ... > fixed the text... > ... > I am not fully convinced all descriptions are in recent history, but I > tend to agree that most are, so probably the trade off is a wash. So what do we want with this topic? I think the "teach 'git log' to highlight commits whose changes involve the given blob" is a more or less an orthogonal thing, and I suspect that it is something users may (although I personally do not) find valuable to have a related but different feature in "git describe".
Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
From: "Stefan Beller" Sent: Thursday, November 16, 2017 2:00 AM [in catch up mode..] Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For example git describe --tags v0.99:Makefile conversion-901-g7672db20c2:Makefile tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. The walking is performed in reverse order to show the introduction of a blob rather than its last occurrence. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller --- Documentation/git-describe.txt | 18 ++-- builtin/describe.c | 62 ++ t/t6120-describe.sh| 34 +++ 3 files changed, 107 insertions(+), 7 deletions(-) diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt index c924c945ba..e027fb8c4b 100644 --- a/Documentation/git-describe.txt +++ b/Documentation/git-describe.txt @@ -3,14 +3,14 @@ git-describe(1) NAME -git-describe - Describe a commit using the most recent tag reachable from it - +git-describe - Give an object a human readable name based on an available ref SYNOPSIS [verse] 'git describe' [--all] [--tags] [--contains] [--abbrev=] [...] 'git describe' [--all] [--tags] [--contains] [--abbrev=] --dirty[=] +'git describe' DESCRIPTION --- @@ -24,6 +24,12 @@ By default (without --all or --tags) `git describe` only shows annotated tags. For more information about creating annotated tags see the -a and -s options to linkgit:git-tag[1]. +If the given object refers to a blob, it will be described +as `:`, such that the blob can be found +at `` in the ``, which itself describes the +first commit in which this blob occurs in a reverse revision walk +from HEAD. + OPTIONS --- ...:: @@ -186,6 +192,14 @@ selected and output. Here fewest commits different is defined as the number of commits which would be shown by `git log tag..input` will be the smallest number of commits possible. +BUGS + + +Tree objects as well as tag objects not pointing at commits, cannot be described. Is this true? Is it stand alone from the describing of a blob? If so should it be its own patchlet. - I thought I'd read that within the series there is now a tree / tag (of blob/trees) description capability. I'd prefer that we don't start with the "can't" view (relative to the subsequent sentences of the paragraph). It puts off the reader - we are about to say what can be described but in a limited way - the limitation being the bug. Maybe just swap the line to form a second paragraph. +When describing blobs, the lightweight tags pointing at blobs are ignored, +but the blob is still described as : despite the lightweight +tag being favorable. + -- Philip GIT --- Part of the linkgit:git[1] suite diff --git a/builtin/describe.c b/builtin/describe.c index 9e9a5ed5d4..5b4bfaba3f 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -3,6 +3,7 @@ #include "lockfile.h" #include "commit.h" #include "tag.h" +#include "blob.h" #include "refs.h" #include "builtin.h" #include "exec_cmd.h" @@ -11,8 +12,9 @@ #include "hashmap.h" #include "argv-array.h" #include "run-command.h" +#include "revision.h" +#include "list-objects.h" -#define SEEN (1u << 0) #define MAX_TAGS (FLAG_BITS - 1) static const char * const describe_usage[] = { @@ -434,6 +436,53 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst) strbuf_addstr(dst, suffix); } +struct process_commit_data { + struct object_id current_commit; + struct object_id looking_for; + struct strbuf *dst; + struct rev_info *revs; +}; + +static void process_commit(struct commit *commit, void *data) +{ + struct process_commit_data *pcd = data; + pcd->current_commit = commit->object.oid; +} + +static void process_object(struct object *obj, const char *path, void *data)
Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
On Wed, Nov 15, 2017 at 7:24 PM, Junio C Hamano wrote: > I am not sure if "And if there is ..." is adding much value here (I > do not think it is even technically correct for that matter). If > there are more than one tag that point at the commit the user is > interested in, we use one of the tags, as tags conceptually sit at a > higher level. And we use a heuristic to use one or the other tag to > make up a name for the commit, so the same commit can have two valid > names. ---So what? Neither of these two valid names is "ambigous"; > the commit object the user wanted to name _is_ correctly identified > (I would assume that we are not discussing a hash collision). > > Lucikly, if we remove "And if...precisely", the logic still flows > nicely, if not more, to the next paragraph. fixed. >> When describing a blob, we want to describe the blob from a higher layer >> as well, which is a tuple of (commit, deep/path) as the tree objects >> involved are rather uninteresting. The same blob can be referenced by >> multiple commits, so how we decide which commit to use? This patch >> implements a rather naive approach on this: As there are no back pointers >> from blobs to commits in which the blob occurs, we'll start walking from >> any tips available, listing the blobs in-order of the commit and once we > > Is "any tips" still the case? I was wondering why you start your > traversal at HEAD and nothing else in this iteration. There seems > to be no mention of this design decision in the documentation and no > justification in the log. fixed the text. The design decision to reverse walk HEAD is tied to your observation below: > The reversing may improve the chance of an older commit to be chosen > rather than the newer one, but it does not even guarantee to show the > "introduction". This is what I realized when we started walking all refs including reflog. The introduction can only be found when we take the commit-parents into account and look at the diffs. I noticed you started origin/jc/diff-blobfind which may be helpful to find the introduction correctly, which I'll play around with that and see if I can get that working. > What this guarantees is that a long history will be traversed fully > before we start considering which commit has the blob of interest, I > am afraid. Is this a sensible trade-off? I am not fully convinced all descriptions are in recent history, but I tend to agree that most are, so probably the trade off is a wash.
Re: [PATCHv5 7/7] builtin/describe.c: describe a blob
Stefan Beller writes: > When describing commits, we try to anchor them to tags or refs, as these > are conceptually on a higher level than the commit. And if there is no ref > or tag that matches exactly, we're out of luck. So we employ a heuristic > to make up a name for the commit. These names are ambiguous, there might > be different tags or refs to anchor to, and there might be different > path in the DAG to travel to arrive at the commit precisely. I am not sure if "And if there is ..." is adding much value here (I do not think it is even technically correct for that matter). If there are more than one tag that point at the commit the user is interested in, we use one of the tags, as tags conceptually sit at a higher level. And we use a heuristic to use one or the other tag to make up a name for the commit, so the same commit can have two valid names. ---So what? Neither of these two valid names is "ambigous"; the commit object the user wanted to name _is_ correctly identified (I would assume that we are not discussing a hash collision). Lucikly, if we remove "And if...precisely", the logic still flows nicely, if not more, to the next paragraph. > When describing a blob, we want to describe the blob from a higher layer > as well, which is a tuple of (commit, deep/path) as the tree objects > involved are rather uninteresting. The same blob can be referenced by > multiple commits, so how we decide which commit to use? This patch > implements a rather naive approach on this: As there are no back pointers > from blobs to commits in which the blob occurs, we'll start walking from > any tips available, listing the blobs in-order of the commit and once we Is "any tips" still the case? I was wondering why you start your traversal at HEAD and nothing else in this iteration. There seems to be no mention of this design decision in the documentation and no justification in the log. > found the blob, we'll take the first commit that listed the blob. For > example > > git describe --tags v0.99:Makefile > conversion-901-g7672db20c2:Makefile > > tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. > > The walking is performed in reverse order to show the introduction of a > blob rather than its last occurrence. The reversing may improve the chance of an older commit to be chosen rather than the newer one, but it does not even guarantee to show the "introduction". What this guarantees is that a long history will be traversed fully before we start considering which commit has the blob of interest, I am afraid. Is this a sensible trade-off? > + argv_array_pushl(&args, "internal: The first arg is not parsed", > + "--objects", "--in-commit-order", "--reverse", "HEAD", > + NULL);
[PATCHv5 7/7] builtin/describe.c: describe a blob
Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) When describing commits, we try to anchor them to tags or refs, as these are conceptually on a higher level than the commit. And if there is no ref or tag that matches exactly, we're out of luck. So we employ a heuristic to make up a name for the commit. These names are ambiguous, there might be different tags or refs to anchor to, and there might be different path in the DAG to travel to arrive at the commit precisely. When describing a blob, we want to describe the blob from a higher layer as well, which is a tuple of (commit, deep/path) as the tree objects involved are rather uninteresting. The same blob can be referenced by multiple commits, so how we decide which commit to use? This patch implements a rather naive approach on this: As there are no back pointers from blobs to commits in which the blob occurs, we'll start walking from any tips available, listing the blobs in-order of the commit and once we found the blob, we'll take the first commit that listed the blob. For example git describe --tags v0.99:Makefile conversion-901-g7672db20c2:Makefile tells us the Makefile as it was in v0.99 was introduced in commit 7672db20. The walking is performed in reverse order to show the introduction of a blob rather than its last occurrence. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob Signed-off-by: Stefan Beller --- Documentation/git-describe.txt | 18 ++-- builtin/describe.c | 62 ++ t/t6120-describe.sh| 34 +++ 3 files changed, 107 insertions(+), 7 deletions(-) diff --git a/Documentation/git-describe.txt b/Documentation/git-describe.txt index c924c945ba..e027fb8c4b 100644 --- a/Documentation/git-describe.txt +++ b/Documentation/git-describe.txt @@ -3,14 +3,14 @@ git-describe(1) NAME -git-describe - Describe a commit using the most recent tag reachable from it - +git-describe - Give an object a human readable name based on an available ref SYNOPSIS [verse] 'git describe' [--all] [--tags] [--contains] [--abbrev=] [...] 'git describe' [--all] [--tags] [--contains] [--abbrev=] --dirty[=] +'git describe' DESCRIPTION --- @@ -24,6 +24,12 @@ By default (without --all or --tags) `git describe` only shows annotated tags. For more information about creating annotated tags see the -a and -s options to linkgit:git-tag[1]. +If the given object refers to a blob, it will be described +as `:`, such that the blob can be found +at `` in the ``, which itself describes the +first commit in which this blob occurs in a reverse revision walk +from HEAD. + OPTIONS --- ...:: @@ -186,6 +192,14 @@ selected and output. Here fewest commits different is defined as the number of commits which would be shown by `git log tag..input` will be the smallest number of commits possible. +BUGS + + +Tree objects as well as tag objects not pointing at commits, cannot be described. +When describing blobs, the lightweight tags pointing at blobs are ignored, +but the blob is still described as : despite the lightweight +tag being favorable. + GIT --- Part of the linkgit:git[1] suite diff --git a/builtin/describe.c b/builtin/describe.c index 9e9a5ed5d4..5b4bfaba3f 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -3,6 +3,7 @@ #include "lockfile.h" #include "commit.h" #include "tag.h" +#include "blob.h" #include "refs.h" #include "builtin.h" #include "exec_cmd.h" @@ -11,8 +12,9 @@ #include "hashmap.h" #include "argv-array.h" #include "run-command.h" +#include "revision.h" +#include "list-objects.h" -#define SEEN (1u << 0) #define MAX_TAGS (FLAG_BITS - 1) static const char * const describe_usage[] = { @@ -434,6 +436,53 @@ static void describe_commit(struct object_id *oid, struct strbuf *dst) strbuf_addstr(dst, suffix); } +struct process_commit_data { + struct object_id current_commit; + struct object_id looking_for; + struct strbuf *dst; + struct rev_info *revs; +}; + +static void process_commit(struct commit *commit, void *data) +{ + struct process_commit_data *pcd = data; + pcd->current_commit = commit->object.oid; +} + +static void process_object(struct object *obj, const char *path, void *data) +{ + struct process_commit_data *pcd = data; + + if (!oidcmp(&pcd->looking_for, &obj->oid) && !pcd->dst->len) { + reset_revision_walk(); + describe_commit(&pcd->current_commit, pcd->dst); + strbuf_addf(pcd->dst, ":%s", path); + free_commit_list(pcd->revs->commits); + pcd->revs->commits = NULL; + } +} + +static void describe_blob(struct object_id oid, struct strbuf *dst) +{ + struct rev_info revs; + struct argv_