Re: [PATCH 2/6] t1414: document some reflog-walk oddities
On Thu, Jul 06, 2017 at 08:42:45AM -0700, Junio C Hamano wrote: > >> I somehow feel that the "showing all entries from HEAD and then > >> showing all from side" is simply a buggy behaviour. I do not think > >> our users would terribly mind if we changed it later. But I may be > >> missing the reason why (sometimes?) the sequential behaviour may be > >> useful. > > > > If we think it's buggy, we can fix it now. But I'm not convinced that > > sequential iteration is that bad. It's at least _simple_ and easy to > > explain. > > Yes, I agree that sequential is easy to explain, but only when I > consider use of "log" family without "-n 30" or "--since 3.days". > It still is easy to explain---we show from one and then from the > other, but because we stop after showing 30 of them, and the first > one has more than that, you do not see any from the latter. Ah, right, I didn't think of limiting like that. I agree that makes a strong argument in favor of the date-ordered queue. I suspect that "--since 3.days" is still quite buggy (even with a single reflog) because it checks commit timestamps and stops traversing when we go too bar back. But in a reflog, the commits may be totally out of order. I'm not sure what it should do. Either: 1. During a reflog walk, --since and --until should respect reflog timestamps instead of commit timestamps. You can already do "--until" there by simply starting the traversal later, but there's no way to cut it off with --since. 2. Limit commits shown by --since/--until as usual, but skip the "stop traversing" optimization when we see too many "old" commits. I.e., omit a 4.days.ago commit, but keep walking to find other recent commits. -Peff
Honorable Barrister Paul Williams.
Attn: Sir/Madam I am Honorable Barrister Paul Williams the personal resident Attorney here in Burkina Faso to Late Mr. Muammar Muhammad Abu Minyar AL-Gaddafi of Libya c. 1942 – 20 October 2011. Late Mr. Muammar Muhammad Abu Minyar al-Gaddafi c. 1942 – 20 October 2011, commonly known as Colonel Gaddafi, was a Libyan former head of state, revolutionary and a politician, who died on 20 October 2011, was my client here in Burkina Faso Africa. My client Late Mr. Muammar Muhammad Abu Minyar AL-Gaddafi c. 1942 – 20 October 2011, was having a deposit sum of {thirty million four Hundred thousand united state dollars} only ($30.4M USD) with a security finance firm affiliated with African development bank here in Burkina Faso. With the above explanation’s I want to move this money from Burkina Faso to your country, affidavit on your name, but note that this is a deal between me and you and should not be related to anybody until the deal is over for security reasons, please if interested reply as soon as possible. Thanks, Honorable Barrister Paul Williams.
Re: [PATCH 6/6] reflog-walk: stop using fake parents
On Thu, Jul 06, 2017 at 11:02:24PM -0400, Jeff King wrote: > On Fri, Jul 07, 2017 at 12:32:39AM +, Eric Wong wrote: > > > I'm not sure why, but this is causing t1414.8 failures on 32-bit > > x86 with the latest pu with Debian jessie (oldstable). > > > > Reverting this (beafb2c62947a6d4a97b9c3baf99fe62ec8e830f) in pu > > seems to fix the test for me. > > > > +Cc: Ramsay since he also had a 32-bit environment. > > Thanks, I was able to reproduce with CFLAGS=-m32. > > > --- expect 2017-07-07 00:30:57.796325232 + > > +++ actual 2017-07-07 00:30:57.796325232 + > > @@ -3,6 +3,8 @@ > > HEAD@{2} checkout: moving from master to side > > HEAD@{3} commit: two > > HEAD@{4} commit (initial): one > > -side@{0} commit (merge): Merge branch 'master' into side > > -side@{1} commit: three > > -side@{2} branch: Created from HEAD^ > > +HEAD@{0} commit (merge): Merge branch 'master' into side > > +HEAD@{1} commit: three > > +HEAD@{2} checkout: moving from master to side > > +HEAD@{3} commit: two > > +HEAD@{4} commit (initial): one > > That's quite an unexpected error (to show the HEAD reflog twice). Given > that it triggers with 32-bit builds, it's like there's some funny memory > error. And indeed, running it under valgrind shows a problem in > add_reflog_for_walk. I'll try to dig into it. Thanks for reporting. Ah, it's a bug in the existing code. It inserts an entry into a string list and then frees the memory, but the string list wasn't asked to make a copy of the string. You can see the bug by running t1414 with --valgrind even before any of the code changes (though note that you'll have to look at the "-v" output, since it's already marked as expect-failure). This fixes it: diff --git a/reflog-walk.c b/reflog-walk.c index b7e489ad32..c1f61c524a 100644 --- a/reflog-walk.c +++ b/reflog-walk.c @@ -136,6 +136,7 @@ struct reflog_walk_info { void init_reflog_walk(struct reflog_walk_info **info) { *info = xcalloc(1, sizeof(struct reflog_walk_info)); + (*info)->complete_reflogs.strdup_strings = 1; } int add_reflog_for_walk(struct reflog_walk_info *info, I'll add that to the 'maint' portion of the series. -Peff
Re: [PATCH 6/6] reflog-walk: stop using fake parents
On Fri, Jul 07, 2017 at 12:32:39AM +, Eric Wong wrote: > I'm not sure why, but this is causing t1414.8 failures on 32-bit > x86 with the latest pu with Debian jessie (oldstable). > > Reverting this (beafb2c62947a6d4a97b9c3baf99fe62ec8e830f) in pu > seems to fix the test for me. > > +Cc: Ramsay since he also had a 32-bit environment. Thanks, I was able to reproduce with CFLAGS=-m32. > --- expect2017-07-07 00:30:57.796325232 + > +++ actual2017-07-07 00:30:57.796325232 + > @@ -3,6 +3,8 @@ > HEAD@{2} checkout: moving from master to side > HEAD@{3} commit: two > HEAD@{4} commit (initial): one > -side@{0} commit (merge): Merge branch 'master' into side > -side@{1} commit: three > -side@{2} branch: Created from HEAD^ > +HEAD@{0} commit (merge): Merge branch 'master' into side > +HEAD@{1} commit: three > +HEAD@{2} checkout: moving from master to side > +HEAD@{3} commit: two > +HEAD@{4} commit (initial): one That's quite an unexpected error (to show the HEAD reflog twice). Given that it triggers with 32-bit builds, it's like there's some funny memory error. And indeed, running it under valgrind shows a problem in add_reflog_for_walk. I'll try to dig into it. Thanks for reporting. -Peff
Re: [RFC/WIP PATCH] object store classification
Stefan Bellerwrites: > Subject: Re: [RFC/WIP PATCH] object store classification I thought you are writing different object-store backends and classifying them into many categories (e.g. local, networked, telepathic, etc.) It is a logical next step to put per-repository things into the_repository. I skimmed the patch and the changes smelled sane, but I didn't read it really carefully with fine toothed comb. Thanks. The remainder kept for reference, but no additional comments in there. > This continues the efforts of bw/repo-object, making our code base > more object oriented by adding the object store to the the repository struct. > > Long term goal of the series that would evolve from this discussion: > * Easier to implement submodule features as it can be in the same process. > > Short term goal: > * get rid of 'add_submodule_odb' in submodule.c > When fetching or pushing we need to determine if a submodule needs > processing > by finding out if certain commits are present. To do this we add the > submodule > objects as an alternate object database and do the processing in the same > process. In case of multiple submodules, this would pollute the object store > hence being slower and increasing memory usage, too. > > This patch only changes the object store code to be object oriented based > on the repository struct. > > To go for the short term goal we'd need to convert a few more places, mostly > all the construction (blob.c, tree.c, commit.c) > > This is marked RFC as I'd want to gather feedback if the approach, presented > in the header files is sound. > > Thanks! > > Signed-off-by: Stefan Beller > --- > > Peff, this area of the code seems performance sensitive for a variety of > use cases, specifically yours. So I cc'd you here. :) > > object.c | 77 > +++- > object.h | 51 > repository.h | 6 + > 3 files changed, 92 insertions(+), 42 deletions(-) > > diff --git a/object.c b/object.c > index 06ba3a11d8..b5ec0bb2f9 100644 > --- a/object.c > +++ b/object.c > @@ -5,17 +5,15 @@ > #include "commit.h" > #include "tag.h" > > -static struct object **obj_hash; > -static int nr_objs, obj_hash_size; > - > -unsigned int get_max_object_index(void) > +unsigned int object_store_get_max_index(struct repository *r) > { > - return obj_hash_size; > + return r->objects.obj_hash_size; > } > > -struct object *get_indexed_object(unsigned int idx) > +struct object *object_store_get_indexed(struct repository *r, > + unsigned int idx) > { > - return obj_hash[idx]; > + return r->objects.obj_hash[idx]; > } > > static const char *object_type_strings[] = { > @@ -82,11 +80,15 @@ static void insert_obj_hash(struct object *obj, struct > object **hash, unsigned i > * Look up the record for the given sha1 in the hash map stored in > * obj_hash. Return NULL if it was not found. > */ > -struct object *lookup_object(const unsigned char *sha1) > +struct object *object_store_lookup(struct repository *r, > +const unsigned char *sha1) > { > unsigned int i, first; > struct object *obj; > > + struct object **obj_hash = r->objects.obj_hash; > + int obj_hash_size = r->objects.obj_hash_size; > + > if (!obj_hash) > return NULL; > > @@ -114,29 +116,31 @@ struct object *lookup_object(const unsigned char *sha1) > * power of 2 (but at least 32). Copy the existing values to the new > * hash map. > */ > -static void grow_object_hash(void) > +static void grow_object_hash(struct repository *r) > { > int i; > /* >* Note that this size must always be power-of-2 to match hash_obj >* above. >*/ > - int new_hash_size = obj_hash_size < 32 ? 32 : 2 * obj_hash_size; > + int new_hash_size = r->objects.obj_hash_size < 32 ? > + 32 : 2 * r->objects.obj_hash_size; > struct object **new_hash; > > new_hash = xcalloc(new_hash_size, sizeof(struct object *)); > - for (i = 0; i < obj_hash_size; i++) { > - struct object *obj = obj_hash[i]; > + for (i = 0; i < r->objects.obj_hash_size; i++) { > + struct object *obj = r->objects.obj_hash[i]; > if (!obj) > continue; > insert_obj_hash(obj, new_hash, new_hash_size); > } > - free(obj_hash); > - obj_hash = new_hash; > - obj_hash_size = new_hash_size; > + free(r->objects.obj_hash); > + r->objects.obj_hash = new_hash; > + r->objects.obj_hash_size = new_hash_size; > } > > -void *create_object(const unsigned char *sha1, void *o) > +void *object_store_create(struct repository *r, > + const unsigned char *sha1, void *o) > { > struct object *obj = o; > > @@ -145,15 +149,17 @@ void
Re: What's cooking in git.git (Jul 2017, #01; Wed, 5)
Stefan Bellerwrites: > Speaking of submodules, It's not just features, but I also send bug fixes. ;) > https://public-inbox.org/git/20170630003851.17288-1-sbel...@google.com/ > (That patch is not related to this series, except for working in the submodule > area, but I consider that patch more important than e.g. this series.) I did not see the patch as fixing a bug, though. I do agree that overwriting the branch tips in the submodule repositories, possibly rewinding and discarding user's work done on the local branches, is indeed a problem. It however is unclear why detaching HEAD is a good solution to solve that problem. After all, there must have been a reason why the user had checked out a branch and had pointed it at a specific commit (presumably, so that further work would be done while on the branch, to make it easier and safer to eventually push the result back to the upstream of the submodule's project). So another solution that seems equally viable, if not even more so, could be to fail the recursive checkout saying why the checkout cannot be done, just like we fail a checkout when a local change interferes with updating the contents in the working tree and the index with an error message explaining which paths are problematic. I am *not* saying which one among the above two is better; I am not even saying that there could be only these two possible solutions. I just found the posted patch unsatisfactory because it did not make it clear why the chosen solution is a good one. Perhaps I misread the description; but that would mean the description was prone to be misread and has room for improvement ;-)
[RFPR] please ping me with materials for upcoming 2.14
As I want to start the pre-release stabilization period for the upcoming Git 2.14 sometime next week, please throw me a pull request soonish if you have updates meant for it. Thanks.
Re: [PATCH 6/6] reflog-walk: stop using fake parents
I'm not sure why, but this is causing t1414.8 failures on 32-bit x86 with the latest pu with Debian jessie (oldstable). Reverting this (beafb2c62947a6d4a97b9c3baf99fe62ec8e830f) in pu seems to fix the test for me. +Cc: Ramsay since he also had a 32-bit environment. --8<-- ok 7 - --parents shows true parents expecting success: { do_walk HEAD && do_walk side } >expect && do_walk HEAD side >actual && test_cmp expect actual --- expect 2017-07-07 00:30:57.796325232 + +++ actual 2017-07-07 00:30:57.796325232 + @@ -3,6 +3,8 @@ HEAD@{2} checkout: moving from master to side HEAD@{3} commit: two HEAD@{4} commit (initial): one -side@{0} commit (merge): Merge branch 'master' into side -side@{1} commit: three -side@{2} branch: Created from HEAD^ +HEAD@{0} commit (merge): Merge branch 'master' into side +HEAD@{1} commit: three +HEAD@{2} checkout: moving from master to side +HEAD@{3} commit: two +HEAD@{4} commit (initial): one not ok 8 - walking multiple reflogs shows both # # { # do_walk HEAD && # do_walk side # } >expect && # do_walk HEAD side >actual && # test_cmp expect actual # expecting success: head=$(git rev-parse HEAD) && one=$(git rev-parse one) && ident="$GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE" && echo "$head $one $ident broken reflog entry" >>.git/logs/HEAD && echo $one >expect && git log -g --format=%H -1 >actual && test_cmp expect actual ok 9 - walk prefers reflog to ref tip
Re: [PATCH] push: disable lazy --force-with-lease by default
On Thu, Jul 6, 2017 at 3:39 PM, Junio C Hamanowrote: >>> @@ -91,6 +91,7 @@ test_expect_success 'push to update (allowed)' ' >>> ( >>> cd dst && >>> test_commit D && >>> + git config push.allowLazyForceWithLease false && >> >> Here I thought >> >> test_config -C dst ... >> >> at the beginning might be useful, though .. > > I did not think test_config would work inside a subshell. It does not, it would need to be done outside the ( ), which is a bit subtle.
Re: [PATCH] push: disable lazy --force-with-lease by default
Stefan Bellerwrites: >> diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt >> index 0a639664fd..1fa01210a2 100644 >> --- a/Documentation/git-push.txt >> +++ b/Documentation/git-push.txt >> @@ -212,8 +212,9 @@ must not already exist. >> + >> Note that all forms other than `--force-with-lease=:` >> that specifies the expected current value of the ref explicitly are >> -still experimental and their semantics may change as we gain experience > > This indicates that this feature is not 'experimental' any more, but disabled > (for safety reasons as described below). This implies we will not change the > heuristic for push.allowLazyForceWithLease easily. I actually wanted to say it was a failed experiment, but I see your point. Let's leave the "still experimental" label. >> test_expect_success 'push to update (protected)' ' >> @@ -78,7 +79,6 @@ test_expect_success 'push to update (protected, tracking, >> forced)' ' >> ( >> cd dst && >> test_commit E && >> - git ls-remote . refs/remotes/origin/master >expect && > > This seems unrelated, so ideally it is a separate commit? Yes, I looked for other things that may be good to fix while at it, and planned to make a separate clean-up patch if I found enough, but this was the only one. >> @@ -91,6 +91,7 @@ test_expect_success 'push to update (allowed)' ' >> ( >> cd dst && >> test_commit D && >> + git config push.allowLazyForceWithLease false && > > Here I thought > > test_config -C dst ... > > at the beginning might be useful, though .. I did not think test_config would work inside a subshell.
Re: Should "head" also work for "HEAD" on case-insensitive FS?
Junio C Hamanowrites: > Once Michael's packed-refs backend stabilizes, we may have a nice > calm period in the refs subsystem and I expect that this will become > a good medium-sized project for a contributor who does not have to > be so experienced (but not a complete newbie). > > It needs to: > > - add icase-files-backend, preferrably sharing as much code as the >existing files-backend, in refs/. > > - design a mechanism to configure which refs backend to use at >runtime; as this has to be done fairly early in the control flow, >this will likely to use early configuration mechanism and will >probably need to be done in the set-up code, but doing it lazy >may even be nicer, as not all subcommands need access to refs. > > Thanks for a pointer to the archive. So here is an early WIP/illustration I did to see how involved such a change would be, which should apply cleanly on top of 'pu'. I was pleasantly surprised how cleanly refs/files-backend.c separates the notion of "path" and "refname". Only two functions, files_reflog_path() and files_ref_path(), are responsible for taking the refname and turning it to the pathname of an filesystem entity. One one function, loose_fill_ref_dir(), is responsible for running readdir() to find pathname, and turning the result into a refname. So in theory, these three are the only things that need to know about the "encoding". The exact detail of the encoding used here is immaterial, but I just used "encode uppercase letters and % as % followed by two hex", which was simple enough. Usual refs/heads/master and friends will not have to be touched when encoded this way. Perhaps the decoding side should be tweaked so that uppercase letters it sees needs to be downcased to avoid "refs/heads/Foo" getting returned as "Foo" branch, as a "Foo" branch should have been encoded as "refs/heads/%46oo". Having said that, this patch alone does not quite work yet. * In the repository discovery code, we have some logic that hard-codes the path in the directory (which is a candidate for being a repository) to check, like "refs/" and "HEAD". In the attached illustration patch, files_path_encode() special cases "HEAD" so that it is not munged, which is a bit of ugly workaround for this. * I haven't figured out why, but what refs.c calls "pseudo refs" bypasses the files backend layer for some but not all operations, which causes t1405-main-ref-store to fail. The test creates a "pseudo ref" FOO and then tries to remove it. Creation seems to follow the files-backend.c and thusly goes through the escaping; refs.::refs_delete_ref() however does not consult files-backend.c and fails to find and delete .git/FOO, because the creation side encoded it as ".git/%46%4F%4F". Michael is CC'ed as I thought it would be simpler to just ask about the latter bullet point than digging it further myself ;-) Thanks. refs/files-backend.c | 62 +++- 1 file changed, 56 insertions(+), 6 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 923e481e06..5bde77cbf8 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -23,6 +23,7 @@ struct ref_lock { struct files_ref_store { struct ref_store base; unsigned int store_flags; + int encode_names; char *gitdir; char *gitcommondir; @@ -54,6 +55,9 @@ static struct ref_store *files_ref_store_create(const char *gitdir, base_ref_store_init(ref_store, _be_files); refs->store_flags = flags; + /* git_config_get_bool("core.escapeLooseRefNames", >encode_names); */ + refs->encode_names = 1; + refs->gitdir = xstrdup(gitdir); get_common_dir_noenv(, gitdir); refs->gitcommondir = strbuf_detach(, NULL); @@ -102,6 +106,49 @@ static struct files_ref_store *files_downcast(struct ref_store *ref_store, return refs; } +static void files_path_encode(struct files_ref_store *refs, + struct strbuf *sb, const char *refname) +{ + if (!refs->encode_names || !strcmp(refname, "HEAD")) { + strbuf_addstr(sb, refname); + } else { + const char *cp; + + for (cp = refname; *cp; cp++) { + int ch = *cp; + if (('A' <= ch && ch <= 'Z') || (ch == '%')) + strbuf_addf(sb, "%%%02x", ch); + else + strbuf_addch(sb, ch); + } + } +} + +static int files_path_decode(struct files_ref_store *refs, +struct strbuf *sb, const char *name, int namelen) +{ + if (!refs->encode_names) { + strbuf_add(sb, name, namelen); + } else { + size_t origlen = sb->len; + + while (namelen--) { + int ch = *name++; + +
[RFC/WIP PATCH] object store classification
This continues the efforts of bw/repo-object, making our code base more object oriented by adding the object store to the the repository struct. Long term goal of the series that would evolve from this discussion: * Easier to implement submodule features as it can be in the same process. Short term goal: * get rid of 'add_submodule_odb' in submodule.c When fetching or pushing we need to determine if a submodule needs processing by finding out if certain commits are present. To do this we add the submodule objects as an alternate object database and do the processing in the same process. In case of multiple submodules, this would pollute the object store hence being slower and increasing memory usage, too. This patch only changes the object store code to be object oriented based on the repository struct. To go for the short term goal we'd need to convert a few more places, mostly all the construction (blob.c, tree.c, commit.c) This is marked RFC as I'd want to gather feedback if the approach, presented in the header files is sound. Thanks! Signed-off-by: Stefan Beller--- Peff, this area of the code seems performance sensitive for a variety of use cases, specifically yours. So I cc'd you here. :) object.c | 77 +++- object.h | 51 repository.h | 6 + 3 files changed, 92 insertions(+), 42 deletions(-) diff --git a/object.c b/object.c index 06ba3a11d8..b5ec0bb2f9 100644 --- a/object.c +++ b/object.c @@ -5,17 +5,15 @@ #include "commit.h" #include "tag.h" -static struct object **obj_hash; -static int nr_objs, obj_hash_size; - -unsigned int get_max_object_index(void) +unsigned int object_store_get_max_index(struct repository *r) { - return obj_hash_size; + return r->objects.obj_hash_size; } -struct object *get_indexed_object(unsigned int idx) +struct object *object_store_get_indexed(struct repository *r, + unsigned int idx) { - return obj_hash[idx]; + return r->objects.obj_hash[idx]; } static const char *object_type_strings[] = { @@ -82,11 +80,15 @@ static void insert_obj_hash(struct object *obj, struct object **hash, unsigned i * Look up the record for the given sha1 in the hash map stored in * obj_hash. Return NULL if it was not found. */ -struct object *lookup_object(const unsigned char *sha1) +struct object *object_store_lookup(struct repository *r, + const unsigned char *sha1) { unsigned int i, first; struct object *obj; + struct object **obj_hash = r->objects.obj_hash; + int obj_hash_size = r->objects.obj_hash_size; + if (!obj_hash) return NULL; @@ -114,29 +116,31 @@ struct object *lookup_object(const unsigned char *sha1) * power of 2 (but at least 32). Copy the existing values to the new * hash map. */ -static void grow_object_hash(void) +static void grow_object_hash(struct repository *r) { int i; /* * Note that this size must always be power-of-2 to match hash_obj * above. */ - int new_hash_size = obj_hash_size < 32 ? 32 : 2 * obj_hash_size; + int new_hash_size = r->objects.obj_hash_size < 32 ? + 32 : 2 * r->objects.obj_hash_size; struct object **new_hash; new_hash = xcalloc(new_hash_size, sizeof(struct object *)); - for (i = 0; i < obj_hash_size; i++) { - struct object *obj = obj_hash[i]; + for (i = 0; i < r->objects.obj_hash_size; i++) { + struct object *obj = r->objects.obj_hash[i]; if (!obj) continue; insert_obj_hash(obj, new_hash, new_hash_size); } - free(obj_hash); - obj_hash = new_hash; - obj_hash_size = new_hash_size; + free(r->objects.obj_hash); + r->objects.obj_hash = new_hash; + r->objects.obj_hash_size = new_hash_size; } -void *create_object(const unsigned char *sha1, void *o) +void *object_store_create(struct repository *r, + const unsigned char *sha1, void *o) { struct object *obj = o; @@ -145,15 +149,17 @@ void *create_object(const unsigned char *sha1, void *o) obj->flags = 0; hashcpy(obj->oid.hash, sha1); - if (obj_hash_size - 1 <= nr_objs * 2) - grow_object_hash(); + if (r->objects.obj_hash_size - 1 <= r->objects.nr_objs * 2) + grow_object_hash(r); - insert_obj_hash(obj, obj_hash, obj_hash_size); - nr_objs++; + insert_obj_hash(obj, r->objects.obj_hash, r->objects.obj_hash_size); + r->objects.nr_objs++; return obj; } -void *object_as_type(struct object *obj, enum object_type type, int quiet) +void *object_as_type(struct object *obj, +enum object_type type, +
Re: [PATCH] push: disable lazy --force-with-lease by default
On Thu, Jul 6, 2017 at 11:56 AM, Junio C Hamanowrote: > "git push --force-with-lease=:" makes sure that > there is no unexpected changes to the branch at the remote while you > prepare a rewrite based on the old state of the branch. This > feature came with an experimental option that allows : part > to be omitted by using the tip of remote-tracking branch that > corresponds to the . > > It turns out that some people use third-party tools that fetch from > remote and update the remote-tracking branches behind users' back, > defeating the safety relying on the stability of the remote-tracking > branches. We have some warning text that was meant to be scary > sounding in our documentation, but nevertheless people seem to be > bitten. cf. > https://public-inbox.org/git/1491617750.2149.10.ca...@mattmccutchen.net/ > for a recent example. > > Let's disable the form that relies on the stability of remote-tracking > branches by default, and allow users who _know_ their remote-tracking > branches are stable to enable it with a configuration variable. > > This problem was predicted from the very beginning; see 28f5d176 > (remote.c: add command line option parser for "--force-with-lease", > 2013-07-08). > > Signed-off-by: Junio C Hamano > --- > > * This is a bit overdue safety fix that we should have done long >time ago. If we had this, I do not think it makes it riskier to >forbid --force and tell people to use --force-with-lease. > > Documentation/config.txt | 5 + > Documentation/git-push.txt | 5 +++-- > builtin/send-pack.c| 5 + > remote.c | 16 > remote.h | 2 ++ > send-pack.c| 1 + > t/t5533-push-cas.sh| 19 +-- > transport-helper.c | 5 + > transport.c| 5 + > 9 files changed, 55 insertions(+), 8 deletions(-) > > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 06898a7498..2f929315a2 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -2588,6 +2588,11 @@ new default). > > -- > > +push.allowLazyForceWithLease:: > + If set to true, allow the `--force-with-lease` option > + without the expected object name (i.e. expecting the objects > + at the tip of corresponding remote-tracking branches). > + > push.followTags:: > If set to true enable `--follow-tags` option by default. You > may override this configuration at time of push by specifying > diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt > index 0a639664fd..1fa01210a2 100644 > --- a/Documentation/git-push.txt > +++ b/Documentation/git-push.txt > @@ -212,8 +212,9 @@ must not already exist. > + > Note that all forms other than `--force-with-lease=:` > that specifies the expected current value of the ref explicitly are > -still experimental and their semantics may change as we gain experience This indicates that this feature is not 'experimental' any more, but disabled (for safety reasons as described below). This implies we will not change the heuristic for push.allowLazyForceWithLease easily. Upon reading this documentation and the safety issue, I thought one could have used the reflog to make it safer as well: "I have (manually) inspected the remote tracking branch just before lunch, so now I can safely push with lease after lunch" would translate to e.g. "--force-with-lease-safe--inspection-time=1h", which would make sure that no reflog entries for the given branches exist in the last hour. Just as food for thought. > test_expect_success 'push to update (protected)' ' > @@ -78,7 +79,6 @@ test_expect_success 'push to update (protected, tracking, > forced)' ' > ( > cd dst && > test_commit E && > - git ls-remote . refs/remotes/origin/master >expect && This seems unrelated, so ideally it is a separate commit? Fine with me though. > @@ -91,6 +91,7 @@ test_expect_success 'push to update (allowed)' ' > ( > cd dst && > test_commit D && > + git config push.allowLazyForceWithLease false && Here I thought test_config -C dst ... at the beginning might be useful, though .. > @@ -103,6 +104,10 @@ test_expect_success 'push to update (allowed, tracking)' > ' > ( > cd dst && > test_commit D && > + git config push.allowLazyForceWithLease false && > + test_must_fail git push --force-with-lease=master origin > master 2>err && > + grep "lazy force-with-lease" err && > + git config --unset push.allowLazyForceWithLease && .. here the -C is not useful, so just using it once above may not be a good idea. For more dense and readable tests (also faster?), have you considered using passing the option via -c instead of setting and unsetting it? Thanks,
Re: [PATCH] push: add config option to --force-with-lease by default.
Francesco Mazzoliwrites: > Moreover, it seems to me that the problem `--force-with-lease` is > just one of marketing. `--force-with-lease` is strictly more "safe" > than `--force` in the sense that it'll reject some pushes that `--force` > will let through. By that logic, a hypothetical update to `--force` that makes 1/3 of the attempted forced push randomly would make it safer than the current `--force`, wouldn't it? When third-party tools fetch and update remote-tracking branches behind the users' back, the safety based on the stability of remote-tracking branches are defeated. And the biggest problem is that the way `--force-with-lease` misbehaves---it is not like it randomly and mistakenly stops the push that could go through; it lets through what shouldn't. See the other patch I sent just now---with something like that patch that lets those like you, who know their remote-tracking branches are reliable, use the lazy form, while disabling it by default for others (until they examine their situation and perhaps disable the problematic auto-fetching) in place, I do not think it is a bad idea to advertise --force-with-lease a safer option than --force (because those for whom it is not safer will not be able to use it).
[PATCH] push: disable lazy --force-with-lease by default
"git push --force-with-lease=:" makes sure that there is no unexpected changes to the branch at the remote while you prepare a rewrite based on the old state of the branch. This feature came with an experimental option that allows : part to be omitted by using the tip of remote-tracking branch that corresponds to the . It turns out that some people use third-party tools that fetch from remote and update the remote-tracking branches behind users' back, defeating the safety relying on the stability of the remote-tracking branches. We have some warning text that was meant to be scary sounding in our documentation, but nevertheless people seem to be bitten. cf. https://public-inbox.org/git/1491617750.2149.10.ca...@mattmccutchen.net/ for a recent example. Let's disable the form that relies on the stability of remote-tracking branches by default, and allow users who _know_ their remote-tracking branches are stable to enable it with a configuration variable. This problem was predicted from the very beginning; see 28f5d176 (remote.c: add command line option parser for "--force-with-lease", 2013-07-08). Signed-off-by: Junio C Hamano--- * This is a bit overdue safety fix that we should have done long time ago. If we had this, I do not think it makes it riskier to forbid --force and tell people to use --force-with-lease. Documentation/config.txt | 5 + Documentation/git-push.txt | 5 +++-- builtin/send-pack.c| 5 + remote.c | 16 remote.h | 2 ++ send-pack.c| 1 + t/t5533-push-cas.sh| 19 +-- transport-helper.c | 5 + transport.c| 5 + 9 files changed, 55 insertions(+), 8 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 06898a7498..2f929315a2 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2588,6 +2588,11 @@ new default). -- +push.allowLazyForceWithLease:: + If set to true, allow the `--force-with-lease` option + without the expected object name (i.e. expecting the objects + at the tip of corresponding remote-tracking branches). + push.followTags:: If set to true enable `--follow-tags` option by default. You may override this configuration at time of push by specifying diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt index 0a639664fd..1fa01210a2 100644 --- a/Documentation/git-push.txt +++ b/Documentation/git-push.txt @@ -212,8 +212,9 @@ must not already exist. + Note that all forms other than `--force-with-lease=:` that specifies the expected current value of the ref explicitly are -still experimental and their semantics may change as we gain experience -with this feature. +disabled by default. Read the note on safety below, and if you think +you are not affected, enable it with the `push.allowLazyForceWithLease` +configuration option (cf. linkgit:git-config[1]). + "--no-force-with-lease" will cancel all the previous --force-with-lease on the command line. diff --git a/builtin/send-pack.c b/builtin/send-pack.c index 633e0c3cdd..c008f5b60f 100644 --- a/builtin/send-pack.c +++ b/builtin/send-pack.c @@ -68,6 +68,11 @@ static void print_helper_status(struct ref *ref) msg = "stale info"; break; + case REF_STATUS_REJECT_LAZY_CAS: + res = "error"; + msg = "lazy force-with-error"; + break; + case REF_STATUS_REJECT_ALREADY_EXISTS: res = "error"; msg = "already exists"; diff --git a/remote.c b/remote.c index d87482573d..2d3ee6020f 100644 --- a/remote.c +++ b/remote.c @@ -1517,7 +1517,9 @@ void set_ref_status_for_push(struct ref *remote_refs, int send_mirror, int force_update) { struct ref *ref; + int allow_lazy_cas = 0; + git_config_get_bool("push.allowLazyForceWithLease", _lazy_cas); for (ref = remote_refs; ref; ref = ref->next) { int force_ref_update = ref->force || force_update; int reject_reason = 0; @@ -1544,7 +1546,9 @@ void set_ref_status_for_push(struct ref *remote_refs, int send_mirror, * branch. */ if (ref->expect_old_sha1) { - if (oidcmp(>old_oid, >old_oid_expect)) + if (!allow_lazy_cas && ref->lazy_cas) + reject_reason = REF_STATUS_REJECT_LAZY_CAS; + else if (oidcmp(>old_oid, >old_oid_expect)) reject_reason = REF_STATUS_REJECT_STALE; else /* If the ref isn't stale then force the update. */ @@ -2341,10 +2345,13 @@ static void apply_cas(struct push_cas_option *cas, if
Re: name-rev: anchor pattern without --exclude?
Bryan Turnerwrites: [...] > If you don't need the ancestor traversals "git name-rev" provides, > "git for-each-ref --count 1 --format "%(refname:short)" --points-at > refs/heads/" might work. That only goes back to Git 2.7.0, > though; still quite a ways off your 1.9 target. ("git branch > --points-at" does the same thing, I should add, and only for branches, > but you can't directly limit its output like you can with > "for-each-ref".. Perhaps that doesn't matter for your use case.) If > you want names like "master~2", from your example, though, > "--points-at" won't do what you need. Thanks for the suggestion. Unfortunately, I do want to consider names like "master~2". I was just hoping that I had overlooked another way to achieve the git name-rev --refs="refs/heads/*" --exclude="*/refs/heads/*" example I gave in my initial post. -- Kyle
Re: bug report: dates on diff
Junio C Hamanowrites: > Imagine this scenario: > > - Contributor A writes a change on 2017-07-01 and send it in to me > - Contributor B writes a change on 2017-07-03 and send it in to me > - I apply change from B on 2017-07-04 on 'master' > - I apply change from A on 2017-07-05 on 'master' > - You clone the resulting repository from me on 2017-07-06 > > Now, you have at the tip of 'master' in your repository the commit > that records the change by contributor A. > > And there are three times that are relevant to your tip of 'master'. > > - When was the commit that sits at the tip of 'master' made? > - When was the change recorded in that commit made? > - When was the commit made at the tip of _your_ 'master'? > > and the answers are 2017-07-01, 2017-07-05 and 2017-07-06, respectively. > They are called "committer", "author" and "reflog" timestamps. Oops, obviously the dates have to be 2017-07-05, 2017-07-01 and 2017-07-06. I made the commit on the 5th (i.e. "committer" timestamp) that records a change written on the 1st (i.e. "author" timestamp).
Re: bug report: dates on diff
Todd Lewiswrites: > Trying not to sound snide, but, what's the point of "--date=" on commits if > you > can't use it later? Granted, things always seem harder until you understand > how > the work. Thanks again. You do not sound snide at all, at least to me ;-) Imagine this scenario: - Contributor A writes a change on 2017-07-01 and send it in to me - Contributor B writes a change on 2017-07-03 and send it in to me - I apply change from B on 2017-07-04 on 'master' - I apply change from A on 2017-07-05 on 'master' - You clone the resulting repository from me on 2017-07-06 Now, you have at the tip of 'master' in your repository the commit that records the change by contributor A. And there are three times that are relevant to your tip of 'master'. - When was the commit that sits at the tip of 'master' made? - When was the change recorded in that commit made? - When was the commit made at the tip of _your_ 'master'? and the answers are 2017-07-01, 2017-07-05 and 2017-07-06, respectively. They are called "committer", "author" and "reflog" timestamps. The 'master@{}' syntax is about the reflog timestamp. It never looks at the former two. In general whenever you see @{...}, that is talking about the information that is stored in "reflog", the record of when and in what order the in _your_ repository pointed at various objects. The "commit --date=" is about tweaking the "author" timestamp the commit records. It does not affect the "committer" timestamp. By definition (of what "reflog" is), it will not affect the reflog timestamp, because the reflog timestamp for a particular commit would be different across repositories. I had A's commit on _my_ master on 2017-07-05, but the time you had it on _your_ master was not until 2017-07-06. I think the best way to do this properly would be to extend the "^{...}" syntax so that we can say e.g. git show "master^{#author-time > 2017-01-01}" to mean "traverse from the tip of 'master' and find the first commit that satisfies the given expression, "author-time > 2017-01-01", i.e. has author timestamp that is later than the specified date. Which would be in line with the existing git show "master^{/my commit message}" that means "traverse from the tip of 'master' and find the first commit that has 'my commit message' in the log message". Note that "git log --since= --until= master" would be the thing that is closest to what you would want that already exists, but that limits by the "committer" timestamp. You _could_ lie about both author and committer timestamp when building the backdated history and use this mechanism, but we have author and committer timestamps that are distinct for a reason, so it is a rather poor workaround.
Re: name-rev: anchor pattern without --exclude?
On Thu, Jul 6, 2017 at 10:23 AM, Kyle Meyerwrote: > Junio C Hamano writes: >> Kyle Meyer writes: > >> What is the answer desired by your application when two or more >> branches point at the same commit you are interested in? Pick one at >> random? An error saying it cannot decide where to place the snapshot? > > Our use of name-rev is just to get a friendly name for a hash. If two > branches point to the same commit, we're fine with whatever decision > "git name-rev" makes; we just want to limit it to refs in the > "refs/heads/" namespace. If you don't need the ancestor traversals "git name-rev" provides, "git for-each-ref --count 1 --format "%(refname:short)" --points-at refs/heads/" might work. That only goes back to Git 2.7.0, though; still quite a ways off your 1.9 target. ("git branch --points-at" does the same thing, I should add, and only for branches, but you can't directly limit its output like you can with "for-each-ref".. Perhaps that doesn't matter for your use case.) If you want names like "master~2", from your example, though, "--points-at" won't do what you need. Bryan > > -- > Kyle
Re: [RFC/PATCH v4 00/49] Add initial experimental external ODB support
On 7/1/2017 3:41 PM, Christian Couder wrote: On Fri, Jun 23, 2017 at 8:24 PM, Ben Peartwrote: On 6/20/2017 3:54 AM, Christian Couder wrote: To be able to better handle some kind of objects, for example big blobs, it would be nice if Git could store its objects in other object databases (ODB). To do that, this patch series makes it possible to register commands, also called "helpers", using "odb..command" config variables, to access external ODBs where objects can be stored and retrieved. External ODBs should be able to tranfer information about the blobs they store. This patch series shows how this is possible using kind of replace refs. Great to see this making progress! My thoughts and questions are mostly about the overall design tradeoffs. Is your intention to enable the ODB to completely replace the regular object store or just to supplement it? It is to supplement it, as I think the regular object store works very well most of the time. I certainly understand the desire to restrict the scope of the patch series. I know full replacement is a much larger problem as it would touch much more of the codebase. I'd still like to see an object store that was thread safe, more robust (ie transactional) and hopefully faster so I am hoping we can design the ODB interface to eventually enable that. For example: it seems the ODB helpers need to be able to be called before the regular object store in the "put" case (so they can intercept large objects for example) and after in in the "get" case to enable "fault-in." Something like this: have/get git object store large object ODB helper put === large object ODB helper git object store It would be nice if that order wasn't hard coded but that the order or level of the "git object store" could be specified using the same mechanism as used for the ODB helpers so that some day you could do something like this: have/get "LMDB" ODB helper git object store put === "LMDB" ODB helper git object store (and even further out, drop the current git object store completely :)). I think it would be good to ensure the interface is robust and performant enough to actually replace the current object store interface (even if we don't actually do that just yet). I agree that it should be robust and performant, but I don't think it needs to be as performant in all cases as the current object store right now. Another way of asking this is: do the 3 verbs (have, get, put) and the 3 types of "get" enable you to wrap the current loose object and pack file code as ODBs and run completely via the external ODB interface? If not, what is missing and can it be added? One example of what I think is missing is a way to stream objects (ie get_stream, put_stream). This isn't used often in git but it did exist last I checked. I'm not saying this needs to be supported in the first version - more if we want to support total replacement. I also wonder if we'd need an "optimize" verb (for "git gc") or a "validate" verb (for "git fsck"). Again, only if/when we are looking at total replacement. Right now the "put" verb only send plain blobs, so the most logical way to run completely via the external ODB interface would be to use it to send and receive plain blobs. There are tests scripts (t0420, t0470 and t0480) that use an http server as the external ODB and all the blobs are stored in it. And yeah for now it works only for blobs. There is a temporary patch in the series that limits it to blobs. For the non RFC patch series, I think it should either use the attribute system to tell which objects should be run via the external ODB interface, or perhaps there should be a way to ask each external ODB helper which kind of objects and blobs it can handle. I should add that in the future work part. Sounds good. For GVFS we handle all object types (including commits and trees) so would need this to be enabled so that we can switch to using it. _Eventually_ it would be great to see the current object store(s) moved behind the new ODB interface. This is not one of my goals and I think it could be a problem if we want to keep the "fault in" mode. > In this mode the helper writes or reads directly to or from the > current object store, so it needs the current object store to be > available. > I think implementing "fault in" should be an option that the ODB handler can implement but should not be required by the design/interface. As you state above, this could be as simple as having the ODB handler write the object to the git object store on "get." Also I think compatibility with other git implementations is important and it is a good thing that they can all work on a common repository format. I agree this should be an option but I don't want to say we'll _never_ move to a better object store. When there are multiple ODB providers, what is the order they are called? The
Re: bug report: dates on diff
On 07/06/2017 12:22 PM, Junio C Hamano wrote: > If you didn't create this repository back in 2012, then the syntax > "master@{01-01-2012}" that asks "Back at the beginning of 2012, what > object did the master branch point at?" does not have a sensible > answer. That can be seen in the warning you got from Git. > > Hope this clarifies. Thanks; it does explain what I saw, and even makes some sense. However, it does leave me scratching my head about how to accomplish what I set out to do, or just how to reference commits by their dates as displayed by git log w/o either changing the system time/date when committing historic data (which seems like a total no-no), or making multiple passes over "get log" to determine how to ref. Perhaps there's a way to patch the log to match dates recorded in relevant objects? Trying not to sound snide, but, what's the point of "--date=" on commits if you can't use it later? Granted, things always seem harder until you understand how the work. Thanks again.
Re: name-rev: anchor pattern without --exclude?
Junio C Hamanowrites: > Kyle Meyer writes: > >> [*] A bit more information on why I'm trying to do this: In Magit, we >> have a work-in-progress feature that takes "snapshots" of changes >> before they are committed. These snapshots are stored as >> "refs/wip/{wtree,index}/". >> >> We want to use name-rev to map a commit to a name in "refs/heads/", >> ignoring these snapshot refs. > > What is the in the above supposed to represent? It's the current branch, as returned by "git symbolic-ref HEAD". > When a user sees two refs "refs/wip/{wtree,index}/", > does it mean: "These two represent a snapshot for changes while the > user was working on this branch"? Yes. > Isn't name-rev a wrong tool to find that information? Sorry for the confusion. We're using "symbolic-ref HEAD" to get the above information. I was just trying to explain why we're dealing with ref names that contain a "refs/heads" substring (like refs/wip/wtree/refs/heads/master in the example I gave). > What is the answer desired by your application when two or more > branches point at the same commit you are interested in? Pick one at > random? An error saying it cannot decide where to place the snapshot? Our use of name-rev is just to get a friendly name for a hash. If two branches point to the same commit, we're fine with whatever decision "git name-rev" makes; we just want to limit it to refs in the "refs/heads/" namespace. -- Kyle
Re: name-rev: anchor pattern without --exclude?
Kyle Meyerwrites: > [*] A bit more information on why I'm trying to do this: In Magit, we > have a work-in-progress feature that takes "snapshots" of changes > before they are committed. These snapshots are stored as > "refs/wip/{wtree,index}/". > > We want to use name-rev to map a commit to a name in "refs/heads/", > ignoring these snapshot refs. What is the in the above supposed to represent? When a user sees two refs "refs/wip/{wtree,index}/", does it mean: "These two represent a snapshot for changes while the user was working on this branch"? Isn't name-rev a wrong tool to find that information? What is the answer desired by your application when two or more branches point at the same commit you are interested in? Pick one at random? An error saying it cannot decide where to place the snapshot? I am wondering if you are looking for "symbolic-ref HEAD".
Re: Flurries of 'git reflog expire'
I'm one of the Bitbucket Server developers. My apologies; I just noticed this thread or I would have jumped in sooner! On Thu, Jul 6, 2017 at 6:31 AM, Andreas Kreywrote: > On Wed, 05 Jul 2017 04:20:27 +, Jeff King wrote: >> On Tue, Jul 04, 2017 at 09:57:58AM +0200, Andreas Krey wrote: > ... >> And what does the process tree look like? > > Lots (~ 10) of > > \_ /usr/bin/git receive-pack > /opt/apps/atlassian/bitbucket-data/shared/data/repositories/68 > | \_ git gc --auto --quiet > | \_ git reflog expire --all > > plus another dozen gc/expire pairs where the parent is already gone. > All with the same arguments - auto GC. Do you know what version of Bitbucket Server is in use? Based on the fact that it's "git gc --auto" triggered from a "git receive-pack", that implies two things: - You're on a 4.x version of Bitbucket Server - The repository (68) has never been forked Depending on your Bitbucket Server version (this being the reason I asked), there are a couple different fixes available: - Fork the repository. You don't need to _use_ the fork, but having a fork existing will trigger Bitbucket Server to disable auto GC and fully manage that itself. That includes managing both _concurrency_ and _frequency_ of GC. This works on all versions of Bitbucket Server. - Run "git config gc.auto 0" in /opt/apps/atlassian/bitbucket-data/shared/data/repositories/68 to disable auto GC yourself. This may be preferable to forking the repository, which, in addition to disabling auto GC, also disables object pruning. However, you must be running at least Bitbucket Server 4.6.0 for this approach to work. Otherwise auto GC will simply be reenabled the first time Bitbucket Server goes to trigger GC, when it detects that the repository has no forks. Assuming you're on 4.6.0 or newer, either approach should fix the issue. If you're on 4.5 or older, forking is the only viable approach unless you upgrade Bitbucket Server first. I also want to add that Bitbucket Server 5.x includes totally rewritten GC handling. 5.0.x automatically disables auto GC in all repositories and manages it explicitly, and 5.1.x fully removes use of "git gc" in favor of running relevant plumbing commands directly. We moved away from "git gc" specifically to avoid the "git reflog expire --all", because there's no config setting that _fully disables_ forking that process. By default our bare clones only have reflogs for pull request refs, and we've explicitly configured those to never expire, so all "git reflog expire --all" can do is use up I/O and, quite frequently, fail because refs are updated. Since we stopped running "git gc", we've not yet seen any GC failures on our internal Bitbucket Server clusters. Bitbucket Server 5.1.x also includes a new "gc.log" (not to be confused with the one Git itself writes) which retains a record of every GC-related process we run in each repository, and how long that process took to complete. That can be useful for getting clearer insight into both how often GC work is being done, and how long it's taking. Upgrading to 5.x can be a bit of an undertaking, since the major version brings API changes, so it's totally understandable that many organizations haven't upgraded yet. I'm just noting that these improvements are there for when such an upgrade becomes viable. Hope this helps! Bryan > > I'd wager that each push sees that a GC is in order, > and doesn't notice that there is one already running. > > - Andreas > > -- > "Totally trivial. Famous last words." > From: Linus Torvalds > Date: Fri, 22 Jan 2010 07:29:21 -0800
name-rev: anchor pattern without --exclude?
Hello, I'm trying to restrict name-rev's output to a ref whose name begins with the supplied pattern [*]. Looking at "man git-name-rev", builtin/name-rev.c, and wildmatch.c, I don't see a way to accomplish this with "--refs=". Say, for example, that I want to limit name-rev's output to a ref in the "refs/heads/" namespace. * cc8f7dc (master) c * ad6d6f0 b | * 49a2156 (refs/wip/wtree/refs/heads/master) d |/ * b91f97a a -- $ git name-rev b91f97a b91f97a wip/wtree/refs/heads/master~1 $ git name-rev --refs="refs/heads/*" b91f97a b91f97a wip/wtree/refs/heads/master~1 $ git name-rev --refs="^refs/heads/*" b91f97a b91f97a undefined Starting with v2.13.0, I can get my desired output using the --exclude option. $ git name-rev --refs="refs/heads/*" --exclude="*/refs/heads/*" b91f97a b91f97a master~2 But, unfortunately, I'm trying to find a solution that works with earlier Git versions (back to v1.9.4). Am I overlooking a way to do this without the --exclude option? [*] A bit more information on why I'm trying to do this: In Magit, we have a work-in-progress feature that takes "snapshots" of changes before they are committed. These snapshots are stored as "refs/wip/{wtree,index}/". We want to use name-rev to map a commit to a name in "refs/heads/", ignoring these snapshot refs. -- Kyle
Re: [PATCH] t5534: fix misleading grep invocation
Michael J Gruberwrites: > Junio C Hamano venit, vidit, dixit 05.07.2017 18:26: > >> The invocation this fixes is not just misleading but simply wrong. >> Nicely spotted. > > In addition, the patch makes sure to catch any rev-parse failures which > the original invocation shove under the rug. Yeah, good thing that this got fixed ;-)
Re: bug report: dates on diff
Todd Lewiswrites: > [utoddl@tarna gitbug]$ git diff master@{01-01-2012} charter.txt > warning: Log for 'master' only goes back to Thu, 6 Jul 2017 08:19:45 -0400. What you observed is how @{} syntax is designed to work, and is not limited to "git diff". Any Git command e.g. "git rev-parse master@{01-01-2012}", would and should behave the same way. The thing to note is that the syntax does not pay any attention to author or committer dates recorded in the commit objects. In fact, if you have a ref that points at an object that is not a commit-ish, you can still use the syntax. If you did this, for example $ git config core.logallrefupdates always $ one=$(echo one | git hash-object --stdin -w) $ git update-ref refs/my/blob $one ... time passes ... $ two=$(echo two | git hash-object --stdin -w) $ git update-ref refs/my/blob $two then "git show my/blob@{2.minutes.ago}" will show the blob object your refs/my/blob ref was pointing at 2 minutes ago. And as you may know, blobs do not record any timestamp. So how does this work? The reason why this works is because the time in @{$time} syntax is about asking what was pointed by the back in $time in your repository. It does not matter what timestamp the object that was pointed by the has (or does not have). If you didn't create this repository back in 2012, then the syntax "master@{01-01-2012}" that asks "Back at the beginning of 2012, what object did the master branch point at?" does not have a sensible answer. That can be seen in the warning you got from Git. Hope this clarifies.
Re: [PATCH] comment: fix a typo in the comment
SZEDER Gáborwrites: > Speaking of sharp eyes... That Subject: line really needs a > s/comment:/commit:/, doesn't it? :) Surely it does.
[PATCH] reflog-walk: mark a file-local symbol as static
Signed-off-by: Ramsay Jones--- Hi Jeff, When you re-roll your 'jk/reflog-walk' branch, could you please squash this into the relevant patch (commit beafb2c629, "reflog-walk: stop using fake parents", 05-07-2017). Thanks! ATB, Ramsay Jones reflog-walk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reflog-walk.c b/reflog-walk.c index a7644d944..8effe722b 100644 --- a/reflog-walk.c +++ b/reflog-walk.c @@ -284,7 +284,7 @@ int reflog_walk_empty(struct reflog_walk_info *info) return !info || !info->nr; } -struct commit *next_entry_in_log(struct commit_reflog *log) +static struct commit *next_entry_in_log(struct commit_reflog *log) { while (log->recno >= 0) { struct reflog_info *entry = >reflogs->items[log->recno--]; -- 2.13.0
Re: [PATCH 2/6] t1414: document some reflog-walk oddities
Jeff Kingwrites: > On Wed, Jul 05, 2017 at 03:45:03PM -0700, Junio C Hamano wrote: > >> > I did make the ordering intentional. We process each reflog sequentially >> > in turn. I don't think it would be wrong to interleave them by date, but >> > I actually think it's useful for somebody who switches that behavior to >> > consciously update the test. Maybe it's worth calling out in the commit >> > message. >> > >> > I stopped short of actually documenting and guaranteeing that behavior >> > to the user. I don't know how comfortable we would be changing it later. >> >> I somehow feel that the "showing all entries from HEAD and then >> showing all from side" is simply a buggy behaviour. I do not think >> our users would terribly mind if we changed it later. But I may be >> missing the reason why (sometimes?) the sequential behaviour may be >> useful. > > If we think it's buggy, we can fix it now. But I'm not convinced that > sequential iteration is that bad. It's at least _simple_ and easy to > explain. Yes, I agree that sequential is easy to explain, but only when I consider use of "log" family without "-n 30" or "--since 3.days". It still is easy to explain---we show from one and then from the other, but because we stop after showing 30 of them, and the first one has more than that, you do not see any from the latter. It's just the easy-to-explian behaviour is not very useful and very hard to defend. > The only other thing that would make sense to me is sorting the > entries by date (reflog date, not commit date) and showing them in that > order. But that gives rise to some funny corner cases. > > What do we do if there's clock skew within the reflog (e.g., somebody > fixed their skewed clock such that HEAD@{5} is less recent than > HEAD@{6}). Do we show them out of order then (even if only a single > reflog is being shown?). I would think the easiest-to-explain thing is handle clock skew just like how clock skew affects the traversal with commit date. At least inside a single reflog, there is no need to sort---its append-only nature gives a reliable "newer to older" order. > But maybe we could do a pseudo-list-merge, where we treat the reflogs as > queues and always pick from the queue with the most recent timestamp. Yes, that was what I had in mind. An entry with an artificially old timestamp due to clock skew would clog the output and prevent the entries behind it on the same reflog from getting shown before entries from other reflog, but that is the same as what happens in the normal traversal to sane parents that can only be reached from a child commit that has an artificially old timestamp. Thanks.
Charity
Hi, My name is Peggy Altman the personal assistant of Ms. Doris Buffett, She is a philanthropist and the founder of one of the largest private foundation in the world. She is on a mission to give it all away as she strongly believes in ‘giving while living.’ She always had the idea that never changed in her mind — that wealth should be used to help each other which has made her decide to donate to you. She has instructed I get in touch with you, kindly acknowledge this message and I will get back to you with more details. Visit the web page to know more about her: http://abcnews.go.com/GMA/Books/giving-dorris-buffett-story-michael-zitz/story?id=10827641 Regards, Peggy Altman.
Re: [PATCH] comment: fix a typo in the comment
> On Wed, 2017-07-05 at 11:18 -0700, Junio C Hamano wrote: > > Kaartic Sivaraamwrites: > > > > > --- > > > �Though very trivial, I wanted to correct this as I didn't > > > �want to ignore it after seeing it. > > > > Thanks for sharp eyes.��Sign-off?��(or Sign-of? ;-)) > > > I should also thank you for your sharp eyes! Speaking of sharp eyes... That Subject: line really needs a s/comment:/commit:/, doesn't it? :)
Re: [PATCH] hooks: add signature using "interpret-trailers"
On Wed, 2017-07-05 at 21:14 +0100, Ramsay Jones wrote: > > On 05/07/17 18:35, Kaartic Sivaraam wrote: > > The sample hook to prepare the commit message before > > a commit allows users to opt-in to add the signature > > to the commit message. The signature is added at a place > > that isn't consistent with the "-s" option of "git commit". > > Further, it could go out of view in certain cases. > > > > Add the signature in a way similar to "-s" option of > > "git commit" using git's interpret-trailers command. > > > > It works well in all cases except when the user invokes > > "git commit" without any arguments. In that case manually > > add a new line after the first line to ensure it's consistent > > with the output of "-s" option. > > > > While at it, name the input parameters to improve readability > > of script. > > I assume each occurrence of 'signature' in the commit message, > including the subject, should be 'sign-off' (or Signed-off-by) > instead. Yes? > Yes. Thanks for pointing out a possible way in which the message could be misinterpreted. -- Kaartic
Re: Flurries of 'git reflog expire'
On Wed, 05 Jul 2017 04:20:27 +, Jeff King wrote: > On Tue, Jul 04, 2017 at 09:57:58AM +0200, Andreas Krey wrote: ... > I seem to recall that using --stale-fix is also extremely expensive, > too. What do the command line arguments for the slow commands look like? The problem isn't that the expire is slow, it is that there are many of them, waiting for disk writes. > And what does the process tree look like? Lots (~ 10) of \_ /usr/bin/git receive-pack /opt/apps/atlassian/bitbucket-data/shared/data/repositories/68 | \_ git gc --auto --quiet | \_ git reflog expire --all plus another dozen gc/expire pairs where the parent is already gone. All with the same arguments - auto GC. I'd wager that each push sees that a GC is in order, and doesn't notice that there is one already running. - Andreas -- "Totally trivial. Famous last words." From: Linus TorvaldsDate: Fri, 22 Jan 2010 07:29:21 -0800
Re: Flurries of 'git reflog expire'
On Tue, 04 Jul 2017 11:43:33 +, Ævar Arnfjörð Bjarmason wrote: ... > You can set gc.auto=0 in the repo to disable auto-gc, and play with > e.g. the reflog expire values, see the git-gc manpage. > > But then you need to run your own gc, which is not a bad idea anyway > with a dedicated git server. Actually, bitbucket should be doing this. Although I can't quite rule out the possibility that we reenabled GC in this repo some time ago. > But it would be good to get to the bottom of this, we shouldn't be > running these concurrently. Indeed. Unfortunately this isn't easily reproduced in the test instance, so I will need to get a newer git under the production bitbucket. There are quite some of \_ /usr/bin/git receive-pack /opt/apps/atlassian/bitbucket-data/shared/data/repositories/68 | \_ git gc --auto --quiet | \_ git reflog expire --all in the process tree, apparently a new one gets started even though previous ones are still running. The number of running expires grew slowly, in the order of many minutes. Andreas -- "Totally trivial. Famous last words." From: Linus TorvaldsDate: Fri, 22 Jan 2010 07:29:21 -0800
bug report: dates on diff
I've run into what I think is a bug wrt date handling in "git diff". I have some historical data with which I'm attempting to populate a new git repo with back-dated commits. That appears to work. But referencing those commits by date with "git diff" does not. (I have no idea if the problem is limited to "git diff"; that's just where I've been banging my head.) I'm using git version 2.9.4. Below is a recreation of the problem from scratch using made up data. Note that the "git log" output shows the commits with the correct dates, but referencing "master@{any-date-before-git-init}" throws a "Warning: Log for 'master' only goes back to ..." and it erroneously selects the first commit (for 2010 in the example below, instead of the requested 2012 commit). Here's the terminal session demonstrating the problem: [utoddl@tarna tmp]$ mkdir gitbug [utoddl@tarna tmp]$ cd gitbug [utoddl@tarna gitbug]$ git init Initialized empty Git repository in /tmp/gitbug/.git/ [utoddl@tarna gitbug]$ for year in 2010 2011 2012 2013 2014 ; do > echo "Charter for year $year." > charter_01-01-$year.txt > touch -d"$year-01-01" charter_01-01-$year.txt > done [utoddl@tarna gitbug]$ ls -al total 20 drwxrwxr-x. 3 utoddl utoddl 160 Jul 6 08:14 . drwxrwxrwt. 19 root root 420 Jul 6 08:12 .. -rw-rw-r--. 1 utoddl utoddl 23 Jan 1 2010 charter_01-01-2010.txt -rw-rw-r--. 1 utoddl utoddl 23 Jan 1 2011 charter_01-01-2011.txt -rw-rw-r--. 1 utoddl utoddl 23 Jan 1 2012 charter_01-01-2012.txt -rw-rw-r--. 1 utoddl utoddl 23 Jan 1 2013 charter_01-01-2013.txt -rw-rw-r--. 1 utoddl utoddl 23 Jan 1 2014 charter_01-01-2014.txt drwxrwxr-x. 7 utoddl utoddl 200 Jul 6 08:12 .git [utoddl@tarna gitbug]$ for year in 2010 2011 2012 2013 2014 ; do > cp -p charter_01-01-$year.txt charter.txt > git add charter.txt > git commit --date="01-01-$year" -m "Committing charter for $year." charter.txt > done [master (root-commit) f5dc22c] Committing charter for 2010. Date: Fri Jan 1 07:19:45 2010 -0500 1 file changed, 1 insertion(+) create mode 100644 charter.txt cp: overwrite 'charter.txt'? y [master dec8944] Committing charter for 2011. Date: Sat Jan 1 07:19:47 2011 -0500 1 file changed, 1 insertion(+), 1 deletion(-) cp: overwrite 'charter.txt'? y [master cab72e0] Committing charter for 2012. Date: Sun Jan 1 07:19:48 2012 -0500 1 file changed, 1 insertion(+), 1 deletion(-) cp: overwrite 'charter.txt'? y [master a0cbf74] Committing charter for 2013. Date: Tue Jan 1 07:19:49 2013 -0500 1 file changed, 1 insertion(+), 1 deletion(-) cp: overwrite 'charter.txt'? y [master a75164c] Committing charter for 2014. Date: Wed Jan 1 07:19:53 2014 -0500 1 file changed, 1 insertion(+), 1 deletion(-) [utoddl@tarna gitbug]$ git log commit a75164cdf56b5f7b10e7575ee4aa4f653656e456 Author: Todd LewisDate: Wed Jan 1 07:19:53 2014 -0500 Committing charter for 2014. commit a0cbf74320f5b2309f697a48a8026328a998d787 Author: Todd Lewis Date: Tue Jan 1 07:19:49 2013 -0500 Committing charter for 2013. commit cab72e0902cb4130de49c5d898ef4cc2daf1e2d1 Author: Todd Lewis Date: Sun Jan 1 07:19:48 2012 -0500 Committing charter for 2012. commit dec8944f3c411f5f667cdfefee0609d38298489f Author: Todd Lewis Date: Sat Jan 1 07:19:47 2011 -0500 Committing charter for 2011. commit f5dc22c636dd66a51e74dc01935a4879bd4946d6 Author: Todd Lewis Date: Fri Jan 1 07:19:45 2010 -0500 Committing charter for 2010. [utoddl@tarna gitbug]$ git diff master@{01-01-2012} charter.txt warning: Log for 'master' only goes back to Thu, 6 Jul 2017 08:19:45 -0400. diff --git a/charter.txt b/charter.txt index a18dbc9..29fedf9 100644 --- a/charter.txt +++ b/charter.txt @@ -1 +1 @@ -Charter for year 2010. +Charter for year 2014. [utoddl@tarna gitbug]$ -- +--+ / todd_le...@unc.edu 919-445-0091 http://www.unc.edu/~utoddl / / A man's home is his castle, in a manor of speaking. / +--+
Re: [PATCH] merge-message: change meaning of "empty merge message"
On Thu, 2017-07-06 at 06:46 +0200, Kevin Daudt wrote: > > The function is called "rest_is_empty". > Thanks for correcting that! > But isn't it better that commit and merge use the same code, instead > of > duplicating it again? Otherwise one may be updated, and the other > forgotten, getting differences in behaviur, which is what you want to > solve. > Yes, I did think of that. It *seems* that neither "message_is_empty" or the "rest_is_empty" are exposed to other files. Have to work on that. -- Kaartic
Re: [PATCH] t5534: fix misleading grep invocation
Junio C Hamano venit, vidit, dixit 05.07.2017 18:26: > Johannes Schindelinwrites: > >> It seems to be a little-known feature of `grep` (and it certainly came >> as a surprise to this here developer who believed to know the Unix tools >> pretty well) that multiple patterns can be passed in the same >> command-line argument simply by separating them by newlines. Watch, and >> learn: >> >> $ printf '1\n2\n3\n' | grep "$(printf '1\n3\n')" >> 1 >> 3 >> >> That behavior also extends to patterns passed via `-e`, and it is not >> modified by passing the option `-E` (but trying this with -P issues the >> error "grep: the -P option only supports a single pattern"). >> >> It seems that there are more old Unix hands who are surprised by this >> behavior, as grep invocations of the form >> >> grep "$(git rev-parse A B) C" file >> >> were introduced in a85b377d041 (push: the beginning of "git push >> --signed", 2014-09-12), and later faithfully copy-edited in b9459019bbb >> (push: heed user.signingkey for signed pushes, 2014-10-22). >> >> Please note that the output of `git rev-parse A B` separates the object >> IDs via *newlines*, not via spaces, and those newlines are preserved >> because the interpolation is enclosed in double quotes. >> >> As a consequence, these tests try to validate that the file contains >> either A's object ID, or B's object ID followed by C, or both. Clearly, >> however, what the test wanted to see is that there is a line that >> contains all of them. >> >> This is clearly unintended, and the grep invocations in question really >> match too many lines. >> >> Fix the test by avoiding the newlines in the patterns. >> >> Signed-off-by: Johannes Schindelin >> --- > > The invocation this fixes is not just misleading but simply wrong. > Nicely spotted. In addition, the patch makes sure to catch any rev-parse failures which the original invocation shove under the rug. > Thanks, will queue. Thanks from the faithful copy-editor ;) How did you spot this? Are there grep versions that behave differently? >> t/t5534-push-signed.sh | 14 ++ >> 1 file changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/t/t5534-push-signed.sh b/t/t5534-push-signed.sh >> index 5bcb288f5c4..464ffdd147a 100755 >> --- a/t/t5534-push-signed.sh >> +++ b/t/t5534-push-signed.sh >> @@ -119,8 +119,11 @@ test_expect_success GPG 'signed push sends push >> certificate' ' >> sed -n -e "s/^nonce /NONCE=/p" -e "/^$/q" dst/push-cert >> ) >expect && >> >> -grep "$(git rev-parse noop ff) refs/heads/ff" dst/push-cert && >> -grep "$(git rev-parse noop noff) refs/heads/noff" dst/push-cert && >> +noop=$(git rev-parse noop) && >> +ff=$(git rev-parse ff) && >> +noff=$(git rev-parse noff) && >> +grep "$noop $ff refs/heads/ff" dst/push-cert && >> +grep "$noop $noff refs/heads/noff" dst/push-cert && >> test_cmp expect dst/push-cert-status >> ' >> >> @@ -200,8 +203,11 @@ test_expect_success GPG 'fail without key and heed >> user.signingkey' ' >> sed -n -e "s/^nonce /NONCE=/p" -e "/^$/q" dst/push-cert >> ) >expect && >> >> -grep "$(git rev-parse noop ff) refs/heads/ff" dst/push-cert && >> -grep "$(git rev-parse noop noff) refs/heads/noff" dst/push-cert && >> +noop=$(git rev-parse noop) && >> +ff=$(git rev-parse ff) && >> +noff=$(git rev-parse noff) && >> +grep "$noop $ff refs/heads/ff" dst/push-cert && >> +grep "$noop $noff refs/heads/noff" dst/push-cert && >> test_cmp expect dst/push-cert-status >> ' >> >> >> base-commit: 5116f791c12dda6b6c22fa85b600a8e30dfa168a
Re: [PATCH 2/6] t1414: document some reflog-walk oddities
On Thu, Jul 06, 2017 at 03:16:06AM -0400, Jeff King wrote: > If we think it's buggy, we can fix it now. But I'm not convinced that > sequential iteration is that bad. It's at least _simple_ and easy to > explain. The only other thing that would make sense to me is sorting the > entries by date (reflog date, not commit date) and showing them in that > order. But that gives rise to some funny corner cases. > > What do we do if there's clock skew within the reflog (e.g., somebody > fixed their skewed clock such that HEAD@{5} is less recent than > HEAD@{6}). Do we show them out of order then (even if only a single > reflog is being shown?). > > Or do we try some complicated sort where entries compare sequentially > within a given reflog, but use date comparisons between different logs? > I don't think that can work with a normal comparison sort, as it makes > the comparison non-transitive. > > But maybe we could do a pseudo-list-merge, where we treat the reflogs as > queues and always pick from the queue with the most recent timestamp. > > That would probably be pretty easy to retrofit on the iteration from the > current series. Something like the patch below. Looking over the output, the interleaving makes it look confusing (especially if you don't use --date, since then the interleaved numbers have no relation to each other). It would help if we actually had a use case where somebody really did want to show two reflogs from a single command. Then we'd know what they were expecting and which output would be most useful, at least for that case. But I can't really think of why you'd want to do so (which is probably why nobody noticed for years that the current behavior is so broken). --- diff --git a/reflog-walk.c b/reflog-walk.c index a7644d944e..6cdf7bfb0f 100644 --- a/reflog-walk.c +++ b/reflog-walk.c @@ -90,7 +90,7 @@ struct commit_reflog { struct reflog_walk_info { struct commit_reflog **logs; - size_t nr, alloc, cur; + size_t nr, alloc; struct string_list complete_reflogs; struct commit_reflog *last_commit_reflog; }; @@ -284,26 +284,43 @@ int reflog_walk_empty(struct reflog_walk_info *info) return !info || !info->nr; } -struct commit *next_entry_in_log(struct commit_reflog *log) +static void skip_unusable_entries(struct commit_reflog *log) { while (log->recno >= 0) { - struct reflog_info *entry = >reflogs->items[log->recno--]; + struct reflog_info *entry = >reflogs->items[log->recno]; struct object *obj = parse_object(>noid); if (obj && obj->type == OBJ_COMMIT) - return (struct commit *)obj; + return; + log->recno--; } - return NULL; +} + +static timestamp_t next_timestamp(struct commit_reflog *log) +{ + return log->reflogs->items[log->recno].timestamp; } struct commit *next_reflog_entry(struct reflog_walk_info *walk) { - for (; walk->cur < walk->nr; walk->cur++) { - struct commit *ret = next_entry_in_log(walk->logs[walk->cur]); - if (ret) { - walk->last_commit_reflog = walk->logs[walk->cur]; - return ret; - } + struct commit_reflog *best = NULL; + size_t i; + + for (i = 0; i < walk->nr; i++) { + struct commit_reflog *log = walk->logs[i]; + + skip_unusable_entries(log); + if (log->recno < 0) + continue; + + if (!best || next_timestamp(log) > next_timestamp(best)) + best = log; } + + if (best) { + walk->last_commit_reflog = best; + return lookup_commit(>reflogs->items[best->recno--].noid); + } + return NULL; } The implementation is fairly naive. It would be O(nr_reflogs * nr_entries) in the worst case. I wouldn't expect nr_reflogs to get large. But if we cared, we could fix it by keeping the reflogs in a heap. -Peff
Re: [PATCH 2/6] t1414: document some reflog-walk oddities
On Wed, Jul 05, 2017 at 03:45:03PM -0700, Junio C Hamano wrote: > > I did make the ordering intentional. We process each reflog sequentially > > in turn. I don't think it would be wrong to interleave them by date, but > > I actually think it's useful for somebody who switches that behavior to > > consciously update the test. Maybe it's worth calling out in the commit > > message. > > > > I stopped short of actually documenting and guaranteeing that behavior > > to the user. I don't know how comfortable we would be changing it later. > > I somehow feel that the "showing all entries from HEAD and then > showing all from side" is simply a buggy behaviour. I do not think > our users would terribly mind if we changed it later. But I may be > missing the reason why (sometimes?) the sequential behaviour may be > useful. If we think it's buggy, we can fix it now. But I'm not convinced that sequential iteration is that bad. It's at least _simple_ and easy to explain. The only other thing that would make sense to me is sorting the entries by date (reflog date, not commit date) and showing them in that order. But that gives rise to some funny corner cases. What do we do if there's clock skew within the reflog (e.g., somebody fixed their skewed clock such that HEAD@{5} is less recent than HEAD@{6}). Do we show them out of order then (even if only a single reflog is being shown?). Or do we try some complicated sort where entries compare sequentially within a given reflog, but use date comparisons between different logs? I don't think that can work with a normal comparison sort, as it makes the comparison non-transitive. But maybe we could do a pseudo-list-merge, where we treat the reflogs as queues and always pick from the queue with the most recent timestamp. That would probably be pretty easy to retrofit on the iteration from the current series. -Peff