Re: [PATCH] fetch: use "quick" has_sha1_file for tag following

2016-10-18 Thread Jeff King
On Mon, Oct 17, 2016 at 10:30:28AM -0700, Junio C Hamano wrote:

> > It looks like I _did_ look into optimizing this into a single stat()
> > call in the thread at [1]. I completely forgot about that. I did find
> > there that naively using stat_validity() on a directory is racy, though
> > I wonder if we could do something clever with gettimeofday() instead.
> 
> It feels funny to hear an idea to compare fs timestamp with gettimeofday
> immedately after hearing the word NFS, though ;-).

Yeah, I had a funny feeling in my stomach as I wrote that.

What you really want to know is the current filesystem time. You'd
probably have to do something gross like creating a new file and then
comparing its timestamp. In theory you'd only have to do that _once_,
and then as long as the pack directory wasn't changing, you could say "I
don't know what time it is now, but I know it is at least time X, and I
know that X is greater than Y, the pack directory timestamp, therefore
the pack directory hasn't changed since I last looked".

That assumes monotonic clocks, but we basically already do so for the
racy-git checks, I think.

I dunno. It feels...complicated. And bad to require writing to the
repository for what would otherwise be a read-only operation. But I
don't see any fundamental reason it could not work.

-Peff


Re: [PATCH] fetch: use "quick" has_sha1_file for tag following

2016-10-17 Thread Junio C Hamano
Jeff King  writes:

> Still an impressive speedup as a percentage, but negligible in absolute
> terms. But that's on a local filesystem on a Linux machine. I'd worry
> much more about a system with a slow readdir(), e.g., due to NFS.
> Somebody's real-world NFS case[1] was what prompted us to do 0eeb077
> (index-pack: avoid excessive re-reading of pack directory, 2015-06-09).

Yes.

> It looks like I _did_ look into optimizing this into a single stat()
> call in the thread at [1]. I completely forgot about that. I did find
> there that naively using stat_validity() on a directory is racy, though
> I wonder if we could do something clever with gettimeofday() instead.

It feels funny to hear an idea to compare fs timestamp with gettimeofday
immedately after hearing the word NFS, though ;-).

>> I agree that the fallout from the inaccuracy of "quick" approach is
>> probably acceptable and the next "fetch" will correct it anyway, so
>> let's do the "quick but inaccurate" for now and perhaps cook it in
>> 'next' for a bit longer than other topics?
>
> I doubt that cooking in 'next' for longer will turn up anything useful.
> The case we care about is the race between a repack and a fetch. We
> lived with the "quick" version of has_sha1_file() everywhere for 8
> years.

A very convincing argument.  I stand corrected.

Thanks.




Re: [PATCH] fetch: use "quick" has_sha1_file for tag following

2016-10-14 Thread Jeff King
On Fri, Oct 14, 2016 at 10:39:52AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > So it's certainly better. But 7500 packs is just silly, and squeezing
> > out ~400ms there is hardly worth it. If you repack this same case into a
> > single pack, the command drops to 5ms. So yes, there's close to an order
> > of magnitude speedup here, but you get that _and_ another order of
> > magnitude just by repacking.
> 
> "7500 is silly" equally applies to the "quick" (and sloppy, if I am
> reading your "Failing in this direction doesn't make me feel great."
> correctly) approach, I think, which argues for not taking either
> change X-<.

I wouldn't quite agree that they're the same. 7500 packs is silly
because a bunch of _other_ things are going to get equally or more slow
before the prepare_packed_git() slowdown is noticeable. And it's in your
power to clean up, and we should encourage users to do so.

Whereas having a bunch of unfetched tags is a state that may linger
indefinitely, and be outside the user's control.

I _am_ open to the argument that calling reprepare_packed_git() over and
over doesn't really matter much if you have a sane number of packs. If
you tweak my perf test like so:

diff --git a/t/perf/p5550-fetch-tags.sh b/t/perf/p5550-fetch-tags.sh
index a5dc39f..7e7ae24 100755
--- a/t/perf/p5550-fetch-tags.sh
+++ b/t/perf/p5550-fetch-tags.sh
@@ -86,7 +86,7 @@ test_expect_success 'create child packs' '
cd child &&
git config gc.auto 0 &&
git config gc.autopacklimit 0 &&
-   create_packs 500
+   create_packs 10
)
 '
 

you get:

Testoriginquick 

5550.4: fetch   0.06(0.02+0.02)   0.02(0.01+0.00) -66.7%

Still an impressive speedup as a percentage, but negligible in absolute
terms. But that's on a local filesystem on a Linux machine. I'd worry
much more about a system with a slow readdir(), e.g., due to NFS.
Somebody's real-world NFS case[1] was what prompted us to do 0eeb077
(index-pack: avoid excessive re-reading of pack directory, 2015-06-09).

It looks like I _did_ look into optimizing this into a single stat()
call in the thread at [1]. I completely forgot about that. I did find
there that naively using stat_validity() on a directory is racy, though
I wonder if we could do something clever with gettimeofday() instead.
IOW, the patches there only bothered to re-read when they saw the mtime
on the directory jump, which suffers from 1-second precision problems.
But if we instead compared the mtime to the current time, we could err
in favor of re-reading the packs, and get false positives for at most 1
second.

[1] 
http://public-inbox.org/git/7fae15f0a93c0144ad8b5fbd584e1c5519758...@c111kxtembx51.erf.thomson.com/

> I agree that the fallout from the inaccuracy of "quick" approach is
> probably acceptable and the next "fetch" will correct it anyway, so
> let's do the "quick but inaccurate" for now and perhaps cook it in
> 'next' for a bit longer than other topics?

I doubt that cooking in 'next' for longer will turn up anything useful.
The case we care about is the race between a repack and a fetch. We
lived with the "quick" version of has_sha1_file() everywhere for 8
years. I only noticed the race on a hosting cluster which puts through
millions of operations per day (I could in theory test the patch on that
same cluster, but we actually don't do very many fetches).

-Peff


Re: [PATCH] fetch: use "quick" has_sha1_file for tag following

2016-10-14 Thread Junio C Hamano
Jeff King  writes:

> So it's certainly better. But 7500 packs is just silly, and squeezing
> out ~400ms there is hardly worth it. If you repack this same case into a
> single pack, the command drops to 5ms. So yes, there's close to an order
> of magnitude speedup here, but you get that _and_ another order of
> magnitude just by repacking.

"7500 is silly" equally applies to the "quick" (and sloppy, if I am
reading your "Failing in this direction doesn't make me feel great."
correctly) approach, I think, which argues for not taking either
change X-<.

I agree that the fallout from the inaccuracy of "quick" approach is
probably acceptable and the next "fetch" will correct it anyway, so
let's do the "quick but inaccurate" for now and perhaps cook it in
'next' for a bit longer than other topics?


Re: [PATCH] fetch: use "quick" has_sha1_file for tag following

2016-10-13 Thread Jeff King
On Thu, Oct 13, 2016 at 01:04:43PM -0400, Jeff King wrote:

> > This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice
> > accuracy for speed, in cases where we might be racy with a
> > simultaneous repack. This is similar to the fix in 0eeb077
> > (index-pack: avoid excessive re-reading of pack directory,
> > 2015-06-09). As with that case, it's OK for has_sha1_file()
> > occasionally say "no I don't have it" when we do, because
> > the worst case is not a corruption, but simply that we may
> > fail to auto-follow a tag that points to it.
> 
> Failing in this direction doesn't make me feel great. I had hoped it
> would fail the _other_ way, and request an object that we might already
> have.
> 
> There are two alternatives that might be worth pursuing:
> 
>   1. The thing that makes this really painful is the quadratic
>  duplicate-search in prepare_packed_git_one(). We could speed that
>  up pretty easily by keeping a hash of the packs indexed by their
>  names, and looking up in that.
> 
>  That drops the quadratic factor, but it's still going to be
>  O(nr_tags * nr_packs), as we have to at the very least readdir()
>  all of objects/pack to see that nothing new showed up.

That hash patch would look something like what is below.

Here are numbers for the perf script ("quick" is the one that skips
reprepare entirely, "hash" is this hash table patch, and "quick+hash" is
the combination of the two):

origin  quickhash 
quick+hash

11.09(10.38+0.70)   0.08(0.06+0.01) -99.3%   1.41(0.65+0.75) -87.3%   
0.07(0.05+0.01) -99.4%

So yes, it does improve this case, but not nearly as much as the "quick"
approach. Putting it on top of the "quick" approach is barely noticeable
(it is speeding up the initial prepare_packed_git() call, but it's just
not expensive enough in the first place to be worth it).

The "hash" patch does fix a potentially quadratic behavior, though, so I
guess in theory it is worth it. But I had to go up to quite a large
number of packs to make it worthwhile. Here it is at 7500 packs, running
"git cat-file -e $some_object_that_exists":

  [origin]
  real0m0.417s
  user0m0.376s
  sys 0m0.040s

  [hash]
  real0m0.054s
  user0m0.008s
  sys 0m0.044s

So it's certainly better. But 7500 packs is just silly, and squeezing
out ~400ms there is hardly worth it. If you repack this same case into a
single pack, the command drops to 5ms. So yes, there's close to an order
of magnitude speedup here, but you get that _and_ another order of
magnitude just by repacking.

So I think it's only worth pursuing if we wanted to scrap my original
patch, and continue to aggressively reprepare_packed_git(). I'd worry
that it's still expensive on some systems, even with the hash, because
the opendir/readdir might cost a lot (OTOH, we'll by definition have
just done a stat() on the loose version of the object, so there's a
limit to how fast we can make it).

I dunno.

---
 cache.h |  2 ++
 sha1_file.c | 44 +++-
 2 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/cache.h b/cache.h
index ec791a6..0d8b4e8 100644
--- a/cache.h
+++ b/cache.h
@@ -1411,6 +1411,7 @@ struct pack_window {
 };
 
 extern struct packed_git {
+   struct hashmap_entry name_entry;
struct packed_git *next;
struct pack_window *windows;
off_t pack_size;
@@ -1428,6 +1429,7 @@ extern struct packed_git {
 do_not_close:1;
unsigned char sha1[20];
struct revindex_entry *revindex;
+   size_t basename_len;
/* something like ".git/objects/pack/x.pack" */
char pack_name[FLEX_ARRAY]; /* more */
 } *packed_git;
diff --git a/sha1_file.c b/sha1_file.c
index c652cb6..eb6a5f3 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -602,6 +602,8 @@ struct packed_git *packed_git;
 static struct mru packed_git_mru_storage;
 struct mru *packed_git_mru = _git_mru_storage;
 
+static struct hashmap pack_name_index;
+
 void pack_report(void)
 {
fprintf(stderr,
@@ -1260,6 +1262,30 @@ struct packed_git *parse_pack_index(unsigned char *sha1, 
const char *idx_path)
return p;
 }
 
+static int pack_name_hashcmp(const void *va, const void *vb, const void *name)
+{
+   const struct packed_git *a = va, *b = vb;
+
+   if (a->basename_len != b->basename_len)
+   return -1;
+   else if (name)
+   return memcmp(a->pack_name, name, a->basename_len);
+   else
+   return memcmp(a->pack_name, b->pack_name, a->basename_len);
+}
+
+static int pack_loaded(const char *name, size_t len)
+{
+   struct packed_git key;
+
+   if (!pack_name_index.tablesize)
+   return 0;
+
+   hashmap_entry_init(, memhash(name, len));
+   key.basename_len = len;
+   return 

Re: [PATCH] fetch: use "quick" has_sha1_file for tag following

2016-10-13 Thread Jeff King
On Thu, Oct 13, 2016 at 12:53:44PM -0400, Jeff King wrote:

> -- >8 --
> Subject: [PATCH] fetch: use "quick" has_sha1_file for tag following

A few comments on my own patch...

> This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice
> accuracy for speed, in cases where we might be racy with a
> simultaneous repack. This is similar to the fix in 0eeb077
> (index-pack: avoid excessive re-reading of pack directory,
> 2015-06-09). As with that case, it's OK for has_sha1_file()
> occasionally say "no I don't have it" when we do, because
> the worst case is not a corruption, but simply that we may
> fail to auto-follow a tag that points to it.

Failing in this direction doesn't make me feel great. I had hoped it
would fail the _other_ way, and request an object that we might already
have.

There are two alternatives that might be worth pursuing:

  1. The thing that makes this really painful is the quadratic
 duplicate-search in prepare_packed_git_one(). We could speed that
 up pretty easily by keeping a hash of the packs indexed by their
 names, and looking up in that.

 That drops the quadratic factor, but it's still going to be
 O(nr_tags * nr_packs), as we have to at the very least readdir()
 all of objects/pack to see that nothing new showed up.

 I wonder if we could trust the timestamp on the objects/pack
 directory to avoid re-reading it at all. That turns it into a
 single stat() call.

  2. We could stop bothering to reprepare_packed_git() only after the
 nth time it is called. This basically turns on the "sacrifice
 accuracy for speed" behavior automatically, but it means that most
 cases would never do so, because it wouldn't be creating a
 performance issue in the first place.

 It feels weird and flaky that git might behave differently based on
 the number of unfetched tags the remote happens to have, though.

> Here are results from the included perf script, which sets
> up a situation similar to the one described above:
> 
> TestHEAD^   HEAD
> --
> 5550.4: fetch   11.21(10.42+0.78)   0.08(0.04+0.02) -99.3%

The situation in this perf script is obviously designed to show off this
one specific optimization. It feels a bit overly specific, and I doubt
anybody will be all that interested in running it again once the fix is
in. OTOH, I did want to document my reproduction steps, and this seemed
like the only reasonable place to do so. And as the perf suite is
already pretty expensive, perhaps nobody minds adding to it too much.

-Peff


[PATCH] fetch: use "quick" has_sha1_file for tag following

2016-10-13 Thread Jeff King
On Thu, Oct 13, 2016 at 11:26:32AM -0400, Jeff King wrote:

> On Thu, Oct 13, 2016 at 09:20:07AM +0200, Vegard Nossum wrote:
> 
> > > Does the patch below help?
> > 
> > Yes, ~2m10s -> ~1m25s when I test a git fetch this morning (the other
> > variation in time may be due to different CPU usage by other programs,
> > but I ran with/without the patch multiple times and the difference is
> > consistent).
> > [...]
> > There are some 20k refs on the remote, closer to 25k locally.
> 
> OK, that makes some sense. For whatever reason, your remote has a bunch
> of tags that point to objects you do not already have. That could
> happen, I think, if the remote added a lot of tags since you cloned
> (because clone will grab all of the tags), but those tags do not point
> to history that you are otherwise fetching (since fetch by default will
> "auto-follow" such tags).

Armed with this information, I was able to reproduce the issue locally.
However, once my patch is applied, it's now quite fast. So I still don't
know where your other 1m25s is going.

So here's that same patch wrapped up with a commit message. Note that I
converted one more call site to the "quick" form; it would trigger when
the candidate tags are real tag objects, not just pointers to commits.
That might improve your runtime more, depending on what is actually in
your repository.

-- >8 --
Subject: [PATCH] fetch: use "quick" has_sha1_file for tag following

When we auto-follow tags in a fetch, we look at all of the
tags advertised by the remote and fetch ones where we don't
already have the tag, but we do have the object it peels to.
This involves a lot of calls to has_sha1_file(), some of
which we can reasonably expect to fail. Since 45e8a74
(has_sha1_file: re-check pack directory before giving up,
2013-08-30), this may cause many calls to
reprepare_packed_git(), which is potentially expensive.

This has gone unnoticed for several years because it
requires a fairly unique setup to matter:

  1. You need to have a lot of packs on the client side to
 make reprepare_packed_git() expensive (the most
 expensive part is finding duplicates in an unsorted
 list, which is currently quadratic).

  2. You need a large number of tag refs on the server side
 that are candidates for auto-following (i.e., that the
 client doesn't have). Each one triggers a re-read of
 the pack directory.

  3. Under normal circumstances, the client would
 auto-follow those tags and after one large fetch, (2)
 would no longer be true. But if those tags point to
 history which is disconnected from what the client
 otherwise fetches, then it will never auto-follow, and
 those candidates will impact it on every fetch.

So when all three are true, each fetch pays an extra
O(nr_tags * nr_packs^2) cost, mostly in string comparisons
on the pack names. This was exacerbated by 47bf4b0
(prepare_packed_git_one: refactor duplicate-pack check,
2014-06-30) which uses a slightly more expensive string
check, under the assumption that the duplicate check doesn't
happen very often (and it shouldn't; the real problem here
is how often we are calling reprepare_packed_git()).

This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice
accuracy for speed, in cases where we might be racy with a
simultaneous repack. This is similar to the fix in 0eeb077
(index-pack: avoid excessive re-reading of pack directory,
2015-06-09). As with that case, it's OK for has_sha1_file()
occasionally say "no I don't have it" when we do, because
the worst case is not a corruption, but simply that we may
fail to auto-follow a tag that points to it.

Here are results from the included perf script, which sets
up a situation similar to the one described above:

TestHEAD^   HEAD
--
5550.4: fetch   11.21(10.42+0.78)   0.08(0.04+0.02) -99.3%

Reported-by: Vegard Nossum <vegard.nos...@oracle.com>
Signed-off-by: Jeff King <p...@peff.net>
---
 builtin/fetch.c| 11 --
 cache.h|  1 +
 sha1_file.c|  5 +++
 t/perf/p5550-fetch-tags.sh | 99 ++
 4 files changed, 112 insertions(+), 4 deletions(-)
 create mode 100755 t/perf/p5550-fetch-tags.sh

diff --git a/builtin/fetch.c b/builtin/fetch.c
index d5329f9..74c0546 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -241,9 +241,10 @@ static void find_non_local_tags(struct transport 
*transport,
 * as one to ignore by setting util to NULL.
 */
if (ends_with(ref->name, "^{}")) {
-   if (item && !has_object_file(>old_oid) &&
+   if (item &&
+   !ha