Am 05.04.2015 um 03:06 schrieb Jeff King:
> As I've mentioned before, I have some repositories with rather large
> numbers of refs. The worst one has ~13 million refs, for a 1.6GB
> packed-refs file. So I was saddened by this:
>
> $ time git.v2.0.0 rev-parse refs/heads/foo >/dev/null 2>&1
> real 0m6.840s
> user 0m6.404s
> sys 0m0.440s
>
> $ time git.v2.4.0-rc1 rev-parse refs/heads/foo >/dev/null 2>&1
> real 0m19.432s
> user 0m18.996s
> sys 0m0.456s
>
> The command isn't important; what I'm really measuring is loading the
> packed-refs file. And yes, of course this repository is absolutely
> ridiculous. But the slowdowns here are linear with the number of refs.
> So _every_ git command got a little bit slower, even in less crazy
> repositories. We just didn't notice it as much.
>
> Here are the numbers after this series:
>
> real 0m8.539s
> user 0m8.052s
> sys 0m0.496s
>
> Much better, but I'm frustrated that they are still 20% slower than the
> original.
>
> The main culprits seem to be d0f810f (which introduced some extra
> expensive code for each ref) and my 10c497a, which switched from fgets()
> to strbuf_getwholeline. It turns out that strbuf_getwholeline is really
> slow.
10c497a changed read_packed_refs(), which reads *all* packed refs.
Each is checked for validity. That sounds expensive if the goal is
just to look up a single (non-existing) ref.
Would it help to defer any checks until a ref is actually accessed?
Can a binary search be used instead of reading the whole file?
I wonder if pluggable reference backends could help here. Storing refs
in a database table indexed by refname should simplify things.
Short-term, can we avoid the getc()/strbuf_grow() dance e.g. by mapping
the packed refs file? What numbers do you get with the following patch?
---
refs.c | 36 ++++++++++++++++++++++++++++--------
1 file changed, 28 insertions(+), 8 deletions(-)
diff --git a/refs.c b/refs.c
index 47e4e53..144255f 100644
--- a/refs.c
+++ b/refs.c
@@ -1153,16 +1153,35 @@ static const char *parse_ref_line(struct strbuf *line,
unsigned char *sha1)
* compatibility with older clients, but we do not require it
* (i.e., "peeled" is a no-op if "fully-peeled" is set).
*/
-static void read_packed_refs(FILE *f, struct ref_dir *dir)
+static void read_packed_refs(int fd, struct ref_dir *dir)
{
struct ref_entry *last = NULL;
struct strbuf line = STRBUF_INIT;
enum { PEELED_NONE, PEELED_TAGS, PEELED_FULLY } peeled = PEELED_NONE;
+ struct stat st;
+ void *map;
+ size_t mapsz, len;
+ const char *p;
+
+ fstat(fd, &st);
+ mapsz = xsize_t(st.st_size);
+ if (!mapsz)
+ return;
+ map = xmmap(NULL, mapsz, PROT_READ, MAP_PRIVATE, fd, 0);
- while (strbuf_getwholeline(&line, f, '\n') != EOF) {
+ for (p = map, len = mapsz; len; ) {
unsigned char sha1[20];
const char *refname;
const char *traits;
+ const char *nl;
+ size_t linelen;
+
+ nl = memchr(p, '\n', len);
+ linelen = nl ? nl - p + 1 : len;
+ strbuf_reset(&line);
+ strbuf_add(&line, p, linelen);
+ p += linelen;
+ len -= linelen;
if (skip_prefix(line.buf, "# pack-refs with:", &traits)) {
if (strstr(traits, " fully-peeled "))
@@ -1204,6 +1223,7 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir)
}
strbuf_release(&line);
+ munmap(map, mapsz);
}
/*
@@ -1224,16 +1244,16 @@ static struct packed_ref_cache
*get_packed_ref_cache(struct ref_cache *refs)
clear_packed_ref_cache(refs);
if (!refs->packed) {
- FILE *f;
+ int fd;
refs->packed = xcalloc(1, sizeof(*refs->packed));
acquire_packed_ref_cache(refs->packed);
refs->packed->root = create_dir_entry(refs, "", 0, 0);
- f = fopen(packed_refs_file, "r");
- if (f) {
- stat_validity_update(&refs->packed->validity,
fileno(f));
- read_packed_refs(f, get_ref_dir(refs->packed->root));
- fclose(f);
+ fd = open(packed_refs_file, O_RDONLY);
+ if (fd >= 0) {
+ stat_validity_update(&refs->packed->validity, fd);
+ read_packed_refs(fd, get_ref_dir(refs->packed->root));
+ close(fd);
}
}
return refs->packed;
--
2.3.5
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html