Re: [PATCH] use a hashmap to make remotes faster

2014-07-29 Thread Matthieu Moy
patrick.reyno...@github.com patrick.reyno...@github.com writes:
^

It seems you mixed your name and email address in your config file. I
guess your name is Patrick Reynolds, not
patrick.reyno...@github.com.

 Remotes are stored as an array, so looking one up or adding one without
 duplication is an O(n) operation.  Reading an entire config file full of
 remotes is O(n^2) in the number of remotes.  For a repository with tens of
 thousands of remotes, the running time can hit multiple minutes.

Just being curious: in which senario do you have tens of thousands of
remotes?

(not an objection, it's a good thing anyway)

 +static inline void init_remotes_hash()

static inline void init_remotes_hash(void)

Not a detailed review, but the patch sounds good other than that.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] use a hashmap to make remotes faster

2014-07-29 Thread Jeff King
On Tue, Jul 29, 2014 at 09:57:45AM +0200, Matthieu Moy wrote:

 patrick.reyno...@github.com patrick.reyno...@github.com writes:
 ^
 
 It seems you mixed your name and email address in your config file. I
 guess your name is Patrick Reynolds, not
 patrick.reyno...@github.com.

Also, Patrick, please sign-off your patch (format-patch -s).

  Remotes are stored as an array, so looking one up or adding one without
  duplication is an O(n) operation.  Reading an entire config file full of
  remotes is O(n^2) in the number of remotes.  For a repository with tens of
  thousands of remotes, the running time can hit multiple minutes.
 
 Just being curious: in which senario do you have tens of thousands of
 remotes?
 
 (not an objection, it's a good thing anyway)

Whenever you fork a repository at GitHub, we give you a leaf repository
that points its info/alternates to a master network.git repository for
the fork network.  The network.git repo contains all of the objects, and
has a remote configured for each of the child repositories. You would
never want to gc in that repository without doing a fetch --all first.

Most networks have only a few dozen forks, but a few have a large number
(torvalds/linux has ~5K, and homebrew is close to 10K).  And then
sometimes a MOOC instructor tells an entire 50K-person class to fork a
hello-world project all at once. :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html