On Fri, Nov 13, 2015 at 04:29:15PM +0100, Andreas Krey wrote:
> > Likewise, I think dir.c:remove_dir_recurse is in a similar boat.
> > Grepping for resolve_gitlink_ref, it looks like there may be others,
> > too.
>
> Can't we handle this in resolve_gitlink_ref itself? As I understand it,
> it should resolve a ref (here "HEAD") when path points to a submodule.
> When there isn't one it should return -1, so:
I'm not sure. I think part of the change to git-clean was that
is_git_directory() is a _better_ check than "can we resolve HEAD?"
because it covers empty repos, too.
> diff --git a/refs.c b/refs.c
> index 132eff5..f8648c5 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1553,6 +1553,10 @@ int resolve_gitlink_ref(const char *path, const char
> *refname, unsigned char *sh
> if (!len)
> return -1;
> submodule = xstrndup(path, len);
> + if (!is_git_directory(submodule)) {
> + free(submodule);
> + return -1;
> + }
> refs = get_ref_cache(submodule);
> free(submodule);
>
> I'm way too little into the code to see what may this may get wrong.
I don't think it produces wrong outcomes, but I think it's sub-optimal.
In cases where we already have a ref cache, we'll hit the filesystem for
each lookup to re-confirm what we already know. That doesn't affect your
case, but it does when we actually _do_ have a submodule.
So if we were to follow this route, I think it would go better in
get_ref_cache itself (right after we determine there is no existing
cache, but before we call create_ref_cache()).
> But this, as well as the old hash-ref-cache patch speeds me
> up considerably, in this case a git ls-files -o from half a
> minute of mostly user CPU to a second.
Right, that makes sense to me.
> > All of these should be using the same test, I think. Doing that with
> > is_git_directory() is probably OK. It is a little more expensive than we
> > might want for mass-use (it actually opens and parses the HEAD file in
> > each directory),
>
> This happens as well when we let resolve_gitlink_ref run its old course.
> (It (ls-files) even seems to try to open .git and then .git/HEAD, even
> if the former fails with ENOENT.)
Yes, I think my earlier comment that you are quoting was just misguided.
We only do the extra work if the directory actually does look like a
gitdir, and the many-directories case we are optimizing here is the
opposite of that.
So summing up, I think:
1. We could get by with teaching get_ref_cache not to auto-create ref
caches for non-git-directories.
2. But for a little more work, pushing the is_git_directory() check
out to the call-sites gives us probably saner semantics overall.
-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html