On Tue, Apr 7, 2015 at 12:10 AM, Eric Sunshine <[email protected]> wrote:
> On Mon, Apr 6, 2015 at 7:48 AM, Erik Elfström <[email protected]> wrote:
>> Before this change, clean used resolve_gitlink_ref to check for the
>> presence of nested git repositories. This had the drawback of creating
>> a ref_cache entry for every directory that should potentially be
>> cleaned. The linear search through the ref_cache list caused a massive
>> performance hit for large number of directories.
>>
>> Teach clean.c:remove_dirs to use setup.c:is_git_directory
>> instead. is_git_directory will actually open HEAD and parse the HEAD
>> ref but this implies a nested git repository and should be rare when
>> cleaning.
>>
>> Using is_git_directory should give a more standardized check for what
>> is and what isn't a git repository but also gives a slight behavioral
>> change. We will now detect and respect bare and empty nested git
>> repositories (only init run). Update t7300 to reflect this.
>>
>> The time to clean an untracked directory containing 100000 sub
>> directories went from 61s to 1.7s after this change.
>
> Impressive.
>
>> Signed-off-by: Erik Elfström <[email protected]>
>> Helped-by: Jeff King <[email protected]>
>
> It is customary for your sign-off to be last.
>
> More below...
>
>> ---
>> diff --git a/builtin/clean.c b/builtin/clean.c
>> index 98c103f..e951bd9 100644
>> --- a/builtin/clean.c
>> +++ b/builtin/clean.c
>> @@ -148,6 +147,24 @@ static int exclude_cb(const struct option *opt, const
>> char *arg, int unset)
>> return 0;
>> }
>>
>> +static int is_git_repository(struct strbuf *path)
>> +{
>> + int ret = 0;
>> + if (is_git_directory(path->buf))
>> + ret = 1;
>> + else {
>> + int orig_path_len = path->len;
>> + if (path->buf[orig_path_len - 1] != '/')
>
> Minor: I don't know how others feel about it, but I always find it a
> bit disturbing to see a potential negative array access without a
> safety check that orig_path_len is not 0, either directly in the
> conditional or as a documenting assert().
>
I think I would prefer to accept empty input and return false rather
than assert. What to you think about:
static int is_git_repository(struct strbuf *path)
{
int ret = 0;
size_t orig_path_len = path->len;
if (orig_path_len == 0)
ret = 0;
else if (is_git_directory(path->buf))
ret = 1;
else {
if (path->buf[orig_path_len - 1] != '/')
strbuf_addch(path, '/');
strbuf_addstr(path, ".git");
if (is_git_directory(path->buf))
ret = 1;
strbuf_setlen(path, orig_path_len);
}
return ret;
}
Also I borrowed this pattern from remove_dirs and it has the same
problem. Should I add something like this as a separate commit?
diff --git a/builtin/clean.c b/builtin/clean.c
index ccffd8a..88850e3 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -173,7 +173,8 @@ static int remove_dirs(struct strbuf *path, const
char *prefix, int force_flag,
DIR *dir;
struct strbuf quoted = STRBUF_INIT;
struct dirent *e;
- int res = 0, ret = 0, gone = 1, original_len = path->len, len;
+ int res = 0, ret = 0, gone = 1;
+ size_t original_len = path->len, len;
struct string_list dels = STRING_LIST_INIT_DUP;
*dir_gone = 1;
@@ -201,6 +202,7 @@ static int remove_dirs(struct strbuf *path, const
char *prefix, int force_flag,
return res;
}
+ assert(original_len > 0 && "expects non-empty path");
if (path->buf[original_len - 1] != '/')
strbuf_addch(path, '/');
>> + strbuf_addch(path, '/');
>> + strbuf_addstr(path, ".git");
>> + if (is_git_directory(path->buf))
>> + ret = 1;
>> + strbuf_setlen(path, orig_path_len);
>> + }
>> +
>> + return ret;
>> +}
>> +
>> static int remove_dirs(struct strbuf *path, const char *prefix, int
>> force_flag,
>> int dry_run, int quiet, int *dir_gone)
>> {
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html