On Mon, Feb 11, 2013 at 9:56 AM, Duy Nguyen <pclo...@gmail.com> wrote:
> Yeah, it did not cut out syscall cost, I also cut a lot of user-space
> processing (plus .gitignore content access). From the timings I posted
> earlier,
>
>>         unmodified  dir.c
>> real    0m0.550s    0m0.287s
>> user    0m0.305s    0m0.201s
>> sys     0m0.240s    0m0.084s
>
> sys time is reduced from 0.24s to 0.08s, so readdir+opendir definitely
> has something to do with it (and perhaps reading .gitignore). But it
> also reduces user time from 0.305 to 0.201s. I don't think avoiding
> readdir+openddir will bring us this gain. It's probably the cost of
> matching .gitignore. I'll try to replace opendir+readdir with a
> no-syscall version. At this point "untracked caching" sounds more
> feasible (and less complex) than ".gitignore cachine".

And this is read_directory's timing breakdown (again, "git status" on
gentoo-x86,git, built with -O2 on x86-64 if I did not mention before)

opendir   = 0.030s
readdir   = 0.083s
closedir  = 0.020s
{open,read,close}dir = 0.132s
treat_path           = 0.094s (172534 times)
dir_add_name         = 0.050s (101917 times)
read_directory       = 0.292s
# On branch master
nothing to commit, working directory clean

real    0m0.511s
user    0m0.347s
sys     0m0.157s

Instrumentation is done with gettimeofday. Without gettimeofday calls
inside read_directory_recursive, read_directory takes 0.267s (iow,
gettimeofday cost is about 0.30s). {open,read,close}dir + treat_path +
dir_add_name + gettimeofday add up quite close to 0.292s (strbuf_*
takes just about 0.005s)

Eliminating xxxdir syscalls may save us 0.132s (or less, we need to
pay to get the information elsewhere).

Because my worktree is clean, dir_add_name spends all 0.05s in
cache_name_exists(). If we somehow know the input path is not a
tracked entry, we could avoid cache_name_exists() and save 0.05s.

If we do the "untracked cache", the number of treat_path calls should
be much lower. In this particular case of gentoo-x86, I'd expect no
more than a dozen of untracked files, which cuts down treat_path and
dir_add_name's time to near zero. On a normal repository like git.git,
untracked files are about 1075 files with 2552 tracked files, we
should be able to save 2/3 to 1/2 of treat_path calls.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to