6] perf: add a performance test for core.fsmonitor

Ben Peart Wed, 07 Jun 2017 12:51:48 -0700


On 6/2/2017 7:06 PM, Ævar Arnfjörð Bjarmason wrote:


I don't have time to update the perf test now or dig into it, but most
of what you're describing in this mail doesn't at all match with the
ad-hoc tests I ran in
https://public-inbox.org/git/CACBZZX5e58bWuf3NdDYTxu2KyZj29hHONzN=rp-7vxd8nur...@mail.gmail.com/

There (at the very end of the E-Mail) I'm running watchman in a tight
loop while I flush the entire fs cache, its runtime is never longer
than 600ms, with 3ms being the norm.

I added a perf trace around the entire query-fsmonitor hook proc (patchbelow) to measure the total actual impact of running the hook script +querying watchman + parsing the output with perl + passing the resultback to git. On my machine, the total cost of the hook runs between 130ms and 180 ms when there are zero changes to report (ie best case).

With short status times, the overhead of watchman simply outweighs anygains in performance - especially when you have a warm file system cacheas that cancels out the biggest win of avoiding the IO associated withscanning the working directory.



diff --git a/fsmonitor.c b/fsmonitor.c
index 763a8a3a3f..cb47f31863 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -210,9 +210,11 @@ void refresh_by_fsmonitor(struct index_state *istate)
         * If we have a last update time, call query-monitor for the set of
         * changes since that time.
         */
-       if (istate->fsmonitor_last_update)
+       if (istate->fsmonitor_last_update) {
                query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION,
                        istate->fsmonitor_last_update, &query_result);
+               trace_performance_since(last_update, "query-fsmonitor");
+       }

        if (query_success) {
                /* Mark all entries returned by the monitor as dirty */


I.e. flushing the cache doesn't slow things down much at all compared
to how long a "git status" takes from cold cache. Something else must
be going on, and the smoking gun is the gprof output I posted in the
follow-up E-Mail:
https://public-inbox.org/git/CACBZZX4eZ3G8LQ8O+_BkbkJ-ZXTOkUi9cW=qkyjfhktma3p...@mail.gmail.com/

There with the fsmonitor we end up calling blk_SHA1_Block ~100K times
during "status", but IIRC (I don't have the output in front of me,
this is from memory) something like twenty times without the
fsmonitor.

It can't be a coincidence that with the fscache:

$ pwd; git ls-files | wc -l
/home/avar/g/linux
59844

And you can see that in the fsmonitor "git status" we make exactly
that many calls to cache_entry_from_ondisk(), but those calls don't
show up at all in the non-fscache codepath.

I don't see how the gprof numbers for the non-fsmonitor case can becorrect. It appears they don't contain any calls related to loading theindex while the fsmonitor gprof numbers do. Here is a typical call stack:


git.exe!cache_entry_from_ondisk()
git.exe!create_from_disk()
git.exe!do_read_index()
git.exe!read_index_from()
git.exe!read_index()

During read_index(), cache_entry_from_ondisk() gets called for everyitem in the index (which explains the 59K calls). How can thenon-fsmonitor codepath not be loading the index?

So, again, I haven't dug and really must step away from the computer
now, but this really looks like the fscache saves us the recursive
readdir() / lstat() etc, but in return we somehow fall though to a
codepath where we re-read the entire on-disk state back into the
index, which we don't do in the non-fscache codepath.

I've run multiple profiles and compared them with fsmonitor on and offand have been unable to find any performance regression caused byfsmonitor (other than flagging the index as dirty at times when it isn'trequired which I have fixed for the next patch series).

I have done many performance runs and when I subtract the _actual_ timespent in the hook from the overall command time, it comes in at slightlyless time than when status is run with fsmonitor off. This also leadsme to believe there is no regression with fsmonitor on.

All this leads me back to my original conclusion: the reason status isslower in these specific cases is because the overhead of calling thehook exceeds the savings gained. If your status calls are taking lessthan a second, it just doesn't make sense to add the complexity andoverhead of calling a file system watcher.

I'm working on an updated perf test that will demonstrate the best casescenario (warm watchman, cold file system cache) in addition to theworst case (cold watchman, warm file system cache). The reality is thatin normal use cases, perf will be between the two. I'll add that to thenext iteration of the patch series.

Re: [WIP/PATCH 7/6] perf: add a performance test for core.fsmonitor

Reply via email to