Re: Some rough edges of core.fsmonitor

2018-02-01 Thread Duy Nguyen
On Tue, Jan 30, 2018 at 6:16 AM, Ben Peart  wrote:
>
>
> On 1/29/2018 4:40 AM, Duy Nguyen wrote:
>>
>> On Sat, Jan 27, 2018 at 12:43:41PM +0100, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> b) with fsmonitor
>>>
>>>  $ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
>>>  12:34:23.833625 read-cache.c:1890   performance: 0.049485685 s:
>>> read cache .git/index
>>
>>
>> This is sort of off topic but may be interesting for big repo guys. It
>> looks like read cache's time is partially dominated by malloc().
>>
>
> That is correct.  We have tried a few different ways to address this. First
> was my patch series [1] that would parallelize all of the read cache code.
>
> We quickly found that malloc() was the biggest culprit and by speeding that
> up, we got most of the wins.  At Peff's recommendation [2], we looked into
> using tcmalloc but found that 1) it has bugs on Windows and 2) it isn't
> being actively maintained so it didn't seem those bugs would ever get fixed.
>
> We are currently working on a patch that will use a refactored version of
> the mem_pool in fast-import.c to do block allocations of the cache entries
> which is giving us about a 22% improvement in "git status" times.  One

My apologies if this has been discussed in the second half of 2017
which I have no idea what happened.

I just wonder if it's possible to design a "file format" that is
basically a memory dump of lots of struct cache_entry? This "file"
will stay in a shared memory somewhere and never get written down to
disk. Since it's very close to the data structure we have in core, the
most we need to do after mmap'ing it (and keeping it mmap'd until the
end) is adjust some pointers. These entries are of course read-only.
When you modify/create new entries, they are created new, using the
old malloc(). We just need to make sure not free the read-only
cache_entry entries and munmap() the whole thing when we
discard_index().

This opens up another option to deal with the large UNTR and TREE
extensions in a similar way. These will be your next headache after
you have reduced parse time for main entries.

> challenge has been ensuring that cache entries are not passed from one
> index/mem_pool to another which could cause access after free bugs.

We kind of have something close to that, but not entirely the same.
When split index is used, the same cache_entry can appear in two
index_state structs. Of course you can free only one of them (and you
can only do so when you know both index_state are gone). I see some
code cleanup opportunity :)

>
> [1]
> https://public-inbox.org/git/20171109141737.47976-1-benpe...@microsoft.com/
> [2]
> https://public-inbox.org/git/20171120153846.v5b7ho42yzrzn...@sigill.intra.peff.net/
-- 
Duy


Re: Some rough edges of core.fsmonitor

2018-01-31 Thread Ben Peart



On 1/30/2018 6:16 PM, Ævar Arnfjörð Bjarmason wrote:


On Tue, Jan 30 2018, Ben Peart jotted:


While some of these issues have been discussed in other threads, I
thought I'd summarize my thoughts here.


Thanks for this & your other reply. I'm going to get to testing some of
Duy's patches soon, and if you have some other relevant WIP I'd be happy
to try them, but meanwhile replying to a few of these:


On 1/26/2018 7:28 PM, Ævar Arnfjörð Bjarmason wrote:

I just got around to testing this since it landed, for context some
previous poking of mine in [1].

Issues / stuff I've noticed:

1) We end up invalidating the untracked cache because stuff in .git/
changed. For example:

  01:09:24.975524 fsmonitor.c:173 fsmonitor process 
'.git/hooks/fsmonitor-watchman' returned success
  01:09:24.975548 fsmonitor.c:138 fsmonitor_refresh_callback '.git'
  01:09:24.975556 fsmonitor.c:138 fsmonitor_refresh_callback 
'.git/config'
  01:09:24.975568 fsmonitor.c:138 fsmonitor_refresh_callback 
'.git/index'
  01:09:25.122726 fsmonitor.c:91  write fsmonitor extension 
successful

Am I missing something or should we do something like:

  diff --git a/fsmonitor.c b/fsmonitor.c
  index 0af7c4edba..5067b89bda 100644
  --- a/fsmonitor.c
  +++ b/fsmonitor.c
  @@ -118,7 +118,12 @@ static int query_fsmonitor(int version, uint64_t 
last_update, struct strbuf *que

   static void fsmonitor_refresh_callback(struct index_state *istate, const 
char *name)
   {
  -   int pos = index_name_pos(istate, name, strlen(name));
  +   int pos;
  +
  +   if (!strcmp(name, ".git") || starts_with(name, ".git/"))
  +   return;
  +
  +   pos = index_name_pos(istate, name, strlen(name));

  if (pos >= 0) {
  struct cache_entry *ce = istate->cache[pos];

With that patch applied status on a large repo[2] goes from a consistent
~180-200ms to ~140-150ms, since we're not invalidating some of the UC
structure



I favor making this optimization by updating
untracked_cache_invalidate_path() so that it ignores paths under
get_git_dir() and doesn't invalidate the untracked cache or flag the
index as dirty.


*nod*


2) We re-write out the index even though we know nothing changed

When you first run with core.fsmonitor it needs to
mark_fsmonitor_clean() for every path, but is there a reason for why we
wouldn't supply the equivalent of GIT_OPTIONAL_LOCKS=0 if all paths are
marked and we know from the hook that nothing changed? Why write out the
index again?



Writing out the index when core.fsmonitor is first turned on is
necessary to get the index extension added with the current state of
the dirty flags.  Given it is a one time cost, I don't think we have
anything worth trying to optimize here.


Indeed, that makes sense. What I was showing here is even after the
initial setup we continue to write it out when we know nothing changed.

We do that anyway without fsmonitor, but this seemed like a worthwhile
thing to optimize.



There is already logic to avoid writing out the index unless there is 
something that requires it.  In my testing, it was often the untracked 
cache marking the index as dirty that was causing the unexpected writes.


The patch to make the untracked cache only flag the index as dirty when 
the feature is being turned on or off is pretty simple (see below).  The 
challenge was that many of the untracked cache tests assume all changes 
are saved to disk after every command so they fail after making this 
change.  The patch below does a simple hack of checking for a test 
environment variable and only forcing the index writes if it is set.


Internally, we simply turned off the untracked cache as it's usefulness 
in combination with GVFS is limited and without the patch, it actually 
slowed down common operations more than it sped them up.


Typically, changes to the untracked cache don't accumulate long before 
the user does some operation that requires the index to be written out 
at which point the untracked cache is updated as well.



diff --git a/dir.c b/dir.c
index 5e93a1350b..af1d33aae1 100644
--- a/dir.c
+++ b/dir.c
@@ -2256,7 +2256,8 @@ int read_directory(struct dir_struct *dir, struct 
index_state *istate,

 dir->untracked->gitignore_invalidated,
 dir->untracked->dir_invalidated,
 dir->untracked->dir_opened);
-   if (dir->untracked == istate->untracked &&
+   if (getenv("GIT_TEST_UNTRACKED_CACHE") &&
+   dir->untracked == istate->untracked &&
(dir->untracked->dir_opened ||
 dir->untracked->gitignore_invalidated ||
 dir->untracked->dir_invalidated))
diff --git a/t/t7063-status-untracked-cache.sh 
b/t/t7063-status-untracked-cache.sh

index 

Re: Some rough edges of core.fsmonitor

2018-01-31 Thread Duy Nguyen
On Tue, Jan 30, 2018 at 8:21 AM, Ben Peart  wrote:
>
>
> On 1/28/2018 5:28 PM, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Sun, Jan 28, 2018 at 9:44 PM, Johannes Schindelin
>>  wrote:
>>>
>>> Hi,
>>>
>>> On Sat, 27 Jan 2018, Ævar Arnfjörð Bjarmason wrote:
>>>
 I just got around to testing this since it landed, for context some
 previous poking of mine in [1].

 Issues / stuff I've noticed:

 1) We end up invalidating the untracked cache because stuff in .git/
 changed. For example:

  01:09:24.975524 fsmonitor.c:173 fsmonitor process
 '.git/hooks/fsmonitor-watchman' returned success
  01:09:24.975548 fsmonitor.c:138 fsmonitor_refresh_callback
 '.git'
  01:09:24.975556 fsmonitor.c:138 fsmonitor_refresh_callback
 '.git/config'
  01:09:24.975568 fsmonitor.c:138 fsmonitor_refresh_callback
 '.git/index'
  01:09:25.122726 fsmonitor.c:91  write fsmonitor extension
 successful

 Am I missing something or should we do something like:

  diff --git a/fsmonitor.c b/fsmonitor.c
  index 0af7c4edba..5067b89bda 100644
  --- a/fsmonitor.c
  +++ b/fsmonitor.c
  @@ -118,7 +118,12 @@ static int query_fsmonitor(int version,
 uint64_t last_update, struct strbuf *que

   static void fsmonitor_refresh_callback(struct index_state *istate,
 const char *name)
   {
  -   int pos = index_name_pos(istate, name, strlen(name));
  +   int pos;
  +
  +   if (!strcmp(name, ".git") || starts_with(name, ".git/"))
  +   return;
  +
  +   pos = index_name_pos(istate, name, strlen(name));
>>>
>>>
>>> I would much rather have the fsmonitor hook already exclude those.
>>
>>
>> As documented the fsmonitor-watchman hook we ship doesn't work as
>> described in githooks(5), unless "files in the working directory" is
>> taken to include .git/, but I haven't seen that ever used.
>>
>> On the other hand relying on arbitrary user-provided hooks to do the
>> right thing at the cost of silent performance degradation is bad. If
>> we're going to expect the hook to remove these we should probably
>> warn/die here if it does send us .git/* files.
>>
>
> I'm not sure how often something is modified in the git directory when
> nothing was modified in the working directory but this seems like a nice
> optimization.
>
> We can't just blindly ignore changes under ".git" as the git directory may
> have been moved somewhere else.  Instead we'd need to use get_git_dir().

In theory. But we do blindly ignore changes under ".git" in some
cases, see treat_path() in dir.c for example.

> Rather than assuming the hook will optimize for this particular case, I
> think a better solution would be to update untracked_cache_invalidate_path()
> so that it doesn't invalidate the untracked cache and mark the index as
> dirty when it was asked to invalidate a path under GIT_DIR.  I can't think
> of a case when that would be the desired behavior.

You see, my only problem with this is tying the check with $GIT_DIR,
which may involve normalizing the path and all that stuff because the
current code base is not prepared to deal with that. Simply ignoring
anything in ".git" may work too. Not pretty but it's more in line with
what we have. Though I'm still not sure how it interacts with ".git"
from submodules which is why I still have not sent a patch to update
untracked_cache_invalidate_path() because it does make sense to fix it
in there.

> Somewhat off topic but related to the overall performance discussion: I've
> also thought the untracked cache shouldn't mark the index as dirty except in
> the case where the extension is being added or removed.  We've observed that
> it causes unnecessary index writes that actually slows down overall
> performance.
>
> Since it is a cache, it does not require the index to be written out for
> correctness, it can simply update the cache again the next time it is
> needed. This is typically faster than the cost of the index write so makes
> things faster overall.  I adopted this same model with the fsmonitor
> extension.

If you turn on split index, the write cost should be much much less
(but I think read cost increases a bit due to merging the two indexes
in core; I noticed this but haven't really dug down). You basically
pay writing modified index entries and extensions.

But yeah not writing is possible. The index's dirty flag can show that
only untracked cache extension is dirty, then it's a judgement call
whether to write it down or drop it. You still need to occasionally
write it down though. Dirty directories will fall back to slow path.
If you don't write it down, the set of dirty paths keeps increasing
and will start to slow git-status down. I don't know at what point we
should write it down though if 

Re: Some rough edges of core.fsmonitor

2018-01-30 Thread Ævar Arnfjörð Bjarmason

On Tue, Jan 30 2018, Ben Peart jotted:

> While some of these issues have been discussed in other threads, I
> thought I'd summarize my thoughts here.

Thanks for this & your other reply. I'm going to get to testing some of
Duy's patches soon, and if you have some other relevant WIP I'd be happy
to try them, but meanwhile replying to a few of these:

> On 1/26/2018 7:28 PM, Ævar Arnfjörð Bjarmason wrote:
>> I just got around to testing this since it landed, for context some
>> previous poking of mine in [1].
>>
>> Issues / stuff I've noticed:
>>
>> 1) We end up invalidating the untracked cache because stuff in .git/
>> changed. For example:
>>
>>  01:09:24.975524 fsmonitor.c:173 fsmonitor process 
>> '.git/hooks/fsmonitor-watchman' returned success
>>  01:09:24.975548 fsmonitor.c:138 fsmonitor_refresh_callback 
>> '.git'
>>  01:09:24.975556 fsmonitor.c:138 fsmonitor_refresh_callback 
>> '.git/config'
>>  01:09:24.975568 fsmonitor.c:138 fsmonitor_refresh_callback 
>> '.git/index'
>>  01:09:25.122726 fsmonitor.c:91  write fsmonitor extension 
>> successful
>>
>> Am I missing something or should we do something like:
>>
>>  diff --git a/fsmonitor.c b/fsmonitor.c
>>  index 0af7c4edba..5067b89bda 100644
>>  --- a/fsmonitor.c
>>  +++ b/fsmonitor.c
>>  @@ -118,7 +118,12 @@ static int query_fsmonitor(int version, uint64_t 
>> last_update, struct strbuf *que
>>
>>   static void fsmonitor_refresh_callback(struct index_state *istate, 
>> const char *name)
>>   {
>>  -   int pos = index_name_pos(istate, name, strlen(name));
>>  +   int pos;
>>  +
>>  +   if (!strcmp(name, ".git") || starts_with(name, ".git/"))
>>  +   return;
>>  +
>>  +   pos = index_name_pos(istate, name, strlen(name));
>>
>>  if (pos >= 0) {
>>  struct cache_entry *ce = istate->cache[pos];
>>
>> With that patch applied status on a large repo[2] goes from a consistent
>> ~180-200ms to ~140-150ms, since we're not invalidating some of the UC
>> structure
>>
>
> I favor making this optimization by updating
> untracked_cache_invalidate_path() so that it ignores paths under
> get_git_dir() and doesn't invalidate the untracked cache or flag the
> index as dirty.

*nod*

>> 2) We re-write out the index even though we know nothing changed
>>
>> When you first run with core.fsmonitor it needs to
>> mark_fsmonitor_clean() for every path, but is there a reason for why we
>> wouldn't supply the equivalent of GIT_OPTIONAL_LOCKS=0 if all paths are
>> marked and we know from the hook that nothing changed? Why write out the
>> index again?
>>
>
> Writing out the index when core.fsmonitor is first turned on is
> necessary to get the index extension added with the current state of
> the dirty flags.  Given it is a one time cost, I don't think we have
> anything worth trying to optimize here.

Indeed, that makes sense. What I was showing here is even after the
initial setup we continue to write it out when we know nothing changed.

We do that anyway without fsmonitor, but this seemed like a worthwhile
thing to optimize.

>> 3) A lot of time spend reading the index (or something..)
>>
>> While the hook itself takes ~20ms (and watchman itself 1/4 of that)
>> status as a whole takes much longer. gprof reveals:
>>
>>  Each sample counts as 0.01 seconds.
>>%   cumulative   self  self total
>>   time   seconds   secondscalls  ms/call  ms/call  name
>>   15.38  0.02 0.02   221690 0.00 0.00  memihash
>>   15.38  0.04 0.02   221689 0.00 0.00  create_from_disk
>>7.69  0.05 0.01  2216897 0.00 0.00  git_bswap32
>>7.69  0.06 0.01   222661 0.00 0.00  ce_path_match
>>7.69  0.07 0.01   221769 0.00 0.00  hashmap_add
>>7.69  0.08 0.0139941 0.00 0.00  prep_exclude
>>7.69  0.09 0.0139940 0.00 0.00  strbuf_addch
>>7.69  0.10 0.01110.0010.00  read_one
>>7.69  0.11 0.01110.0010.00  refresh_index
>>7.69  0.12 0.01110.0010.00  tweak_fsmonitor
>>7.69  0.13 0.01 preload_thread
>>
>> The index is 24M in this case, I guess it's unpacking it, but I wonder
>> if this couldn't be much faster if we saved away the result of the last
>> "status" in something that's quick to access, and then if nothing
>> changed we just report that, and no need to re-write the index (or just
>> write the "it was clean at this time" part).
>
> Yes, reading the index is slow.  We've made some improvements (not
> computing the SHA, not validating the sort order, etc) and have one
> more in progress that will reduce the malloc() cost.  I haven't found
> any other easy optimizations but it would be great if you 

Re: Some rough edges of core.fsmonitor

2018-01-30 Thread Ben Peart
While some of these issues have been discussed in other threads, I 
thought I'd summarize my thoughts here.


On 1/26/2018 7:28 PM, Ævar Arnfjörð Bjarmason wrote:

I just got around to testing this since it landed, for context some
previous poking of mine in [1].

Issues / stuff I've noticed:

1) We end up invalidating the untracked cache because stuff in .git/
changed. For example:

 01:09:24.975524 fsmonitor.c:173 fsmonitor process 
'.git/hooks/fsmonitor-watchman' returned success
 01:09:24.975548 fsmonitor.c:138 fsmonitor_refresh_callback '.git'
 01:09:24.975556 fsmonitor.c:138 fsmonitor_refresh_callback 
'.git/config'
 01:09:24.975568 fsmonitor.c:138 fsmonitor_refresh_callback 
'.git/index'
 01:09:25.122726 fsmonitor.c:91  write fsmonitor extension 
successful

Am I missing something or should we do something like:

 diff --git a/fsmonitor.c b/fsmonitor.c
 index 0af7c4edba..5067b89bda 100644
 --- a/fsmonitor.c
 +++ b/fsmonitor.c
 @@ -118,7 +118,12 @@ static int query_fsmonitor(int version, uint64_t 
last_update, struct strbuf *que

  static void fsmonitor_refresh_callback(struct index_state *istate, const 
char *name)
  {
 -   int pos = index_name_pos(istate, name, strlen(name));
 +   int pos;
 +
 +   if (!strcmp(name, ".git") || starts_with(name, ".git/"))
 +   return;
 +
 +   pos = index_name_pos(istate, name, strlen(name));

 if (pos >= 0) {
 struct cache_entry *ce = istate->cache[pos];

With that patch applied status on a large repo[2] goes from a consistent
~180-200ms to ~140-150ms, since we're not invalidating some of the UC
structure



I favor making this optimization by updating 
untracked_cache_invalidate_path() so that it ignores paths under 
get_git_dir() and doesn't invalidate the untracked cache or flag the 
index as dirty.



2) We re-write out the index even though we know nothing changed

When you first run with core.fsmonitor it needs to
mark_fsmonitor_clean() for every path, but is there a reason for why we
wouldn't supply the equivalent of GIT_OPTIONAL_LOCKS=0 if all paths are
marked and we know from the hook that nothing changed? Why write out the
index again?



Writing out the index when core.fsmonitor is first turned on is 
necessary to get the index extension added with the current state of the 
dirty flags.  Given it is a one time cost, I don't think we have 
anything worth trying to optimize here.



3) A lot of time spend reading the index (or something..)

While the hook itself takes ~20ms (and watchman itself 1/4 of that)
status as a whole takes much longer. gprof reveals:

 Each sample counts as 0.01 seconds.
   %   cumulative   self  self total
  time   seconds   secondscalls  ms/call  ms/call  name
  15.38  0.02 0.02   221690 0.00 0.00  memihash
  15.38  0.04 0.02   221689 0.00 0.00  create_from_disk
   7.69  0.05 0.01  2216897 0.00 0.00  git_bswap32
   7.69  0.06 0.01   222661 0.00 0.00  ce_path_match
   7.69  0.07 0.01   221769 0.00 0.00  hashmap_add
   7.69  0.08 0.0139941 0.00 0.00  prep_exclude
   7.69  0.09 0.0139940 0.00 0.00  strbuf_addch
   7.69  0.10 0.01110.0010.00  read_one
   7.69  0.11 0.01110.0010.00  refresh_index
   7.69  0.12 0.01110.0010.00  tweak_fsmonitor
   7.69  0.13 0.01 preload_thread

The index is 24M in this case, I guess it's unpacking it, but I wonder
if this couldn't be much faster if we saved away the result of the last
"status" in something that's quick to access, and then if nothing
changed we just report that, and no need to re-write the index (or just
write the "it was clean at this time" part).


Yes, reading the index is slow.  We've made some improvements (not 
computing the SHA, not validating the sort order, etc) and have one more 
in progress that will reduce the malloc() cost.  I haven't found any 
other easy optimizations but it would be great if you could find more! 
To make significant improvements, I'm afraid it will take more 
substantial changes to the in memory and on disk formats and updates to 
the code to take advantage of those changes.




4) core.fsmonitor=false behaves unexpectedly

The code that reads this variable just treats it as a string, so we do a
bunch of work for nothing (and nothing warns) if this is set and 'false'
is executed. Any reason we couldn't do our standard boolean parsing
here? You couldn't call your hook 0/1/true/false, but that doesn't seem
like a big loss.

1. 
https://public-inbox.org/git/cacbzzx5a6op7dh_g9wofbnejh2zgnk4b34ygxa8dandqvit...@mail.gmail.com/
2. https://github.com/avar/2015-04-03-1M-git



I'm torn on this 

Re: Some rough edges of core.fsmonitor

2018-01-30 Thread Ben Peart



On 1/27/2018 2:01 PM, Ævar Arnfjörð Bjarmason wrote:


On Sat, Jan 27 2018, Duy Nguyen jotted:


On Sat, Jan 27, 2018 at 07:39:27PM +0700, Duy Nguyen wrote:

On Sat, Jan 27, 2018 at 6:43 PM, Ævar Arnfjörð Bjarmason
 wrote:

a) no fsmonitor

 $ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
 12:32:44.947651 read-cache.c:1890   performance: 0.053153609 s: read 
cache .git/index
 12:32:44.967943 preload-index.c:112 performance: 0.020161093 s: 
preload index
 12:32:44.974217 read-cache.c:1446   performance: 0.006230611 s: 
refresh index

...

b) with fsmonitor

 $ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
 12:34:23.833625 read-cache.c:1890   performance: 0.049485685 s: read 
cache .git/index
 12:34:23.838622 preload-index.c:112 performance: 0.001221197 s: 
preload index
 12:34:23.858723 fsmonitor.c:170 performance: 0.020059647 s: 
fsmonitor process '.git/hooks/fsmonitor-watchman'
 12:34:23.871532 read-cache.c:1446   performance: 0.032870818 s: 
refresh index


Hmm.. why does refresh take longer with fsmonitor/watchman? With the
help from watchman, we know what files are modified. We don't need
manual stat()'ing and this line should be lower than the "no
fsmonitor" case, which is 0.006230611s.


Ahh.. my patch probably does not see that fsmonitor could be activated
lazily inside refresh_index() call. The patch below should fix it.


Will have to get those numbers to you later, or alternatively clone
https://github.com/avar/2015-04-03-1M-git (or some other test repo) and
test it yourself, sorry. Don't have time to follow-up much this weekend.


But between your normal refresh time (0.020 preload + 0.006 actual
refresh) and fsmonitor taking 0.020 just to talk to watchman, this
repo seems "too small" for fsmonitor/watchman to shine.


Surely that's an implementation limitation and not something inherent,
given that watchman itself returns in 5ms?

I.e. status could work like this, no?:

  1. At start, record the timestamp & find out canonical state via some
 expansive method.
  2. Print out xyz changed, abc added etc.
  3. Record *just* what status would report about xyz, abc etc.
  4. On subsequent git status, just amend that information, e.g. if
 watchman says nothing changed $(cat .git/last-status-output).

We shouldn't need to be reading the entire index in the common case
where just a few things change.



I agree that reading the entire index in the common case is rather 
expensive. It is, however, the model we have today and all the code in 
git assumes all cache entries are in memory.


We are interested in pursuing a patch series that would enable higher 
performance especially with large and/or sparse repos by making the 
index sparse, hierarchical, and incrementally readable/writable.  As you 
might expect, that is a lot of work and is far beyond what we can 
address in this patch series.



There's also a lot of things that use status to just check "are we
clean?", those would only need to record the last known timestamp when
the tree was clean, and then ask watchman if there were any changes, if
not we're done.


I'm still a bit curious that refresh index time, after excluding 0.020
for fsmonitor, is stil 0.012s. What does it do? It should really be
doing nothing. Either way, read index time seems to be the elephant in
the room now.

-- 8< --
diff --git a/read-cache.c b/read-cache.c
index eac74bc9f1..d60e0a8480 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1367,12 +1367,21 @@ int refresh_index(struct index_state *istate, unsigned 
int flags,
unsigned int options = (CE_MATCH_REFRESH |
(really ? CE_MATCH_IGNORE_VALID : 0) |
(not_new ? CE_MATCH_IGNORE_MISSING : 0));
+   int ignore_fsmonitor = options & CE_MATCH_IGNORE_FSMONITOR;
const char *modified_fmt;
const char *deleted_fmt;
const char *typechange_fmt;
const char *added_fmt;
const char *unmerged_fmt;
-   uint64_t start = getnanotime();
+   uint64_t start;
+
+   /*
+* If fsmonitor is used, force its communication early to
+* accurately measure how long this function takes without it.
+*/
+   if (!ignore_fsmonitor)
+   refresh_fsmonitor(istate);
+   start = getnanotime();

modified_fmt = (in_porcelain ? "M\t%s\n" : "%s: needs update\n");
deleted_fmt = (in_porcelain ? "D\t%s\n" : "%s: needs update\n");
-- 8< --


Re: Some rough edges of core.fsmonitor

2018-01-29 Thread Ben Peart



On 1/28/2018 5:28 PM, Ævar Arnfjörð Bjarmason wrote:

On Sun, Jan 28, 2018 at 9:44 PM, Johannes Schindelin
 wrote:

Hi,

On Sat, 27 Jan 2018, Ævar Arnfjörð Bjarmason wrote:


I just got around to testing this since it landed, for context some
previous poking of mine in [1].

Issues / stuff I've noticed:

1) We end up invalidating the untracked cache because stuff in .git/
changed. For example:

 01:09:24.975524 fsmonitor.c:173 fsmonitor process 
'.git/hooks/fsmonitor-watchman' returned success
 01:09:24.975548 fsmonitor.c:138 fsmonitor_refresh_callback '.git'
 01:09:24.975556 fsmonitor.c:138 fsmonitor_refresh_callback 
'.git/config'
 01:09:24.975568 fsmonitor.c:138 fsmonitor_refresh_callback 
'.git/index'
 01:09:25.122726 fsmonitor.c:91  write fsmonitor extension 
successful

Am I missing something or should we do something like:

 diff --git a/fsmonitor.c b/fsmonitor.c
 index 0af7c4edba..5067b89bda 100644
 --- a/fsmonitor.c
 +++ b/fsmonitor.c
 @@ -118,7 +118,12 @@ static int query_fsmonitor(int version, uint64_t 
last_update, struct strbuf *que

  static void fsmonitor_refresh_callback(struct index_state *istate, const 
char *name)
  {
 -   int pos = index_name_pos(istate, name, strlen(name));
 +   int pos;
 +
 +   if (!strcmp(name, ".git") || starts_with(name, ".git/"))
 +   return;
 +
 +   pos = index_name_pos(istate, name, strlen(name));


I would much rather have the fsmonitor hook already exclude those.


As documented the fsmonitor-watchman hook we ship doesn't work as
described in githooks(5), unless "files in the working directory" is
taken to include .git/, but I haven't seen that ever used.

On the other hand relying on arbitrary user-provided hooks to do the
right thing at the cost of silent performance degradation is bad. If
we're going to expect the hook to remove these we should probably
warn/die here if it does send us .git/* files.



I'm not sure how often something is modified in the git directory when 
nothing was modified in the working directory but this seems like a nice 
optimization.


We can't just blindly ignore changes under ".git" as the git directory 
may have been moved somewhere else.  Instead we'd need to use get_git_dir().


Rather than assuming the hook will optimize for this particular case, I 
think a better solution would be to update 
untracked_cache_invalidate_path() so that it doesn't invalidate the 
untracked cache and mark the index as dirty when it was asked to 
invalidate a path under GIT_DIR.  I can't think of a case when that 
would be the desired behavior.


Somewhat off topic but related to the overall performance discussion: 
I've also thought the untracked cache shouldn't mark the index as dirty 
except in the case where the extension is being added or removed.  We've 
observed that it causes unnecessary index writes that actually slows 
down overall performance.


Since it is a cache, it does not require the index to be written out for 
correctness, it can simply update the cache again the next time it is 
needed. This is typically faster than the cost of the index write so 
makes things faster overall.  I adopted this same model with the 
fsmonitor extension.



If you *must* add these comparisons, you have to use fspathcmp() and
fspathncmp() instead (because case-insensitivity).


Thanks.



Re: Some rough edges of core.fsmonitor

2018-01-29 Thread Ben Peart



On 1/29/2018 4:40 AM, Duy Nguyen wrote:

On Sat, Jan 27, 2018 at 12:43:41PM +0100, Ævar Arnfjörð Bjarmason wrote:

b) with fsmonitor

 $ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
 12:34:23.833625 read-cache.c:1890   performance: 0.049485685 s: read 
cache .git/index


This is sort of off topic but may be interesting for big repo guys. It
looks like read cache's time is partially dominated by malloc().



That is correct.  We have tried a few different ways to address this. 
First was my patch series [1] that would parallelize all of the read 
cache code.


We quickly found that malloc() was the biggest culprit and by speeding 
that up, we got most of the wins.  At Peff's recommendation [2], we 
looked into using tcmalloc but found that 1) it has bugs on Windows and 
2) it isn't being actively maintained so it didn't seem those bugs would 
ever get fixed.


We are currently working on a patch that will use a refactored version 
of the mem_pool in fast-import.c to do block allocations of the cache 
entries which is giving us about a 22% improvement in "git status" 
times.  One challenge has been ensuring that cache entries are not 
passed from one index/mem_pool to another which could cause access after 
free bugs.


[1] 
https://public-inbox.org/git/20171109141737.47976-1-benpe...@microsoft.com/
[2] 
https://public-inbox.org/git/20171120153846.v5b7ho42yzrzn...@sigill.intra.peff.net/




This is the performance breakdown of do_read_index()

$ GIT_TRACE_PERFORMANCE=2 ~/w/git/t/helper/test-read-cache
0.78489 s: open/mmap/close
0.038915239 s: main entries
0.018983150 s: ext TREE
0.012667080 s: ext UNTR
0.05372 s: ext FSMN
0.001473470 s: munmap
0.072386911 s: read cache .git/index

Reading main index entries takes like half of the time (0.038 vs
0.072). With the below patch to take out hundred thousands of malloc()
we have this, loading main entries now only takes 0.012s:

$ GIT_TRACE_PERFORMANCE=2 ~/w/git/t/helper/test-read-cache
0.46587 s: open/mmap/close
0.012077300 s: main entries
0.020477683 s: ext TREE
0.010259998 s: ext UNTR
0.10250 s: ext FSMN
0.000753854 s: munmap
0.043906473 s: read cache .git/index

We used to do less malloc until debed2a629 (read-cache.c: allocate
index entries individually - 2011-10-24) but I don't think we can
simply revert that (not worth the extra complexity of the old way).

Now "TREE" and "UNTR" extensions become a bigger problem.

-- 8< --
diff --git a/read-cache.c b/read-cache.c
index d60e0a8480..88f4213c99 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1622,7 +1622,12 @@ static struct cache_entry 
*cache_entry_from_ondisk(struct ondisk_cache_entry *on
   const char *name,
   size_t len)
  {
+#if 0
struct cache_entry *ce = xmalloc(cache_entry_size(len));
+#else
+   static char buf[1024];
+   struct cache_entry *ce = (struct cache_entry *)buf;
+#endif
  
  	ce->ce_stat_data.sd_ctime.sec = get_be32(>ctime.sec);

ce->ce_stat_data.sd_mtime.sec = get_be32(>mtime.sec);
diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c
index 48255eef31..e1d21d17a3 100644
--- a/t/helper/test-read-cache.c
+++ b/t/helper/test-read-cache.c
@@ -8,7 +8,9 @@ int cmd_main(int argc, const char **argv)
setup_git_directory();
for (i = 0; i < cnt; i++) {
read_cache();
+#if 0
discard_cache();
+#endif
}
return 0;
  }
-- 8< --



Re: Some rough edges of core.fsmonitor

2018-01-29 Thread Duy Nguyen
On Sat, Jan 27, 2018 at 12:43:41PM +0100, Ævar Arnfjörð Bjarmason wrote:
> b) with fsmonitor
> 
> $ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
> 12:34:23.833625 read-cache.c:1890   performance: 0.049485685 s: read 
> cache .git/index

This is sort of off topic but may be interesting for big repo guys. It
looks like read cache's time is partially dominated by malloc().

This is the performance breakdown of do_read_index()

$ GIT_TRACE_PERFORMANCE=2 ~/w/git/t/helper/test-read-cache
0.78489 s: open/mmap/close
0.038915239 s: main entries
0.018983150 s: ext TREE
0.012667080 s: ext UNTR
0.05372 s: ext FSMN
0.001473470 s: munmap
0.072386911 s: read cache .git/index

Reading main index entries takes like half of the time (0.038 vs
0.072). With the below patch to take out hundred thousands of malloc()
we have this, loading main entries now only takes 0.012s:

$ GIT_TRACE_PERFORMANCE=2 ~/w/git/t/helper/test-read-cache
0.46587 s: open/mmap/close
0.012077300 s: main entries
0.020477683 s: ext TREE
0.010259998 s: ext UNTR
0.10250 s: ext FSMN
0.000753854 s: munmap
0.043906473 s: read cache .git/index

We used to do less malloc until debed2a629 (read-cache.c: allocate
index entries individually - 2011-10-24) but I don't think we can
simply revert that (not worth the extra complexity of the old way).

Now "TREE" and "UNTR" extensions become a bigger problem.

-- 8< --
diff --git a/read-cache.c b/read-cache.c
index d60e0a8480..88f4213c99 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1622,7 +1622,12 @@ static struct cache_entry 
*cache_entry_from_ondisk(struct ondisk_cache_entry *on
   const char *name,
   size_t len)
 {
+#if 0
struct cache_entry *ce = xmalloc(cache_entry_size(len));
+#else
+   static char buf[1024];
+   struct cache_entry *ce = (struct cache_entry *)buf;
+#endif
 
ce->ce_stat_data.sd_ctime.sec = get_be32(>ctime.sec);
ce->ce_stat_data.sd_mtime.sec = get_be32(>mtime.sec);
diff --git a/t/helper/test-read-cache.c b/t/helper/test-read-cache.c
index 48255eef31..e1d21d17a3 100644
--- a/t/helper/test-read-cache.c
+++ b/t/helper/test-read-cache.c
@@ -8,7 +8,9 @@ int cmd_main(int argc, const char **argv)
setup_git_directory();
for (i = 0; i < cnt; i++) {
read_cache();
+#if 0
discard_cache();
+#endif
}
return 0;
 }
-- 8< --


Re: Some rough edges of core.fsmonitor

2018-01-28 Thread Ævar Arnfjörð Bjarmason
On Sun, Jan 28, 2018 at 9:44 PM, Johannes Schindelin
 wrote:
> Hi,
>
> On Sat, 27 Jan 2018, Ævar Arnfjörð Bjarmason wrote:
>
>> I just got around to testing this since it landed, for context some
>> previous poking of mine in [1].
>>
>> Issues / stuff I've noticed:
>>
>> 1) We end up invalidating the untracked cache because stuff in .git/
>> changed. For example:
>>
>> 01:09:24.975524 fsmonitor.c:173 fsmonitor process 
>> '.git/hooks/fsmonitor-watchman' returned success
>> 01:09:24.975548 fsmonitor.c:138 fsmonitor_refresh_callback '.git'
>> 01:09:24.975556 fsmonitor.c:138 fsmonitor_refresh_callback 
>> '.git/config'
>> 01:09:24.975568 fsmonitor.c:138 fsmonitor_refresh_callback 
>> '.git/index'
>> 01:09:25.122726 fsmonitor.c:91  write fsmonitor extension 
>> successful
>>
>> Am I missing something or should we do something like:
>>
>> diff --git a/fsmonitor.c b/fsmonitor.c
>> index 0af7c4edba..5067b89bda 100644
>> --- a/fsmonitor.c
>> +++ b/fsmonitor.c
>> @@ -118,7 +118,12 @@ static int query_fsmonitor(int version, uint64_t 
>> last_update, struct strbuf *que
>>
>>  static void fsmonitor_refresh_callback(struct index_state *istate, 
>> const char *name)
>>  {
>> -   int pos = index_name_pos(istate, name, strlen(name));
>> +   int pos;
>> +
>> +   if (!strcmp(name, ".git") || starts_with(name, ".git/"))
>> +   return;
>> +
>> +   pos = index_name_pos(istate, name, strlen(name));
>
> I would much rather have the fsmonitor hook already exclude those.

As documented the fsmonitor-watchman hook we ship doesn't work as
described in githooks(5), unless "files in the working directory" is
taken to include .git/, but I haven't seen that ever used.

On the other hand relying on arbitrary user-provided hooks to do the
right thing at the cost of silent performance degradation is bad. If
we're going to expect the hook to remove these we should probably
warn/die here if it does send us .git/* files.

> If you *must* add these comparisons, you have to use fspathcmp() and
> fspathncmp() instead (because case-insensitivity).

Thanks.


Re: Some rough edges of core.fsmonitor

2018-01-28 Thread Johannes Schindelin
Hi,

On Sat, 27 Jan 2018, Ævar Arnfjörð Bjarmason wrote:

> I just got around to testing this since it landed, for context some
> previous poking of mine in [1].
> 
> Issues / stuff I've noticed:
> 
> 1) We end up invalidating the untracked cache because stuff in .git/
> changed. For example:
> 
> 01:09:24.975524 fsmonitor.c:173 fsmonitor process 
> '.git/hooks/fsmonitor-watchman' returned success
> 01:09:24.975548 fsmonitor.c:138 fsmonitor_refresh_callback '.git'
> 01:09:24.975556 fsmonitor.c:138 fsmonitor_refresh_callback 
> '.git/config'
> 01:09:24.975568 fsmonitor.c:138 fsmonitor_refresh_callback 
> '.git/index'
> 01:09:25.122726 fsmonitor.c:91  write fsmonitor extension 
> successful
> 
> Am I missing something or should we do something like:
> 
> diff --git a/fsmonitor.c b/fsmonitor.c
> index 0af7c4edba..5067b89bda 100644
> --- a/fsmonitor.c
> +++ b/fsmonitor.c
> @@ -118,7 +118,12 @@ static int query_fsmonitor(int version, uint64_t 
> last_update, struct strbuf *que
> 
>  static void fsmonitor_refresh_callback(struct index_state *istate, const 
> char *name)
>  {
> -   int pos = index_name_pos(istate, name, strlen(name));
> +   int pos;
> +
> +   if (!strcmp(name, ".git") || starts_with(name, ".git/"))
> +   return;
> +
> +   pos = index_name_pos(istate, name, strlen(name));

I would much rather have the fsmonitor hook already exclude those.

If you *must* add these comparisons, you have to use fspathcmp() and
fspathncmp() instead (because case-insensitivity).

Ciao,
Dscho

Re: Some rough edges of core.fsmonitor

2018-01-27 Thread Ævar Arnfjörð Bjarmason

On Sat, Jan 27 2018, Duy Nguyen jotted:

> On Sat, Jan 27, 2018 at 07:39:27PM +0700, Duy Nguyen wrote:
>> On Sat, Jan 27, 2018 at 6:43 PM, Ævar Arnfjörð Bjarmason
>>  wrote:
>> > a) no fsmonitor
>> >
>> > $ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
>> > 12:32:44.947651 read-cache.c:1890   performance: 0.053153609 s: 
>> > read cache .git/index
>> > 12:32:44.967943 preload-index.c:112 performance: 0.020161093 s: 
>> > preload index
>> > 12:32:44.974217 read-cache.c:1446   performance: 0.006230611 s: 
>> > refresh index
>> >
>> > ...
>> >
>> > b) with fsmonitor
>> >
>> > $ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
>> > 12:34:23.833625 read-cache.c:1890   performance: 0.049485685 s: 
>> > read cache .git/index
>> > 12:34:23.838622 preload-index.c:112 performance: 0.001221197 s: 
>> > preload index
>> > 12:34:23.858723 fsmonitor.c:170 performance: 0.020059647 s: 
>> > fsmonitor process '.git/hooks/fsmonitor-watchman'
>> > 12:34:23.871532 read-cache.c:1446   performance: 0.032870818 s: 
>> > refresh index
>>
>> Hmm.. why does refresh take longer with fsmonitor/watchman? With the
>> help from watchman, we know what files are modified. We don't need
>> manual stat()'ing and this line should be lower than the "no
>> fsmonitor" case, which is 0.006230611s.
>
> Ahh.. my patch probably does not see that fsmonitor could be activated
> lazily inside refresh_index() call. The patch below should fix it.

Will have to get those numbers to you later, or alternatively clone
https://github.com/avar/2015-04-03-1M-git (or some other test repo) and
test it yourself, sorry. Don't have time to follow-up much this weekend.

> But between your normal refresh time (0.020 preload + 0.006 actual
> refresh) and fsmonitor taking 0.020 just to talk to watchman, this
> repo seems "too small" for fsmonitor/watchman to shine.

Surely that's an implementation limitation and not something inherent,
given that watchman itself returns in 5ms?

I.e. status could work like this, no?:

 1. At start, record the timestamp & find out canonical state via some
expansive method.
 2. Print out xyz changed, abc added etc.
 3. Record *just* what status would report about xyz, abc etc.
 4. On subsequent git status, just amend that information, e.g. if
watchman says nothing changed $(cat .git/last-status-output).

We shouldn't need to be reading the entire index in the common case
where just a few things change.

There's also a lot of things that use status to just check "are we
clean?", those would only need to record the last known timestamp when
the tree was clean, and then ask watchman if there were any changes, if
not we're done.

> I'm still a bit curious that refresh index time, after excluding 0.020
> for fsmonitor, is stil 0.012s. What does it do? It should really be
> doing nothing. Either way, read index time seems to be the elephant in
> the room now.
>
> -- 8< --
> diff --git a/read-cache.c b/read-cache.c
> index eac74bc9f1..d60e0a8480 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1367,12 +1367,21 @@ int refresh_index(struct index_state *istate, 
> unsigned int flags,
>   unsigned int options = (CE_MATCH_REFRESH |
>   (really ? CE_MATCH_IGNORE_VALID : 0) |
>   (not_new ? CE_MATCH_IGNORE_MISSING : 0));
> + int ignore_fsmonitor = options & CE_MATCH_IGNORE_FSMONITOR;
>   const char *modified_fmt;
>   const char *deleted_fmt;
>   const char *typechange_fmt;
>   const char *added_fmt;
>   const char *unmerged_fmt;
> - uint64_t start = getnanotime();
> + uint64_t start;
> +
> + /*
> +  * If fsmonitor is used, force its communication early to
> +  * accurately measure how long this function takes without it.
> +  */
> + if (!ignore_fsmonitor)
> + refresh_fsmonitor(istate);
> + start = getnanotime();
>
>   modified_fmt = (in_porcelain ? "M\t%s\n" : "%s: needs update\n");
>   deleted_fmt = (in_porcelain ? "D\t%s\n" : "%s: needs update\n");
> -- 8< --


Re: Some rough edges of core.fsmonitor

2018-01-27 Thread Duy Nguyen
On Sat, Jan 27, 2018 at 07:39:27PM +0700, Duy Nguyen wrote:
> On Sat, Jan 27, 2018 at 6:43 PM, Ævar Arnfjörð Bjarmason
>  wrote:
> > a) no fsmonitor
> >
> > $ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
> > 12:32:44.947651 read-cache.c:1890   performance: 0.053153609 s: 
> > read cache .git/index
> > 12:32:44.967943 preload-index.c:112 performance: 0.020161093 s: 
> > preload index
> > 12:32:44.974217 read-cache.c:1446   performance: 0.006230611 s: 
> > refresh index
> >
> > ...
> >
> > b) with fsmonitor
> >
> > $ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
> > 12:34:23.833625 read-cache.c:1890   performance: 0.049485685 s: 
> > read cache .git/index
> > 12:34:23.838622 preload-index.c:112 performance: 0.001221197 s: 
> > preload index
> > 12:34:23.858723 fsmonitor.c:170 performance: 0.020059647 s: 
> > fsmonitor process '.git/hooks/fsmonitor-watchman'
> > 12:34:23.871532 read-cache.c:1446   performance: 0.032870818 s: 
> > refresh index
> 
> Hmm.. why does refresh take longer with fsmonitor/watchman? With the
> help from watchman, we know what files are modified. We don't need
> manual stat()'ing and this line should be lower than the "no
> fsmonitor" case, which is 0.006230611s.

Ahh.. my patch probably does not see that fsmonitor could be activated
lazily inside refresh_index() call. The patch below should fix it.

But between your normal refresh time (0.020 preload + 0.006 actual
refresh) and fsmonitor taking 0.020 just to talk to watchman, this
repo seems "too small" for fsmonitor/watchman to shine.

I'm still a bit curious that refresh index time, after excluding 0.020
for fsmonitor, is stil 0.012s. What does it do? It should really be
doing nothing. Either way, read index time seems to be the elephant in
the room now.

-- 8< --
diff --git a/read-cache.c b/read-cache.c
index eac74bc9f1..d60e0a8480 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1367,12 +1367,21 @@ int refresh_index(struct index_state *istate, unsigned 
int flags,
unsigned int options = (CE_MATCH_REFRESH |
(really ? CE_MATCH_IGNORE_VALID : 0) |
(not_new ? CE_MATCH_IGNORE_MISSING : 0));
+   int ignore_fsmonitor = options & CE_MATCH_IGNORE_FSMONITOR;
const char *modified_fmt;
const char *deleted_fmt;
const char *typechange_fmt;
const char *added_fmt;
const char *unmerged_fmt;
-   uint64_t start = getnanotime();
+   uint64_t start;
+
+   /*
+* If fsmonitor is used, force its communication early to
+* accurately measure how long this function takes without it.
+*/
+   if (!ignore_fsmonitor)
+   refresh_fsmonitor(istate);
+   start = getnanotime();
 
modified_fmt = (in_porcelain ? "M\t%s\n" : "%s: needs update\n");
deleted_fmt = (in_porcelain ? "D\t%s\n" : "%s: needs update\n");
-- 8< --


Re: Some rough edges of core.fsmonitor

2018-01-27 Thread Duy Nguyen
On Sat, Jan 27, 2018 at 6:43 PM, Ævar Arnfjörð Bjarmason
 wrote:
> a) no fsmonitor
>
> $ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
> 12:32:44.947651 read-cache.c:1890   performance: 0.053153609 s: read 
> cache .git/index
> 12:32:44.967943 preload-index.c:112 performance: 0.020161093 s: 
> preload index
> 12:32:44.974217 read-cache.c:1446   performance: 0.006230611 s: 
> refresh index
>
> ...
>
> b) with fsmonitor
>
> $ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
> 12:34:23.833625 read-cache.c:1890   performance: 0.049485685 s: read 
> cache .git/index
> 12:34:23.838622 preload-index.c:112 performance: 0.001221197 s: 
> preload index
> 12:34:23.858723 fsmonitor.c:170 performance: 0.020059647 s: 
> fsmonitor process '.git/hooks/fsmonitor-watchman'
> 12:34:23.871532 read-cache.c:1446   performance: 0.032870818 s: 
> refresh index

Hmm.. why does refresh take longer with fsmonitor/watchman? With the
help from watchman, we know what files are modified. We don't need
manual stat()'ing and this line should be lower than the "no
fsmonitor" case, which is 0.006230611s.

> 12:34:23.876427 diff-lib.c:250  performance: 0.004731427 s: 
> diff-files
> 12:34:23.880669 diff-lib.c:527  performance: 0.003944422 s: 
> diff-index
> 12:34:23.899225 dir.c:2290  performance: 0.018509066 s: read 
> directory
> 12:34:23.901914 trace.c:417 performance: 0.118250995 s: git 
> command: '/home/aearnfjord/g/git/git-status'

I don't see any "write index" line here, which is interesting since
your case c) is about "don't write index".

> c) with fsmonitor + don't write index
-- 
Duy


Re: Some rough edges of core.fsmonitor

2018-01-27 Thread Ævar Arnfjörð Bjarmason

On Sat, Jan 27 2018, Duy Nguyen jotted:

> On Sat, Jan 27, 2018 at 7:28 AM, Ævar Arnfjörð Bjarmason
>  wrote:
>> 3) A lot of time spend reading the index (or something..)
>
> I'm resending a patch from my old index-helper series. It should
> measure all big time consuming blocks in git. Maybe we should get it
> merged...
>
>> While the hook itself takes ~20ms (and watchman itself 1/4 of that)
>> status as a whole takes much longer. gprof reveals:
>>
>> Each sample counts as 0.01 seconds.
>>   %   cumulative   self  self total
>>  time   seconds   secondscalls  ms/call  ms/call  name
>>  15.38  0.02 0.02   221690 0.00 0.00  memihash
>
> This sounds like name-hash to me.
>
>>  15.38  0.04 0.02   221689 0.00 0.00  create_from_disk
>>   7.69  0.05 0.01  2216897 0.00 0.00  git_bswap32
>>   7.69  0.06 0.01   222661 0.00 0.00  ce_path_match
>>   7.69  0.07 0.01   221769 0.00 0.00  hashmap_add
>>   7.69  0.08 0.0139941 0.00 0.00  prep_exclude
>>   7.69  0.09 0.0139940 0.00 0.00  strbuf_addch
>>   7.69  0.10 0.01110.0010.00  read_one
>>   7.69  0.11 0.01110.0010.00  refresh_index
>>   7.69  0.12 0.01110.0010.00  tweak_fsmonitor
>>   7.69  0.13 0.01 preload_thread
>>
>> The index is 24M in this case, I guess it's unpacking it, but I wonder
>> if this couldn't be much faster if we saved away the result of the last
>> "status" in something that's quick to access, and then if nothing
>
> No we could do better, we could cache parsed index content so
> everybody benefits. I demonstrated it with my "index v254" patch a
> while back:
>
> https://public-inbox.org/git/1399980019-8706-1-git-send-email-pclo...@gmail.com/
>
> With the patch I'm sending soon, we can see how much time reading an
> index take out of that ~140-150ms (and we probably can cut down index
> read time to like 10-20% when cached).
>
>> changed we just report that, and no need to re-write the index (or just
>> write the "it was clean at this time" part).
>
> Hmm.. does an index write increase that much time?

Your patch is very useful. Here's (with gcc -03) some runtimes. This
also includes my .git exclusion patch.

These are all best out of 5, and with the top (until <0.5% time) of
strace -c output (run as another invocation, timing not done with
strace)::

a) no fsmonitor

$ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
12:32:44.947651 read-cache.c:1890   performance: 0.053153609 s: read 
cache .git/index
12:32:44.967943 preload-index.c:112 performance: 0.020161093 s: preload 
index
12:32:44.974217 read-cache.c:1446   performance: 0.006230611 s: refresh 
index
12:32:44.979083 diff-lib.c:250  performance: 0.004649994 s: 
diff-files
12:32:44.982511 diff-lib.c:527  performance: 0.002918416 s: 
diff-index
12:32:45.037880 dir.c:2290  performance: 0.055331063 s: read 
directory
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean
12:32:45.040666 trace.c:417 performance: 0.146724289 s: git 
command: '/home/aearnfjord/g/git/git-status'

real0m0.153s
user0m0.110s
sys 0m0.354s
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 59.930.031924   1 39978 9 stat
 35.860.0191046368 3   futex
  0.840.000446  1236   mprotect
  0.730.000389  1329   munmap
  0.660.000349   662   mmap
  0.530.000285  1420   clone

b) with fsmonitor

$ time GIT_TRACE_PERFORMANCE=1 ~/g/git/git-status
12:34:23.833625 read-cache.c:1890   performance: 0.049485685 s: read 
cache .git/index
12:34:23.838622 preload-index.c:112 performance: 0.001221197 s: preload 
index
12:34:23.858723 fsmonitor.c:170 performance: 0.020059647 s: 
fsmonitor process '.git/hooks/fsmonitor-watchman'
12:34:23.871532 read-cache.c:1446   performance: 0.032870818 s: refresh 
index
12:34:23.876427 diff-lib.c:250  performance: 0.004731427 s: 
diff-files
12:34:23.880669 diff-lib.c:527  performance: 0.003944422 s: 
diff-index
12:34:23.899225 dir.c:2290  performance: 0.018509066 s: read 
directory
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean
12:34:23.901914 trace.c:417 performance: 0.118250995 s: git 
command: '/home/aearnfjord/g/git/git-status'

real0m0.125s
user0m0.086s
sys 0m0.043s
  

Re: Some rough edges of core.fsmonitor

2018-01-26 Thread Duy Nguyen
On Sat, Jan 27, 2018 at 7:28 AM, Ævar Arnfjörð Bjarmason
 wrote:
> 3) A lot of time spend reading the index (or something..)

I'm resending a patch from my old index-helper series. It should
measure all big time consuming blocks in git. Maybe we should get it
merged...

> While the hook itself takes ~20ms (and watchman itself 1/4 of that)
> status as a whole takes much longer. gprof reveals:
>
> Each sample counts as 0.01 seconds.
>   %   cumulative   self  self total
>  time   seconds   secondscalls  ms/call  ms/call  name
>  15.38  0.02 0.02   221690 0.00 0.00  memihash

This sounds like name-hash to me.

>  15.38  0.04 0.02   221689 0.00 0.00  create_from_disk
>   7.69  0.05 0.01  2216897 0.00 0.00  git_bswap32
>   7.69  0.06 0.01   222661 0.00 0.00  ce_path_match
>   7.69  0.07 0.01   221769 0.00 0.00  hashmap_add
>   7.69  0.08 0.0139941 0.00 0.00  prep_exclude
>   7.69  0.09 0.0139940 0.00 0.00  strbuf_addch
>   7.69  0.10 0.01110.0010.00  read_one
>   7.69  0.11 0.01110.0010.00  refresh_index
>   7.69  0.12 0.01110.0010.00  tweak_fsmonitor
>   7.69  0.13 0.01 preload_thread
>
> The index is 24M in this case, I guess it's unpacking it, but I wonder
> if this couldn't be much faster if we saved away the result of the last
> "status" in something that's quick to access, and then if nothing

No we could do better, we could cache parsed index content so
everybody benefits. I demonstrated it with my "index v254" patch a
while back:

https://public-inbox.org/git/1399980019-8706-1-git-send-email-pclo...@gmail.com/

With the patch I'm sending soon, we can see how much time reading an
index take out of that ~140-150ms (and we probably can cut down index
read time to like 10-20% when cached).

> changed we just report that, and no need to re-write the index (or just
> write the "it was clean at this time" part).

Hmm.. does an index write increase that much time?
-- 
Duy


Some rough edges of core.fsmonitor

2018-01-26 Thread Ævar Arnfjörð Bjarmason
I just got around to testing this since it landed, for context some
previous poking of mine in [1].

Issues / stuff I've noticed:

1) We end up invalidating the untracked cache because stuff in .git/
changed. For example:

01:09:24.975524 fsmonitor.c:173 fsmonitor process 
'.git/hooks/fsmonitor-watchman' returned success
01:09:24.975548 fsmonitor.c:138 fsmonitor_refresh_callback '.git'
01:09:24.975556 fsmonitor.c:138 fsmonitor_refresh_callback 
'.git/config'
01:09:24.975568 fsmonitor.c:138 fsmonitor_refresh_callback 
'.git/index'
01:09:25.122726 fsmonitor.c:91  write fsmonitor extension successful

Am I missing something or should we do something like:

diff --git a/fsmonitor.c b/fsmonitor.c
index 0af7c4edba..5067b89bda 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -118,7 +118,12 @@ static int query_fsmonitor(int version, uint64_t 
last_update, struct strbuf *que

 static void fsmonitor_refresh_callback(struct index_state *istate, const 
char *name)
 {
-   int pos = index_name_pos(istate, name, strlen(name));
+   int pos;
+
+   if (!strcmp(name, ".git") || starts_with(name, ".git/"))
+   return;
+
+   pos = index_name_pos(istate, name, strlen(name));

if (pos >= 0) {
struct cache_entry *ce = istate->cache[pos];

With that patch applied status on a large repo[2] goes from a consistent
~180-200ms to ~140-150ms, since we're not invalidating some of the UC
structure

2) We re-write out the index even though we know nothing changed

When you first run with core.fsmonitor it needs to
mark_fsmonitor_clean() for every path, but is there a reason for why we
wouldn't supply the equivalent of GIT_OPTIONAL_LOCKS=0 if all paths are
marked and we know from the hook that nothing changed? Why write out the
index again?

3) A lot of time spend reading the index (or something..)

While the hook itself takes ~20ms (and watchman itself 1/4 of that)
status as a whole takes much longer. gprof reveals:

Each sample counts as 0.01 seconds.
  %   cumulative   self  self total
 time   seconds   secondscalls  ms/call  ms/call  name
 15.38  0.02 0.02   221690 0.00 0.00  memihash
 15.38  0.04 0.02   221689 0.00 0.00  create_from_disk
  7.69  0.05 0.01  2216897 0.00 0.00  git_bswap32
  7.69  0.06 0.01   222661 0.00 0.00  ce_path_match
  7.69  0.07 0.01   221769 0.00 0.00  hashmap_add
  7.69  0.08 0.0139941 0.00 0.00  prep_exclude
  7.69  0.09 0.0139940 0.00 0.00  strbuf_addch
  7.69  0.10 0.01110.0010.00  read_one
  7.69  0.11 0.01110.0010.00  refresh_index
  7.69  0.12 0.01110.0010.00  tweak_fsmonitor
  7.69  0.13 0.01 preload_thread

The index is 24M in this case, I guess it's unpacking it, but I wonder
if this couldn't be much faster if we saved away the result of the last
"status" in something that's quick to access, and then if nothing
changed we just report that, and no need to re-write the index (or just
write the "it was clean at this time" part).

4) core.fsmonitor=false behaves unexpectedly

The code that reads this variable just treats it as a string, so we do a
bunch of work for nothing (and nothing warns) if this is set and 'false'
is executed. Any reason we couldn't do our standard boolean parsing
here? You couldn't call your hook 0/1/true/false, but that doesn't seem
like a big loss.

1. 
https://public-inbox.org/git/cacbzzx5a6op7dh_g9wofbnejh2zgnk4b34ygxa8dandqvit...@mail.gmail.com/
2. https://github.com/avar/2015-04-03-1M-git