Re: [PATCH v5 0/7] Fast git status via a file system watcher

2017-07-10 Thread Ben Peart



On 7/10/2017 9:36 AM, Ben Peart wrote:



On 6/28/2017 1:11 AM, Christian Couder wrote:

On Sat, Jun 10, 2017 at 3:40 PM, Ben Peart  wrote:

Changes from V4 include:

...

I took a look at this patch series except the last patch ([PATCH v5
7/7] fsmonitor: add a performance test) as Junio reviewed it already,
and had only a few comments on patches 3/7 and 4/7.

I am still not convinced by the discussions following v2
(http://public-inbox.org/git/20170518201333.13088-1-benpe...@microsoft.com/) 


about using a hook instead of for example a "core.fsmonitorcommand".

I think using a hook is not necessary and might not be a good match
for later optimizations. For example people might want to use a
library or some OS specific system calls to do what the hook does.

AEvar previously reported some not so great performance numbers on
some big Booking.com boxes with a big monorepo and it seems that using
taskset for example to make sure that the hook is run on the same CPU
improves these numbers significantly. So avoiding to run a separate
process can be important in some cases.



Using a hook is the only pattern I've seen in git that provides a way to 
enable OS specific calls.  I used a hook so that different file 
monitoring services could be plugged in depending on the OS or tools 
available.  The Watchman integration script was mostly intended as a 
sample that could be used where Watchman is available and works well.


I had not heard about the taskset issues you mention above.  If there is 
something else that can be done to make this work better, please let me 
know the details.


If I'm understanding you correctly, you are suggesting that someone 
should be able to configure a setting (core.fsmonitorcommand) that gives 
a custom command line that would be run instead of running the 
query-fsmonitor hook.


I'm not entirely sure how that should work.  There are command line 
options that need to be passed (currently the interface version as well 
as the current clock in nanoseconds).  How would those passed when using 
the custom command?


Is it OK to just append them to the given command line?  Does there need 
to be some substitution token to indicate where they should be inserted 
(ie "mycustomcommand --custom --options %version% --more-options 
%timestamp%").  Are there any other tokens that should be supported (ie 
PID or processor mask?).


Is there a design pattern already used somewhere in git that I can 
follow or is this all blazing a new trail?




My co-worker reminded me about git difftool - is this what you had in mind?


Thanks for continuing to look into this. Feedback is good!




Re: [PATCH v5 0/7] Fast git status via a file system watcher

2017-07-10 Thread Ben Peart



On 6/28/2017 1:11 AM, Christian Couder wrote:

On Sat, Jun 10, 2017 at 3:40 PM, Ben Peart  wrote:

Changes from V4 include:

...

I took a look at this patch series except the last patch ([PATCH v5
7/7] fsmonitor: add a performance test) as Junio reviewed it already,
and had only a few comments on patches 3/7 and 4/7.

I am still not convinced by the discussions following v2
(http://public-inbox.org/git/20170518201333.13088-1-benpe...@microsoft.com/)
about using a hook instead of for example a "core.fsmonitorcommand".

I think using a hook is not necessary and might not be a good match
for later optimizations. For example people might want to use a
library or some OS specific system calls to do what the hook does.

AEvar previously reported some not so great performance numbers on
some big Booking.com boxes with a big monorepo and it seems that using
taskset for example to make sure that the hook is run on the same CPU
improves these numbers significantly. So avoiding to run a separate
process can be important in some cases.



Using a hook is the only pattern I've seen in git that provides a way to 
enable OS specific calls.  I used a hook so that different file 
monitoring services could be plugged in depending on the OS or tools 
available.  The Watchman integration script was mostly intended as a 
sample that could be used where Watchman is available and works well.


I had not heard about the taskset issues you mention above.  If there is 
something else that can be done to make this work better, please let me 
know the details.


If I'm understanding you correctly, you are suggesting that someone 
should be able to configure a setting (core.fsmonitorcommand) that gives 
a custom command line that would be run instead of running the 
query-fsmonitor hook.


I'm not entirely sure how that should work.  There are command line 
options that need to be passed (currently the interface version as well 
as the current clock in nanoseconds).  How would those passed when using 
the custom command?


Is it OK to just append them to the given command line?  Does there need 
to be some substitution token to indicate where they should be inserted 
(ie "mycustomcommand --custom --options %version% --more-options 
%timestamp%").  Are there any other tokens that should be supported (ie 
PID or processor mask?).


Is there a design pattern already used somewhere in git that I can 
follow or is this all blazing a new trail?


Thanks for continuing to look into this. Feedback is good!




Re: [PATCH v5 0/7] Fast git status via a file system watcher

2017-06-27 Thread Christian Couder
On Sat, Jun 10, 2017 at 3:40 PM, Ben Peart  wrote:
> Changes from V4 include:
...

I took a look at this patch series except the last patch ([PATCH v5
7/7] fsmonitor: add a performance test) as Junio reviewed it already,
and had only a few comments on patches 3/7 and 4/7.

I am still not convinced by the discussions following v2
(http://public-inbox.org/git/20170518201333.13088-1-benpe...@microsoft.com/)
about using a hook instead of for example a "core.fsmonitorcommand".

I think using a hook is not necessary and might not be a good match
for later optimizations. For example people might want to use a
library or some OS specific system calls to do what the hook does.

AEvar previously reported some not so great performance numbers on
some big Booking.com boxes with a big monorepo and it seems that using
taskset for example to make sure that the hook is run on the same CPU
improves these numbers significantly. So avoiding to run a separate
process can be important in some cases.


[PATCH v5 0/7] Fast git status via a file system watcher

2017-06-10 Thread Ben Peart
Changes from V4 include:

fsmonitor.c:
Only flag index dirty if fsmonitor extension is added or removed
or if new cache or untracked cache entries are marked dirty

dir.c:
If an untracked cache entry is flagged as fsmonitor_dirty,
fall back to existing logic to stat the file and check for
changes instead of assuming it is dirty.

hooks--query-fsmonitor.sample: 
Optimize query to exclude transitory files (files that were
created and deleted since the last call).

Update path mangling on Windows to match Watchman beta.

test-drop-caches.c:
Add perf helper to drop the disk cache on Windows.

p7519-fsmonitor.sh:
Add perf test for fsmonitor changes

Ben Peart (7):
  bswap: add 64 bit endianness helper get_be64
  dir: make lookup_untracked() available outside of dir.c
  fsmonitor: teach git to optionally utilize a file system monitor to
speed up detecting new or changed files.
  fsmonitor: add test cases for fsmonitor extension
  fsmonitor: add documentation for the fsmonitor extension.
  fsmonitor: add a sample query-fsmonitor hook script for Watchman
  fsmonitor: add a performance test

 Documentation/config.txt |   7 +
 Documentation/githooks.txt   |  23 +++
 Documentation/technical/index-format.txt |  19 +++
 Makefile |   2 +
 builtin/update-index.c   |   1 +
 cache.h  |   5 +
 compat/bswap.h   |   4 +
 config.c |   4 +
 dir.c|  29 ++--
 dir.h|   5 +
 entry.c  |   1 +
 environment.c|   1 +
 fsmonitor.c  | 261 +++
 fsmonitor.h  |   9 ++
 read-cache.c |  28 +++-
 t/helper/test-drop-caches.c  | 107 +
 t/perf/p7519-fsmonitor.sh| 161 +++
 t/t7519-status-fsmonitor.sh  | 173 
 templates/hooks--query-fsmonitor.sample  |  76 +
 unpack-trees.c   |   1 +
 20 files changed, 904 insertions(+), 13 deletions(-)
 create mode 100644 fsmonitor.c
 create mode 100644 fsmonitor.h
 create mode 100644 t/helper/test-drop-caches.c
 create mode 100755 t/perf/p7519-fsmonitor.sh
 create mode 100755 t/t7519-status-fsmonitor.sh
 create mode 100755 templates/hooks--query-fsmonitor.sample

-- 
2.13.0