[PATCH 19/24] update-index: manually enable or disable untracked cache

2015-03-08 Thread Nguyễn Thái Ngọc Duy
Overall time saving on git status is about 40% in the best case
scenario, removing ..collect_untracked() as the most time consuming
function. read and refresh index operations are now at the top (which
should drop when index-helper and/or watchman support is added). More
numbers and analysis below.

webkit.git
==

169k files. 6k dirs. Lots of test data (i.e. not touched most of the
time)

Base status
---

Index version 4 in split index mode and cache-tree populated. No
untracked cache. It shows how time is consumed by git status. The
same settings are used for other repos below.

18:28:10.199679 builtin/commit.c:1394   performance: 0.00451 s: 
cmd_status:setup
18:28:10.474847 read-cache.c:1407   performance: 0.274873831 s: read_index
18:28:10.475295 read-cache.c:1407   performance: 0.00656 s: read_index
18:28:10.728443 preload-index.c:131 performance: 0.253147487 s: 
read_index_preload
18:28:10.741422 read-cache.c:1254   performance: 0.012868340 s: 
refresh_index
18:28:10.752300 wt-status.c:623 performance: 0.010421357 s: 
wt_status_collect_changes_worktree
18:28:10.762069 wt-status.c:629 performance: 0.009644748 s: 
wt_status_collect_changes_index
18:28:11.601019 wt-status.c:632 performance: 0.838859547 s: 
wt_status_collect_untracked
18:28:11.605939 builtin/commit.c:1421   performance: 0.004835004 s: 
cmd_status:update_index
18:28:11.606580 trace.c:415 performance: 1.407878388 s: git 
command: 'git' 'status'

Populating status
-

This is after enabling untracked cache and the cache is still empty.
We see a slight increase in .._collect_untracked() and update_index
(because new cache has to be written to $GIT_DIR/index).

18:28:18.915213 builtin/commit.c:1394   performance: 0.00326 s: 
cmd_status:setup
18:28:19.197364 read-cache.c:1407   performance: 0.281901416 s: read_index
18:28:19.197754 read-cache.c:1407   performance: 0.00546 s: read_index
18:28:19.451355 preload-index.c:131 performance: 0.253599607 s: 
read_index_preload
18:28:19.464400 read-cache.c:1254   performance: 0.012935336 s: 
refresh_index
18:28:19.475115 wt-status.c:623 performance: 0.010236920 s: 
wt_status_collect_changes_worktree
18:28:19.486022 wt-status.c:629 performance: 0.010801685 s: 
wt_status_collect_changes_index
18:28:20.362660 wt-status.c:632 performance: 0.876551366 s: 
wt_status_collect_untracked
18:28:20.396199 builtin/commit.c:1421   performance: 0.033447969 s: 
cmd_status:update_index
18:28:20.396939 trace.c:415 performance: 1.482695902 s: git 
command: 'git' 'status'

Populated status


After the cache is populated, wt_status_collect_untracked() drops 82%
from 0.838s to 0.144s. Overall time drops 45%. Top offenders are now
read_index() and read_index_preload().

18:28:20.408605 builtin/commit.c:1394   performance: 0.00457 s: 
cmd_status:setup
18:28:20.692864 read-cache.c:1407   performance: 0.283980458 s: read_index
18:28:20.693273 read-cache.c:1407   performance: 0.00661 s: read_index
18:28:20.958814 preload-index.c:131 performance: 0.265540254 s: 
read_index_preload
18:28:20.972375 read-cache.c:1254   performance: 0.013437429 s: 
refresh_index
18:28:20.983959 wt-status.c:623 performance: 0.011146646 s: 
wt_status_collect_changes_worktree
18:28:20.993948 wt-status.c:629 performance: 0.009879094 s: 
wt_status_collect_changes_index
18:28:21.138125 wt-status.c:632 performance: 0.144084737 s: 
wt_status_collect_untracked
18:28:21.173678 builtin/commit.c:1421   performance: 0.035463949 s: 
cmd_status:update_index
18:28:21.174251 trace.c:415 performance: 0.766707355 s: git 
command: 'git' 'status'

gentoo-x86.git
==

This repository is a strange one with a balanced, wide and shallow
worktree (about 100k files and 23k dirs) and no .gitignore in
worktree. .._collect_untracked() time drops 88%, total time drops 56%.

Base status
---
18:20:40.828642 builtin/commit.c:1394   performance: 0.00496 s: 
cmd_status:setup
18:20:41.027233 read-cache.c:1407   performance: 0.198130532 s: read_index
18:20:41.027670 read-cache.c:1407   performance: 0.00581 s: read_index
18:20:41.171716 preload-index.c:131 performance: 0.144045594 s: 
read_index_preload
18:20:41.179171 read-cache.c:1254   performance: 0.007320424 s: 
refresh_index
18:20:41.185785 wt-status.c:623 performance: 0.006144638 s: 
wt_status_collect_changes_worktree
18:20:41.192701 wt-status.c:629 performance: 0.006780184 s: 
wt_status_collect_changes_index
18:20:41.991723 wt-status.c:632 performance: 0.798927029 s: 
wt_status_collect_untracked
18:20:41.994664 builtin/commit.c:1421   performance: 0.002852772 s: 
cmd_status:update_index
18:20:41.995458 trace.c:415 performance: 1.168427502 s: git 
command: 'git' 'status'
Populating status
-
18:20:48.968848 

[PATCH 19/24] update-index: manually enable or disable untracked cache

2015-02-08 Thread Nguyễn Thái Ngọc Duy
Overall time saving on git status is about 40% in the best case
scenario, removing ..collect_untracked() as the most time consuming
function. read and refresh index operations are now at the top (which
should drop when index-helper and/or watchman support is added). More
numbers and analysis below.

webkit.git
==

169k files. 6k dirs. Lots of test data (i.e. not touched most of the
time)

Base status
---

Index version 4 in split index mode and cache-tree populated. No
untracked cache. It shows how time is consumed by git status. The
same settings are used for other repos below.

18:28:10.199679 builtin/commit.c:1394   performance: 0.00451 s: 
cmd_status:setup
18:28:10.474847 read-cache.c:1407   performance: 0.274873831 s: read_index
18:28:10.475295 read-cache.c:1407   performance: 0.00656 s: read_index
18:28:10.728443 preload-index.c:131 performance: 0.253147487 s: 
read_index_preload
18:28:10.741422 read-cache.c:1254   performance: 0.012868340 s: 
refresh_index
18:28:10.752300 wt-status.c:623 performance: 0.010421357 s: 
wt_status_collect_changes_worktree
18:28:10.762069 wt-status.c:629 performance: 0.009644748 s: 
wt_status_collect_changes_index
18:28:11.601019 wt-status.c:632 performance: 0.838859547 s: 
wt_status_collect_untracked
18:28:11.605939 builtin/commit.c:1421   performance: 0.004835004 s: 
cmd_status:update_index
18:28:11.606580 trace.c:415 performance: 1.407878388 s: git 
command: 'git' 'status'

Populating status
-

This is after enabling untracked cache and the cache is still empty.
We see a slight increase in .._collect_untracked() and update_index
(because new cache has to be written to $GIT_DIR/index).

18:28:18.915213 builtin/commit.c:1394   performance: 0.00326 s: 
cmd_status:setup
18:28:19.197364 read-cache.c:1407   performance: 0.281901416 s: read_index
18:28:19.197754 read-cache.c:1407   performance: 0.00546 s: read_index
18:28:19.451355 preload-index.c:131 performance: 0.253599607 s: 
read_index_preload
18:28:19.464400 read-cache.c:1254   performance: 0.012935336 s: 
refresh_index
18:28:19.475115 wt-status.c:623 performance: 0.010236920 s: 
wt_status_collect_changes_worktree
18:28:19.486022 wt-status.c:629 performance: 0.010801685 s: 
wt_status_collect_changes_index
18:28:20.362660 wt-status.c:632 performance: 0.876551366 s: 
wt_status_collect_untracked
18:28:20.396199 builtin/commit.c:1421   performance: 0.033447969 s: 
cmd_status:update_index
18:28:20.396939 trace.c:415 performance: 1.482695902 s: git 
command: 'git' 'status'

Populated status


After the cache is populated, wt_status_collect_untracked() drops 82%
from 0.838s to 0.144s. Overall time drops 45%. Top offenders are now
read_index() and read_index_preload().

18:28:20.408605 builtin/commit.c:1394   performance: 0.00457 s: 
cmd_status:setup
18:28:20.692864 read-cache.c:1407   performance: 0.283980458 s: read_index
18:28:20.693273 read-cache.c:1407   performance: 0.00661 s: read_index
18:28:20.958814 preload-index.c:131 performance: 0.265540254 s: 
read_index_preload
18:28:20.972375 read-cache.c:1254   performance: 0.013437429 s: 
refresh_index
18:28:20.983959 wt-status.c:623 performance: 0.011146646 s: 
wt_status_collect_changes_worktree
18:28:20.993948 wt-status.c:629 performance: 0.009879094 s: 
wt_status_collect_changes_index
18:28:21.138125 wt-status.c:632 performance: 0.144084737 s: 
wt_status_collect_untracked
18:28:21.173678 builtin/commit.c:1421   performance: 0.035463949 s: 
cmd_status:update_index
18:28:21.174251 trace.c:415 performance: 0.766707355 s: git 
command: 'git' 'status'

gentoo-x86.git
==

This repository is a strange one with a balanced, wide and shallow
worktree (about 100k files and 23k dirs) and no .gitignore in
worktree. .._collect_untracked() time drops 88%, total time drops 56%.

Base status
---
18:20:40.828642 builtin/commit.c:1394   performance: 0.00496 s: 
cmd_status:setup
18:20:41.027233 read-cache.c:1407   performance: 0.198130532 s: read_index
18:20:41.027670 read-cache.c:1407   performance: 0.00581 s: read_index
18:20:41.171716 preload-index.c:131 performance: 0.144045594 s: 
read_index_preload
18:20:41.179171 read-cache.c:1254   performance: 0.007320424 s: 
refresh_index
18:20:41.185785 wt-status.c:623 performance: 0.006144638 s: 
wt_status_collect_changes_worktree
18:20:41.192701 wt-status.c:629 performance: 0.006780184 s: 
wt_status_collect_changes_index
18:20:41.991723 wt-status.c:632 performance: 0.798927029 s: 
wt_status_collect_untracked
18:20:41.994664 builtin/commit.c:1421   performance: 0.002852772 s: 
cmd_status:update_index
18:20:41.995458 trace.c:415 performance: 1.168427502 s: git 
command: 'git' 'status'
Populating status
-
18:20:48.968848 

[PATCH 19/24] update-index: manually enable or disable untracked cache

2015-01-20 Thread Nguyễn Thái Ngọc Duy
Overall time saving on git status is about 40% in the best case
scenario, removing ..collect_untracked() as the most time consuming
function. read and refresh index operations are now at the top (which
should drop when index-helper and/or watchman support is added). More
numbers and analysis below.

webkit.git
==

169k files. 6k dirs. Lots of test data (i.e. not touched most of the
time)

Base status
---

Index version 4 in split index mode and cache-tree populated. No
untracked cache. It shows how time is consumed by git status. The
same settings are used for other repos below.

18:28:10.199679 builtin/commit.c:1394   performance: 0.00451 s: 
cmd_status:setup
18:28:10.474847 read-cache.c:1407   performance: 0.274873831 s: read_index
18:28:10.475295 read-cache.c:1407   performance: 0.00656 s: read_index
18:28:10.728443 preload-index.c:131 performance: 0.253147487 s: 
read_index_preload
18:28:10.741422 read-cache.c:1254   performance: 0.012868340 s: 
refresh_index
18:28:10.752300 wt-status.c:623 performance: 0.010421357 s: 
wt_status_collect_changes_worktree
18:28:10.762069 wt-status.c:629 performance: 0.009644748 s: 
wt_status_collect_changes_index
18:28:11.601019 wt-status.c:632 performance: 0.838859547 s: 
wt_status_collect_untracked
18:28:11.605939 builtin/commit.c:1421   performance: 0.004835004 s: 
cmd_status:update_index
18:28:11.606580 trace.c:415 performance: 1.407878388 s: git 
command: 'git' 'status'

Populating status
-

This is after enabling untracked cache and the cache is still empty.
We see a slight increase in .._collect_untracked() and update_index
(because new cache has to be written to $GIT_DIR/index).

18:28:18.915213 builtin/commit.c:1394   performance: 0.00326 s: 
cmd_status:setup
18:28:19.197364 read-cache.c:1407   performance: 0.281901416 s: read_index
18:28:19.197754 read-cache.c:1407   performance: 0.00546 s: read_index
18:28:19.451355 preload-index.c:131 performance: 0.253599607 s: 
read_index_preload
18:28:19.464400 read-cache.c:1254   performance: 0.012935336 s: 
refresh_index
18:28:19.475115 wt-status.c:623 performance: 0.010236920 s: 
wt_status_collect_changes_worktree
18:28:19.486022 wt-status.c:629 performance: 0.010801685 s: 
wt_status_collect_changes_index
18:28:20.362660 wt-status.c:632 performance: 0.876551366 s: 
wt_status_collect_untracked
18:28:20.396199 builtin/commit.c:1421   performance: 0.033447969 s: 
cmd_status:update_index
18:28:20.396939 trace.c:415 performance: 1.482695902 s: git 
command: 'git' 'status'

Populated status


After the cache is populated, wt_status_collect_untracked() drops 82%
from 0.838s to 0.144s. Overall time drops 45%. Top offenders are now
read_index() and read_index_preload().

18:28:20.408605 builtin/commit.c:1394   performance: 0.00457 s: 
cmd_status:setup
18:28:20.692864 read-cache.c:1407   performance: 0.283980458 s: read_index
18:28:20.693273 read-cache.c:1407   performance: 0.00661 s: read_index
18:28:20.958814 preload-index.c:131 performance: 0.265540254 s: 
read_index_preload
18:28:20.972375 read-cache.c:1254   performance: 0.013437429 s: 
refresh_index
18:28:20.983959 wt-status.c:623 performance: 0.011146646 s: 
wt_status_collect_changes_worktree
18:28:20.993948 wt-status.c:629 performance: 0.009879094 s: 
wt_status_collect_changes_index
18:28:21.138125 wt-status.c:632 performance: 0.144084737 s: 
wt_status_collect_untracked
18:28:21.173678 builtin/commit.c:1421   performance: 0.035463949 s: 
cmd_status:update_index
18:28:21.174251 trace.c:415 performance: 0.766707355 s: git 
command: 'git' 'status'

gentoo-x86.git
==

This repository is a strange one with a balanced, wide and shallow
worktree (about 100k files and 23k dirs) and no .gitignore in
worktree. .._collect_untracked() time drops 88%, total time drops 56%.

Base status
---
18:20:40.828642 builtin/commit.c:1394   performance: 0.00496 s: 
cmd_status:setup
18:20:41.027233 read-cache.c:1407   performance: 0.198130532 s: read_index
18:20:41.027670 read-cache.c:1407   performance: 0.00581 s: read_index
18:20:41.171716 preload-index.c:131 performance: 0.144045594 s: 
read_index_preload
18:20:41.179171 read-cache.c:1254   performance: 0.007320424 s: 
refresh_index
18:20:41.185785 wt-status.c:623 performance: 0.006144638 s: 
wt_status_collect_changes_worktree
18:20:41.192701 wt-status.c:629 performance: 0.006780184 s: 
wt_status_collect_changes_index
18:20:41.991723 wt-status.c:632 performance: 0.798927029 s: 
wt_status_collect_untracked
18:20:41.994664 builtin/commit.c:1421   performance: 0.002852772 s: 
cmd_status:update_index
18:20:41.995458 trace.c:415 performance: 1.168427502 s: git 
command: 'git' 'status'
Populating status
-
18:20:48.968848