Re: [PATCH v1 10/19] read-cache: regenerate shared index if necessary

2016-10-29 Thread Christian Couder
On Tue, Oct 25, 2016 at 12:16 PM, Duy Nguyen  wrote:
> On Sun, Oct 23, 2016 at 4:26 PM, Christian Couder
>  wrote:
>> @@ -2233,7 +2263,8 @@ int write_locked_index(struct index_state *istate, 
>> struct lock_file *lock,
>> if ((v & 15) < 6)
>> istate->cache_changed |= SPLIT_INDEX_ORDERED;
>> }
>> -   if (istate->cache_changed & SPLIT_INDEX_ORDERED) {
>> +   if (istate->cache_changed & SPLIT_INDEX_ORDERED ||
>> +   too_many_not_shared_entries(istate)) {
>
> It's probably safer to keep this piece unchanged and add this
> somewhere before it
>
> if (too_many_not_shared_entries(istate))
> istate->cache_changed |= SPLIT_INDEX_ORDERED;
>
> We could keep cache_changed consistent until the end this way.

Ok, it will be in the next version.

>>  test_expect_success 'enable split index' '
>> +   git config splitIndex.maxPercentChange 100 &&
>
> An alternative name might be splitThreshold. I don't know, maybe
> maxPercentChange is better.

I think it is important to say that it is a percent in the name, so I
prefer maxPercentChange.

Thanks,
Christian.


Re: [PATCH v1 10/19] read-cache: regenerate shared index if necessary

2016-10-29 Thread Christian Couder
On Sun, Oct 23, 2016 at 6:07 PM, Ramsay Jones
 wrote:
>>
>> +int too_many_not_shared_entries(struct index_state *istate)
>
> This function is a file-loacal symbol; could you please make it
> a static function.

Ok, it will be in the next version.

Thanks,
Christian.


Re: [PATCH v1 10/19] read-cache: regenerate shared index if necessary

2016-10-25 Thread Duy Nguyen
On Sun, Oct 23, 2016 at 4:26 PM, Christian Couder
 wrote:
> @@ -2233,7 +2263,8 @@ int write_locked_index(struct index_state *istate, 
> struct lock_file *lock,
> if ((v & 15) < 6)
> istate->cache_changed |= SPLIT_INDEX_ORDERED;
> }
> -   if (istate->cache_changed & SPLIT_INDEX_ORDERED) {
> +   if (istate->cache_changed & SPLIT_INDEX_ORDERED ||
> +   too_many_not_shared_entries(istate)) {

It's probably safer to keep this piece unchanged and add this
somewhere before it

if (too_many_not_shared_entries(istate))
istate->cache_changed |= SPLIT_INDEX_ORDERED;

We could keep cache_changed consistent until the end this way.

> int ret = write_shared_index(istate, lock, flags);
> if (ret)
> return ret;
> diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
> index db8c39f..507a1dd 100755
> --- a/t/t1700-split-index.sh
> +++ b/t/t1700-split-index.sh
> @@ -8,6 +8,7 @@ test_description='split index mode tests'
>  sane_unset GIT_TEST_SPLIT_INDEX
>
>  test_expect_success 'enable split index' '
> +   git config splitIndex.maxPercentChange 100 &&

An alternative name might be splitThreshold. I don't know, maybe
maxPercentChange is better.

> git update-index --split-index &&
> test-dump-split-index .git/index >actual &&
> indexversion=$(test-index-version <.git/index) &&
> --
> 2.10.1.462.g7e1e03a
>



-- 
Duy


Re: [PATCH v1 10/19] read-cache: regenerate shared index if necessary

2016-10-23 Thread Ramsay Jones


On 23/10/16 10:26, Christian Couder wrote:
> When writing a new split-index and there is a big number of cache
> entries in the split-index compared to the shared index, it is a
> good idea to regenerate the shared index.
> 
> By default when the ratio reaches 20%, we will push back all
> the entries from the split-index into a new shared index file
> instead of just creating a new split-index file.
> 
> The threshold can be configured using the
> "splitIndex.maxPercentChange" config variable.
> 
> We need to adjust the existing tests in t1700 by setting
> "splitIndex.maxPercentChange" to 100 at the beginning of t1700,
> as the existing tests are assuming that the shared index is
> regenerated only when `git update-index --split-index` is used.
> 
> Signed-off-by: Christian Couder 
> ---
>  read-cache.c   | 33 -
>  t/t1700-split-index.sh |  1 +
>  2 files changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/read-cache.c b/read-cache.c
> index bb53823..a91fabe 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2216,6 +2216,36 @@ static int write_shared_index(struct index_state 
> *istate,
>   return ret;
>  }
>  
> +static const int default_max_percent_split_change = 20;
> +
> +int too_many_not_shared_entries(struct index_state *istate)

This function is a file-loacal symbol; could you please make it
a static function.

Thanks.

ATB,
Ramsay Jones


[PATCH v1 10/19] read-cache: regenerate shared index if necessary

2016-10-23 Thread Christian Couder
When writing a new split-index and there is a big number of cache
entries in the split-index compared to the shared index, it is a
good idea to regenerate the shared index.

By default when the ratio reaches 20%, we will push back all
the entries from the split-index into a new shared index file
instead of just creating a new split-index file.

The threshold can be configured using the
"splitIndex.maxPercentChange" config variable.

We need to adjust the existing tests in t1700 by setting
"splitIndex.maxPercentChange" to 100 at the beginning of t1700,
as the existing tests are assuming that the shared index is
regenerated only when `git update-index --split-index` is used.

Signed-off-by: Christian Couder 
---
 read-cache.c   | 33 -
 t/t1700-split-index.sh |  1 +
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index bb53823..a91fabe 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2216,6 +2216,36 @@ static int write_shared_index(struct index_state *istate,
return ret;
 }
 
+static const int default_max_percent_split_change = 20;
+
+int too_many_not_shared_entries(struct index_state *istate)
+{
+   int i, not_shared = 0;
+   int max_split = git_config_get_max_percent_split_change();
+
+   switch (max_split) {
+   case -1:
+   /* not or badly configured: use the default value */
+   max_split = default_max_percent_split_change;
+   break;
+   case 0:
+   return 1; /* 0% means always write a new shared index */
+   case 100:
+   return 0; /* 100% means never write a new shared index */
+   default:
+   ; /* do nothing: just use the configured value */
+   }
+
+   /* Count not shared entries */
+   for (i = 0; i < istate->cache_nr; i++) {
+   struct cache_entry *ce = istate->cache[i];
+   if (!ce->index)
+   not_shared++;
+   }
+
+   return istate->cache_nr * max_split < not_shared * 100;
+}
+
 int write_locked_index(struct index_state *istate, struct lock_file *lock,
   unsigned flags)
 {
@@ -2233,7 +2263,8 @@ int write_locked_index(struct index_state *istate, struct 
lock_file *lock,
if ((v & 15) < 6)
istate->cache_changed |= SPLIT_INDEX_ORDERED;
}
-   if (istate->cache_changed & SPLIT_INDEX_ORDERED) {
+   if (istate->cache_changed & SPLIT_INDEX_ORDERED ||
+   too_many_not_shared_entries(istate)) {
int ret = write_shared_index(istate, lock, flags);
if (ret)
return ret;
diff --git a/t/t1700-split-index.sh b/t/t1700-split-index.sh
index db8c39f..507a1dd 100755
--- a/t/t1700-split-index.sh
+++ b/t/t1700-split-index.sh
@@ -8,6 +8,7 @@ test_description='split index mode tests'
 sane_unset GIT_TEST_SPLIT_INDEX
 
 test_expect_success 'enable split index' '
+   git config splitIndex.maxPercentChange 100 &&
git update-index --split-index &&
test-dump-split-index .git/index >actual &&
indexversion=$(test-index-version <.git/index) &&
-- 
2.10.1.462.g7e1e03a