On 28/09/18 20:41, Ben Peart wrote:
> 
> 
> On 9/28/2018 1:07 PM, Junio C Hamano wrote:
>> Ben Peart <peart...@gmail.com> writes:
>>
>>>> Why does multithreading have to be disabled in this test?
>>>
>>> If multi-threading is enabled, it will write out the IEOT extension
>>> which changes the SHA and causes the test to fail.
>>
>> I think it is a design mistake to let the writing processes's
>> capability decide what is written in the file to be read later by a
>> different process, which possibly may have different capability.  If
>> you are not writing with multiple threads, it should not matter if
>> that writer process is capable of and configured to spawn 8 threads
>> if the process were reading the file---as it is not reading the file
>> it is writing right now.
>>
>> I can understand if the design is to write IEOT only if the
>> resulting index is expected to become large enough (above an
>> arbitrary threshold like 100k entries) to matter.  I also can
>> understand if IEOT is omitted when the repository configuration says
>> that no process is allowed to read the index with multi-threaded
>> codepath in that repository.
>>
> 
> There are two different paths which determine how many blocks are written to 
> the IEOT.  The first is the default path.  On this path, the number of blocks 
> is determined by the number of cache entries divided by the THREAD_COST.  If 
> there are sufficient entries to make it faster to use threading, then it will 
> automatically use enough blocks to optimize the performance of reading the 
> entries across multiple threads.
> 
> I currently cap the maximum number of blocks to be the number of cores that 
> would be available to process them on that same machine purely as an 
> optimization.  The majority of the time, the index will be read from the same 
> machine that it was written on so this works well.  Before I added that 
> logic, you would usually end up with more blocks than available threads which 
> meant some threads had more to do than the other threads and resulted in 
> worse performance.  For example, 4 blocks across 3 threads results in the 1st 
> thread having twice as much work to do as the other threads.
> 
> If the index is copied to a machine with a different number of cores, it will 
> still all work - it just may not be optimal for that machine.  This is self 
> correcting because as soon as the index is written out, it will be optimized 
> for that machine.
> 
> If the "automatically try to make it perform optimally" logic doesn't work 
> for some reason, we have path #2.
> 
> The second path is when the user specifies a specific number of blocks via 
> the GIT_TEST_INDEX_THREADS=<n> environment variable or the index.threads=<n> 
> config setting.  If they ask for n blocks, they will get n blocks.  This is 
> the "I know what I'm doing and want to control the behavior" path.
> 
> I just added one additional test (see patch below) to avoid a divide by zero 
> bug and simplify things a bit.  With this change, if there are fewer than two 
> blocks, the IEOT extension is not written out as it isn't needed.  The load 
> would be single threaded anyway so there is no reason to write out a IEOT 
> extensions that won't be used.
> 
> 
> 
> diff --git a/read-cache.c b/read-cache.c
> index f5d766088d..a1006fa824 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -2751,18 +2751,23 @@ static int do_write_index(struct index_state *istate, 
> struct tempfile *tempfil
> e,
>                  */
>                 if (!nr) {
>                         ieot_blocks = istate->cache_nr / THREAD_COST;
> -                       if (ieot_blocks < 1)
> -                               ieot_blocks = 1;
>                         cpus = online_cpus();
>                         if (ieot_blocks > cpus - 1)
>                                 ieot_blocks = cpus - 1;

So, am I reading this correctly - you need cpus > 2 before an
IEOT extension block is written out?

OK.

ATB,
Ramsay Jones

>                 } else {
>                         ieot_blocks = nr;
>                 }
> -               ieot = xcalloc(1, sizeof(struct index_entry_offset_table)
> -                       + (ieot_blocks * sizeof(struct index_entry_offset)));
> -               ieot->nr = 0;
> -               ieot_work = DIV_ROUND_UP(entries, ieot_blocks);
> +
> +               /*
> +                * no reason to write out the IEOT extension if we don't
> +                * have enough blocks to utilize multi-threading
> +                */
> +               if (ieot_blocks > 1) {
> +                       ieot = xcalloc(1, sizeof(struct 
> index_entry_offset_table)
> +                               + (ieot_blocks * sizeof(struct 
> index_entry_offset)));
> +                       ieot->nr = 0;
> +                       ieot_work = DIV_ROUND_UP(entries, ieot_blocks);
> +               }
>         }
>  #endif
> 
> 

Reply via email to