On 8/29/2018 1:14 PM, Junio C Hamano wrote:
Ben Peart <ben.pe...@microsoft.com> writes:

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1c42364988..79f8296d9c 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2391,6 +2391,12 @@ imap::
        The configuration variables in the 'imap' section are described
        in linkgit:git-imap-send[1].

Adding something like

        You can disable multi-threaded code by setting this variable
        to 'false' (or 1).

may reduce the risk of a similar "Huh?" reaction by other readers.


Will do

+struct load_cache_entries_thread_data
+{
+       pthread_t pthread;
+       struct index_state *istate;
+       struct mem_pool *ce_mem_pool;
+       int offset, nr;
+       void *mmap;
+       unsigned long start_offset;
+       struct strbuf previous_name_buf;
+       struct strbuf *previous_name;
+       unsigned long consumed; /* return # of bytes in index file processed */
+};

We saw that Duy's "let's not use strbuf to remember the previous
name but instead use the previous ce" approach gave us a nice
performance boost; I wonder if we can build on that idea here?

One possible approach might be to create one ce per "block" in the
pre-scanning thread and use that ce as the "previous one" in the
per-thread data before spawning a worker.


Yes, I believe this can be done. I was planning to wait until both patches settled down a bit before adapting it to threads. It's a little trickier because the previous ce doesn't yet exist but I believe one can be fabricated enough to make the optimization work.

+static unsigned long load_cache_entries(struct index_state *istate,
+                       void *mmap, size_t mmap_size, unsigned long src_offset)
+{
+       struct strbuf previous_name_buf = STRBUF_INIT, *previous_name;
+       struct load_cache_entries_thread_data *data;
+       int nr_threads, cpus, ce_per_thread;
+       unsigned long consumed;
+       int i, thread;
+
+       nr_threads = git_config_get_index_threads();
+       if (!nr_threads) {
+               cpus = online_cpus();
+               nr_threads = istate->cache_nr / THREAD_COST;

Here, nr_threads could become 0 with a small index, but any value
below 2 makes us call load_all_cache_entries() by the main thread
(and the value of nr_thread is not used anyore), it is fine.  Of
course, forced test will set it to 2 so there is no problem, either.

OK.

+       /* a little sanity checking */
+       if (istate->name_hash_initialized)
+               die("the name hash isn't thread safe");

If it is a programming error to call into this codepath without
initializing the name_hash, which I think is the case, this is
better done with BUG("").


Will do

The remainder of the patch looked good.  Thanks.

Reply via email to