Am 20.03.2014 22:56, schrieb Stefan Zager:
> On Thu, Mar 20, 2014 at 2:35 PM, Karsten Blees <> 
> wrote:
>> Am 20.03.2014 17:08, schrieb Stefan Zager:
>>> Going forward, there is still a lot of performance that gets left on
>>> the table when you rule out threaded file access.  There are not so
>>> many calls to read, mmap, and pread in the code; it should be possible
>>> to rationalize them and make them thread-safe -- at least, thread-safe
>>> for posix-compliant systems and msysgit, which covers the great
>>> majority of git users, I would hope.
>> IMO a "mostly" XSI compliant pread (or even the git_pread() emulation) is 
>> still better than forbidding the use of read() entirely. Switching from read 
>> to pread everywhere requires that all callers have to keep track of the file 
>> position, which means a _lot_ of code changes (read/xread/strbuf_read is 
>> used in ~70 places throughout git). And how do you plan to deal with 
>> platforms that don't have a thread-safe pread (HP, Cygwin)?
>> Considering all that, Duy's solution of opening separate file descriptors 
>> per thread seems to be the best pattern for future multi-threaded work.
> Does that mean you would endorse the (N threads) * (M pack files)
> approach to threading checkout and status?  That seems kind of
> crazy-town to me.  Not to mention that pack windows are not shared, so
> this approach to multi-threading can have the side-effect of blowing
> out memory consumption.  We have already had to dial back settings for
> pack.threads and core.deltaBaseCacheLimit, because threaded index-pack
> was causing OOM errors on 32-bit platforms.

Opening more file descriptors doesn't significantly increase the memory 
footprint, so it shouldn't matter whether the threads read data via shared or 
private descriptors.

git-status with core.preloadindex is already multithreaded (at least the first 
part), and AFAIK doesn't read pack files at all.

I'm still not convinced that multi-threaded git-checkout is a good idea. 
According to my tests this is actually slower than sequential checkout. You'd 
have to be very careful to only multi-thread the parts that don't do any IO, 
such as unpacking / undeltifying.

