Hi George V. Reilly, you wrote:

> 
> How did you measure the time in EnterCriticalSection and 
> LeaveCriticalSection?
>

I discovered the problem by using a performance tuning tool which uses sampling 
approach to get statistical profile data. The intrusivity of this class of 
tools is very low. I verified that the problem exists by compiling Vim with 
unlocked getc() and measuring the difference (without any tool, just by $ time 
...).
>
> If there's no lock contention, these routines are little more than 
> InterlockedIncrement and InterlockedDecrement, without a kernel transition
> or blocking.
>
You're absolutely right, it doesn't need to switch the context when the CS is 
free. But InterlockedIncrement/Decrement is not that cheap. On uniprocessor 
machine it takes about 200 cycles of CPU cycles per atomic operation. Thus, 
EnterCS/LeaveCS pair will take about 400 cycles of CPU. The program which 
confirms these numbers is attached. So it means that to read 1 Mbyte of data 
with locking getc (this is roughly the size of Russian + English spl files) we 
need to pay 400e6 cycles for these useless attempts to syncronize. Given the 
frequency of my machine 2.4 GHz = 2400e6 we get that 400e6 means 1/6 of second 
which is 0.17 seconds - exactly the speedup I observed.

And remember that you need to pay this price on every Vim startup.
>
> 
> In other words, if you're seeing significant time in Enter/LeaveCS, I 
> can think of two causes. Either your measurement tool has perturbed the 
> results, or there really is some multithreaded lock contention. The 
> former seems more likely, as Vim is single-threaded, but who knows what 
> some DLLs in the Vim process might be doing.
>
No, there isn't any contention. The critical section in Microsoft 
multi-threaded CRT is per-FILE* so it's impossible that any guy competes with 
you unless you give them the FILE *. As far as I can say, descriptor to opened 
spell file is absolutetly private inside spell.c

Also, the numbers above show that the overhead is exactly this without any 
contention. If there were competition, the overhead would be much bigger.
>
> 
> I would be vary wary of using the _getc_nolock macro until we understand 
> why you are seeing those results.
> 
>
-- 
Alexei Alexandrov

Reply via email to