Well it works for me with 32 characters in the string (Linux, 4 CPUs). So it seems to depend a bit on the OS - and therefore, the scheduler.
> I think this shouldn't be the case. I'll try to explain what might happen. I do not know all internals involved, so it may or may not be accurate. Suppose all the threads have been started and we are at some point _x_ where one thread just finished writing its character. That thread releases the lock, and the scheduler sees „ah, the lock has been released, I have a bunch of threads waiting for that, let's wake the next in line. And because I have multiple CPUs, let's directly start multiple threads.“ The next thread in line - lets call it `T1` \- executes and acquires the lock. Unfortunately, it is not the thread that can write the next character. So it releases the lock and goes to sleep for 1ms. Meanwhile, another thread (`T2`) has been started and tries to acquire the lock again, but `T1` is still holding the lock. So `T2` tells the scheduler it is still blocked by the lock and goes to sleep again. The scheduler moves it to the back of the queue of blocked threads. Now, unfortunately, `T2` is actually the thread that can write the next character. But the scheduler will now execute all the other threads first. Then, if you're lucky, `T2` will finally be executed because all other threads currently sleep. This, however, will only happen if the scheduler can make all the necessary context switches between threads, all the executions and sleeps etc, in less than 1ms (the time the other threads sleep). Depending on your system, that may or may not be enough time (I do not have the metrics on context switches at hand, so I can only guess). If a round-trip through all the threads takes more than 1ms, it can happen that `T2` is blocked again by another thread when it is its turn to execute. And the same circle happens again. So it cannot be guaranteed that `T2` will ever be executed while the lock is acquirable by it. You may try to increase sleep time. But all in all, this code is outright horrible because of those interdependencies, and you should never do something like that in production code.
