Alex A. Granovsky wrote:
Isn't in evident from the theory of random processes and probability theory that in the limit of infinitely
large cluster and parallel process, the probability of deadlocks with current implementation is unfortunately
quite a finite quantity and in limit approaches to unity regardless on any particular details of the program.
No, not at all.  Consider simulating a physical volume.  Each process is assigned to some small subvolume.  It updates conditions locally, but on the surface of its simulation subvolume it needs information from "nearby" processes.  It cannot proceed along the surface until it has that neighboring information.  Its neighbors, in turn, cannot proceed until their neighbors have reached some point.  Two distant processes can be quite out of step with one another, but only by some bounded amount.  At some point, a leading process has to wait for information from a laggard to propagate to it.  All processes proceed together, in some loose lock-step fashion.  Many applications behave in this fashion.  Actually, in many applications, the synchronization is tightened in that "physics" is made to propagate faster than neighbor-by-neighbor.

As the number of processes increases, the laggard might seem relatively slower in comparison, but that isn't deadlock.

As the size of the cluster increases, the chances of a system component failure increase, but that also is a different matter.

Reply via email to