Hallo Edward, thanks for the paper. In 3 they describe different kinds of parallelism. The method "exemplar parallelism" is what I tried to explain in my recent email. They conclude that's the preferred technique which is not very surprising. The communication overhead for the other methods is to high.
I also tried to understand the BSP model. What I didn't catch completely is how the communication step works. Is is possible that each node sends a message to each other. That's exactly what should be done in mlp learning. In the talk of Jeff Dean they have a single master for this. He receives the messages of all the worker nodes. The master adds the weight changes to the current net and distributes the updated net. This should be more efficient than sending messages from each node to each node. But for a first distributed version that should be ok. Cheers Christian
