Hello, If you are running Ray (or plan to) on a large number of cores, you might be interested in a new feature available in the development tree of Ray.
This feature is a new option called -route-messages. In Ray, any core can sends a message directly to any other core including itself. For example, if you run Ray on 512 cores (let's say 64 computers with 8 cores each), then each core has 511 connections -- one with each other core. This means that each core has to check for incoming messages in a round-robin fashion for all the 512 cores (this includes itself). In this setting, the communication network is complete with 512 cores and 130816 connections (512 * 511 / 2). One way to avoid such a huge number of connections is to allow each core to communicate directly with only a few others. To do so, we can take the logarithm in base 2 of the number of cores to get the average number of connections of a core log2(512)=9. Considering that we want any core to have 9 connections on average, we need to select randomly 512*9 / 2 connections from the 130816 connections in order to build the random graph. Such a random graph has 512 cores and an average number of connections of 9 and has exactly 2304 edges (512*9/2). There are many such graphs but it is easy to pick up one. In this case, each core has to check for incoming messages in a round-robin fashion for all the ~9+1 connections (+1 to include itself). There is also less memory utilised for incoming buffers. And the length of the shortest route between any pair of cores in this random graph is, on average, 3 connections. This is because there are 9 first neighbors, 81 second neighbors and 729 third neighbors (which are redundant). But the main motivation is that the latency is reduced by 60 %. The latency without this routing with random graphs: 386 microseconds (standard deviation: 9) The latency with this routing with random graphs: 158 microseconds (standard deviation: 15) If anyone would like to share its experience with Ray on a large number of cores, go ahead. More detailed post on the Open-MPI list (more technical): http://www.open-mpi.org/community/lists/users/2011/11/17737.php Happy assembly. Sébastien Boisvert http://boisvert.info ------------------------------------------------------------------------------ RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users