Hi, although I did my due diligence on searching for this question, I apologise if this is a repeat. >From an architectural point of view does it make sense to use MPI in the following scenario (for the purposes of resilience as much as parallelization):
Each process is a long-running process (runs non-interrupted for weeks, months or even years) that collects and crunches some streaming data, for example temperature readings, and the data is replicated to R nodes. Because this is a diversion from the normal modus operandi (i.e. all data is immediately available), is there any obvious MPI issues that I am not considering in designing such an application? Here is a more detailed description of the app: A master receives the data and dispatches it according to some function such that each tuple is replicated R times to R of the N nodes (with R<=N). Suppose that there are K regions from which temperature readings stream in in the form of <K,T> where K is the region id and T is the temperature reading. The master sends <K,T> to R of the N nodes. These nodes maintain a long-term state of, say, the min/max readings. If R=N=2, the system is basically duplicated and if one of the two nodes dies inadvertently, the other one still has accounted for all the data. Here is some pseudo-code: int main(argc, argv) int N=10, R=3, K=200; Init(argc,argv); int rank=COMM_WORLD.Get_rank(); if(rank==0) { int lastnode = 1; while(read <k,T> from socket) for(i in 0:R) COMM_WORLD.Send(<k,T>,1,tuple,++lastnode%N,tag); } else { COMM_WORLD.Recv(<k,T>,1,tuple,any,tag,Info); process_message(<k,T>); } Many thanks for your time! Regards Dok