Hi, Pooja and I are actually working on this course project where we our main aim is schedule MPI and non MPI calls... giving more priority to the MPI calls over the non MPI ones.
To make things simple, we are making this scheduling static to some extent... by static I mean.. we know that our clusters use Infiniband for MPI ( from our study of the openmpi source code this precisely uses the 'mca_btl_openib_send()' from the ompi/mca/btl/openib/btl_openib.c file) ... so all the non MPI communication can be assumed to be TCP communication using the 'mca_btl_tcp_send()' from the ompi/mca/btl/tcp/btl_tcp.c file. To implement this we plan to implement the foll. simple algorithm: - before calling the 'mca_btl_openib_send()' lock0(X); - before calling the 'mca_btl_tcp_send()' lock1(X); Algo: 1. Allow Lock0(x) -> Lock0(x);.. meaning Lock0(x) is followed by Lock0(x). 2. Allow Lock1(x) -> Lock1(x); 3. Do not allow Lock0(x) -> Lock1(x); 4. If Lock1(x) -> Lock0(x).... since MPI calls are to be higher priority over the non MPI ones.. in this case the non MPI communication should be paused and all the related data off course needs to be put into a queue(meaning the status of this should be saved in a queue). All other non MPI communications newer than this should also be added to this same queue. Now the MPI process trying to perform Lock0(x) should be allowed to complete and only when all the MPI communications are complete should the non MPI communication be allowed. Currently we are working on a simple scheduling algorithm without giving any priorities to the 'MPI_send' calls. However to implement the project fully, we have the following queries :( -Can we abort or pause the non-MPI/TCP communication in any way??? -Given the assumption that the non-MPI communication is TCP, can we make use of the built in structures (i mean the buffer already used) in mca_btl_tcp_send() for the implementation of pt.4 in the above mentioned algorithm??? and more importantly how? Regards, Chaitali