On Wed, Mar 25, 2020 at 4:49 AM Raut, S Biplab <biplab.r...@amd.com> wrote:
> [AMD Official Use Only - Internal Distribution Only] > > > > Dear George, > > Thank you the reply. But my question is more > particularly on the message size from application side. > > > > Let’s say the application is running with 128 ranks. > > Each rank is doing send() msg to rest of 127 ranks where the msg length > sent is under question. > > Now after all the sends are completed, each rank will recv() msg from rest > of 127 ranks. > > Unless the msg length in the sending part is within eager_limit (4K size), > this program will hang. > This is definitively not true, one can imagine many communication patterns that will ensure correctness for your all-to-all communications. As an example, you can place your processes in a virtual ring, and at each step send and recv to/from process (my_rank + step) % comm_size. This communication pattern will always be correct, independent of the eager size (for as long as you correctly order the send/recv for each pair). So, based on the above scenario, my questions are:- > > 1. Can each of the rank send message upto 4K size successfully, i.e > all 128 ranks sending (128 * 4K) bytes simultaneously? > > Potentially yes, but there are physical constraints (aka number of network links, switches capabilities, ... ) and memory limits. But if you have enough memory, this could potentially work. I'm not saying this is correct and should be done. > 1. If application has bigger msg to be sent by each rank, then how to > derive the send message size? Is it equal to eager_limit and each rank > needs to send multiple chunks of this size? > > Definitively not! You should never rely on the eager size to fix a complex communication pattern. The rule of thumb should be: Is my application working correctly if the MPI forces a zero-bytes eager size. As suggested above, the most suitable approach is to define a communication scheme that would never deadlock. George. > With Regards, > > S. Biplab Raut > > > > *From:* George Bosilca <bosi...@icl.utk.edu> > *Sent:* Tuesday, March 24, 2020 9:01 PM > *To:* Open MPI Users <users@lists.open-mpi.org> > *Cc:* Raut, S Biplab <biplab.r...@amd.com> > *Subject:* Re: [OMPI users] Regarding eager limit relationship to send > message size > > > > [CAUTION: External Email] > > Biplab, > > > > The eager is a constant for each BTL, and it represent the data that is > sent eagerly with the matching information out of the entire message. So, > if the question is how much memory is needed to store all the > eager messages then the answer will depend on the communication pattern of > your application: > > - applications using only blocking messages might only have 1 pending > communications per peer, so in the worst case any process will only need at > most P * eager_size memory for local storage of the eager. > > - applications using non-blocking communications, there is basically no > limit. > > > > However, the good news is that you can change this limit to adapt to the > needs of your application(s). > > > > Hope this answers your question, > > George. > > > > > > On Tue, Mar 24, 2020 at 1:46 AM Raut, S Biplab via users < > users@lists.open-mpi.org> wrote: > > Dear Experts, > > I would like to derive/calculate the maximum MPI > send message size possible given the known details of > btl_vader_eager_limit and number of ranks. > > Can anybody explain and confirm on this? > > > > With Regards, > > S. Biplab Raut > >