Re: [OMPI devel] OpenIB BTL and SRQs
Jeff Squyres wrote: On Jul 12, 2007, at 1:18 PM, Don Kerr wrote: - So if you want to simply eliminate the flow control, choose M high enough (or just a total number of receive buffers to post to the SRQ) that you won't ever run out of resources and you should see some speedup from lack of flow control. This obviously mainly helps apps with lots of small messages; it may not help in many other cases. Is there any distinction by the size of the message. If the "M" parameter is set high does the openib btl post this many recv buffers for the SRQ on both QPs? Or are SRQs only created on one of the QPs? Keep in mind that the SRQs are only for send/receive messages, not RDMA messages. That is obviously enough but isn't there a window for MPI messages that are greater than the eager limit but less than where the rdma protocol kicks in and fragments for this size message use fragments larger than than the eager size. Maybe this is where openib's high and low priority qp differ from udapl which makes a choice of which endpoint to use based on the size of the fragment. That is why I was curious if openib was using SRQs on both queue pairs. Each receive buffer has a max size (the eager limit, IIRC). So if the message is larger than that, we'll fragment per the pipeline protocol, possibly subject to doing RDMA if the message is large enough, yadda yadda yadda. More specifically, the size of the buffer is not dependent upon an individual message that is being sent or received (since they're pre-posted -- we have no idea what the message sizes will be). As for whether the SRQ is on both QP's, this is a Galen/George/Gleb (G^3) question...
Re: [OMPI devel] OpenIB BTL and SRQs
On Jul 12, 2007, at 1:18 PM, Don Kerr wrote: - So if you want to simply eliminate the flow control, choose M high enough (or just a total number of receive buffers to post to the SRQ) that you won't ever run out of resources and you should see some speedup from lack of flow control. This obviously mainly helps apps with lots of small messages; it may not help in many other cases. Is there any distinction by the size of the message. If the "M" parameter is set high does the openib btl post this many recv buffers for the SRQ on both QPs? Or are SRQs only created on one of the QPs? Keep in mind that the SRQs are only for send/receive messages, not RDMA messages. Each receive buffer has a max size (the eager limit, IIRC). So if the message is larger than that, we'll fragment per the pipeline protocol, possibly subject to doing RDMA if the message is large enough, yadda yadda yadda. More specifically, the size of the buffer is not dependent upon an individual message that is being sent or received (since they're pre-posted -- we have no idea what the message sizes will be). As for whether the SRQ is on both QP's, this is a Galen/George/Gleb (G^3) question... -- Jeff Squyres Cisco Systems
Re: [OMPI devel] OpenIB BTL and SRQs
Jeff Squyres wrote: There's a few benefits: - Remember that you post a big pool of buffers instead of num_peers individual sets of receive buffers. Hence, if you post M buffers for each of N peers, each peer -- due to flow control -- can only have M outstanding sends at a time. So if you have apps sending lots of small messages, you can get better utilization of buffer space because a single peer has more than M buffers to receive into. - You can also post less than M*N buffers by playing the statistics of your app -- if you know that you won't have more than M*N messages outstanding at any given time, you can post fewer receive buffers. - At the same time, there's a problem with flow control (meaning that there is none): how can a sender know when they have overflowed the receiver (other than an RNR)? So it's not necessarily as safe. - So if you want to simply eliminate the flow control, choose M high enough (or just a total number of receive buffers to post to the SRQ) that you won't ever run out of resources and you should see some speedup from lack of flow control. This obviously mainly helps apps with lots of small messages; it may not help in many other cases. Is there any distinction by the size of the message. If the "M" parameter is set high does the openib btl post this many recv buffers for the SRQ on both QPs? Or are SRQs only created on one of the QPs? On Jul 12, 2007, at 12:29 PM, Don Kerr wrote: Through mca parameters one can select the use of shared receive queues in the openib btl, other than having fewer queues I am wondering what are the benefits of using this option. Can anyone eleborate on using them vs the default? ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] OpenIB BTL and SRQs
Interesting. So with SRQs there is no flow control, I am guessing the btl sets some reasonable default but essentially is relying on the user to adjust other parameters so the buffers are not over run. And yes Galen I would like to read your paper. Jeff Squyres wrote: There's a few benefits: - Remember that you post a big pool of buffers instead of num_peers individual sets of receive buffers. Hence, if you post M buffers for each of N peers, each peer -- due to flow control -- can only have M outstanding sends at a time. So if you have apps sending lots of small messages, you can get better utilization of buffer space because a single peer has more than M buffers to receive into. - You can also post less than M*N buffers by playing the statistics of your app -- if you know that you won't have more than M*N messages outstanding at any given time, you can post fewer receive buffers. - At the same time, there's a problem with flow control (meaning that there is none): how can a sender know when they have overflowed the receiver (other than an RNR)? So it's not necessarily as safe. - So if you want to simply eliminate the flow control, choose M high enough (or just a total number of receive buffers to post to the SRQ) that you won't ever run out of resources and you should see some speedup from lack of flow control. This obviously mainly helps apps with lots of small messages; it may not help in many other cases. On Jul 12, 2007, at 12:29 PM, Don Kerr wrote: Through mca parameters one can select the use of shared receive queues in the openib btl, other than having fewer queues I am wondering what are the benefits of using this option. Can anyone eleborate on using them vs the default? ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] OpenIB BTL and SRQs
There's a few benefits: - Remember that you post a big pool of buffers instead of num_peers individual sets of receive buffers. Hence, if you post M buffers for each of N peers, each peer -- due to flow control -- can only have M outstanding sends at a time. So if you have apps sending lots of small messages, you can get better utilization of buffer space because a single peer has more than M buffers to receive into. - You can also post less than M*N buffers by playing the statistics of your app -- if you know that you won't have more than M*N messages outstanding at any given time, you can post fewer receive buffers. - At the same time, there's a problem with flow control (meaning that there is none): how can a sender know when they have overflowed the receiver (other than an RNR)? So it's not necessarily as safe. - So if you want to simply eliminate the flow control, choose M high enough (or just a total number of receive buffers to post to the SRQ) that you won't ever run out of resources and you should see some speedup from lack of flow control. This obviously mainly helps apps with lots of small messages; it may not help in many other cases. On Jul 12, 2007, at 12:29 PM, Don Kerr wrote: Through mca parameters one can select the use of shared receive queues in the openib btl, other than having fewer queues I am wondering what are the benefits of using this option. Can anyone eleborate on using them vs the default? ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
[OMPI devel] OpenIB BTL and SRQs
Through mca parameters one can select the use of shared receive queues in the openib btl, other than having fewer queues I am wondering what are the benefits of using this option. Can anyone eleborate on using them vs the default?