Re: [OMPI devel] calling sendi earlier in the PML

Eugene Loh Wed, 4 Mar 2009 18:40:47 -0500

George Bosilca wrote:

On Mar 4, 2009, at 14:44 , Eugene Loh wrote:
Let me try another thought here. Why do we have BTL sendi functionsat all? I'll make an assertion and would appreciate feedback: aBTL sendi function contributes nothing to optimizing send latency.To optimize send latency in the "immediate" case, we need *ONLY* PMLwork.
Because otherwise you will have to make 2 BTL calls instead of one

What's wrong with that? It's "just a function call", and OMPI islittered with those.

plus one extra memcpy (or not depending on your network).

Just to check my understanding: sm does not benefit here. Right? So,it makes no sense really for sm to have a sendi function.

I remain close to giving up, but still am trying to understand the PMLsomewhat. I guess there are several send-latency optimizations onecould consider:

1) Not populating the PML send request if the send can be completed"immediately". This makes sense for BTLs (like sm) that are very fast(so that shaving off instructions and memory operations makes adifference) and for messages that are very, very short (one or fewcachelines). It does not rely on having a sendi function, which knowsnothing about PML send requests anyhow.

2) Not copying the user data into a buffer. This makes sense for BTLs(unlike sm) that can move the data "directly" and for "intermediate"message sizes (below the eager limit, but still big enough that saving amemcpy makes a difference). It does rely on having a sendi function.

I guess I've been focussing on #1. I thought that was related to sendi,but now it seems sendi is related more to #2. So, if I understandcorrectly (unclear if this is the case), we're both right. I'm right inthat pruning the PML stack (specifically, populating the send request)is unrelated to the existence of sendi. You're right in that there isstill a legitimate role for sendi for some BTLs (sm not among them).

First you will have to call btl_alloc to get back a descriptor withsome BTL memory attached to it. The you will put your data (includingthe header) in this memory and once ready call btl_send. With sendithere is only one call from the PML into the BTL, but this time it isthe BTL responsibility to prepare the data that will be sent.
I'm churning a lot and not making much progress, but I'll trychewing on that idea (unless someone points out it's utterlyridiculous). I'll look into having PML ignore sendi functionsaltogether and just make the "send-immediate" path work fast withnormal send functions. If that works, then we can get rid of sendifunctions and hopefully have a solution that makes sense for everyone.
This is utterly ridiculous (I hope you really expect someone to say  it).

It's fine to say that. As you can see, I'll push back when I don'tunderstand why it's ridiculous.

As I said before, SM is only one of the networks supported by OpenMPI. Independent on how much I would like to have better sharedmemory performance, I will not agree with any PML modifications thatare SM oriented. We did that in the past with other BTLs and itturned out to be a bad idea, so I'm clearly not in favor of doing thesame mistake twice.
Regarding the sendi there are at least 3 networks that can takeadvantage of it: Portals, MX and Sicortex. Some of them do this rightnow, some others in the near future. Moreover, for these particularnetworks there is no way to avoid extra overhead without this feature(for very obscure reasons such as non contiguous pieces of memoryonly known by the BTL that can decrease the number of networkoperations).

Re: [OMPI devel] calling sendi earlier in the PML

Reply via email to