Re: [Pvfs2-developers] bmi testcontext/testunexpected

Sam Lang Tue, 06 Jan 2009 14:58:50 -0800

Changing the API as you describe would actually bring back theoriginal problem. As is, the BMI_tcp_testcontext call knows thatthere are unexpected messages waiting, so it returns immediately(expecting a call to testunexpected to follow). This is a specificpolicy hard-coded in the tcp method.

With just a single testcontext call and all expected and unexpectedmessages going to that context, the tcp code would have to put all theunexpected messages at the top of the context to give them priority.This would fix the particular problem that Nawab has, but its stilldictating policy (which messages get priority) from within theparticular BMI method.

I agree that forcing the application to define the policy (withthreads or timeouts) is moving the problem elsewhere, but its movingthe problem to where it belongs. Its our pvfs server that wantsunexpected messages to have priority, the bmi code itself shouldn'tdictate that priority. We could define interfaces to BMI that allowthe policy to be set, but that's even further from where we are now.


-sam

On Jan 6, 2009, at 2:52 PM, Rob Ross wrote:

Yeah a special named context for unexpected message would be a cleanway to have done things... -- Rob
On Jan 6, 2009, at 2:49 PM, Phil Carns wrote:
Yeah, I don't particularly like adding special cases either.
I feel like making the consumer play with timeouts or use an extrathread would be just as much of a hack/workaround, though. Itsjust moving the problem elsewhere.
Fundamentally it seems more like a BMI API flaw. It would havemade more sense (for example) if unexpected messages were assignedto a specific context and the testunexpected() and testcontext()functions were combined. The consumer could then use a single testcall to retrieve both unexpected and normal messages at once ifthey are in the same context (as in the pvfs2-server use case).Testing on a different context would ignore the presence ofunexpected messages (as in the problem triggering use case here).
There are other ways to deal with it, that's just an example. Wejust need the API to better express the intention of the caller(preferably in one function) so that BMI doesn't have to optimizeby guessing about what else is going on.
That is more work than just adding a flag, though :) It probablydepends on if we think the use case is going to be around longenough to justify tweaking the API.
-Phil

Sam Lang wrote:
I've committed the set_info fix for this. I'm not crazy about it,but it should work for now. In the long term, we should probablymove away from method specific hacks like this. I.e. it should beup to the API consumer (our server) to adjust timeouts or calltestunexpected in a separate thread.Nawab, in the zoidfs init code after initializing BMI you need tocall:
int check = 0;
BMI_set_info(0, BMI_TCP_CHECK_UNEXPECTED, &check);
-sam
On Dec 23, 2008, at 2:01 PM, Phil Carns wrote:
Sam Lang wrote:
Hi All,
I think Nawab has found a bug (or untested code path) in the BMItcp method. He's running a daemon that both receives unexpectedrequests (as a server), and receives expected responses (as aclient).In the BMI_testcontext call, if there aren't any completed(expected) operations, and there are completed unexpectedreceives, we return immediately, assuming thatBMI_testunexpected will be called in turn. I think the ideahere is that we want to keep our latency down for unexpectedmessages, instead of doing work on expected messages whileunexpected messages are waiting in the hopper. But the daemonis single threaded, and making blocking PVFS_sys_* calls, so weessentially spin forever calling BMI_testcontext over and over.I'm not sure of the best way to fix this. Easy fixes would beto remove the check for completed unexpected receives, and/or dotcp_do_work for a shorter timeout.It seems like we have a special case for blocking PVFS_sys_*calls. We want to ignore unexpected receives just in that case,and actually call tcp_do_work. In other contexts, I think wewant the behavior that we have now, where we assume that aBMI_testunexpected call will follow a BMI_testcontext call. Wecould modify the testcontext call to take a separate parameter,but that seems messy. We might also be able to handle this withseparate BMI contexts somehow...
I haven't dug in the code yet to see if I see any more elegantway to handle it, but I wanted to mention that if you want to adda special flag to toggle the behavior, it might be better to justset it globally with the set_info() function rather thanmodifying the testcontext() api. That way you don't have tochange any of the other BMI methods. There are already a coupleof similar set_info() calls to toggle BMI behavior for differentuse cases.
-Phil
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers


_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] bmi testcontext/testunexpected

Reply via email to