On Thu, Jan 21, 2010 at 10:37 AM, Simon Laws <[email protected]> wrote: > At the moment the code follows approach 2. However it's in no way > configurable. It was initially introduced in this way because of a > number of test cases that were failing when following these steps. > > test with multiple nodes starts > client node fires a message into a component using SCA API > In parallel the node with the service is coming up and populating the registry > reference is asked to send a message to the service before the service > endpoint has been replicated around to all nodes > > Hence there is code in the tribes version of the replicated endpoint registry. > > The nature of working in this distributed world is that we will always > encounter these failure cases. I'd like to maintain some configurable > resilience for the most common failures. Without it you may (or may > not) need to include retry logic around every use of a reference > depending on how that reference is wired which doesn't seem ideal. > > Simon >
IMHO it would be far better to have the delay in the testcase as that is where it knows what its trying to do. Nodes are going to come and go over the lifetime of a domain and all the different bindings handle timeouts or non-existent services in their own way. If a service isn't available for a reference at the time of a request then the request should just fail, i don't see much success in having the runtime try to paper over that. Something which would be useful is having the runtime be more in control of the target endpoint being used so that each binding invoker its kept up-to-date as endpoints are updated in the endpoint registry so that bindings don't make requests to out-of-date endpoints. ...ant
