On Thu, Jan 21, 2010 at 10:57 AM, ant elder <[email protected]> wrote:
> On Thu, Jan 21, 2010 at 10:37 AM, Simon Laws <[email protected]> 
> wrote:
>> At the moment the code follows approach 2. However it's in no way
>> configurable. It was initially introduced in this way because of a
>> number of test cases that were failing when following these steps.
>>
>> test with multiple nodes starts
>> client node fires a message into a component using SCA API
>> In parallel the node with the service is coming up and populating the 
>> registry
>> reference is asked to send a message to the service before the service
>> endpoint has been replicated around to all nodes
>>
>> Hence there is code in the tribes version of the replicated endpoint 
>> registry.
>>
>> The nature of working in this distributed world is that we will always
>> encounter these failure cases. I'd like to maintain some configurable
>> resilience for the most common failures. Without it you may (or may
>> not) need to include retry logic around every use of a reference
>> depending on how that reference is wired which doesn't seem ideal.
>>
>> Simon
>>
>
> IMHO it would be far better to have the delay in the testcase as that
> is where it knows what its trying to do. Nodes are going to come and
> go over the lifetime of a domain and all the different bindings handle
> timeouts or non-existent services in their own way. If a service isn't
> available for a reference at the time of a request then the request
> should just fail, i don't see much success in having the runtime try
> to paper over that. Something which would be useful is having the
> runtime be more in control of the target endpoint being used so that
> each binding invoker its kept up-to-date as endpoints are updated in
> the endpoint registry so that bindings don't make requests to
> out-of-date endpoints.
>

That said... the Tuscany itest for the calculator-rmi sample has
started failing a lot on Hudson because when Hudson is heavily loaded
it can take so long to start up the service and reference
contributions that even with a 25 second delay the service still isn't
always up by the time the reference is invoked. I guess some way to
configure retries would help with this, though as its supposed to be a
trivial introduction sample it wouldn't be great to have to introduce
a lot of policy and intents into it if that was the approach taken to
configure the retires.

   ...ant

Reply via email to