I think that it's possible my terminology is letting me down. If I appear to be getting defensive over my example, then I apologise and it's not intended, I just haven't experienced the problems that you describe.
> For instance in the transport layer. A server can detect that an ack/nack is overdue and start a > retransmission. But will this not only be appropriate if the remote service remains discoverable and in a (business context) working state? The following is a question that I genuinely do not know the answer to; if there is a transmission problem, what is quicker? To "start a retransmission" and possibly have to wait for another timeout to occur, or drop the remote reference and discover a new service with the assumption that there wouldn't be any transmission problems between the client and the new service? Obviously this might be problematic if you are relying on your business services to be having a certain amount of state. > I haven't seen any self healing behaviour in the jeri transports, or the layers between the actual > java.lang.Proxy of the service and its transport Neither have I, which is why when outside of low level transport layers, recovery and self-healing are all down to the client. > retrieves a new RemoteReference for every transport error Actually, the service could throw a business-context exception (as long as it extends RemoteException) and that would prompt the retrieval of a new Remote Reference also. > (not registered, bu exported) remote reference gets serialized as for instance a return value from a call to > the service, and this reference is passed through the system, it will still experience transport errors. Can you explain this for me, please? If I have wrapped a remote proxy and call the "sayHello()" method on, it returns a String which gets deserialised on the client side, i.e. in the Wrapper, into a "normal" Java String which can be read and have all the wonderful String methods called on it without risking a RemoteException. Right? Or have I got the wrong end of the stick? Also, can you point me to somewhere that I can read up on the differences between "not registered, but exported", please? That bit went over my head. > i'm thinking about: during the change reducing the number of cluster members, upgrading the freed cluster > members, and a hot switchover for the 2 groups This is indeed what this service wrapping method can be used for. If Service_A needs Service_B to do it's job and there are multiple Service_Bs on the subnet. If the machine with the B on that A is using explodes, or if it's JVM crashes, or if the service abruptly shuts down or stop responding or if it is purposely shutdown or etc; then when A next calls a method on the wrapped B, it will automatically find a different B (assuming one is available) all without A having to understand the it's original B reference is defunct and needs to be replaced. In my experience, all the return values that A got from it's original B reference are all still valid and able to be used. Which is why I'm struggling to understand your comment about "a return value from a call to the service...it will still experience transport errors". But I can fully believe that because I'm misunderstanding you, which is probably my fault. :-) I'm trying to understand exactly what you and Peter are trying to solve (and if this example helps any). Then maybe I can start contributing more to helping you guys. Sorry if the answers to these questions are in the specs somewhere, I just missed them. Cheers, Tom On Mon, Feb 15, 2010 at 5:43 PM, Sim IJskes - QCG <[email protected]> wrote: > Tom Hobbs wrote: > > Certainly in my experience, detecting errors and recovering is always the >> job of the client. To use a daft example; why would a web page detect >> that >> a browser has unexpectedly disappeared and try to find a new browser to >> display itself on? But in the event of a web server going down, it's >> always >> the browser/etc that needs to go any find another copy of the page >> somewhere. >> > > This is not always the case. For instance in the transport layer. A server > can detect that an ack/nack is overdue and start a retransmission. > > But thats not what i tried to express. In that specific email i meant a > client of the service. I haven't seen any self healing behaviour in the jeri > transports, or the layers between the actual java.lang.Proxy of the service > and its transport, so any hickup there will lead to a RemoteException. So i > guess, with the current state of affairs, the only place for selfhealing > (with keeping the RemoteReference the same) is for a SmartProxy. > > What you have done, is created a ServiceWrapper which does the > wrapping/proxying on the clients initiative, and retrieves a new > RemoteReference for every transport error. This is also a perfectly valid > approach. > > The only problem i see, is (in both scenarios) that when an anonymous (not > registered, bu exported) remote reference gets serialized as for instance a > return value from a call to the service, and this reference is passed > through the system, it will still experience transport errors. So this > remote reference needs to be wrapped also, either at the server or the > client side. > > While writing this, i'm thinking this might be also fixed in the invocation > layer. Altough it still only guards against transport errors, and not > against dropping a member of a server cluster. > > > This style of service wrapping worked very well in a complex trading >> platform that I was previously involved in. It enabled us, with the >> provision of some additional business rules - especially regarding state, >> to >> take down services at random and have the system automatically recover >> without interrupting the client. It truly was a self-healing system. >> > > Indeed, i can see this. And very practical for dynamic cluster scaling > issues, for instance during a deployment of a new version. (i'm thinking > about: during the change reducing the number of cluster members, upgrading > the freed cluster members, and a hot switchover for the 2 groups). > > Gr. Sim > > -- > QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl > Quality Consultancy Group b.v., Leiderdorp, Kvk Leiden: 28088397 >
