On Fri, Jul 5, 2013 at 12:24 AM, Pulasthi Supun <[email protected]> wrote:
> Hi, > > > On Thu, Jul 4, 2013 at 8:00 PM, Paul Fremantle <[email protected]> wrote: > >> Some systems (e.g. MQTT) allow for the last message on any given topic to >> be retained and then replayed to any new subscriber. This model can work >> well: it encourages the design of a topic space where you just need to know >> the last message (i.e. the latest state of that particular resource). >> >> Paul >> >> >> On 4 July 2013 13:14, Afkham Azeez <[email protected]> wrote: >> >>> There could be a situation where when a cluster message is sent, a >>> member momentarily leaves the cluster, but joins immediately. This >>> generally could happen when the nodes slow down under load, or due to >>> intermittent network failures. However, this could lead to failures because >>> crucial cluster messages may not be received by members. >>> >>> To overcome this, or reduce the probability of loss of such messages, we >>> can replay a certain number of messages when a member joins. On the sender >>> side, messages over a particular time period can be buffered, and then >>> replayed when new members join. However, we should ensure that the messages >>> are idempotent, and messages should declare whether they are idempotent or >>> not. If a message is not idempotent, we will not replay it. All the >>> messages we have at the moment are idempotent, AFAIK. >>> >>> How does this approach sound? >>> >> > +1 . In the new Hazelcast based implementation is a member > that momentarily leaves the cluster and joins back treated as new member > like in the tribes based implementation? > Yeah, that is how unreliable failure detection normally works, and all real world group management systems use unreliable failure detection. > > If this is not the case ( that is if there is some information kept about > the member that leaves and joins back ), how about saving some information > like the last message that was processed by the member that way i think the > number of messages that need to be replayed will be reduced. not sure if > this is worth the trouble performance wise even if its possible :) :) > If a Carbon member leaves, and rejoins within about 5 seconds, we can assume that it was a momentary failure because Carbon startup normally takes at least 10s. If it takes longer, we could assume that the member actually left & rejoined. However, these assumptions don't hold under all conditions. On a very fast computer, Carbon may start under 5s, and there could be a prolonged failure where it could take more than 30s to detect that a member rejoined. So, the safest would be to replay the messages, and handle duplicates. > > Regards, > Pulasthi > >> >>> >>> -- >>> *Afkham Azeez* >>> Director of Architecture; WSO2, Inc.; http://wso2.com >>> Member; Apache Software Foundation; http://www.apache.org/ >>> * <http://www.apache.org/>** >>> email: **[email protected]* <[email protected]>* cell: +94 77 3320919 >>> blog: **http://blog.afkham.org* <http://blog.afkham.org>* >>> twitter: **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez> >>> * >>> linked-in: **http://lk.linkedin.com/in/afkhamazeez* >>> * >>> * >>> *Lean . Enterprise . Middleware* >>> >>> _______________________________________________ >>> Architecture mailing list >>> [email protected] >>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >>> >>> >> >> >> -- >> Paul Fremantle >> CTO and Co-Founder, WSO2 >> OASIS WS-RX TC Co-chair, VP, Apache Synapse >> >> UK: +44 207 096 0336 >> US: +1 646 595 7614 >> >> blog: http://pzf.fremantle.org >> twitter.com/pzfreo >> [email protected] >> >> wso2.com Lean Enterprise Middleware >> >> Disclaimer: This communication may contain privileged or other >> confidential information and is intended exclusively for the addressee/s. >> If you are not the intended recipient/s, or believe that you may have >> received this communication in error, please reply to the sender indicating >> that fact and delete the copy you received and in addition, you should not >> print, copy, retransmit, disseminate, or otherwise use the information >> contained in this communication. Internet communications cannot be >> guaranteed to be timely, secure, error or virus-free. The sender does not >> accept liability for any errors or omissions. >> >> _______________________________________________ >> Architecture mailing list >> [email protected] >> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >> >> > > > -- > -- > Pulasthi Supun > Software Engineer; WSO2 Inc.; http://wso2.com, > Email: [email protected] > Mobile: +94 (71) 9258281 > Blog : http://pulasthisupun.blogspot.com/ > Git hub profile: https://github.com/pulasthi > > _______________________________________________ > Architecture mailing list > [email protected] > https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture > > -- *Afkham Azeez* Director of Architecture; WSO2, Inc.; http://wso2.com Member; Apache Software Foundation; http://www.apache.org/ * <http://www.apache.org/>** email: **[email protected]* <[email protected]>* cell: +94 77 3320919 blog: **http://blog.afkham.org* <http://blog.afkham.org>* twitter: **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez> * linked-in: **http://lk.linkedin.com/in/afkhamazeez* * * *Lean . Enterprise . Middleware*
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
