Looking back at the where this conversation & concerns originated, it is clear that this came out from issues in the Tribes based cluster messaging (primarily message losses on EC2 deployments). We knew that Tribes had issues, and there was not much support available out there. Like I said before, Tribes was never designed for cloud scale. Even the Tomcat session replication on EC2 talks about persisting sessions to a DB or Amazon Elastic Cache, and not through the regular Tribes messaging. That is the reason why we switched to Hazelcast. Unlike Tribes, Hazelcast has been extensively testes & supported on EC2 by their devs.
What we should be doing is throughly testing the Hazelcast based implementation & see whether we still see the Tribes issues (which I doubt we would see), before making such a major decision which will drastically increase the deployment complexity of all of our products. On Thu, Sep 19, 2013 at 6:00 AM, Afkham Azeez <[email protected]> wrote: > > > > On Thu, Sep 19, 2013 at 5:24 AM, Sanjiva Weerawarana <[email protected]>wrote: > >> I disagree - of course caches can get out of sync .. that's part of the >> definition of being a distributed cache. However the values that you share >> in a cache are typically for performance optimization, not for reliable >> execution. > > > I do not completely agree. Take our permission cache for example. You may > revoke certain permissions from a role, but if the cache does not > eventually sync up, there will be major issues. There will be a window > where the system is vulnerable. If we assume that our caching is not > reliable & MB is, then for operations such as permission changes, we would > always need to send the cache invalidation message also using MB. > > >> >> On the question in your previous mail - I think we're asking this >> backwards. The first question is whether reliable messaging is necessary >> for deployment synchronization. I believe the answer that is a firm yes as >> otherwise deployment and undeployment is not reliably updated to all nodes. >> >> The second question is whether you need reliable messaging for every >> interaction between nodes of a cluster. The answer to that I believe is a >> firm no. Distributed cache is a perfectly good example. >> >> Sanjiva. >> >> >> On Tuesday, September 17, 2013, Afkham Azeez wrote: >> >>> Based on the message loss argument, in that case even our new caching >>> implementation has to be switched to use MB instead of Hazelcast. If Hz >>> cannot recover from such losses, the caches will contain obsolete values. >>> Caching is built on Hz maps. Cluster messaging is built on Hz topic. If you >>> argue that one cannot scale on the cloud & handle message losses/cluster >>> partitioning, then the other does not work as well. As I mentioned earlier, >>> depsync is only one type of cluster message. >>> >>> I would like to have a meeting next week when I return before the final >>> decision to move to MB is made, unless that has been already made, and this >>> input won't make any difference. >>> >>> Azeez >>> >>> >>> On Tue, Sep 17, 2013 at 8:04 PM, Afkham Azeez <[email protected]> wrote: >>> >>> DepSync is only one type of cluster message. There are many other types >>> of cluster messages. Are we proposing to use MB for those as well? >>> >>> >>> On Tue, Sep 17, 2013 at 7:47 PM, Afkham Azeez <[email protected]> wrote: >>> >>> >>> >>> >>> On Tue, Sep 17, 2013 at 11:42 AM, Eranda Sooriyabandara <[email protected] >>> > wrote: >>> >>> Hi Azeez, >>> >>> >>> On Tue, Sep 17, 2013 at 10:50 AM, Afkham Azeez <[email protected]> wrote: >>> >>> Unlike Tribes, Hazelcast has been designed to scale on the cloud. All >>> the cluster messaging related issues we were seeing were due to Tribes, and >>> Tribes was designed only for datacenter scale. >>> >>> >>> Main problem of hazelcast cluster message based deployment synchronizer >>> is reliability. What if one node didn't get the update message? That node >>> may not updated until next change. >>> >>> >>> Same thing can happen even with MB. If there is a network partition, >>> messages may not be received. But once the partitions are merged, the >>> messages that were not received should be received. Unlike Tribes, >>> Hazelcast has a good set of distributed collections, and I believe, the >>> messages posted to topics would be properly received. Adding MB just to >>> send depsync messages is overkill IMO, and the decision has been based on >>> the problems that were faced with the old Tribes based implementation. >>> >>> >>> >>> thanks >>> Eranda >>> >>> >>> >>> Azeez >>> >>> >>> On Tue, Sep 17, 2013 at 9:49 AM, Sanjiva Weerawarana >>> <[email protected]>wrote: >>> >>> I don't see the point of marrying into Hazelcast at that level. The >>> problem required here is a queuing solution because we need it to scale >>> from simple to very large installations involving multiple AZs etc.. Many >>> times persistent reliability is important (esp for deployment messages). >>> Why would we re-invent all of that on top of Hazelcast instead of using MB? >>> Of course we need an embedded, in-memory, ultra-light weight system too for >>> the simple case and MB can deliver that quite easily. >>> >>> Sanjiva. >>> >>> >>> On Tue, Sep 17, 2013 at 12:07 AM, Afkham Azeez <[email protected]> wrote: >>> >>> >>> >>> >>> On Mon, Sep 16, 2013 at 11:42 PM, Afkham Azeez <[email protected]> wrote: >>> >>> >>> >>> >>> On Mon, Sep 16, 2013 at 11:20 PM, Isuru Perera < >>> >>> >> >> -- >> Sanjiva Weerawarana, Ph.D. >> Founder, Chairman & CEO; WSO2, Inc.; http://wso2.com/ >> email: [email protected]; phone: +94 11 763 9614; cell: +94 77 787 6880 | +1 >> 650 265 8311 >> blog: http://sanjiva.weerawarana.org/ >> >> Lean . Enterprise . Middleware >> >> >> _______________________________________________ >> Architecture mailing list >> [email protected] >> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >> >> > > > -- > *Afkham Azeez* > Director of Architecture; WSO2, Inc.; http://wso2.com > Member; Apache Software Foundation; http://www.apache.org/ > * <http://www.apache.org/>** > email: **[email protected]* <[email protected]>* cell: +94 77 3320919 > blog: **http://blog.afkham.org* <http://blog.afkham.org>* > twitter: **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez> > * > linked-in: **http://lk.linkedin.com/in/afkhamazeez* > * > * > *Lean . Enterprise . Middleware* > -- *Afkham Azeez* Director of Architecture; WSO2, Inc.; http://wso2.com Member; Apache Software Foundation; http://www.apache.org/ * <http://www.apache.org/>** email: **[email protected]* <[email protected]>* cell: +94 77 3320919 blog: **http://blog.afkham.org* <http://blog.afkham.org>* twitter: **http://twitter.com/afkham_azeez*<http://twitter.com/afkham_azeez> * linked-in: **http://lk.linkedin.com/in/afkhamazeez* * * *Lean . Enterprise . Middleware*
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
