Inline.
On Sun, Apr 28, 2013 at 11:16 AM, kishore g <[email protected]> wrote: > Its looks amazing, was looking for something like this. Couple of questions > > 1. How is the data replicated, is it writing synchronously to both primary > and backups. What if backup is down > I think that the synchronous backup is configurable, not sure. If the backup for a partition goes down, the partition is backed up elsewhere. > 2. what happens in network partition ? > You get split brain behavior. This is actually very good for some applications which are not the storage tier. For instance, with web-servers or Drill bits, split brain is actually better than going tharn. That way, the machines keep working and if the storage tier is, by some miracle, still functional then all is well. If the storage tier is not well, then part of the split cluster will reflect that. Another case where split brain is good is in message queuing where the storage tier is local to the service receiving the messages. It is good to continue queuing messages during the split brain episode rather than failing to accept them. HazelCast has a mechanism for handling merging of split clusters, but I would be very leery of expecting it to work correctly. If you care about consistency, then Zookeeper is a better model. > > Looks like it dynamically distributes data based on the number of nodes in > the system. I think in multicast it can discover other nodes, but what > happens in tcp. > With TCP, you have to give a host name or address of at least one member of the cluster. You can configure things like IP address ranges and port ranges. Without multi-cast, HC will scan for live servers. This isn't quite zero-conf, but in the application I am building, it will use multicast if you don't specify anything and if you give a host option it will take a comma delimited list of hostnames or IP addresses and use all of them. > Does not look like its following any consensus protocol like paxos/zab. I > just skimmed through the doc, could not get the internal details. Would > love to know more about how it ensures data consistency. > It doesn't do much of that. There is a way to define split/merge behavior but there is no effort to provide strong consistency. As I mentioned before, that is actually really, really good for many applications. The mission of HC is very different from the mission of ZK and each does a different thing well. For providing super simple out-of-the-box user experience, HazelCast pretty much dominates any of the ZK based approaches. For providing absolute consistency, ZK totally dominates HC. Thus, for something like Drillbits where I want them to do whatever they can under all circumstances and where the primary goal is read access, I would say Hazel is better. For providing the ground truth information about where the CLDB master is in the MapR file-system, I think that ZK is better.
