On Mon, Feb 6, 2012 at 3:49 PM, Mircea Markus <[email protected]> wrote: >> numOwners==2 is and will very likely remain the most common case, >> particularly for small clusters. >> >> But if we have two sites, it makes sense to configure 2 owners per >> site. If only one node goes down, the surviving owner will supply >> state to the new owner. If both nodes go down, the new owners will >> fetch the data from the other site. So while 2 nodes going down will >> be quite costly, it should be infrequent enough that it's worth >> optimizing for the more frequent "1 node goes down and than comes >> back >> up" case. >> > Agreed; this mixed batching (leaves with joins) makes sense for non-site > clusters as well. >
Yup, even with numOwners == 2 you could say that 2 nodes dying in 1 minute is highly improbable. So you can delay rehashes triggered by leaves by 1 minute just in case the node comes back up. >> > For total shutdown, I guess we can use other means that rehash, >> > e.g. a specific command that would disable it and start flushing >> > to the store. >> > >> >> I think just stopping the cache is enough to get it to flush data to >> the store with passivation enabled. > ATM, wouldn't the shutdown of a cluster of servers trigger a rehash storm? > Right. I was commenting only on the cache store part, which should be completely orthogonal to graceful shutdown. I think we need a mechanism to better handle partial cluster shutdown, and that mechanism can be used for full shutdown as well. >> But for now any data saved to a >> private store in distributed mode is useless after restart, because >> we >> have no safe way to push data that we don't own to other nodes (and >> by >> safe I mean avoiding overwriting newer data or resurrecting deleted >> data). > I think that should work with a clustered cache store though. > Yep, cluster shutdown with a shared cache store should work - even without any changes to how the cache is flushed to the store. The application (could be an AS instance or a HotRod server) just needs to shut down cleanly on each node, because Infinispan is just a library and can't decide to kill the entire application on its own. Cheers Dan _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
