[ https://issues.apache.org/jira/browse/OAK-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074482#comment-14074482 ]
Chetan Mehrotra commented on OAK-1453: -------------------------------------- Couple of observation regarding importance of config servers when sharding is involved based on this [thread|https://groups.google.com/d/msg/mongodb-user/Q0yRpr-kNco/DLMtpjZq36IJ] * you cannot simply randomise the list of config servers per mongos. The --configdb string needs to be the same across all mongos'. * Traffic between mongos and config servers can be very high (particularly first config server) if balancing is going on * On AWS it might be tempting to use t1.micro for config server but that should not be done as they might require higher network throughput. Use m3.large > MongoMK failover support for replica sets (esp. shards) > ------------------------------------------------------- > > Key: OAK-1453 > URL: https://issues.apache.org/jira/browse/OAK-1453 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: mongomk > Reporter: Michael Marth > Assignee: Thomas Mueller > Labels: production, resilience > Fix For: 1.1 > > > With OAK-759 we have introduced replica support in MongoMK. I think we still > need to address the resilience for failover from primary to secoandary: > Consider a case where Oak writes to the primary. Replication to secondary is > ongoing. During that period the primary goes down and the secondary becomes > primary. There could be some "half-replicated" MVCC revisions, which need to > be either discarded or be ignored after the failover. > This might not be an issue if there is only one shard, as the commit root is > written last (and replicated last) > But with 2 shards the the replication state of these 2 shards could be > inconsistent. Oak needs to handle such a situation without falling over. > If we can detect a Mongo failover we could query Mongo which revisions are > fully replicated to the new primary and discard the potentially > half-replicated revisions. -- This message was sent by Atlassian JIRA (v6.2#6252)