On Mon, Jul 25, 2016 at 5:38 PM, Menno Smits <[email protected]> wrote:
> Regarding https://bugs.launchpad.net/juju-core/+bug/1597601 ... > > When "juju enable-ha" is used, new controller machines are started, each > running a mongod instance which is connected to Juju's replicaset. As each > new node joins the replicaset a MongoDB leader election is triggered which > causes all mongod instances in the replicaset to drop their connections > (this is by design). The workers in the Juju's machine agents handle this > correctly by aborting and restarting with fresh connections to MongoDB. > > The problem is that if an API request comes in at just the right time, it > will be actioned just as the MongoDB connection goes down, resulting in the > i/o timeout error being reported back to the client. > > This isn't a new problem but it's one that Juju's users regularly run in > to. A workaround is to wait for the new controller machines to come up > after enable-ha is issued before doing anything else. > > IMHO it would be best if Juju could hide all this from the client as much > as possible but I'm really not sure if that's feasible or what the best > approach should be. > > The challenge is that unless we do some major rearchitecting, the API > server needs to be restarted when the MongoDB connections drop. There's no > way to that the client's connection can stay up, making it difficult to > hide this detail from the client. > It seems that mgo could handle this as a failover. Or that we could see that the replica set is starting and wait until it reports being up, then refresh the mgo session. I don't understand why the API server itself has to restart, though I am sure there are good reasons. > > The most practical solution I can think of is that we introduce a new > error type over the API which means "please retry the request". Errors such > as an i/o timeout from the MongoDB layer could be converted into this > error. Clients would obviously have to handle this error specially. > Barring handling it via mgo session this seems obvious and practical. ~ro -- Reed O'Brien ✉ [email protected] ✆ 415-562-6797
-- Juju-dev mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
