Did you take a stack trace of your _server_ and see if the fragment I posted is the place a bunch of threads are stuck? If so, then it's what I mentioned, and the patch I pointed to should fix it up (when it's ready)...
The fact that it hangs more frequently with replication > 1 is consistent with the JIRA. Shawn: Thanks, you beat me to the punch for clarifying "replication"! Best Erick On Sun, Jun 2, 2013 at 12:41 PM, Yago Riveiro <yago.rive...@gmail.com> wrote: > Shawn: > > replicationFactor higher than one yes. > > -- > Yago Riveiro > Sent with Sparrow (http://www.sparrowmailapp.com/?sig) > > > On Sunday, June 2, 2013 at 4:07 PM, Shawn Heisey wrote: > >> On 6/2/2013 8:28 AM, Yago Riveiro wrote: >> > Erick: >> > >> > In my case, when server hangs, no exception is thrown, the logs on both >> > servers stop registering the update INFO messages. if a shutdown one node, >> > immediately the log of the alive node register some update INFO messages >> > that appears was stuck at some place on the update operation. >> > >> > Other thing that I notice is the fact that the cluster hangs more >> > frequently when the collection has replication. >> >> Just to clarify, you are talking about a replicationFactor higher than >> one, not old-style master-slave replication, correct? I'm pretty sure >> that's the case, I'm just trying to keep this topic from getting derailed. >> >> Thanks, >> Shawn >> >> > >