Hi, Thanks for the prompt reply, The process probably seems to have died due to some other application occupying a large amount of memory.
I unfortunately deleted the entire data and reset the cluster with ArangoDB 3.0.6, since it has a fix in ArangoImp, which i'm looking for. The log message on co-coordinator which i could remember on restart is something like "Cannot connect to primary DB server at IP:Port", It looked like it was not able to resolve it. Next time if the error occurs, i'll make sure to take the logs. Thanks, Praveen On Tuesday, September 6, 2016 at 2:11:24 PM UTC+5:30, [email protected] wrote: > > Hi, > > 1. Did the process die for a good reason meaning that you maybe limited > memory via cgroups or would you consider this a bug? If so it would be very > interesting to know more details. > > 2. Do you have any logoutput of the restart? > > What should happen: > > DBServer restarts and connects to the agency. It tries to lookup its > internal ID via the user supplied --cluster.my-local-info and should > reintegrate iteself into a cluster. > > This is integral part of our resilience tests which we execute very > regulary. You seem to have hit an edge case or seem to have a problem with > your setup. In any case logs and startup options would be HIGHLY > appreciated to resolve this issue :) > > Kind regards, > > Andreas Streichardt > > Am Dienstag, 6. September 2016 08:13:58 UTC+2 schrieb > [email protected]: >> >> I've setup an ArangoDB cluster on three machines, following the >> instructions present here : >> https://docs.arangodb.com/3.0/Manual/Deployment/Distributed.html >> >> Then i injested documents using ArangoImp and it worked perfectly, with >> good performance. >> >> After about a day, One of the Primary DB servers went down with the below >> error : >> >> 2016-09-06T02:44:24Z [29600] ERROR {threads} could not start thread >> 'DispatcherStd': Cannot allocate memory >> 2016-09-06T02:44:24Z [29600] FATAL cannot start dispatcher thread >> >> When i restarted the processes present in the DB server machine, it was >> not detected by the co-ordinator machine. >> >> Are there any guidelines to bring up the server, when one of the machine >> goes down. >> >> However, if i remove all the agency, primary and co-ordinator >> directories, and re-start all the process from start then arangoDB is >> working >> properly. >> > -- You received this message because you are subscribed to the Google Groups "ArangoDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
