Hi Zeeshan, Please try v2.2.26 where we fixed many of these problems. Please let me know.
Best Regards, Luca Garulli Founder & CEO OrientDB LTD <http://orientdb.com/> On 14 August 2017 at 13:15, Zeeshan Ahmad <[email protected]> wrote: > Hey, > > I had two running master nodes i had to add another master node. My > distributed config: > > { > "replication": true, > "hotAlignment" : false, > "autoDeploy": true, > "readQuorum": 1, > "writeQuorum": "majority", > "executionMode": "synchronous", > "readYourWrites": true, > "newNodeStrategy": "dynamic", > "servers": { > "orientdbMaster1": "master", > "orientdbMaster2": "master", > "orientdbMaster3": "master" > }, > "clusters": { > "internal": { > }, > "*": { > "servers": ["<NEW_NODE>"] > } > } > } > > > 2017-08-14 10:44:44:254 WARNI [orientdbMaster1] Timeout (20001ms) on > waiting for synchronous responses from nodes=[orientdbMaster2, > orientdbMaster3] responsesSoFar=[orientdbMaster3] request=(id=1.263 > task=gossip timestamp: 1502707464247 lockManagerServer: orientdbMaster1) > [ODistributedDatabaseImpl] > > As soon the new machine joined the cluster following chain of events > happened: > > 1. Added orientdbMaster3 > 2. orientdbMaster3 started synchronising the database with orientdbMaster2 > 3. During this time orientdbMaster2 became unreachable for > orientdbMaster1. Got this in the log continuously > > WARNI [orientdbMaster1] Timeout (20001ms) on waiting for synchronous > responses from nodes=[orientdbMaster2, orientdbMaster3] > responsesSoFar=[orientdbMaster3] > request=(id=1.263 task=gossip timestamp: 1502707464247 lockManagerServer: > orientdbMaster1) [ODistributedDatabaseImpl] > > 4. Writes were not possible as the quorum of 2 was not reached. All the > writes failed. > 5. After the orientdbMaster3 was up, orientdbMaster2 started to rebuild > the indexes. (Took a lot of time) > > > This caused a huge down time. > > The same issues happens whenever a node which was the lock Manager was > restarted. The machine starts to get the entire database. > > Questions: > > 1. Why is the entire database needed to be fetched again on every restart > of the lockManger node? > 2. How is the new lock Manager elected in the beginning and what is the > process of re-election? > 3. Can i specify the new node to get the database from a specific node? > 4. Why are writes not possible on the node which is helping re-sync of > database? > 5. Why the indices rebuild whenever there is re-sync? > > I have waste a lot of time when i added a new machine and this caused a > huge downtime as well. > > > Thanks, > Zeeshan > > -- > > --- > You received this message because you are subscribed to the Google Groups > "OrientDB" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
