Hey,
I had two running master nodes i had to add another master node. My
distributed config:
{
"replication": true,
"hotAlignment" : false,
"autoDeploy": true,
"readQuorum": 1,
"writeQuorum": "majority",
"executionMode": "synchronous",
"readYourWrites": true,
"newNodeStrategy": "dynamic",
"servers": {
"orientdbMaster1": "master",
"orientdbMaster2": "master",
"orientdbMaster3": "master"
},
"clusters": {
"internal": {
},
"*": {
"servers": ["<NEW_NODE>"]
}
}
}
2017-08-14 10:44:44:254 WARNI [orientdbMaster1] Timeout (20001ms) on
waiting for synchronous responses from nodes=[orientdbMaster2,
orientdbMaster3] responsesSoFar=[orientdbMaster3] request=(id=1.263
task=gossip timestamp: 1502707464247 lockManagerServer: orientdbMaster1)
[ODistributedDatabaseImpl]
As soon the new machine joined the cluster following chain of events
happened:
1. Added orientdbMaster3
2. orientdbMaster3 started synchronising the database with orientdbMaster2
3. During this time orientdbMaster2 became unreachable for orientdbMaster1.
Got this in the log continuously
WARNI [orientdbMaster1] Timeout (20001ms) on waiting for synchronous
responses from nodes=[orientdbMaster2, orientdbMaster3]
responsesSoFar=[orientdbMaster3] request=(id=1.263 task=gossip timestamp:
1502707464247 lockManagerServer: orientdbMaster1) [ODistributedDatabaseImpl]
4. Writes were not possible as the quorum of 2 was not reached. All the
writes failed.
5. After the orientdbMaster3 was up, orientdbMaster2 started to rebuild the
indexes. (Took a lot of time)
This caused a huge down time.
The same issues happens whenever a node which was the lock Manager was
restarted. The machine starts to get the entire database.
Questions:
1. Why is the entire database needed to be fetched again on every restart
of the lockManger node?
2. How is the new lock Manager elected in the beginning and what is the
process of re-election?
3. Can i specify the new node to get the database from a specific node?
4. Why are writes not possible on the node which is helping re-sync of
database?
5. Why the indices rebuild whenever there is re-sync?
I have waste a lot of time when i added a new machine and this caused a
huge downtime as well.
Thanks,
Zeeshan
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.