Re: [orientdb] question regarding synchronisation with version 2.2.25 on adding a new machine in a cluster

Luca Garulli Fri, 18 Aug 2017 15:46:55 -0700

Hi Zeeshan,

Please try v2.2.26 where we fixed many of these problems. Please let me
know.


Best Regards,

Luca Garulli
Founder & CEO
OrientDB LTD <http://orientdb.com/>

On 14 August 2017 at 13:15, Zeeshan Ahmad <[email protected]> wrote:

> Hey,
>
> I had two running master nodes i had to add another master node. My
> distributed config:
>
> {
>   "replication": true,
>   "hotAlignment" : false,
>   "autoDeploy": true,
>   "readQuorum": 1,
>   "writeQuorum": "majority",
>   "executionMode": "synchronous",
>   "readYourWrites": true,
>   "newNodeStrategy": "dynamic",
>   "servers": {
>     "orientdbMaster1": "master",
>     "orientdbMaster2": "master",
>     "orientdbMaster3": "master"
>   },
>   "clusters": {
>     "internal": {
>     },
>     "*": {
>       "servers": ["<NEW_NODE>"]
>     }
>   }
> }
>
>
> 2017-08-14 10:44:44:254 WARNI [orientdbMaster1] Timeout (20001ms) on
> waiting for synchronous responses from nodes=[orientdbMaster2,
> orientdbMaster3] responsesSoFar=[orientdbMaster3] request=(id=1.263
> task=gossip timestamp: 1502707464247 lockManagerServer: orientdbMaster1)
> [ODistributedDatabaseImpl]
>
> As soon the new machine joined the cluster following chain of events
> happened:
>
> 1. Added orientdbMaster3
> 2. orientdbMaster3 started synchronising the database with orientdbMaster2
> 3. During this time orientdbMaster2 became unreachable for
> orientdbMaster1. Got this in the log continuously
>
> WARNI [orientdbMaster1] Timeout (20001ms) on waiting for synchronous
> responses from nodes=[orientdbMaster2, orientdbMaster3] 
> responsesSoFar=[orientdbMaster3]
> request=(id=1.263 task=gossip timestamp: 1502707464247 lockManagerServer:
> orientdbMaster1) [ODistributedDatabaseImpl]
>
> 4. Writes were not possible as the quorum of 2 was not reached. All the
> writes failed.
> 5. After the orientdbMaster3 was up, orientdbMaster2 started to rebuild
> the indexes. (Took a lot of time)
>
>
> This caused a huge down time.
>
> The same issues happens whenever a node which was the lock Manager was
> restarted. The machine starts to get the entire database.
>
> Questions:
>
> 1. Why is the entire database needed to be fetched again on every restart
> of the lockManger node?
> 2. How is the new lock Manager elected in the beginning and what is the
> process of re-election?
> 3. Can i specify the new node to get the database from a specific node?
> 4. Why are writes not possible on the node which is helping re-sync of
> database?
> 5. Why the indices rebuild whenever there is re-sync?
>
> I have waste a lot of time when i added a new machine and this caused a
> huge downtime as well.
>
>
> Thanks,
> Zeeshan
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] question regarding synchronisation with version 2.2.25 on adding a new machine in a cluster

Reply via email to