Hi Alex,
We already have an issue on { hotAlignment:true } in distributed settings:

https://github.com/orientechnologies/orientdb/issues/2270

It's in our queue.

Lvc@



On 13 June 2014 18:18, alexpmorris <[email protected]> wrote:

> Been testing with the 1.7.4 snapshot, and I still can't get an orientdb
> cluster to properly align itself after a node is removed and added back.
>  I've tried on windows, on ubuntu, and ubuntu as root just to be sure.
>  I've tried adjust parameters in hazelcast.xml and
> default-distributed-db-config.json, still nothing.  If i completely erase
> the db and let it recover, it generally will work.  However, it will not
> properly sync if a record has been altered while the other node was down.
>
> Here is the log file of what happens (this was 1.7.4 snapshot, on ubuntu
> as root):
>
> 2014-06-13 12:02:07:505 INFO Loading configuration from:
> /home/test/orientdb/orientdb2/config/orientdb-dserver-config.xml...
> [OServerConfigurationLoaderXml]
> 2014-06-13 12:02:07:913 INFO OrientDB Server v1.7-SNAPSHOT (build UNKNOWN@r;
> 2014-06-12 18:25:56+0200) is starting up... [OServer]
> 2014-06-13 12:02:07:926 INFO Databases directory:
> /home/test/orientdb/orientdb2/databases [OServer]
> 2014-06-13 12:02:08:001 INFO Port 0.0.0.0:2424 busy, trying the next
> available... [OServerNetworkListener]
> 2014-06-13 12:02:08:002 INFO Listening binary connections on 0.0.0.0:2425
> (protocol v.21, socket=default) [OServerNetworkListener]
> 2014-06-13 12:02:08:002 INFO Port 0.0.0.0:2480 busy, trying the next
> available... [OServerNetworkListener]
> 2014-06-13 12:02:08:003 INFO Listening http connections on 0.0.0.0:2481
> (protocol v.10, socket=default) [OServerNetworkListener]
> 2014-06-13 12:02:08:015 INFO Installing dynamic plugin 'studio-1.7.zip'...
> [OServerPluginManager]
> 2014-06-13 12:02:08:146 INFO Installing GREMLIN language v.2.5.0 -
> graph.pool.max=50 [OGraphServerHandler]
> 2014-06-13 12:02:08:195 INFO Starting distributed server
> 'node1402673455127'... [OHazelcastPlugin]
> 2014-06-13 12:02:08:245 INFO Configuring Hazelcast from
> '/home/test/orientdb/orientdb2/config/hazelcast.xml'. [FileSystemXmlConfig]
> 2014-06-13 12:02:08:591 INFO null [orientdb] [3.2.1] Prefer IPv4 stack is
> true. [DefaultAddressPicker]
> 2014-06-13 12:02:08:623 INFO null [orientdb] [3.2.1] Picked
> Address[192.168.1.10]:2435, using socket
> ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=2435], bind any local is true
> [DefaultAddressPicker]
> 2014-06-13 12:02:08:775 INFO [192.168.1.10]:2435 [orientdb] [3.2.1]
> Hazelcast Community Edition 3.2.1 (20140428) starting at
> Address[192.168.1.10]:2435 [system]
> 2014-06-13 12:02:08:775 INFO [192.168.1.10]:2435 [orientdb] [3.2.1]
> Copyright (C) 2008-2014 Hazelcast.com [system]
> 2014-06-13 12:02:08:784 INFO [192.168.1.10]:2435 [orientdb] [3.2.1]
> Creating MulticastJoiner [Node]
> 2014-06-13 12:02:08:810 INFO [192.168.1.10]:2435 [orientdb] [3.2.1]
> Address[192.168.1.10]:2435 is STARTING [LifecycleService]
> 2014-06-13 12:02:09:010 INFO [192.168.1.10]:2435 [orientdb] [3.2.1]
> Connecting to /192.168.1.10:2434, timeout: 0, bind-any: true
> [SocketConnector]
> 2014-06-13 12:02:09:028 INFO [192.168.1.10]:2435 [orientdb] [3.2.1] 49736
> accepted socket connection from /192.168.1.10:2434
> [TcpIpConnectionManager]
> 2014-06-13 12:02:14:550 INFO [192.168.1.10]:2435 [orientdb] [3.2.1]
>
> Members [2] {
> Member [192.168.1.10]:2434
> Member [192.168.1.10]:2435 this
> }
>  [ClusterService]
> 2014-06-13 12:02:16:100 INFO [192.168.1.10]:2435 [orientdb] [3.2.1]
> Address[192.168.1.10]:2435 is STARTED [LifecycleService]
> 2014-06-13 12:02:16:117 INFO [node1402673455127] found no previous
> messages in queue orientdb.node.node1402673455127.response
> [OHazelcastDistributedMessageService]
> 2014-06-13 12:02:16:296 WARN [node1402673455127] opening database
> 'testdb'... [OHazelcastPlugin]
> 2014-06-13 12:02:16:302 INFO [node1402673455127] loaded database
> configuration from active cluster [OHazelcastPlugin]
> 2014-06-13 12:02:16:354 INFO updated distributed configuration for
> database: testdb:
> ----------
> {
>   "version":2,
>   "autoDeploy":true,
>   "hotAlignment":true,
>   "readQuorum":1,
>   "writeQuorum":2,
>   "failureAvailableNodesLessQuorum":false,
>   "readYourWrites":true,"clusters":{
>     "internal":null,
>     "index":null,
>     "*":{
>   "servers":["<NEW_NODE>","node1402673438702","node1402673455127"]
> }
>     }
> }
> ---------- [OHazelcastPlugin]
> 2014-06-13 12:02:16:375 WARN [node1402673455127] found 1 previous messages
> in queue orientdb.node.node1402673455127.testdb.request, aligning the
> database... [OHazelcastDistributedMessageService]
> 2014-06-13 12:02:18:854 WARN Storage testdb was not closed properly. Will
> try to restore from write ahead log. [OLocalPaginatedStorage]
> 2014-06-13 12:02:18:854 SEVE Restore is not possible because write ahead
> log is empty. [OLocalPaginatedStorage]
> 2014-06-13 12:02:18:927 INFO Storage data restore was completed
> [OLocalPaginatedStorage]
> 2014-06-13 12:02:22:321 WARN segment file 'database.ocf' was not closed
> correctly last time [OSingleFileSegment]
> 2014-06-13 12:02:22:334 WARN Can not restore 1 WAL master record for
> storage testdb [OWriteAheadLog][node1402673455127]<-[node1402673438702]
> error on reading distributed request: record_update(#9:4 v.6)
> Error on creation of shared resource
> ->
> com.orientechnologies.common.concur.resource.OSharedContainerImpl.getResource(OSharedContainerImpl.java:55)
> ->
> com.orientechnologies.orient.server.distributed.ODistributedStorage.getResource(ODistributedStorage.java:516)
> ->
> com.orientechnologies.orient.core.metadata.OMetadataDefault.init(OMetadataDefault.java:110)
> ->
> com.orientechnologies.orient.core.metadata.OMetadataDefault.load(OMetadataDefault.java:68)
> ->
> com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.open(ODatabaseRecordAbstract.java:291)
> ->
> com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:49)
> ->
> com.orientechnologies.orient.server.OServer.openDatabase(OServer.java:557)
> ->
> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.initDatabaseInstance(OHazelcastDistributedDatabase.java:281)
> ->
> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.onMessage(OHazelcastDistributedDatabase.java:471)
> ->
> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase$1.run(OHazelcastDistributedDatabase.java:243)
> -> java.lang.Thread.run(Thread.java:745)
> The record with id '#0:1' not found
> ->
> com.orientechnologies.common.concur.resource.OSharedContainerImpl.getResource(OSharedContainerImpl.java:55)
> ->
> com.orientechnologies.orient.server.distributed.ODistributedStorage.getResource(ODistributedStorage.java:516)
> ->
> com.orientechnologies.orient.core.metadata.OMetadataDefault.init(OMetadataDefault.java:110)
> ->
> com.orientechnologies.orient.core.metadata.OMetadataDefault.load(OMetadataDefault.java:68)
> ->
> com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.open(ODatabaseRecordAbstract.java:291)
> ->
> com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:49)
> ->
> com.orientechnologies.orient.server.OServer.openDatabase(OServer.java:557)
> ->
> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.initDatabaseInstance(OHazelcastDistributedDatabase.java:281)
> ->
> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.onMessage(OHazelcastDistributedDatabase.java:471)
> ->
> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase$1.run(OHazelcastDistributedDatabase.java:243)
> -> java.lang.Thread.run(Thread.java:745)
> Storage testdb is not opened.
> ->
> com.orientechnologies.common.concur.resource.OSharedContainerImpl.getResource(OSharedContainerImpl.java:55)
> ->
> com.orientechnologies.orient.server.distributed.ODistributedStorage.getResource(ODistributedStorage.java:516)
> ->
> com.orientechnologies.orient.core.metadata.OMetadataDefault.init(OMetadataDefault.java:110)
> ->
> com.orientechnologies.orient.core.metadata.OMetadataDefault.load(OMetadataDefault.java:68)
> ->
> com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.open(ODatabaseRecordAbstract.java:291)
> ->
> com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:49)
> ->
> com.orientechnologies.orient.server.OServer.openDatabase(OServer.java:557)
> ->
> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.initDatabaseInstance(OHazelcastDistributedDatabase.java:281)
> ->
> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.onMessage(OHazelcastDistributedDatabase.java:471)
> ->
> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase$1.run(OHazelcastDistributedDatabase.java:243)
> -> java.lang.Thread.run(Thread.java:745)
> 2014-06-13 12:02:22:850 INFO [node1402673455127] executed all pending
> tasks in queue, set restoringMessages=false and database 'testdb' as
> online... [OHazelcastDistributedDatabase$1]
> 2014-06-13 12:02:43:795 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 1/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:44:096 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 2/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:44:397 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 3/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:44:699 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 4/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:45:001 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 5/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:45:307 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 6/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:45:608 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 7/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:45:909 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 8/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:46:210 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 9/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:46:511 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 10/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:46:811 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 11/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:47:112 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 12/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:47:412 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 13/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:47:713 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 14/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:48:016 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 15/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:48:318 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 16/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:48:619 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 17/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:48:920 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 18/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:49:221 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 19/20 [ONetworkProtocolHttpDb]
> 2014-06-13 12:02:49:522 INFO Node is not online yet (status=STARTING),
> blocking the command until it's online 20/20 [ONetworkProtocolHttpDb]
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to