I'm trying to setup OrientDB in distributed mode on AWS, behind a ELB, in 
an ASG.

So far so good, but the initial DB replication have only happened in one 
out of fifteen tries!
That is, after a new node have joined the cluster, only ONE DB is 
syncronized.

And the master is unresponsive! I.e., OrientDB wouldn't answer a health 
check from the
ELB, and had I not taken the instance out of the ELB, it would have been 
destroyed.


So my two questions:
1. How do I initiate/start the replication? As in, make it synchronize ALL 
of the DBs?
    I can't, for several reasons, use "scp" to copy the database(s)!

2. How do I configure OrientDB to _NOT_ reject incoming queries while
    it syncs the DB(s)?


I've tried 2.1.7, 2.1.15 and 2.2.0-beta2 and in all three cases I had to 
upgrade
the hazelcast-all.jar file from Hazelcast.com (v3.6.2) for the 
auto-discovery to
work in the first place.

With the new jar file, "it just worked"! As in, the new instance joined the 
cluster
as soon as it was started. Although, it wouldn't get the new database :(

My files:
========================
* default-distributed-db-config.json
{
  "replication": true,
  "autoDeploy": true,
  "hotAlignment": true,
  "resyncEvery": 15,
  "executionMode": "synchronous",
  "readQuorum": 1,
  "writeQuorum": 2,
  "failureAvailableNodesLessQuorum": false,
  "readYourWrites": true,
  "servers": {
    "*": "master"
  },
  "clusters": {
    "internal": {
    },
    "index": {
    },
    "*": {
      "servers": ["<NEW_NODE>"]
    }
  }
}

========================
* hazelcast.xml
<?xml version="1.0" encoding="UTF-8"?>
<!-- !!! This file is autogenerated by Saltstack. All changes will be 
overridden -->

<!-- ~ Copyright (c) 2008-2012, Hazel Bilisim Ltd. All Rights Reserved. ~
        ~ Licensed under the Apache License, Version 2.0 (the "License"); ~ 
you may
        not use this file except in compliance with the License. ~ You may 
obtain
        a copy of the License at ~ ~ 
http://www.apache.org/licenses/LICENSE-2.0 ~
        ~ Unless required by applicable law or agreed to in writing, 
software ~ distributed
        under the License is distributed on an "AS IS" BASIS, ~ WITHOUT 
WARRANTIES
        OR CONDITIONS OF ANY KIND, either express or implied. ~ See the 
License for
        the specific language governing permissions and ~ limitations under 
the License. -->

<hazelcast
        xsi:schemaLocation="http://www.hazelcast.com/schema/config 
hazelcast-config-3.0.xsd"
        xmlns="http://www.hazelcast.com/schema/config"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";>
        <properties>
                <property name="hazelcast.icmp.enabled">true</property>
        </properties>
        <group>
                <name>orientdb</name>
                <password>orientdb</password>
        </group>
        <network>
                <port auto-increment="true">2434</port>
                <join>
                        <multicast enabled="false">
                                <multicast-group>235.1.1.1</multicast-group>
                                <multicast-port>2434</multicast-port>
                        </multicast>
                        <tcp-ip enabled="false">
                                <member>10.129.1.129:2434</member>
                        </tcp-ip>
                        <aws enabled="true">
                                <access-key>acCEss</access-key>
                                <secret-key>seCRet</secret-key>
                                <host-header>ec2.amazonaws.com</host-header>
                                <region>eu-west-1</region>

                                <!--
                                  
http://grepcode.com/file/repo1.maven.org/maven2/com.hazelcast/hazelcast-all/3.1.5/com/hazelcast/cluster/TcpIpJoinerOverAWS.java

                                  There are 2 mechanisms for filtering out 
AWS instances and
                                  these mechanisms can be combined (AND).

                                    1. If a securityGroup is configured 
only instanced within
                                       that security group are selected.
                                    2. If a tag key/value is set only 
instances with that tag
                                       key/value will be selected.

                                  Once Hazelcast has figured out which 
instances are available,
                                  it will use the private ip addresses of 
these instances to
                                  create a tcp/ip-cluster.

                                
<security-group-name>sg-orientdb</security-group-name>
                                <tag-key>aws:autoscaling:groupName</tag-key>
                                <tag-value>orientdb-orientdb</tag-value>
                                -->
                        </aws>
                </join>
                <interfaces enabled="true">
                        <interface>10.129.1.129</interface>
                </interfaces>
        </network>
        <executor-service>
                <pool-size>16</pool-size>
        </executor-service>
</hazelcast>

========================
* orientdb-server-config.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<orient-server>
    <handlers>
        <handler 
class="com.orientechnologies.orient.graph.handler.OGraphServerHandler">
            <parameters>
                <parameter value="true" name="enabled"/>
                <parameter value="50" name="graph.pool.max"/>
            </parameters>
        </handler>
        <handler 
class="com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin">
            <parameters>
                <parameter value="true" name="enabled"/>
                <parameter value="orientdb-i-8d93ec05" name="nodeName"/>
                <parameter 
value="${ORIENTDB_HOME}/config/default-distributed-db-config.json" 
name="configuration.db.default"/>
                <parameter value="${ORIENTDB_HOME}/config/hazelcast.xml" 
name="configuration.hazelcast"/>
            </parameters>
        </handler>
        <handler 
class="com.orientechnologies.orient.server.handler.OJMXPlugin">
            <parameters>
                <parameter value="false" name="enabled"/>
                <parameter value="true" name="profilerManaged"/>
            </parameters>
        </handler>
        <handler 
class="com.orientechnologies.orient.server.handler.OAutomaticBackup">
            <parameters>
                <parameter value="false" name="enabled"/>
                <parameter value="4h" name="delay"/>
                <parameter value="23:00:00" name="firstTime"/>
                <parameter value="backup" name="target.directory"/>
                <parameter value="${DBNAME}-${DATE:yyyyMMddHHmmss}.zip" 
name="target.fileName"/>
                <parameter value="9" name="compressionLevel"/>
                <parameter value="1048576" name="bufferSize"/>
                <parameter value="" name="db.include"/>
                <parameter value="GratefulDeadConcerts" name="db.exclude"/>
            </parameters>
        </handler>
        <handler 
class="com.orientechnologies.orient.server.handler.OServerSideScriptInterpreter">
            <parameters>
                <parameter value="true" name="enabled"/>
                <parameter value="SQL" name="allowedLanguages"/>
            </parameters>
        </handler>
        <handler 
class="com.orientechnologies.orient.server.token.OrientTokenHandler">
            <parameters>
                <parameter value="false" name="enabled"/>
                <parameter value="" name="oAuth2Key"/>
                <parameter value="60" name="sessionLength"/>
                <parameter value="HmacSHA256" name="encryptionAlgorithm"/>
            </parameters>
        </handler>
        <handler 
class="com.orientechnologies.orient.server.plugin.livequery.OLiveQueryPlugin">
            <parameters>
                <parameter value="false" name="enabled"/>
            </parameters>
        </handler>
    </handlers>
    <network>
        <sockets>
            <socket 
implementation="com.orientechnologies.orient.server.network.OServerSSLSocketFactory"
 
name="ssl">
                <parameters>
                    <parameter value="false" name="network.ssl.clientAuth"/>
                    <parameter value="config/cert/orientdb.ks" 
name="network.ssl.keyStore"/>
                    <parameter value="password" 
name="network.ssl.keyStorePassword"/>
                    <parameter value="config/cert/orientdb.ks" 
name="network.ssl.trustStore"/>
                    <parameter value="password" 
name="network.ssl.trustStorePassword"/>
                </parameters>
            </socket>
            <socket 
implementation="com.orientechnologies.orient.server.network.OServerSSLSocketFactory"
 
name="https">
                <parameters>
                    <parameter value="false" name="network.ssl.clientAuth"/>
                    <parameter value="config/cert/orientdb.ks" 
name="network.ssl.keyStore"/>
                    <parameter value="password" 
name="network.ssl.keyStorePassword"/>
                    <parameter value="config/cert/orientdb.ks" 
name="network.ssl.trustStore"/>
                    <parameter value="password" 
name="network.ssl.trustStorePassword"/>
                </parameters>
            </socket>
        </sockets>
        <protocols>
            <protocol 
implementation="com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary"
 
name="binary"/>
            <protocol 
implementation="com.orientechnologies.orient.server.network.protocol.http.ONetworkProtocolHttpDb"
 
name="http"/>
        </protocols>
        <listeners>
            <listener protocol="binary" socket="default" 
port-range="2424-2430" ip-address="0.0.0.0"/>
            <listener protocol="http" socket="default" 
port-range="2480-2490" ip-address="0.0.0.0">
                <commands>
                    <command 
implementation="com.orientechnologies.orient.server.network.protocol.http.command.get.OServerCommandGetStaticContent"
 
pattern="GET|www GET|studio/ GET| GET|*.htm GET|*.html G\
ET|*.xml GET|*.jpeg GET|*.jpg GET|*.png GET|*.gif GET|*.js GET|*.css 
GET|*.swf GET|*.ico GET|*.txt GET|*.otf GET|*.pjs GET|*.svg GET|*.json 
GET|*.woff GET|*.woff2 GET|*.ttf GET|*.svgz" stateful="false">
                        <parameters>
                            <entry value="Cache-Control: no-cache, 
no-store, max-age=0, must-revalidate\r\nPragma: no-cache" 
name="http.cache:*.htm *.html"/>
                            <entry value="Cache-Control: max-age=120" 
name="http.cache:default"/>
                        </parameters>
                    </command>
                    <command 
implementation="com.orientechnologies.orient.graph.server.command.OServerCommandGetGephi"
 
pattern="GET|gephi/*" stateful="false"/>
                </commands>
                <parameters>
                    <parameter value="utf-8" name="network.http.charset"/>
                    <parameter value="true" 
name="network.http.jsonResponseError"/>
                </parameters>
            </listener>
        </listeners>
    </network>
    <storages/>
    <users>
        <user resources="*" password="secret" name="root"/>
        <user resources="connect,server.listDatabases,server.dblist" 
password="guest" name="guest"/>
        <user resources="database.passthrough" password="SeCrEt" 
name="replicator"/>
    </users>
    <properties>
        <entry value="1" name="db.pool.min"/>
        <entry value="50" name="db.pool.max"/>
        <entry value="true" name="profiler.enabled"/>
        <entry value="info" name="log.console.level"/>
        <entry value="fine" name="log.file.level"/>
    </properties>
</orient-server>


Once adding a new node, I get this on the "master" (10.129.1.48):

2016-04-12 10:41:38:483 INFO  [10.129.1.48]:2434 [orientdb] [3.6.2] 
Accepting socket connection from /10.129.1.129:36103 [SocketAcceptorThread]
2016-04-12 10:41:38:485 INFO  [10.129.1.48]:2434 [orientdb] [3.6.2] 
Established socket connection between /10.129.1.48:2434 and 
/10.129.1.129:36103 [TcpIpConnectionManager]
2016-04-12 10:41:45:505 INFO  [10.129.1.48]:2434 [orientdb] [3.6.2]

Members [3] {
        Member [10.129.1.219]:2434
        Member [10.129.1.48]:2434 this
        Member [10.129.1.129]:2434
}
 [ClusterService]
2016-04-12 10:41:45:505 SEVER [orientdb-i-346e11bc] Cannot find node with 
id '3082da3b-c329-49f3-8561-2053ac5bbe21' [OHazelcastPlugin]
2016-04-12 10:41:45:505 WARNI [orientdb-i-346e11bc] added new node 
id=Member [10.129.1.129]:2434 name=ext:3082da3b-c329-49f3-8561-2053ac5bbe21 
[OHazelcastPlugin]
2016-04-12 10:41:47:558 INFO  [orientdb-i-346e11bc]<-[orientdb-i-8d93ec05] 
added node configuration id=Member [10.129.1.129]:2434 
name=orientdb-i-8d93ec05, now 3 nodes are configured [OHazelcastPlugin]
2016-04-12 10:41:47:569 INFO  [orientdb-i-346e11bc] Current node started as 
MASTER for database 'db_1' [OHazelcastPlugin]
2016-04-12 10:41:47:569 INFO  [orientdb-i-346e11bc] Current node started as 
MASTER for database 'db_2' [OHazelcastPlugin]


And on the new node:

2016-04-12 10:41:41:267 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Creating AWSJoiner [Node]
2016-04-12 10:41:41:272 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Address[10.129.1.129]:2434 is STARTING [LifecycleService]
2016-04-12 10:41:41:359 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
TcpIpConnectionManager configured with Non Blocking IO-threading model: 3 
input threads and 3 output threads [NonBlockingIOThreadingModel]
2016-04-12 10:41:41:855 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Connecting to /10.129.1.129:2435, timeout: 0, bind-any: true 
[InitConnectionTask]
2016-04-12 10:41:41:857 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] Could 
not connect to: /10.129.1.129:2435. Reason: SocketException[Connection 
refused to address /10.129.1.129:2435] [InitConnectionTask]
2016-04-12 10:41:41:857 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Address[10.129.1.129]:2435 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:858 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Connecting to /10.129.1.129:2436, timeout: 0, bind-any: true 
[InitConnectionTask]
2016-04-12 10:41:41:866 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Connecting to /10.129.1.219:2434, timeout: 0, bind-any: true 
[InitConnectionTask]
2016-04-12 10:41:41:868 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] Could 
not connect to: /10.129.1.129:2436. Reason: SocketException[Connection 
refused to address /10.129.1.129:2436] [InitConnectionTask]
2016-04-12 10:41:41:880 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Connecting to /10.129.1.219:2435, timeout: 0, bind-any: true 
[InitConnectionTask]
2016-04-12 10:41:41:881 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Connecting to /10.129.0.14:2434, timeout: 0, bind-any: true 
[InitConnectionTask]
2016-04-12 10:41:41:881 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Established socket connection between /10.129.1.129:59102 and 
/10.129.1.219:2434 [TcpIpConnectionManager]
2016-04-12 10:41:41:882 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Connecting to /10.129.1.219:2436, timeout: 0, bind-any: true 
[InitConnectionTask]
2016-04-12 10:41:41:883 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Address[10.129.1.129]:2436 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:885 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] Could 
not connect to: /10.129.0.14:2434. Reason: SocketException[Connection 
refused to address /10.129.0.14:2434] [InitConnectionTask]
2016-04-12 10:41:41:885 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Address[10.129.0.14]:2434 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:887 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] Could 
not connect to: /10.129.1.219:2436. Reason: SocketException[Connection 
refused to address /10.129.1.219:2436] [InitConnectionTask]
2016-04-12 10:41:41:887 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Address[10.129.1.219]:2436 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:890 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] Could 
not connect to: /10.129.1.219:2435. Reason: SocketException[Connection 
refused to address /10.129.1.219:2435] [InitConnectionTask]
2016-04-12 10:41:41:890 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Address[10.129.1.219]:2435 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:892 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Connecting to /10.129.1.48:2434, timeout: 0, bind-any: true 
[InitConnectionTask]
2016-04-12 10:41:41:893 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Established socket connection between /10.129.1.129:36103 and 
/10.129.1.48:2434 [TcpIpConnectionManager]
2016-04-12 10:41:41:896 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Connecting to /10.129.1.48:2435, timeout: 0, bind-any: true 
[InitConnectionTask]
2016-04-12 10:41:41:897 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] Could 
not connect to: /10.129.1.48:2435. Reason: SocketException[Connection 
refused to address /10.129.1.48:2435] [InitConnectionTask]
2016-04-12 10:41:41:897 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Address[10.129.1.48]:2435 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:899 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Connecting to /10.129.0.14:2436, timeout: 0, bind-any: true 
[InitConnectionTask]
2016-04-12 10:41:41:901 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Connecting to /10.129.1.48:2436, timeout: 0, bind-any: true 
[InitConnectionTask]
2016-04-12 10:41:41:902 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] Could 
not connect to: /10.129.1.48:2436. Reason: SocketException[Connection 
refused to address /10.129.1.48:2436] [InitConnectionTask]
2016-04-12 10:41:41:902 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Address[10.129.1.48]:2436 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:906 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Connecting to /10.129.0.14:2435, timeout: 0, bind-any: true 
[InitConnectionTask]
2016-04-12 10:41:41:906 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] Could 
not connect to: /10.129.0.14:2435. Reason: SocketException[Connection 
refused to address /10.129.0.14:2435] [InitConnectionTask]
2016-04-12 10:41:41:907 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Address[10.129.0.14]:2435 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:910 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] Could 
not connect to: /10.129.0.14:2436. Reason: SocketException[Connection 
refused to address /10.129.0.14:2436] [InitConnectionTask]
2016-04-12 10:41:41:910 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Address[10.129.0.14]:2436 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:48:921 WARNI [10.129.1.129]:2434 [orientdb] [3.6.2] 
Ignoring received partition table, startup is not completed yet. Sender: 
Address[10.129.1.219]:2434 [InternalPartitionService]
2016-04-12 10:41:48:924 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2]

Members [3] {
        Member [10.129.1.219]:2434
        Member [10.129.1.48]:2434
        Member [10.129.1.129]:2434 this
}
 [ClusterService]
2016-04-12 10:41:50:942 INFO  [10.129.1.129]:2434 [orientdb] [3.6.2] 
Address[10.129.1.129]:2434 is STARTED [LifecycleService]
2016-04-12 10:41:50:943 INFO  Starting distributed server 
'orientdb-i-8d93ec05' (hzID=3082da3b-c329-49f3-8561-2053ac5bbe21) 
dbDir='/opt/orientdb-enterprise-2.1.15/databases/'... [OHazelcastPlugin]
2016-04-12 10:41:50:985 INFO  [orientdb-i-8d93ec05] found no previous 
messages in queue orientdb.node.orientdb-i-8d93ec05.response 
[OHazelcastDistributedMessageService]
2016-04-12 10:41:50:992 INFO  [orientdb-i-8d93ec05] loaded database 
configuration from active cluster [OHazelcastPlugin]
2016-04-12 10:41:51:002 WARNI [orientdb-i-8d93ec05] updated distributed 
configuration for database: db_1:
----------
{
  "version": 1,
  "replication": true,
  "autoDeploy": true,
  "hotAlignment": true,
  "resyncEvery": 15,
  "executionMode": "synchronous",
  "readQuorum": 1,
  "writeQuorum": 2,
  "failureAvailableNodesLessQuorum": false,
  "readYourWrites": true,
  "servers": {
    "*": "master"
  },
  "clusters": {
    "internal": {
    },
    "index": {
    },
    "*": {
      "servers": ["orientdb-i-8d93ec05","<NEW_NODE>"]
    }
  }
}
---------- [OHazelcastPlugin]
2016-04-12 10:41:51:009 INFO  [orientdb-i-8d93ec05] Current node started as 
MASTER for database 'db_1' [OHazelcastPlugin]
2016-04-12 10:41:51:108 WARNI [orientdb-i-8d93ec05]->[[]] requesting deploy 
of database 'db_1' on local server... [OHazelcastPlugin]
2016-04-12 10:41:51:112 SEVER [orientdb-i-8d93ec05] No nodes configured for 
partition 'db_1.null' request: id=-1 from=orientdb-i-8d93ec05 
task=deploy_db [OHazelcastPlugin][orientdb-i-8d93ec05] Error on starting 
distributed plugin
com.orientechnologies.orient.server.distributed.ODistributedException: No 
nodes configured for partition 'db_1.null' request: id=-1 
from=orientdb-i-8d93ec05 task=deploy_db
        at 
com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.sendRequest(OHazelcastPlugin.java:347)
        at 
com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.requestDatabase(OHazelcastPlugin.java:944)
        at 
com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installDatabase(OHazelcastPlugin.java:906)
        at 
com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installNewDatabases(OHazelcastPlugin.java:1484)
        at 
com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.startup(OHazelcastPlugin.java:175)
        at 
com.orientechnologies.orient.server.OServer.registerPlugins(OServer.java:993)
        at 
com.orientechnologies.orient.server.OServer.activate(OServer.java:336)
        at 
com.orientechnologies.orient.server.OServerMain.main(OServerMain.java:41)


PS: 10.129.1.193 and 10.129.1.213 is the ELBs, 10.129.0.14 is the NAT 
instance and 10.129.1.129/orientdb-i-8d93ec05 is the new node, 
10.129.1.48/orientdb-i-346e11bc is (or is supposed to be) the "master" node.

What "worries" me is the "No nodes configured for partition 'db_1'..." part!

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to