I'm trying to setup OrientDB in distributed mode on AWS, behind a ELB, in
an ASG.
So far so good, but the initial DB replication have only happened in one
out of fifteen tries!
That is, after a new node have joined the cluster, only ONE DB is
syncronized.
And the master is unresponsive! I.e., OrientDB wouldn't answer a health
check from the
ELB, and had I not taken the instance out of the ELB, it would have been
destroyed.
So my two questions:
1. How do I initiate/start the replication? As in, make it synchronize ALL
of the DBs?
I can't, for several reasons, use "scp" to copy the database(s)!
2. How do I configure OrientDB to _NOT_ reject incoming queries while
it syncs the DB(s)?
I've tried 2.1.7, 2.1.15 and 2.2.0-beta2 and in all three cases I had to
upgrade
the hazelcast-all.jar file from Hazelcast.com (v3.6.2) for the
auto-discovery to
work in the first place.
With the new jar file, "it just worked"! As in, the new instance joined the
cluster
as soon as it was started. Although, it wouldn't get the new database :(
My files:
========================
* default-distributed-db-config.json
{
"replication": true,
"autoDeploy": true,
"hotAlignment": true,
"resyncEvery": 15,
"executionMode": "synchronous",
"readQuorum": 1,
"writeQuorum": 2,
"failureAvailableNodesLessQuorum": false,
"readYourWrites": true,
"servers": {
"*": "master"
},
"clusters": {
"internal": {
},
"index": {
},
"*": {
"servers": ["<NEW_NODE>"]
}
}
}
========================
* hazelcast.xml
<?xml version="1.0" encoding="UTF-8"?>
<!-- !!! This file is autogenerated by Saltstack. All changes will be
overridden -->
<!-- ~ Copyright (c) 2008-2012, Hazel Bilisim Ltd. All Rights Reserved. ~
~ Licensed under the Apache License, Version 2.0 (the "License"); ~
you may
not use this file except in compliance with the License. ~ You may
obtain
a copy of the License at ~ ~
http://www.apache.org/licenses/LICENSE-2.0 ~
~ Unless required by applicable law or agreed to in writing,
software ~ distributed
under the License is distributed on an "AS IS" BASIS, ~ WITHOUT
WARRANTIES
OR CONDITIONS OF ANY KIND, either express or implied. ~ See the
License for
the specific language governing permissions and ~ limitations under
the License. -->
<hazelcast
xsi:schemaLocation="http://www.hazelcast.com/schema/config
hazelcast-config-3.0.xsd"
xmlns="http://www.hazelcast.com/schema/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<properties>
<property name="hazelcast.icmp.enabled">true</property>
</properties>
<group>
<name>orientdb</name>
<password>orientdb</password>
</group>
<network>
<port auto-increment="true">2434</port>
<join>
<multicast enabled="false">
<multicast-group>235.1.1.1</multicast-group>
<multicast-port>2434</multicast-port>
</multicast>
<tcp-ip enabled="false">
<member>10.129.1.129:2434</member>
</tcp-ip>
<aws enabled="true">
<access-key>acCEss</access-key>
<secret-key>seCRet</secret-key>
<host-header>ec2.amazonaws.com</host-header>
<region>eu-west-1</region>
<!--
http://grepcode.com/file/repo1.maven.org/maven2/com.hazelcast/hazelcast-all/3.1.5/com/hazelcast/cluster/TcpIpJoinerOverAWS.java
There are 2 mechanisms for filtering out
AWS instances and
these mechanisms can be combined (AND).
1. If a securityGroup is configured
only instanced within
that security group are selected.
2. If a tag key/value is set only
instances with that tag
key/value will be selected.
Once Hazelcast has figured out which
instances are available,
it will use the private ip addresses of
these instances to
create a tcp/ip-cluster.
<security-group-name>sg-orientdb</security-group-name>
<tag-key>aws:autoscaling:groupName</tag-key>
<tag-value>orientdb-orientdb</tag-value>
-->
</aws>
</join>
<interfaces enabled="true">
<interface>10.129.1.129</interface>
</interfaces>
</network>
<executor-service>
<pool-size>16</pool-size>
</executor-service>
</hazelcast>
========================
* orientdb-server-config.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<orient-server>
<handlers>
<handler
class="com.orientechnologies.orient.graph.handler.OGraphServerHandler">
<parameters>
<parameter value="true" name="enabled"/>
<parameter value="50" name="graph.pool.max"/>
</parameters>
</handler>
<handler
class="com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin">
<parameters>
<parameter value="true" name="enabled"/>
<parameter value="orientdb-i-8d93ec05" name="nodeName"/>
<parameter
value="${ORIENTDB_HOME}/config/default-distributed-db-config.json"
name="configuration.db.default"/>
<parameter value="${ORIENTDB_HOME}/config/hazelcast.xml"
name="configuration.hazelcast"/>
</parameters>
</handler>
<handler
class="com.orientechnologies.orient.server.handler.OJMXPlugin">
<parameters>
<parameter value="false" name="enabled"/>
<parameter value="true" name="profilerManaged"/>
</parameters>
</handler>
<handler
class="com.orientechnologies.orient.server.handler.OAutomaticBackup">
<parameters>
<parameter value="false" name="enabled"/>
<parameter value="4h" name="delay"/>
<parameter value="23:00:00" name="firstTime"/>
<parameter value="backup" name="target.directory"/>
<parameter value="${DBNAME}-${DATE:yyyyMMddHHmmss}.zip"
name="target.fileName"/>
<parameter value="9" name="compressionLevel"/>
<parameter value="1048576" name="bufferSize"/>
<parameter value="" name="db.include"/>
<parameter value="GratefulDeadConcerts" name="db.exclude"/>
</parameters>
</handler>
<handler
class="com.orientechnologies.orient.server.handler.OServerSideScriptInterpreter">
<parameters>
<parameter value="true" name="enabled"/>
<parameter value="SQL" name="allowedLanguages"/>
</parameters>
</handler>
<handler
class="com.orientechnologies.orient.server.token.OrientTokenHandler">
<parameters>
<parameter value="false" name="enabled"/>
<parameter value="" name="oAuth2Key"/>
<parameter value="60" name="sessionLength"/>
<parameter value="HmacSHA256" name="encryptionAlgorithm"/>
</parameters>
</handler>
<handler
class="com.orientechnologies.orient.server.plugin.livequery.OLiveQueryPlugin">
<parameters>
<parameter value="false" name="enabled"/>
</parameters>
</handler>
</handlers>
<network>
<sockets>
<socket
implementation="com.orientechnologies.orient.server.network.OServerSSLSocketFactory"
name="ssl">
<parameters>
<parameter value="false" name="network.ssl.clientAuth"/>
<parameter value="config/cert/orientdb.ks"
name="network.ssl.keyStore"/>
<parameter value="password"
name="network.ssl.keyStorePassword"/>
<parameter value="config/cert/orientdb.ks"
name="network.ssl.trustStore"/>
<parameter value="password"
name="network.ssl.trustStorePassword"/>
</parameters>
</socket>
<socket
implementation="com.orientechnologies.orient.server.network.OServerSSLSocketFactory"
name="https">
<parameters>
<parameter value="false" name="network.ssl.clientAuth"/>
<parameter value="config/cert/orientdb.ks"
name="network.ssl.keyStore"/>
<parameter value="password"
name="network.ssl.keyStorePassword"/>
<parameter value="config/cert/orientdb.ks"
name="network.ssl.trustStore"/>
<parameter value="password"
name="network.ssl.trustStorePassword"/>
</parameters>
</socket>
</sockets>
<protocols>
<protocol
implementation="com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary"
name="binary"/>
<protocol
implementation="com.orientechnologies.orient.server.network.protocol.http.ONetworkProtocolHttpDb"
name="http"/>
</protocols>
<listeners>
<listener protocol="binary" socket="default"
port-range="2424-2430" ip-address="0.0.0.0"/>
<listener protocol="http" socket="default"
port-range="2480-2490" ip-address="0.0.0.0">
<commands>
<command
implementation="com.orientechnologies.orient.server.network.protocol.http.command.get.OServerCommandGetStaticContent"
pattern="GET|www GET|studio/ GET| GET|*.htm GET|*.html G\
ET|*.xml GET|*.jpeg GET|*.jpg GET|*.png GET|*.gif GET|*.js GET|*.css
GET|*.swf GET|*.ico GET|*.txt GET|*.otf GET|*.pjs GET|*.svg GET|*.json
GET|*.woff GET|*.woff2 GET|*.ttf GET|*.svgz" stateful="false">
<parameters>
<entry value="Cache-Control: no-cache,
no-store, max-age=0, must-revalidate\r\nPragma: no-cache"
name="http.cache:*.htm *.html"/>
<entry value="Cache-Control: max-age=120"
name="http.cache:default"/>
</parameters>
</command>
<command
implementation="com.orientechnologies.orient.graph.server.command.OServerCommandGetGephi"
pattern="GET|gephi/*" stateful="false"/>
</commands>
<parameters>
<parameter value="utf-8" name="network.http.charset"/>
<parameter value="true"
name="network.http.jsonResponseError"/>
</parameters>
</listener>
</listeners>
</network>
<storages/>
<users>
<user resources="*" password="secret" name="root"/>
<user resources="connect,server.listDatabases,server.dblist"
password="guest" name="guest"/>
<user resources="database.passthrough" password="SeCrEt"
name="replicator"/>
</users>
<properties>
<entry value="1" name="db.pool.min"/>
<entry value="50" name="db.pool.max"/>
<entry value="true" name="profiler.enabled"/>
<entry value="info" name="log.console.level"/>
<entry value="fine" name="log.file.level"/>
</properties>
</orient-server>
Once adding a new node, I get this on the "master" (10.129.1.48):
2016-04-12 10:41:38:483 INFO [10.129.1.48]:2434 [orientdb] [3.6.2]
Accepting socket connection from /10.129.1.129:36103 [SocketAcceptorThread]
2016-04-12 10:41:38:485 INFO [10.129.1.48]:2434 [orientdb] [3.6.2]
Established socket connection between /10.129.1.48:2434 and
/10.129.1.129:36103 [TcpIpConnectionManager]
2016-04-12 10:41:45:505 INFO [10.129.1.48]:2434 [orientdb] [3.6.2]
Members [3] {
Member [10.129.1.219]:2434
Member [10.129.1.48]:2434 this
Member [10.129.1.129]:2434
}
[ClusterService]
2016-04-12 10:41:45:505 SEVER [orientdb-i-346e11bc] Cannot find node with
id '3082da3b-c329-49f3-8561-2053ac5bbe21' [OHazelcastPlugin]
2016-04-12 10:41:45:505 WARNI [orientdb-i-346e11bc] added new node
id=Member [10.129.1.129]:2434 name=ext:3082da3b-c329-49f3-8561-2053ac5bbe21
[OHazelcastPlugin]
2016-04-12 10:41:47:558 INFO [orientdb-i-346e11bc]<-[orientdb-i-8d93ec05]
added node configuration id=Member [10.129.1.129]:2434
name=orientdb-i-8d93ec05, now 3 nodes are configured [OHazelcastPlugin]
2016-04-12 10:41:47:569 INFO [orientdb-i-346e11bc] Current node started as
MASTER for database 'db_1' [OHazelcastPlugin]
2016-04-12 10:41:47:569 INFO [orientdb-i-346e11bc] Current node started as
MASTER for database 'db_2' [OHazelcastPlugin]
And on the new node:
2016-04-12 10:41:41:267 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Creating AWSJoiner [Node]
2016-04-12 10:41:41:272 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Address[10.129.1.129]:2434 is STARTING [LifecycleService]
2016-04-12 10:41:41:359 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
TcpIpConnectionManager configured with Non Blocking IO-threading model: 3
input threads and 3 output threads [NonBlockingIOThreadingModel]
2016-04-12 10:41:41:855 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Connecting to /10.129.1.129:2435, timeout: 0, bind-any: true
[InitConnectionTask]
2016-04-12 10:41:41:857 INFO [10.129.1.129]:2434 [orientdb] [3.6.2] Could
not connect to: /10.129.1.129:2435. Reason: SocketException[Connection
refused to address /10.129.1.129:2435] [InitConnectionTask]
2016-04-12 10:41:41:857 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Address[10.129.1.129]:2435 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:858 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Connecting to /10.129.1.129:2436, timeout: 0, bind-any: true
[InitConnectionTask]
2016-04-12 10:41:41:866 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Connecting to /10.129.1.219:2434, timeout: 0, bind-any: true
[InitConnectionTask]
2016-04-12 10:41:41:868 INFO [10.129.1.129]:2434 [orientdb] [3.6.2] Could
not connect to: /10.129.1.129:2436. Reason: SocketException[Connection
refused to address /10.129.1.129:2436] [InitConnectionTask]
2016-04-12 10:41:41:880 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Connecting to /10.129.1.219:2435, timeout: 0, bind-any: true
[InitConnectionTask]
2016-04-12 10:41:41:881 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Connecting to /10.129.0.14:2434, timeout: 0, bind-any: true
[InitConnectionTask]
2016-04-12 10:41:41:881 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Established socket connection between /10.129.1.129:59102 and
/10.129.1.219:2434 [TcpIpConnectionManager]
2016-04-12 10:41:41:882 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Connecting to /10.129.1.219:2436, timeout: 0, bind-any: true
[InitConnectionTask]
2016-04-12 10:41:41:883 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Address[10.129.1.129]:2436 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:885 INFO [10.129.1.129]:2434 [orientdb] [3.6.2] Could
not connect to: /10.129.0.14:2434. Reason: SocketException[Connection
refused to address /10.129.0.14:2434] [InitConnectionTask]
2016-04-12 10:41:41:885 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Address[10.129.0.14]:2434 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:887 INFO [10.129.1.129]:2434 [orientdb] [3.6.2] Could
not connect to: /10.129.1.219:2436. Reason: SocketException[Connection
refused to address /10.129.1.219:2436] [InitConnectionTask]
2016-04-12 10:41:41:887 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Address[10.129.1.219]:2436 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:890 INFO [10.129.1.129]:2434 [orientdb] [3.6.2] Could
not connect to: /10.129.1.219:2435. Reason: SocketException[Connection
refused to address /10.129.1.219:2435] [InitConnectionTask]
2016-04-12 10:41:41:890 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Address[10.129.1.219]:2435 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:892 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Connecting to /10.129.1.48:2434, timeout: 0, bind-any: true
[InitConnectionTask]
2016-04-12 10:41:41:893 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Established socket connection between /10.129.1.129:36103 and
/10.129.1.48:2434 [TcpIpConnectionManager]
2016-04-12 10:41:41:896 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Connecting to /10.129.1.48:2435, timeout: 0, bind-any: true
[InitConnectionTask]
2016-04-12 10:41:41:897 INFO [10.129.1.129]:2434 [orientdb] [3.6.2] Could
not connect to: /10.129.1.48:2435. Reason: SocketException[Connection
refused to address /10.129.1.48:2435] [InitConnectionTask]
2016-04-12 10:41:41:897 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Address[10.129.1.48]:2435 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:899 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Connecting to /10.129.0.14:2436, timeout: 0, bind-any: true
[InitConnectionTask]
2016-04-12 10:41:41:901 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Connecting to /10.129.1.48:2436, timeout: 0, bind-any: true
[InitConnectionTask]
2016-04-12 10:41:41:902 INFO [10.129.1.129]:2434 [orientdb] [3.6.2] Could
not connect to: /10.129.1.48:2436. Reason: SocketException[Connection
refused to address /10.129.1.48:2436] [InitConnectionTask]
2016-04-12 10:41:41:902 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Address[10.129.1.48]:2436 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:906 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Connecting to /10.129.0.14:2435, timeout: 0, bind-any: true
[InitConnectionTask]
2016-04-12 10:41:41:906 INFO [10.129.1.129]:2434 [orientdb] [3.6.2] Could
not connect to: /10.129.0.14:2435. Reason: SocketException[Connection
refused to address /10.129.0.14:2435] [InitConnectionTask]
2016-04-12 10:41:41:907 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Address[10.129.0.14]:2435 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:41:910 INFO [10.129.1.129]:2434 [orientdb] [3.6.2] Could
not connect to: /10.129.0.14:2436. Reason: SocketException[Connection
refused to address /10.129.0.14:2436] [InitConnectionTask]
2016-04-12 10:41:41:910 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Address[10.129.0.14]:2436 is added to the blacklist. [TcpIpJoinerOverAWS]
2016-04-12 10:41:48:921 WARNI [10.129.1.129]:2434 [orientdb] [3.6.2]
Ignoring received partition table, startup is not completed yet. Sender:
Address[10.129.1.219]:2434 [InternalPartitionService]
2016-04-12 10:41:48:924 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Members [3] {
Member [10.129.1.219]:2434
Member [10.129.1.48]:2434
Member [10.129.1.129]:2434 this
}
[ClusterService]
2016-04-12 10:41:50:942 INFO [10.129.1.129]:2434 [orientdb] [3.6.2]
Address[10.129.1.129]:2434 is STARTED [LifecycleService]
2016-04-12 10:41:50:943 INFO Starting distributed server
'orientdb-i-8d93ec05' (hzID=3082da3b-c329-49f3-8561-2053ac5bbe21)
dbDir='/opt/orientdb-enterprise-2.1.15/databases/'... [OHazelcastPlugin]
2016-04-12 10:41:50:985 INFO [orientdb-i-8d93ec05] found no previous
messages in queue orientdb.node.orientdb-i-8d93ec05.response
[OHazelcastDistributedMessageService]
2016-04-12 10:41:50:992 INFO [orientdb-i-8d93ec05] loaded database
configuration from active cluster [OHazelcastPlugin]
2016-04-12 10:41:51:002 WARNI [orientdb-i-8d93ec05] updated distributed
configuration for database: db_1:
----------
{
"version": 1,
"replication": true,
"autoDeploy": true,
"hotAlignment": true,
"resyncEvery": 15,
"executionMode": "synchronous",
"readQuorum": 1,
"writeQuorum": 2,
"failureAvailableNodesLessQuorum": false,
"readYourWrites": true,
"servers": {
"*": "master"
},
"clusters": {
"internal": {
},
"index": {
},
"*": {
"servers": ["orientdb-i-8d93ec05","<NEW_NODE>"]
}
}
}
---------- [OHazelcastPlugin]
2016-04-12 10:41:51:009 INFO [orientdb-i-8d93ec05] Current node started as
MASTER for database 'db_1' [OHazelcastPlugin]
2016-04-12 10:41:51:108 WARNI [orientdb-i-8d93ec05]->[[]] requesting deploy
of database 'db_1' on local server... [OHazelcastPlugin]
2016-04-12 10:41:51:112 SEVER [orientdb-i-8d93ec05] No nodes configured for
partition 'db_1.null' request: id=-1 from=orientdb-i-8d93ec05
task=deploy_db [OHazelcastPlugin][orientdb-i-8d93ec05] Error on starting
distributed plugin
com.orientechnologies.orient.server.distributed.ODistributedException: No
nodes configured for partition 'db_1.null' request: id=-1
from=orientdb-i-8d93ec05 task=deploy_db
at
com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.sendRequest(OHazelcastPlugin.java:347)
at
com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.requestDatabase(OHazelcastPlugin.java:944)
at
com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installDatabase(OHazelcastPlugin.java:906)
at
com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.installNewDatabases(OHazelcastPlugin.java:1484)
at
com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.startup(OHazelcastPlugin.java:175)
at
com.orientechnologies.orient.server.OServer.registerPlugins(OServer.java:993)
at
com.orientechnologies.orient.server.OServer.activate(OServer.java:336)
at
com.orientechnologies.orient.server.OServerMain.main(OServerMain.java:41)
PS: 10.129.1.193 and 10.129.1.213 is the ELBs, 10.129.0.14 is the NAT
instance and 10.129.1.129/orientdb-i-8d93ec05 is the new node,
10.129.1.48/orientdb-i-346e11bc is (or is supposed to be) the "master" node.
What "worries" me is the "No nodes configured for partition 'db_1'..." part!
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.