For some reason it's trying to reach a quorum of 4.
Could you paste your database's distributed-config.json file please?
-Colin
On Tuesday, March 24, 2015 at 12:40:15 PM UTC-5, Amir Khawaja wrote:
>
> The cluster is now online in US East2 and US West. I did the following:
>
> - Changed the default-distributed-db-config.json to:
>
> {
> "replication": true,
> "autoDeploy": true,
> "hotAlignment": false,
> "resyncEvery": 15,
> "clusters": {
> "internal": {
> "replication": false
> },
> "index": {
> "replication": false
> },
> "*": {
> "replication": true,
> "readQuorum": 1,
> "writeQuorum": 1,
> "failureAvailableNodesLessQuorum": false,
> "readYourWrites": true,
> "partitioning": {
> "strategy": "round-robin",
> "default": 0,
> "partitions": [
> [ "<NEW_NODE>" ]
> ]
> }
> }
> }
> }
>
> - Deleted the distributed-config.json file from each database folder and
> restarted each node in the cluster.
>
> Now, when I connect to one of the nodes and try to delete a vertex, I
> receive the following error:
>
> com.orientechnologies.orient.server.distributed.ODistributedException:
> Error on executing distributed request (id=141
> from=odb02uw task=command_sql(delete vertex #42:2) userName=) against
> database 'vis.[]' to nodes [odb02ue2, odb02uw,
> odb01uw, odb01ue2] -->
> com.orientechnologies.orient.server.distributed.ODistributedException:
> Quorum 4 not reached for
> request (id=141 from=odb02uw task=command_sql(delete vertex #42:2)
> userName=). Timeout=407ms Servers in timeout/
> conflict are: - odb02ue2:
> com.orientechnologies.orient.core.exception.OCommandExecutionException:
> Error on execution
> of command: sql.delete vertex #42:2 - odb01ue2:
> com.orientechnologies.orient.core.exception.
> OCommandExecutionException: Error on execution of command: sql.delete
> vertex #42:2 - odb01uw: com.orientechnologies.
> orient.core.exception.OCommandExecutionException: Error on execution of
> command: sql.delete vertex #42:2 Received:
> {odb02uw=com.orientechnologies.orient.core.exception.OCommandExecutionException:
>
> Error on execution of command: sql.
> delete vertex #42:2,
> odb01uw=com.orientechnologies.orient.core.exception.OCommandExecutionException:
>
> Error on
> execution of command: sql.delete vertex #42:2,
> odb02ue2=com.orientechnologies.orient.core.exception.
> OCommandExecutionException: Error on execution of command: sql.delete
> vertex #42:2, odb01ue2=com.orientechnologies.
> orient.core.exception.OCommandExecutionException: Error on execution of
> command: sql.delete vertex #42:2}
>
> Why am I not able to delete a vertex?
>
> Amir.
>
>
> On Tuesday, March 24, 2015 at 12:20:37 PM UTC-5, Colin wrote:
>>
>> That latency should be fine so long as it's consistent.
>>
>> -Colin
>>
>> On Tuesday, March 24, 2015 at 11:52:58 AM UTC-5, Amir Khawaja wrote:
>>>
>>> Hi Colin,
>>>
>>> I checked the latency prior to posting and between regions it is about
>>> 65ms on average. What should I set the latency to for Hazelcast?
>>>
>>> Amir.
>>>
>>> On Tuesday, March 24, 2015 at 11:49:25 AM UTC-5, Colin wrote:
>>>>
>>>> Hi Amir,
>>>>
>>>> You might also do a ping and a traceroute between the machines and see
>>>> what kind of latency you're getting, just in case it's a timeout issue
>>>> with
>>>> Hazelcast.
>>>>
>>>> -Colin
>>>>
>>>> On Tuesday, March 24, 2015 at 11:32:21 AM UTC-5, Amir Khawaja wrote:
>>>>>
>>>>> Hi Colin,
>>>>>
>>>>> Thank you for the prompt response.
>>>>>
>>>>> I'm a little confused as you say "the US West node will not come
>>>>>> online telling me that the database is not yet online. At that point, I
>>>>>> kill the process and then eventually the database comes online."
>>>>>
>>>>> Do you mean you kill the database process and then restart it and then
>>>>>> it starts communicating?
>>>>>
>>>>>
>>>>> Yes. I kill the database process on the cluster node where the
>>>>> OrientDB is not coming online.
>>>>>
>>>>> Can you see on each machine when Hazelcast 'sees' all the members?
>>>>>> Are all the members showing up?
>>>>>
>>>>>
>>>>> Yes. I see the databases are talking to each other as the IP address
>>>>> of the nodes show up in the log of each database server.
>>>>>
>>>>> I will try setting hotAlignment to false and report my results on this
>>>>> thread.
>>>>>
>>>>> Amir.
>>>>>
>>>>>
>>>>> On Tuesday, March 24, 2015 at 11:25:16 AM UTC-5, Colin wrote:
>>>>>>
>>>>>> Hi Amir,
>>>>>>
>>>>>> Is it consistently a problem between the same machines not seeing
>>>>>> each other?
>>>>>>
>>>>>> I'm a little confused as you say "the US West node will not come
>>>>>> online telling me that the database is not yet online. At that point, I
>>>>>> kill the process and then eventually the database comes online."
>>>>>>
>>>>>> Do you mean you kill the database process and then restart it and
>>>>>> then it starts communicating?
>>>>>>
>>>>>> In your distributed json file, try setting "hotAlignment" to false.
>>>>>>
>>>>>> Can you see on each machine when Hazelcast 'sees' all the members?
>>>>>> Are all the members showing up?
>>>>>>
>>>>>> -Colin
>>>>>>
>>>>>> Orient Technologies
>>>>>>
>>>>>> The Company behind OrientDB
>>>>>>
>>>>>> On Tuesday, March 24, 2015 at 11:19:05 AM UTC-5, Amir Khawaja wrote:
>>>>>>>
>>>>>>> Greetings, everyone. Has anyone had much success running an OrientDB
>>>>>>> 2.0.5 cluster in Azure? I created a cluster in Windows Azure with 4
>>>>>>> nodes
>>>>>>> using CentOS 7 and OrientDB Community 2.0.4 -- 2 nodes in US East2 and
>>>>>>> 2
>>>>>>> nodes in US West. There is a Site-to-Site VPN connection between the
>>>>>>> two
>>>>>>> regions in Azure and data is flowing between machines across the
>>>>>>> network. I
>>>>>>> have three databases that I have currently deployed and testing. I find
>>>>>>> that many times the synchronization between databases does not occur.
>>>>>>> For
>>>>>>> instance, if I startup the first node in US East2 and once that comes
>>>>>>> online, fire up the second node in US West, the US West node will not
>>>>>>> come
>>>>>>> online telling me that the database is not yet online. At that point, I
>>>>>>> kill the process and then eventually the database comes online. I even
>>>>>>> have
>>>>>>> to go so far as to delete the databases in the database path folder. I
>>>>>>> do
>>>>>>> this a few times and eventually the server may startup. Sometimes, I
>>>>>>> will
>>>>>>> have three of the four nodes working and the fourth just refuses to
>>>>>>> come
>>>>>>> online.
>>>>>>>
>>>>>>> The VM size selected for each node in the cluster is a D4 (4 cores,
>>>>>>> 28GB RAM). This should be more than sufficient to handle most loads.
>>>>>>> Surely, I must be missing something as this is not acceptable
>>>>>>> production
>>>>>>> behavior. For reference, I am pasting the hazelcast.xml and
>>>>>>> default-distributed-db-config.json files here in hopes that someone has
>>>>>>> some pointers for me.
>>>>>>>
>>>>>>> *** hazelcast.xml ***
>>>>>>>
>>>>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>>>>> <!-- ~ Copyright (c) 2008-2012, Hazel Bilisim Ltd. All Rights
>>>>>>> Reserved. ~
>>>>>>> ~ Licensed under the Apache License, Version 2.0 (the "License"); ~
>>>>>>> you may
>>>>>>> not use this file except in compliance with the License. ~ You may
>>>>>>> obtain
>>>>>>> a copy of the License at ~ ~
>>>>>>> http://www.apache.org/licenses/LICENSE-2.0 ~
>>>>>>> ~ Unless required by applicable law or agreed to in writing,
>>>>>>> software ~ distributed
>>>>>>> under the License is distributed on an "AS IS" BASIS, ~ WITHOUT
>>>>>>> WARRANTIES
>>>>>>> OR CONDITIONS OF ANY KIND, either express or implied. ~ See the
>>>>>>> License for
>>>>>>> the specific language governing permissions and ~ limitations under
>>>>>>> the License. -->
>>>>>>>
>>>>>>> <hazelcast
>>>>>>> xsi:schemaLocation="http://www.hazelcast.com/schema/config
>>>>>>> hazelcast-config-3.0.xsd"
>>>>>>> xmlns="http://www.hazelcast.com/schema/config" xmlns:xsi="
>>>>>>> http://www.w3.org/2001/XMLSchema-instance">
>>>>>>> <group>
>>>>>>> <name>[name]</name>
>>>>>>> <password>[password]</password>
>>>>>>> </group>
>>>>>>> <network>
>>>>>>> <port auto-increment="true">2434</port>
>>>>>>> <join>
>>>>>>> <multicast enabled="false">
>>>>>>> <multicast-group>235.1.1.1</multicast-group>
>>>>>>> <multicast-port>2434</multicast-port>
>>>>>>> </multicast>
>>>>>>> <tcp-ip enabled="true">
>>>>>>> <member>10.0.0.4</member>
>>>>>>> <member>10.0.0.5</member>
>>>>>>> <member>10.1.0.4</member>
>>>>>>> <member>10.1.0.5</member>
>>>>>>> </tcp-ip>
>>>>>>> </join>
>>>>>>> </network>
>>>>>>> <executor-service>
>>>>>>> <pool-size>16</pool-size>
>>>>>>> </executor-service>
>>>>>>> </hazelcast>
>>>>>>>
>>>>>>>
>>>>>>> *** default-distributed-db-config.json ***
>>>>>>>
>>>>>>> {
>>>>>>> "autoDeploy": true,
>>>>>>> "hotAlignment": true,
>>>>>>> "executionMode": "synchronous",
>>>>>>> "readQuorum": 1,
>>>>>>> "writeQuorum": 3,
>>>>>>> "failureAvailableNodesLessQuorum": false,
>>>>>>> "readYourWrites": true,
>>>>>>> "clusters": {
>>>>>>> "internal": {
>>>>>>> },
>>>>>>> "index": {
>>>>>>> },
>>>>>>> "*": {
>>>>>>> "servers" : [ "<NEW_NODE>" ]
>>>>>>> }
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> Thank you for any assistance you can offer.
>>>>>>>
>>>>>>> Amir.
>>>>>>>
>>>>>>
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.