Hi Dennis,
I am facing simialr problem, but in my case I have 2 machines as slaves and
1 master, but I cannot get them up and running. In your case maybe you can
try to setup a third instance and try to start the cluster as a 3 Machine
quorum is required for making a HA cluster.

Regards,
Yayati Sule
Associate Data Scientist
Innoplexus Consulting Services Pvt. Ltd.
www.innoplexus.com
Mob : +91-9527459407

Landline: +91-20-66527300

© 2011-16 Innoplexus Consulting Services Pvt. Ltd.

Unless otherwise explicitly stated, all rights including those in
copyright in the content of this e-mail are owned by Innoplexus
Consulting Services Pvt Ltd. and all related legal entities. The
contents of this e-mail shall not be copied, reproduced, or
transmitted in any form without the written permission of Innoplexus
Consulting Services Pvt Ltd or that of the copyright owner. The
receipt of this mail is the acknowledgement of the receipt of
contents; if the recipient is not the intended addressee then the
recipient shall notify the sender immediately.

The contents are provided for information only and no opinions
expressed should be relied on without further consultation with
Innoplexus Consulting Services Pvt Ltd. and all related legal
entities. While all endeavors have been made to ensure accuracy,
Innoplexus Consulting Services Pvt. Ltd. makes no warranty or
representation to its accuracy, completeness or fairness and persons
who rely on it do so entirely at their own risk. The information
herein may be changed or withdrawn at any time without notice.
Innoplexus Consulting Services Pvt. Ltd. will not be liable to any
client or third party for the accuracy of the information supplied
through this service.

Innoplexus Consulting Services Pvt. Ltd. accepts no responsibility or
liability for the contents of any other site, whether linked to this
site or not, or any consequences from your acting upon the contents of
another site.

Please Consider the environment before printing this email.


On Tue, May 17, 2016 at 10:22 AM, Dennis O <[email protected]> wrote:

> Hi,
>
> Please advise on required configuration for the 2-instance "HA" setup
> (AWS, neo4j Enterprise 3.0.1).
>
> Currently I have on both instances:
> - dbms.mode=HA
> - ha.initial_hosts=172.31.35.147:5001,172.31.33.173:5001
> - ha.host.coordination is commented out
> - ha.host.data is commented out
> Port 5001, 5002, 7474, 6001 open on both.
>
> Differences
> 1. One node has ha.server_id=1 (172.31.33.173), another one
> - ha.server_id=2
> 2. Node with id=1 is Debian 8.4, id=2 is Centos 7
>
>
> With this setup, node with id=1 starts w/o problems, elected as master,
> second one however fails.
>
> Some log extracts:
>
>
> 2016-05-17 04:50:51.781+0000 INFO  [o.n.k.h.MasterClient214]
> MasterClient214 communication channel created towards /127.0.0.1:6001
> 2016-05-17 04:50:51.790+0000 INFO  [o.n.k.h.c.SwitchToSlave] Copying store
> from master
> 2016-05-17 04:50:51.791+0000 INFO  [o.n.k.h.MasterClient214] Thread[31, HA
> Mode switcher-1] Trying to open a new channel from /172.31.35.147:0 to /
> 127.0.0.1:6001
> 2016-05-17 04:50:51.791+0000 DEBUG [o.n.k.h.MasterClient214]
> MasterClient214 could not connect from /172.31.35.147:0 to /127.0.0.1:6001
> 2016-05-17 04:50:51.796+0000 INFO  [o.n.k.h.MasterClient214]
> MasterClient214[/127.0.0.1:6001] shutdown
> 2016-05-17 04:50:51.796+0000 ERROR
> [o.n.k.h.c.m.HighAvailabilityModeSwitcher] Error while trying to switch to
> slave MasterClient214 could not connect from /172.31.35.147:0 to /
> 127.0.0.1:6001
> org.neo4j.com.ComException: MasterClient214 could not connect from /
> 172.31.35.147:0 to /127.0.0.1:6001
> at org.neo4j.com.Client$2.create(Client.java:225)
> at org.neo4j.com.Client$2.create(Client.java:202)
> at org.neo4j.com.ResourcePool.acquire(ResourcePool.java:177)
> at org.neo4j.com.Client.acquireChannelContext(Client.java:390)
> at org.neo4j.com.Client.sendRequest(Client.java:296)
> at org.neo4j.com.Client.sendRequest(Client.java:289)
> at org.neo4j.kernel.ha.MasterClient210.copyStore(MasterClient210.java:311)
> at
> org.neo4j.kernel.ha.cluster.SwitchToSlave$1.copyStore(SwitchToSlave.java:531)
> at
> org.neo4j.com.storecopy.StoreCopyClient.copyStore(StoreCopyClient.java:191)
> at
> org.neo4j.kernel.ha.cluster.SwitchToSlave.copyStoreFromMaster(SwitchToSlave.java:525)
> at
> org.neo4j.kernel.ha.cluster.SwitchToSlave.copyStoreFromMasterIfNeeded(SwitchToSlave.java:348)
> at
> org.neo4j.kernel.ha.cluster.SwitchToSlave.switchToSlave(SwitchToSlave.java:272)
> at
> org.neo4j.kernel.ha.cluster.modeswitch.HighAvailabilityModeSwitcher$1.run(HighAvailabilityModeSwitcher.java:348)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> at org.neo4j.helpers.NamedThreadFactory$2.run(NamedThreadFactory.java:104)
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at
> org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:148)
> at
> org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:104)
> at
> org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:78)
> at
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
> at
> org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:41)
> ... 4 more
> 2016-05-17 04:50:51.797+0000 INFO
>  [o.n.k.h.c.m.HighAvailabilityModeSwitcher] Attempting to switch to slave
> in 7s
> 2016-05-17 04:50:58.799+0000 INFO  [o.n.k.i.f.CommunityFacadeFactory] No
> locking implementation specified, defaulting to 'forseti'
> 2016-05-17 04:50:58.799+0000 INFO  [o.n.k.h.c.SwitchToSlave] ServerId 2,
> moving to slave for master ha://0.0.0.0:6001?serverId=1
>
>
>
> 2016-05-17 04:30:57.535+0000 DEBUG [o.n.c.p.c.ClusterState$2] [AsyncLog @
> 2016-05-17 04:30:57.534+0000]  ClusterState:
> discovery-[configurationTimeout]->discovery conversation-id:2/13#
> payload:ConfigurationTimeoutState{remainingPings=3}
> 2016-05-17 04:30:57.535+0000 DEBUG [o.n.c.p.h.HeartbeatState$1] [AsyncLog
> @ 2016-05-17 04:30:57.535+0000]  HeartbeatState:
> start-[reset_send_heartbeat]->start conversation-id:2/13#
> 2016-05-17 04:30:57.538+0000 INFO  [o.n.c.c.NetworkSender] [AsyncLog @
> 2016-05-17 04:30:57.537+0000]  Attempting to connect from /172.31.35.147:0
> to /172.31.33.173:5001
> 2016-05-17 04:30:57.540+0000 INFO  [o.n.c.c.NetworkSender] [AsyncLog @
> 2016-05-17 04:30:57.540+0000]  Failed to connect to /172.31.33.173:5001
> due to: java.net.ConnectException: Connection refused
> 2016-05-17 04:30:57.540+0000 DEBUG [o.n.c.p.c.ClusterState$2] [AsyncLog @
> 2016-05-17 04:30:57.540+0000]  ClusterState:
> discovery-[configurationRequest]->discovery from:cluster://
> 172.31.35.147:5001 conversation-id:2/13#
> payload:ConfigurationRequestState{joiningId=2, joiningUri=cluster://
> 172.31.35.147:5001}
> 2016-05-17 04:30:58.420+0000 INFO  [o.n.c.c.NetworkReceiver] [AsyncLog @
> 2016-05-17 04:30:58.420+0000]  cluster://172.31.35.147:47188 disconnected
> from me at cluster://172.31.35.147:5001
> 2016-05-17 04:30:58.420+0000 INFO  [o.n.c.c.NetworkReceiver] [AsyncLog @
> 2016-05-17 04:30:58.420+0000]  cluster://172.31.35.147:47188 disconnected
> from me at cluster://172.31.35.147:5001
> 2016-05-17 04:30:58.434+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check
> Pointing triggered by database shutdown [1]:  Starting check pointing...
> 2016-05-17 04:30:58.438+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check
> Pointing triggered by database shutdown [1]:  Starting store flush...
> 2016-05-17 04:30:58.443+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check
> Pointing triggered by database shutdown [1]:  Store flush completed
> 2016-05-17 04:30:58.443+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check
> Pointing triggered by database shutdown [1]:  Starting appending check
> point entry into the tx log...
> 2016-05-17 04:30:58.447+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check
> Pointing triggered by database shutdown [1]:  Appending check point entry
> into the tx log completed
> 2016-05-17 04:30:58.447+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] Check
> Pointing triggered by database shutdown [1]:  Check pointing completed
> 2016-05-17 04:30:58.447+0000 INFO  [o.n.k.i.t.l.p.LogPruningImpl] Log
> Rotation [0]:  Starting log pruning.
> 2016-05-17 04:30:58.447+0000 INFO  [o.n.k.i.t.l.p.LogPruningImpl] Log
> Rotation [0]:  Log pruning complete.
> 2016-05-17 04:30:58.475+0000 INFO  [o.n.k.i.DiagnosticsManager] ---
> STOPPING diagnostics START ---
> 2016-05-17 04:30:58.475+0000 INFO  [o.n.k.i.DiagnosticsManager] High
> Availability diagnostics
> Member state:PENDING
> State machines:
>    AtomicBroadcastMessage:start
>    AcceptorMessage:start
>    ProposerMessage:start
>    LearnerMessage:start
>    HeartbeatMessage:start
>    ElectionMessage:start
>    SnapshotMessage:start
>    ClusterMessage:discovery
> Current timeouts:
> join:configurationTimeout{conversation-id=2/13#, timeout-count=29,
> created-by=2}
> 2016-05-17 04:30:58.475+0000 INFO  [o.n.k.i.DiagnosticsManager] ---
> STOPPING diagnostics END ---
> 2016-05-17 04:30:58.475+0000 INFO
>  [o.n.k.h.f.HighlyAvailableFacadeFactory] Shutdown started
>
> etc.
>
>
> Any insights are highly appreciated!!
>
> Thank you!
> Dennis
>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to