Hi Dennis, I am facing simialr problem, but in my case I have 2 machines as slaves and 1 master, but I cannot get them up and running. In your case maybe you can try to setup a third instance and try to start the cluster as a 3 Machine quorum is required for making a HA cluster.
Regards, Yayati Sule Associate Data Scientist Innoplexus Consulting Services Pvt. Ltd. www.innoplexus.com Mob : +91-9527459407 Landline: +91-20-66527300 © 2011-16 Innoplexus Consulting Services Pvt. Ltd. Unless otherwise explicitly stated, all rights including those in copyright in the content of this e-mail are owned by Innoplexus Consulting Services Pvt Ltd. and all related legal entities. The contents of this e-mail shall not be copied, reproduced, or transmitted in any form without the written permission of Innoplexus Consulting Services Pvt Ltd or that of the copyright owner. The receipt of this mail is the acknowledgement of the receipt of contents; if the recipient is not the intended addressee then the recipient shall notify the sender immediately. The contents are provided for information only and no opinions expressed should be relied on without further consultation with Innoplexus Consulting Services Pvt Ltd. and all related legal entities. While all endeavors have been made to ensure accuracy, Innoplexus Consulting Services Pvt. Ltd. makes no warranty or representation to its accuracy, completeness or fairness and persons who rely on it do so entirely at their own risk. The information herein may be changed or withdrawn at any time without notice. Innoplexus Consulting Services Pvt. Ltd. will not be liable to any client or third party for the accuracy of the information supplied through this service. Innoplexus Consulting Services Pvt. Ltd. accepts no responsibility or liability for the contents of any other site, whether linked to this site or not, or any consequences from your acting upon the contents of another site. Please Consider the environment before printing this email. On Tue, May 17, 2016 at 10:22 AM, Dennis O <[email protected]> wrote: > Hi, > > Please advise on required configuration for the 2-instance "HA" setup > (AWS, neo4j Enterprise 3.0.1). > > Currently I have on both instances: > - dbms.mode=HA > - ha.initial_hosts=172.31.35.147:5001,172.31.33.173:5001 > - ha.host.coordination is commented out > - ha.host.data is commented out > Port 5001, 5002, 7474, 6001 open on both. > > Differences > 1. One node has ha.server_id=1 (172.31.33.173), another one > - ha.server_id=2 > 2. Node with id=1 is Debian 8.4, id=2 is Centos 7 > > > With this setup, node with id=1 starts w/o problems, elected as master, > second one however fails. > > Some log extracts: > > > 2016-05-17 04:50:51.781+0000 INFO [o.n.k.h.MasterClient214] > MasterClient214 communication channel created towards /127.0.0.1:6001 > 2016-05-17 04:50:51.790+0000 INFO [o.n.k.h.c.SwitchToSlave] Copying store > from master > 2016-05-17 04:50:51.791+0000 INFO [o.n.k.h.MasterClient214] Thread[31, HA > Mode switcher-1] Trying to open a new channel from /172.31.35.147:0 to / > 127.0.0.1:6001 > 2016-05-17 04:50:51.791+0000 DEBUG [o.n.k.h.MasterClient214] > MasterClient214 could not connect from /172.31.35.147:0 to /127.0.0.1:6001 > 2016-05-17 04:50:51.796+0000 INFO [o.n.k.h.MasterClient214] > MasterClient214[/127.0.0.1:6001] shutdown > 2016-05-17 04:50:51.796+0000 ERROR > [o.n.k.h.c.m.HighAvailabilityModeSwitcher] Error while trying to switch to > slave MasterClient214 could not connect from /172.31.35.147:0 to / > 127.0.0.1:6001 > org.neo4j.com.ComException: MasterClient214 could not connect from / > 172.31.35.147:0 to /127.0.0.1:6001 > at org.neo4j.com.Client$2.create(Client.java:225) > at org.neo4j.com.Client$2.create(Client.java:202) > at org.neo4j.com.ResourcePool.acquire(ResourcePool.java:177) > at org.neo4j.com.Client.acquireChannelContext(Client.java:390) > at org.neo4j.com.Client.sendRequest(Client.java:296) > at org.neo4j.com.Client.sendRequest(Client.java:289) > at org.neo4j.kernel.ha.MasterClient210.copyStore(MasterClient210.java:311) > at > org.neo4j.kernel.ha.cluster.SwitchToSlave$1.copyStore(SwitchToSlave.java:531) > at > org.neo4j.com.storecopy.StoreCopyClient.copyStore(StoreCopyClient.java:191) > at > org.neo4j.kernel.ha.cluster.SwitchToSlave.copyStoreFromMaster(SwitchToSlave.java:525) > at > org.neo4j.kernel.ha.cluster.SwitchToSlave.copyStoreFromMasterIfNeeded(SwitchToSlave.java:348) > at > org.neo4j.kernel.ha.cluster.SwitchToSlave.switchToSlave(SwitchToSlave.java:272) > at > org.neo4j.kernel.ha.cluster.modeswitch.HighAvailabilityModeSwitcher$1.run(HighAvailabilityModeSwitcher.java:348) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > at org.neo4j.helpers.NamedThreadFactory$2.run(NamedThreadFactory.java:104) > Caused by: java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) > at > org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:148) > at > org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:104) > at > org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:78) > at > org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) > at > org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:41) > ... 4 more > 2016-05-17 04:50:51.797+0000 INFO > [o.n.k.h.c.m.HighAvailabilityModeSwitcher] Attempting to switch to slave > in 7s > 2016-05-17 04:50:58.799+0000 INFO [o.n.k.i.f.CommunityFacadeFactory] No > locking implementation specified, defaulting to 'forseti' > 2016-05-17 04:50:58.799+0000 INFO [o.n.k.h.c.SwitchToSlave] ServerId 2, > moving to slave for master ha://0.0.0.0:6001?serverId=1 > > > > 2016-05-17 04:30:57.535+0000 DEBUG [o.n.c.p.c.ClusterState$2] [AsyncLog @ > 2016-05-17 04:30:57.534+0000] ClusterState: > discovery-[configurationTimeout]->discovery conversation-id:2/13# > payload:ConfigurationTimeoutState{remainingPings=3} > 2016-05-17 04:30:57.535+0000 DEBUG [o.n.c.p.h.HeartbeatState$1] [AsyncLog > @ 2016-05-17 04:30:57.535+0000] HeartbeatState: > start-[reset_send_heartbeat]->start conversation-id:2/13# > 2016-05-17 04:30:57.538+0000 INFO [o.n.c.c.NetworkSender] [AsyncLog @ > 2016-05-17 04:30:57.537+0000] Attempting to connect from /172.31.35.147:0 > to /172.31.33.173:5001 > 2016-05-17 04:30:57.540+0000 INFO [o.n.c.c.NetworkSender] [AsyncLog @ > 2016-05-17 04:30:57.540+0000] Failed to connect to /172.31.33.173:5001 > due to: java.net.ConnectException: Connection refused > 2016-05-17 04:30:57.540+0000 DEBUG [o.n.c.p.c.ClusterState$2] [AsyncLog @ > 2016-05-17 04:30:57.540+0000] ClusterState: > discovery-[configurationRequest]->discovery from:cluster:// > 172.31.35.147:5001 conversation-id:2/13# > payload:ConfigurationRequestState{joiningId=2, joiningUri=cluster:// > 172.31.35.147:5001} > 2016-05-17 04:30:58.420+0000 INFO [o.n.c.c.NetworkReceiver] [AsyncLog @ > 2016-05-17 04:30:58.420+0000] cluster://172.31.35.147:47188 disconnected > from me at cluster://172.31.35.147:5001 > 2016-05-17 04:30:58.420+0000 INFO [o.n.c.c.NetworkReceiver] [AsyncLog @ > 2016-05-17 04:30:58.420+0000] cluster://172.31.35.147:47188 disconnected > from me at cluster://172.31.35.147:5001 > 2016-05-17 04:30:58.434+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Check > Pointing triggered by database shutdown [1]: Starting check pointing... > 2016-05-17 04:30:58.438+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Check > Pointing triggered by database shutdown [1]: Starting store flush... > 2016-05-17 04:30:58.443+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Check > Pointing triggered by database shutdown [1]: Store flush completed > 2016-05-17 04:30:58.443+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Check > Pointing triggered by database shutdown [1]: Starting appending check > point entry into the tx log... > 2016-05-17 04:30:58.447+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Check > Pointing triggered by database shutdown [1]: Appending check point entry > into the tx log completed > 2016-05-17 04:30:58.447+0000 INFO [o.n.k.i.t.l.c.CheckPointerImpl] Check > Pointing triggered by database shutdown [1]: Check pointing completed > 2016-05-17 04:30:58.447+0000 INFO [o.n.k.i.t.l.p.LogPruningImpl] Log > Rotation [0]: Starting log pruning. > 2016-05-17 04:30:58.447+0000 INFO [o.n.k.i.t.l.p.LogPruningImpl] Log > Rotation [0]: Log pruning complete. > 2016-05-17 04:30:58.475+0000 INFO [o.n.k.i.DiagnosticsManager] --- > STOPPING diagnostics START --- > 2016-05-17 04:30:58.475+0000 INFO [o.n.k.i.DiagnosticsManager] High > Availability diagnostics > Member state:PENDING > State machines: > AtomicBroadcastMessage:start > AcceptorMessage:start > ProposerMessage:start > LearnerMessage:start > HeartbeatMessage:start > ElectionMessage:start > SnapshotMessage:start > ClusterMessage:discovery > Current timeouts: > join:configurationTimeout{conversation-id=2/13#, timeout-count=29, > created-by=2} > 2016-05-17 04:30:58.475+0000 INFO [o.n.k.i.DiagnosticsManager] --- > STOPPING diagnostics END --- > 2016-05-17 04:30:58.475+0000 INFO > [o.n.k.h.f.HighlyAvailableFacadeFactory] Shutdown started > > etc. > > > Any insights are highly appreciated!! > > Thank you! > Dennis > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
