Hi,

I have answered the following at March 26:

Hi Martin,

Not in a Docker. We are using the word container for one of our services.

Thanks,

Yuval




> On Apr 10, 2018, at 2:40 PM, Martin Gainty <mgai...@hotmail.com> wrote:
> 
> MG>waiting for Yuval to answer:
> 
> 
> in docker  container?
> if so how do you resolve hosts (.hosts file, kubernetes, swarm)?
> 
> 
> MG>?
> 
> ________________________________
> From: VALLASTER Stefan <stefan.vallas...@frequentis.com>
> Sent: Tuesday, April 10, 2018 4:46 AM
> To: user@zookeeper.apache.org
> Subject: RE: IllegalStateException: instance must be started before calling 
> this method
> 
> How can we get this topic forward again? It seems the mail thread is 
> currently stalled for no obvious reason. Anyone willing to help us out here 
> again?
> 
> Best regards
> Stefan
> 
> -----Original Message-----
> From: Martin Gainty [mailto:mgai...@hotmail.com]
> Sent: Sonntag, 25. März 2018 22:58
> To: user@zookeeper.apache.org
> Subject: Re: IllegalStateException: instance must be started before calling 
> this method
> 
> 
> 
> in docker  container?
> 
> if so how do you resolve hosts (.hosts file, kubernetes, swarm)?
> 
> M
> ________________________________
> From: Yuval Dori <yuval.d...@gigaspaces.com>
> Sent: Sunday, March 25, 2018 5:34 AM
> To: user@zookeeper.apache.org
> Cc: Yuval Dori
> Subject: IllegalStateException: instance must be started before calling this 
> method
> 
> Hi,
> 
> Our customer is using ZK 3.4.3 and got IllegalStateException: instance must 
> be started before calling this method
> 
> 
> with the following stack trace:
> 2018-02-15T09:58:12.094+0100; ERROR; WSOSTSLXWIT01/MANAGER; P3424/T194; 
> [SPACE/LearnerHandler-/10.17.46.142:49336/LearnerHandler<http://10.17.46.142:49336/LearnerHandler>];
>  Unexpected exception causing shutdown while sock still open
> java.net<http://java.net>.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.DataInputStream.readInt(DataInputStream.java:387)
> at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
> at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
> at 
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:542)
> 
> 
> This is the description:
> 
> At first ps-main.1 is started as primary on WSOSTSLXBMS01 GSC 3.
> 
> 24.01.2018 11:31:04,947 +01:00 INFO WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/1 Space instance is Primary
> 
> After some relocation of ps-main instances a new backup instance is started 
> on the same GSC.
> 
> 24.01.2018 11:32:59,886 +01:00 INFO WSOSTSLXBMS01/GSC/3 P1280/T146 
> SPACE/GS-LRMI Connection-pool-1-thread-6/1 Space instance [ 
> frqMain_container1:frqMain ] has been selected as Backup
> 
> Later a failover test is performed by shutting down WSOSTSLXBMS02 and 
> WSOSTSLXMSS02.
> We would expect that ps-main.1 backup will be elected as primary because the 
> former primary was located at WSOSTSLXBMS02 but no election happend. In the 
> end in the whole cluster there was no primary of ps-main.1 running.
> 
> We found suspicious exception in the logs:
> 24.01.2018 11:32:44,987 +01:00 INFO WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/frqMain_container1:frqMain Lost leadership 
> [sessionId=0x36128253b820001]
> 24.01.2018 11:32:44,987 +01:00 INFO WSOSTSLXBMS01/GSC/3 P1280/T98 
> SPACE/Curator-Framework-0/CuratorFrameworkImpl backgroundOperationsLoop 
> exiting
> 24.01.2018 11:32:44,990 +01:00 INFO WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/1 Quiesce state set to SUSPENDED
> 24.01.2018 11:32:44,991 +01:00 INFO WSOSTSLXBMS01/GSC/3 P1280/T24 
> SPACE/GS-LRMI Connection-pool-1-thread-1/ZooKeeper Session: 0x36128253b820001 
> closed
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector The leader threw an exception
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector 
> java.lang.IllegalStateException: instance must be started before calling this 
> method
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> org.apache.curator.shaded.com<http://org.apache.curator.shaded.com>.google.common.base.Preconditions.checkState(Preconditions.java:176)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.delete(CuratorFrameworkImpl.java:359)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> org.apache.curator.framework.recipes.locks.LockInternals.deleteOurPath(LockInternals.java:339)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> org.apache.curator.framework.recipes.locks.LockInternals.releaseLock(LockInternals.java:123)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> org.apache.curator.framework.recipes.locks.InterProcessMutex.release(InterProcessMutex.java:154)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> org.apache.curator.framework.recipes.leader.LeaderSelector.doWork(LeaderSelector.java:449)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> org.apache.curator.framework.recipes.leader.LeaderSelector.doWorkLoop(LeaderSelector.java:466)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> org.apache.curator.framework.recipes.leader.LeaderSelector.access$100(LeaderSelector.java:65)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:246)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:240)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector at 
> java.lang.Thread.run(Thread.java:748)
> 24.01.2018 11:32:44,991 +01:00 ERROR WSOSTSLXBMS01/GSC/3 P1280/T120 
> SPACE/Curator-LeaderSelector-0/LeaderSelector
> 24.01.2018 11:32:44,991 +01:00 INFO WSOSTSLXBMS01/GSC/3 P1280/T97 
> SPACE/GS-LRMI Connection-pool-1-thread-1-EventThread/ClientCnxn EventThread 
> shut down for session: 0x36128253b820001
> 24.01.2018 11:32:44,997 +01:00 INFO WSOSTSLXBMS01/GSC/3 P1280/T24 
> SPACE/GS-LRMI Connection-pool-1-thread-1/frqMain1_1 Channel is closed 
> [target=frqMain1_1, target 
> url=jini://*/frqMain_container1_1/frqMain?groups=POLBAY&ignoreValidation=true&schema=persistent&id=1&total_members=4,1&versioned=true&mirror=true&cluster_schema=partitioned&locators=WSOSTSLXMSS01,WSOSTSLXMSS02&state=started&timeout=5000,
>  target machine connection 
> url=NIO://WSOSTSLXBMS02:11002/pid[3756]/24795468496_3_2538032931939568226_details[class
>  
> com.gigaspaces.internal.cluster.node.impl.router.AbstractConnectionProxyBasedReplicationRouter$ConnectionEndpoint(frqMain_container1_1:frqMain)]]
> 24.01.2018 11:32:44,998 +01:00 INFO WSOSTSLXBMS01/GSC/3 P1280/T24 
> SPACE/GS-LRMI Connection-pool-1-thread-1/ps-main-mirror-service Channel is 
> closed [target=ps-main-mirror-service, target 
> url=jini://*/ps-main-mirror-service_container/ps-main-mirror-service?schema=persistent&id=1&total_members=4,1&versioned=true&mirror=true&groups=POLBAY&cluster_schema=partitioned&locators=WSOSTSLXMSS01,WSOSTSLXMSS02&state=started&timeout=5000,
>  target machine connection 
> url=NIO://WSOSTSLXBMS01:11002/pid[3220]/25906827215_3_7196393537085178059_details[class
>  
> com.gigaspaces.internal.cluster.node.impl.router.AbstractConnectionProxyBasedReplicationRouter$ConnectionEndpoint(ps-main-mirror-service_container:ps-main-mirror-service)]]
> 24.01.2018 11:32:45,767 +01:00 INFO WSOSTSLXBMS01/GSC/3 P1280/T24 
> SPACE/GS-LRMI Connection-pool-1-thread-1/1 Shutdown complete
> 24.01.2018 11:32:45,768 +01:00 INFO WSOSTSLXBMS01/GSC/3 P1280/T24 
> SPACE/GS-LRMI Connection-pool-1-thread-1/container Container 
> <frqMain_container1> shutdown completed
> 
> 24.01.2018 11:32:46,037 +01:00 WARN WSOSTSLXBMS01/GSC/3 P1280/T24 
> SPACE/GS-LRMI Connection-pool-1-thread-1/PUServiceBeanImpl Failed to delete 
> deployed processing unit from [C:\Program Files\Frequentis\Frequentis LifeX 
> Platform\gigaspaces\work\processing-units\ps-main_1_2008048111]
> 24.01.2018 11:32:46,037 +01:00 INFO WSOSTSLXBMS01/GSC/3 P1280/T24 
> SPACE/GS-LRMI Connection-pool-1-thread-1/PUServiceBeanImpl Stopped
> 
> Attached thread dump. Failover happend around 24.01.2018 11:40:20,893 +01:00
> WARN WSOSTSLXBMS01/GSC/3 P1280/T59 SPACE/GS-Watchdog/watchdog The 
> ServerEndPoint [WSOSTSLXMSS02/10.17.46.142:11001<http://10.17.46.142:11001/>] 
> is not reachable or is reachable but with no matching invocation in progress 
> at the server peer, closing invalid connection with local address 
> [/10.17.46.136:49864<http://10.17.46.136:49864/>] [ remote invocation of: 
> net.jini.lookup.DiscoveryAdmin.getMemberGroups] 
> [java.net<http://java.net>.SocketTimeoutException]
> 
> 
> Could it be ZK bug?
> 
> Thanks,
> 
> Yuval
> 
> 

Reply via email to