Hello! Please make sure you're not reusing IgniteConfiguration object with its SPI objects. Some of those are not reusable, once used in a node they should not be used to start a new one.
Regards, -- Ilya Kasnacheev 2018-07-31 3:41 GMT+03:00 ianhamilton_modelshop <ian.hamil...@modelshop.com> : > Hello, > > I'm doing a POC to see if Ignite is suitable for my company's application. > While doing this, I have created the following environment: > > Configuration: > > Ignite version: 2.6.0 > Java version used: Java(TM) SE Runtime Environment 1.8.0_171-b11 Oracle > Corporation Java HotSpot(TM) 64-Bit Server VM 25.171-b11 > OS: Windows 10 (local dev env) > > Server: running an Ignite server via the $IGNITE_HOME/bin/ignite.bat > script. > Client: Junit session running in IntelliJ, using Ignite's Java API to > attach > to the server, run in client mode, and activate the cluster once initially > connected. > Configuration: see attached zip file, file ignitepoc.xml. Both the client > and server use the same configuration. > PartitionExchangeProblemWhenReconnecting.zip > <http://apache-ignite-users.70518.x6.nabble.com/file/t1951/ > PartitionExchangeProblemWhenReconnecting.zip> > > What's Happening > > Inititial client run - ok > > 1. Start server up - server starts ok > 2. Run client - client is able to connect to server and run test to > completion. Client also explicitly calls Ignite.close() to shutdown > cleanly. > During the client execution, it: > * Destroys any existing copy of the test cache from prior runs > * Creates a new test cache > * Loads 100K items into that cache using a DataStreamer > * reads all items in the cache using an Iterator obtained from the cache > * reads 100K items at random using the cache's get() method > > Logs from this step are available in the attached zip file - file names > ClientLog-FirstRun-Success.txt, ServerLog-FirstRun-Success.txt > > Second client run - trouble starts > > The server remains up and running from the first run. > 3. Run client again. > > *The problem here is that the client never successfully connects to the > server.* > The server fails responding back to one of the messages sent from the > client, and I see the following exception in the logs: > > /2018-07-30 18:08:05.494 [exchange-worker-#42] ERROR > o.a.i.i.p.c.d.d.p.GridDhtPartitionsExchangeFuture.error:137 - Failed to > reinitialize local partitions (preloading will be stopped): > > GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=4, > minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode > [id=207a9b5d-0305-405b-9aee-32b7cbee7163, addrs=[0:0:0:0:0:0:0:1, > 127.0.0.1, > 172.27.225.23, 192.168.52.92], sockAddrs=[/0:0:0:0:0:0:0:1:0, /127.0.0.1:0 > , > ip-172-27-225-23.ec2.internal/172.27.225.23:0, > ip-192-168-52-92.ec2.internal/192.168.52.92:0], discPort=0, order=4, > intOrder=3, lastExchangeTime=1532988478915, loc=false, > ver=2.6.0#20180710-sha1:669feacc, isClient=true], topVer=4, > nodeId8=798ca779, msg=Node joined: TcpDiscoveryNode > [id=207a9b5d-0305-405b-9aee-32b7cbee7163, addrs=[0:0:0:0:0:0:0:1, > 127.0.0.1, > 172.27.225.23, 192.168.52.92], sockAddrs=[/0:0:0:0:0:0:0:1:0, /127.0.0.1:0 > , > ip-172-27-225-23.ec2.internal/172.27.225.23:0, > ip-192-168-52-92.ec2.internal/192.168.52.92:0], discPort=0, order=4, > intOrder=3, lastExchangeTime=1532988478915, loc=false, > ver=2.6.0#20180710-sha1:669feacc, isClient=true], type=NODE_JOINED, > tstamp=1532988478959], nodeId=207a9b5d, evt=NODE_JOINED] > > java.lang.NullPointerException: null at > org.apache.ignite.internal.processors.cache.persistence. > GridCacheDatabaseSharedManager$11.apply(GridCacheDatabaseSharedManager > .java:1243) > ~[ignite-core-2.6.0.jar:2.6.0] at > org.apache.ignite.internal.processors.cache.persistence. > GridCacheDatabaseSharedManager$11.apply(GridCacheDatabaseSharedManager > .java:1239) > ~[ignite-core-2.6.0.jar:2.6.0] at > org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener( > GridFutureAdapter.java:383) > ~[ignite-core-2.6.0.jar:2.6.0] at > org.apache.ignite.internal.util.future.GridFutureAdapter. > listen(GridFutureAdapter.java:353) > ~[ignite-core-2.6.0.jar:2.6.0] at > org.apache.ignite.internal.processors.cache.persistence. > GridCacheDatabaseSharedManager.rebuildIndexesIfNeeded( > GridCacheDatabaseSharedManager.java:1239) > ~[ignite-core-2.6.0.jar:2.6.0] at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader. > GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFutur > e.java:1711) > ~[ignite-core-2.6.0.jar:2.6.0] at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader. > GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFutur > e.java:126) > ~[ignite-core-2.6.0.jar:2.6.0] at > org.apache.ignite.internal.util.future.GridFutureAdapter. > onDone(GridFutureAdapter.java:451) > ~[ignite-core-2.6.0.jar:2.6.0] at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader. > GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFutur > e.java:729) > ~[ignite-core-2.6.0.jar:2.6.0] at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeMana > ger$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2419) > [ignite-core-2.6.0.jar:2.6.0] at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeMana > ger$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2299) > [ignite-core-2.6.0.jar:2.6.0] at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) > [ignite-core-2.6.0.jar:2.6.0] at java.lang.Thread.run(Thread.java:748) > [na:1.8.0_171]/ > > Logs from this step are available in the attached zip file - file names > ClientLog-SecondRun-ClientCannotConnect.txt, > ServerLog-SecondRun-ClientCannotConnect.txt > > From stepping through the server code using a debugger, I can see that the > usrFut variable is null on GridCacheDatabaseSharedManager.java:1243. > > But I have no idea whether that is the problem or if my setup should not > have even gotten into that area of the code. > > I had to kill the client in order to stop it, otherwise it will continually > wait for the message to come back. > > Try to run client one more time - still a problem > > The server is still up and running from before, it hasn't been restarted. > 4. Try running the client again. > > Here again, the client hangs. I don't seem to see the NPE like before. > But > it continually waits for a response from the server, and I have to kill it. > > Logs from this step are available in the attached zip file - file names > ClientLog-ThirdRun-ClientStillCannotConnect.txt, > ServerLog-ThirdRun-ClientStillCannotConnect.txt > > The client will not successfully connect to the server unless I restart the > server. Then the pattern of events shown above repeats itself - first time > the client can connect, but subsequent times it hangs. > > *Could someone please help? Is this a bug, or have I messed up something > in > the configuration?* > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >