I wish I could -- this cluster is running on an isolated network and I can't get the logs or configs or anything down to the Internet.
But, I just figured out the problem -- I had set a very large value for failureDetectionTimeout (default is 10s). When I reverted that to the default, everything started working great. This is interesting, because in 2.7.3, bumping up this setting didn't cause the same problem. I went back and forth between 2.7.3 and 2.8.1 a few times (using the same config w/ the large failureDetectionTimeout) and was able to replicate this -- worked fine in 2.7.3, and broke in 2.8.1. Hopefully this helps someone else out there, Alan On Thu, Sep 24, 2020 at 12:08 PM Andrei Aleksandrov <[email protected]> wrote: > Hi, > > Highly likely some of the nodes go offline and try to connect again. > Probably you had some network issues. I think I will see this and other > information in the logs. Can you provide them? > > BR, > Andrei > 9/24/2020 6:54 PM, Alan Ward пишет: > > The only log I see is from one of the server nodes, which is spewing at a > very high rate: > > [grid-nio-worker-tcp-comm-...][TcpCommunicationSpi] Accepted incoming > communication connection [locAddr=/<ip>:47100, rmtAddr=<ip>:<port> > > Note that each time the log is printed, i see a different value for > <port>. > > Also note that I only see these logs when i try to run ignitevisorcmd's > "cache" command. When I run the java application that calls > IgniteCache.size(), I don't see any such logs. But in both cases, the > result is that the operation is just hanging. > > The cluster is active and I am able to insert data (albeit at a pretty > slow rate), so it's not like things are completely non-functional. It's > really confusing :\ > > On Thu, Sep 24, 2020 at 11:04 AM aealexsandrov <[email protected]> > wrote: > >> Hi, >> >> Can you please provide the full server logs? >> >> BR, >> Andrei >> >> >> >> -- >> Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >> >
