[
https://issues.apache.org/jira/browse/IGNITE-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15147531#comment-15147531
]
Denis Magda commented on IGNITE-2405:
-------------------------------------
[~maseev],
Your code works fine on my side. No loop during node startup is reproduced.
As I've stated earlier the reason of the issue on your side is the following -
the node tries to reconnect in a loop since you're using
TcpDiscoveryVmIpFinder, that is not shared, and because local node's addresses
are not found in the list of registered addresses in the IP finder.
I see that you set {{InetAddress.getLocalHost().getHostName()}} for both
{{IgniteConfiguration.setLocalHost(...)}} and IP finder's addresses list.
The issue happens because by some reason
{{TcpDiscoverySpi.ipFinderHasLocalAddress()}} says that there is no local host
address in IP finders's lists.
Would you mind helping us to debug the following:
- please insert debug traces into {{TcpDiscoverySpi.ipFinderHasLocalAddress()}}
to see what with which is compared;
- {{TcpDiscoveryNode}} constructor initializes the list of local addresses at
the end of its body - {{sockAddrs = U.toSocketAddresses(this, discPort);}}.
Please share with us what is returned by this function.
BTW, I've just added the documentation on how two different clusters can be
configured on the same set of machines. Probably this will be useful for you
https://apacheignite.readme.io/docs/cluster-config#isolated-ignite-clusters-on-the-same-set-of-machin
> Ignite is blocking the thread in case it can't connect to the node
> ------------------------------------------------------------------
>
> Key: IGNITE-2405
> URL: https://issues.apache.org/jira/browse/IGNITE-2405
> Project: Ignite
> Issue Type: Bug
> Affects Versions: 1.5.0.final
> Reporter: Miron Aseev
> Labels: community, newbie
> Attachments: log.out, log.out, log.out
>
>
> It seems that Apache Ignite runs an infinite loop if it can't connect to the
> resolved hostname.
> For example, if you specify the hostname of the local machine as an address
> for the TCPSpi, then Apache Ignite resolves it
> [here|https://github.com/apache/ignite/blob/b3d347e35a254928fd1c4a0473f1b17d642c72f3/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java#L915].
>
> But after that, for some reasons, it fails to connect to the resolved address
> [here|https://github.com/apache/ignite/blob/b3d347e35a254928fd1c4a0473f1b17d642c72f3/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java#L925].
>
> As a result, the control flow goes to the catch block where the exception is
> logged (It's an IgniteSpiException. It seems it can't get acces to the host
> according to the exception message - "Network operation timed out. Increase
> 'failureDetectionTimeout' configuration property
> [failureDetectionTimeout=10000]").
> In the end, we get an infinite loop.
> On the other hand, If I set an ip address (127.0.0.1) instead of the
> hostname, Ignite gets an empty list
> [here|https://github.com/apache/ignite/blob/b3d347e35a254928fd1c4a0473f1b17d642c72f3/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java#L915]
> and the control flow breaks the loop
> [here|https://github.com/apache/ignite/blob/b3d347e35a254928fd1c4a0473f1b17d642c72f3/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java#L918]
> and goes forward.
> Here's a log information which Ignite produces during its work.
> {code}
> Jan 19, 2016 7:33:02 PM java.util.logging.LogManager$RootLogger log
> SEVERE: Failed to resolve default logging config file:
> config/java.util.logging.properties
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO:
> >>> __________ ________________
> >>> / _/ ___/ |/ / _/_ __/ __/
> >>> _/ // (7 7 // / / / / _/
> >>> /___/\___/_/|_/___/ /_/ /___/
> >>>
> >>> ver. 1.5.0-final#20151229-sha1:f1f8cda2
> >>> 2015 Copyright(C) Apache Software Foundation
> >>>
> >>> Ignite documentation: http://ignite.apache.org
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Config URL: n/a
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Daemon mode: off
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: OS: Linux 3.13.0-74-generic amd64
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: OS user: maseev
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Language runtime: Java Platform API Specification ver. 1.8
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: VM information: Java(TM) SE Runtime Environment 1.8.0_40-b26 Oracle
> Corporation Java HotSpot(TM) 64-Bit Server VM 25.40-b25
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: VM total memory: 3.5GB
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Remote Management [restart: off, REST: on, JMX (remote: off)]
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: IGNITE_HOME=null
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: VM arguments:
> [-agentlib:jdwp=transport=dt_socket,address=127.0.0.1:51947,suspend=y,server=n,
> -ea, -DIGNITE_QUIET=false, -Didea.junit.sm_runner, -Dfile.encoding=UTF-8]
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Configured caches ['ignite-marshaller-sys-cache', 'ignite-sys-cache',
> 'ignite-atomics-sys-cache']
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger warning
> WARNING: Initial heap size is 250MB (should be no less than 512MB, use
> -Xms512m -Xmx512m).
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Non-loopback local IPs: 172.17.0.1, 192.168.88.252,
> fe80:0:0:0:42:2aff:feff:fe3d%docker0, fe80:0:0:0:f2de:f1ff:feb6:5301%eth0
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Enabled local MACs: 02422AFFFE3D, F0DEF1B65301
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Configured plugins:
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: ^-- None
> Jan 19, 2016 7:33:03 PM org.apache.ignite.logger.java.JavaLogger info
> INFO:
> Jan 19, 2016 7:33:04 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: IPC shared memory server endpoint started [port=48103,
> tokDir=/tmp/ignite/work/ipc/shmem/62396752-fb2f-4402-bff6-9f5f2d4fc0b3-8658]
> Jan 19, 2016 7:33:04 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Successfully bound shared memory communication to TCP port [port=48103,
> locHost=0.0.0.0/0.0.0.0]
> Jan 19, 2016 7:33:04 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Successfully bound to TCP port [port=47103, locHost=0.0.0.0/0.0.0.0]
> Jan 19, 2016 7:33:04 PM org.apache.ignite.logger.java.JavaLogger warning
> WARNING: Checkpoints are disabled (to enable configure any GridCheckpointSpi
> implementation)
> Jan 19, 2016 7:33:04 PM org.apache.ignite.logger.java.JavaLogger warning
> WARNING: Collision resolution is disabled (all jobs will be activated upon
> arrival).
> Jan 19, 2016 7:33:04 PM org.apache.ignite.logger.java.JavaLogger warning
> WARNING: Swap space is disabled. To enable use FileSwapSpaceSpi.
> Jan 19, 2016 7:33:04 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Security status [authentication=off, tls/ssl=off]
> Jan 19, 2016 7:33:05 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Command protocol successfully started [name=TCP binary,
> host=0.0.0.0/0.0.0.0, port=11214]
> Jan 19, 2016 7:33:05 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Successfully bound to TCP port [port=47503, localHost=0.0.0.0/0.0.0.0]
> Jan 19, 2016 7:33:13 PM org.apache.ignite.logger.java.JavaLogger info
> INFO: Your version is up to date.
> {code}
> And here's a simple
> [test|https://github.com/maseev/ignite/blob/master/src/test/java/io/github/maseev/IgniteTest.java#L23]
> which reproduces the problem. You can run it with maven by executing it with
> {code}mvn test{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)