I'm new to Ignite, exploring it as embedded fabric for a dynamic distributed
systems. My thinking probably doesn't fit Ignite patterns exactly, so I will
appreciate your feedback on the following. (I'm doing programmatic
initialization of embedded Ignite, and I'm not sure whether the users or dev
list is more appropriate. Trying here first.)
For reasons(tm), multicast discovery is not an option, and I've done my little
testing with static IP discovery. I ran two instances of my test class, each
given their peer's IP:port:
...
TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
ipFinder.setAddresses( m_peers );
...
...and it took me a while to realise why they were not getting past the start
phase: they were deadlocked, each receiving a RES_WAIT from the peer in
org.apache.ignite.spi.discovery.tcp.ServerImpl.sendJoinRequestMessage().
Starting one instance without any peer and the other as previously, all works
fine. (1.4.0 and 1.5.0_SNAPSHOT)
In a typical use case, I would expect to run many nodes, started by systemd or
package manager after software updates. Deciding to start one without initial
peers just to get through joinTopology() and be able to answer something else
than RES_WAIT to subsequent joiners seems a bit fragile. Either I would have to
elect a leader myself before launching (seems sort of redundant, given that
I'm starting a communications cluster), or I would have to be able to start the
instances and prod them to join peers afterward (this doesn't seem to be
supported). There don't seem to be any IgniteConfiguration or TcpDiscoverySpi
options around this, either.
How would established no-multicast igniters approach this startup sequence?
Speculating about code changes, would it be feasible to make response to
external join requests and own join attempts independent, allowing a node to
respond with RES_OK even if it hasn't found peers yet? Or, would it be feasible
to add cluster peer addresses even after Ignition.start()? (Either may be
trickier than I think, and I haven't had a chance to dig deeper yet.)
I will be happy to provide more details or sample code, if the above is too
vague.
Thanks,
//eb