Bruce Schuchardt created GEODE-2732: ---------------------------------------
Summary: after auto-reconnect a server is restarted on the default port of 40404 Key: GEODE-2732 URL: https://issues.apache.org/jira/browse/GEODE-2732 Project: Geode Issue Type: Bug Components: membership Reporter: Bruce Schuchardt If you start a server using gfsh with the server defined in a cache.xml and you specify the server's port Geode will ignore this setting in the event of an auto-reconnect. I observed this in a GemFire deployment and the code in this area hasn't changed in Apache Geode. By chance port 40404 was already in use when the auto-reconnect occurred and an exception was thrown. {noformat} com.gemstone.gemfire.GemFireIOException: While starting bridge server CacheServer on port=40404 client subscription config policy=none client subscription config capacity=1 client subscription config overflow directory=. at com.gemstone.gemfire.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:611) at com.gemstone.gemfire.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:340) at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4263) at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.initializeDeclarativeCache(GemFireCacheImpl.java:1178) at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.init(GemFireCacheImpl.java:1020) at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.create(GemFireCacheImpl.java:684) at com.gemstone.gemfire.distributed.internal.InternalDistributedSystem.reconnect(InternalDistributedSystem.java:2909) at com.gemstone.gemfire.distributed.internal.InternalDistributedSystem.tryReconnect(InternalDistributedSystem.java:2655) at com.gemstone.gemfire.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1058) at com.gemstone.gemfire.distributed.internal.DistributionManager$MyListener.membershipFailure(DistributionManager.java:4822) at com.gemstone.gemfire.distributed.internal.membership.jgroup.JGroupMembershipManager.uncleanShutdown(JGroupMembershipManager.java:2733) at com.gemstone.gemfire.distributed.internal.membership.jgroup.JGroupMembershipManager$Puller.channelClosing(JGroupMembershipManager.java:1213) at com.gemstone.org.jgroups.JChannel$CloserThread.run(JChannel.java:1617) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:463) at sun.nio.ch.Net.bind(Net.java:455) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at com.gemstone.gemfire.internal.cache.tier.sockets.AcceptorImpl.<init>(AcceptorImpl.java:432) at com.gemstone.gemfire.internal.cache.BridgeServerImpl.start(BridgeServerImpl.java:342) at com.gemstone.gemfire.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:607) ... 12 more {noformat} I think the fix is to get rid of the ThreadLocal storage of the port and bind address in CacheServerLauncher. These variables are used by the XML parser to configure a server. Gfsh sets them in its thread but they aren't available in the auto-reconnect thread that rebuilds the cache. -- This message was sent by Atlassian JIRA (v6.3.15#6346)