[jira] [Comment Edited] (IGNITE-15343) NullPointerException occurs when restarting ignite client application
[ https://issues.apache.org/jira/browse/IGNITE-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17402837#comment-17402837 ] Pavel Vinokurov edited comment on IGNITE-15343 at 8/22/21, 5:22 PM: [~francopo] It would be helpful if you attached the logs from server nodes. The log messages indicated connection issues. Thus the server logs could show the cause of this issues was (Author: pvinokurov): [~francopo] It would be helpful if you attached the logs from server nodes. > NullPointerException occurs when restarting ignite client application > - > > Key: IGNITE-15343 > URL: https://issues.apache.org/jira/browse/IGNITE-15343 > Project: Ignite > Issue Type: Bug >Reporter: Franco Po >Priority: Critical > Attachments: failed_startup-ignite_info.1st.attempt.log, > failed_startup-ignite_info.2nd.attempt.log, successful_startup-ignite_info.log > > > I upgraded one of my API backend applications from Apache Ignite 2.6 to > GridGain Community Edition 8.8.5 successfully in live environment a couple of > months ago. The entire setup is 2 instances of this ignite client application > plus a cluster of 2 ignite server instances. A planned maintenance needed to > restart the ignite client application. However, it couldn't be started again > due to a sequence of below exceptions (see > [^failed_startup-ignite_info.1st.attempt.log] and > [^failed_startup-ignite_info.2nd.attempt.log] for full log): > # java.io.IOException: Failed to get acknowledge for message: > TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage > [sndNodeId=null, id=fef7e5e5b71-b588bb65-6fe8-4aff-8f17-4a9e8733369b, > verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=true]] > # java.net.SocketException: Socket is closed > # java.lang.NullPointerException: null > # org.apache.ignite.IgniteCheckedException: Node stopped > I could restart same ignite client applications running in hot standby > environment where the ignite server contains no active data (see > [^successful_startup-ignite_info.log]). > Is this problem related to GG-17439 and IGNITE-11406? Which is equivalent > version of ignite 2.10 in GainGrid edition? > If anyone can provide insight as to how I can resolve this, that would be > greatly appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-15343) NullPointerException occurs when restarting ignite client application
[ https://issues.apache.org/jira/browse/IGNITE-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17402837#comment-17402837 ] Pavel Vinokurov commented on IGNITE-15343: -- [~francopo] It would be helpful if you attached the logs from server nodes. > NullPointerException occurs when restarting ignite client application > - > > Key: IGNITE-15343 > URL: https://issues.apache.org/jira/browse/IGNITE-15343 > Project: Ignite > Issue Type: Bug >Reporter: Franco Po >Priority: Critical > Attachments: failed_startup-ignite_info.1st.attempt.log, > failed_startup-ignite_info.2nd.attempt.log, successful_startup-ignite_info.log > > > I upgraded one of my API backend applications from Apache Ignite 2.6 to > GridGain Community Edition 8.8.5 successfully in live environment a couple of > months ago. The entire setup is 2 instances of this ignite client application > plus a cluster of 2 ignite server instances. A planned maintenance needed to > restart the ignite client application. However, it couldn't be started again > due to a sequence of below exceptions (see > [^failed_startup-ignite_info.1st.attempt.log] and > [^failed_startup-ignite_info.2nd.attempt.log] for full log): > # java.io.IOException: Failed to get acknowledge for message: > TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage > [sndNodeId=null, id=fef7e5e5b71-b588bb65-6fe8-4aff-8f17-4a9e8733369b, > verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=true]] > # java.net.SocketException: Socket is closed > # java.lang.NullPointerException: null > # org.apache.ignite.IgniteCheckedException: Node stopped > I could restart same ignite client applications running in hot standby > environment where the ignite server contains no active data (see > [^successful_startup-ignite_info.log]). > Is this problem related to GG-17439 and IGNITE-11406? Which is equivalent > version of ignite 2.10 in GainGrid edition? > If anyone can provide insight as to how I can resolve this, that would be > greatly appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-15358) Client node can't reconnect to cluster with security enabled.
[ https://issues.apache.org/jira/browse/IGNITE-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikolay Izhikov updated IGNITE-15358: - Priority: Blocker (was: Major) > Client node can't reconnect to cluster with security enabled. > - > > Key: IGNITE-15358 > URL: https://issues.apache.org/jira/browse/IGNITE-15358 > Project: Ignite > Issue Type: Bug >Reporter: Nikolay Izhikov >Priority: Blocker > > After IGNITE-15101 client node can't reconnect to the cluster because node id > changed on the disconnect but security processor continues to use old node id. > {code:java} > public class ClientReconnectTest extends GridCommonAbstractTest { > /** {@inheritDoc} */ > @Override protected IgniteConfiguration getConfiguration(String > igniteInstanceName) throws Exception { > return > super.getConfiguration(igniteInstanceName).setPluginProviders(new > TestReconnectSecurityPluginProvider() { > /** {@inheritDoc} */ > @Override protected GridSecurityProcessor > securityProcessor(GridKernalContext ctx) { > return new TestReconnectProcessor(ctx) { > @Override public SecurityContext securityContext(UUID > subjId) { > if (ctx.localNodeId().equals(subjId)) > return ctx.security().securityContext(); > throw new IgniteException( > "Unexpected subjId[subjId=" + subjId + > ",localNodeId=" + ctx.localNodeId() + ']' > ); > } > @Override public SecurityContext > authenticateNode(ClusterNode node, SecurityCredentials cred) { > return new TestSecurityContext(new > TestSecuritySubject(node.id())); > } > }; > } > }); > } > /** {@inheritDoc} */ > @Override protected void beforeTest() throws Exception { > super.beforeTest(); > cleanPersistenceDir(); > } > /** */ > @Test > public void testClientNodeReconnected() throws Exception { > IgniteEx ignite = startGrids(2); > ignite.cluster().state(ClusterState.ACTIVE); > int clientIdx = 2; > IgniteEx ex = startClientGrid(clientIdx); > CountDownLatch latch = new CountDownLatch(1); > ex.events().localListen(evt -> { > latch.countDown(); > return true; > }, EVT_CLIENT_NODE_RECONNECTED); > DiscoverySpi discoverySpi = > ignite(0).configuration().getDiscoverySpi(); > discoverySpi.failNode(nodeId(clientIdx), null); > assertTrue(latch.await(getTestTimeout(), TimeUnit.MILLISECONDS)); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-15358) Client node can't reconnect to cluster with security enabled.
[ https://issues.apache.org/jira/browse/IGNITE-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikolay Izhikov updated IGNITE-15358: - Issue Type: Bug (was: Improvement) > Client node can't reconnect to cluster with security enabled. > - > > Key: IGNITE-15358 > URL: https://issues.apache.org/jira/browse/IGNITE-15358 > Project: Ignite > Issue Type: Bug >Reporter: Nikolay Izhikov >Priority: Major > > After IGNITE-15101 client node can't reconnect to the cluster because node id > changed on the disconnect but security processor continues to use old node id. > {code:java} > public class ClientReconnectTest extends GridCommonAbstractTest { > /** {@inheritDoc} */ > @Override protected IgniteConfiguration getConfiguration(String > igniteInstanceName) throws Exception { > return > super.getConfiguration(igniteInstanceName).setPluginProviders(new > TestReconnectSecurityPluginProvider() { > /** {@inheritDoc} */ > @Override protected GridSecurityProcessor > securityProcessor(GridKernalContext ctx) { > return new TestReconnectProcessor(ctx) { > @Override public SecurityContext securityContext(UUID > subjId) { > if (ctx.localNodeId().equals(subjId)) > return ctx.security().securityContext(); > throw new IgniteException( > "Unexpected subjId[subjId=" + subjId + > ",localNodeId=" + ctx.localNodeId() + ']' > ); > } > @Override public SecurityContext > authenticateNode(ClusterNode node, SecurityCredentials cred) { > return new TestSecurityContext(new > TestSecuritySubject(node.id())); > } > }; > } > }); > } > /** {@inheritDoc} */ > @Override protected void beforeTest() throws Exception { > super.beforeTest(); > cleanPersistenceDir(); > } > /** */ > @Test > public void testClientNodeReconnected() throws Exception { > IgniteEx ignite = startGrids(2); > ignite.cluster().state(ClusterState.ACTIVE); > int clientIdx = 2; > IgniteEx ex = startClientGrid(clientIdx); > CountDownLatch latch = new CountDownLatch(1); > ex.events().localListen(evt -> { > latch.countDown(); > return true; > }, EVT_CLIENT_NODE_RECONNECTED); > DiscoverySpi discoverySpi = > ignite(0).configuration().getDiscoverySpi(); > discoverySpi.failNode(nodeId(clientIdx), null); > assertTrue(latch.await(getTestTimeout(), TimeUnit.MILLISECONDS)); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (IGNITE-15358) Client node can't reconnect to cluster with security enabled.
[ https://issues.apache.org/jira/browse/IGNITE-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikolay Izhikov reassigned IGNITE-15358: Assignee: Nikolay Izhikov > Client node can't reconnect to cluster with security enabled. > - > > Key: IGNITE-15358 > URL: https://issues.apache.org/jira/browse/IGNITE-15358 > Project: Ignite > Issue Type: Bug >Reporter: Nikolay Izhikov >Assignee: Nikolay Izhikov >Priority: Blocker > > After IGNITE-15101 client node can't reconnect to the cluster because node id > changed on the disconnect but security processor continues to use old node id. > {code:java} > public class ClientReconnectTest extends GridCommonAbstractTest { > /** {@inheritDoc} */ > @Override protected IgniteConfiguration getConfiguration(String > igniteInstanceName) throws Exception { > return > super.getConfiguration(igniteInstanceName).setPluginProviders(new > TestReconnectSecurityPluginProvider() { > /** {@inheritDoc} */ > @Override protected GridSecurityProcessor > securityProcessor(GridKernalContext ctx) { > return new TestReconnectProcessor(ctx) { > @Override public SecurityContext securityContext(UUID > subjId) { > if (ctx.localNodeId().equals(subjId)) > return ctx.security().securityContext(); > throw new IgniteException( > "Unexpected subjId[subjId=" + subjId + > ",localNodeId=" + ctx.localNodeId() + ']' > ); > } > @Override public SecurityContext > authenticateNode(ClusterNode node, SecurityCredentials cred) { > return new TestSecurityContext(new > TestSecuritySubject(node.id())); > } > }; > } > }); > } > /** {@inheritDoc} */ > @Override protected void beforeTest() throws Exception { > super.beforeTest(); > cleanPersistenceDir(); > } > /** */ > @Test > public void testClientNodeReconnected() throws Exception { > IgniteEx ignite = startGrids(2); > ignite.cluster().state(ClusterState.ACTIVE); > int clientIdx = 2; > IgniteEx ex = startClientGrid(clientIdx); > CountDownLatch latch = new CountDownLatch(1); > ex.events().localListen(evt -> { > latch.countDown(); > return true; > }, EVT_CLIENT_NODE_RECONNECTED); > DiscoverySpi discoverySpi = > ignite(0).configuration().getDiscoverySpi(); > discoverySpi.failNode(nodeId(clientIdx), null); > assertTrue(latch.await(getTestTimeout(), TimeUnit.MILLISECONDS)); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-15358) Client node can't reconnect to cluster with security enabled.
Nikolay Izhikov created IGNITE-15358: Summary: Client node can't reconnect to cluster with security enabled. Key: IGNITE-15358 URL: https://issues.apache.org/jira/browse/IGNITE-15358 Project: Ignite Issue Type: Improvement Reporter: Nikolay Izhikov After IGNITE-15101 client node can't reconnect to the cluster because node id changed on the disconnect but security processor continues to use old node id. {code:java} public class ClientReconnectTest extends GridCommonAbstractTest { /** {@inheritDoc} */ @Override protected IgniteConfiguration getConfiguration(String igniteInstanceName) throws Exception { return super.getConfiguration(igniteInstanceName).setPluginProviders(new TestReconnectSecurityPluginProvider() { /** {@inheritDoc} */ @Override protected GridSecurityProcessor securityProcessor(GridKernalContext ctx) { return new TestReconnectProcessor(ctx) { @Override public SecurityContext securityContext(UUID subjId) { if (ctx.localNodeId().equals(subjId)) return ctx.security().securityContext(); throw new IgniteException( "Unexpected subjId[subjId=" + subjId + ",localNodeId=" + ctx.localNodeId() + ']' ); } @Override public SecurityContext authenticateNode(ClusterNode node, SecurityCredentials cred) { return new TestSecurityContext(new TestSecuritySubject(node.id())); } }; } }); } /** {@inheritDoc} */ @Override protected void beforeTest() throws Exception { super.beforeTest(); cleanPersistenceDir(); } /** */ @Test public void testClientNodeReconnected() throws Exception { IgniteEx ignite = startGrids(2); ignite.cluster().state(ClusterState.ACTIVE); int clientIdx = 2; IgniteEx ex = startClientGrid(clientIdx); CountDownLatch latch = new CountDownLatch(1); ex.events().localListen(evt -> { latch.countDown(); return true; }, EVT_CLIENT_NODE_RECONNECTED); DiscoverySpi discoverySpi = ignite(0).configuration().getDiscoverySpi(); discoverySpi.failNode(nodeId(clientIdx), null); assertTrue(latch.await(getTestTimeout(), TimeUnit.MILLISECONDS)); } } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)