[jira] [Commented] (IGNITE-15343) NullPointerException occurs when restarting ignite client application
[ https://issues.apache.org/jira/browse/IGNITE-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403882#comment-17403882 ] Pavel Vinokurov commented on IGNITE-15343: -- You could also check that 10.211.80.15:6200 is accessible from server instances > NullPointerException occurs when restarting ignite client application > - > > Key: IGNITE-15343 > URL: https://issues.apache.org/jira/browse/IGNITE-15343 > Project: Ignite > Issue Type: Bug >Reporter: Franco Po >Priority: Critical > Attachments: failed_startup-ignite_info.1st.attempt.log, > failed_startup-ignite_info.2nd.attempt.log, > server1-ignite_info.1st.attempt.log, server2-ignite_info.1st.attempt.log, > successful_startup-ignite_info.log > > > I upgraded one of my API backend applications from Apache Ignite 2.6 to > GridGain Community Edition 8.8.5 successfully in live environment a couple of > months ago. The entire setup is 2 instances of this ignite client application > plus a cluster of 2 ignite server instances. A planned maintenance needed to > restart the ignite client application. However, it couldn't be started again > due to a sequence of below exceptions (see > [^failed_startup-ignite_info.1st.attempt.log] and > [^failed_startup-ignite_info.2nd.attempt.log] for full log): > # java.io.IOException: Failed to get acknowledge for message: > TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage > [sndNodeId=null, id=fef7e5e5b71-b588bb65-6fe8-4aff-8f17-4a9e8733369b, > verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=true]] > # java.net.SocketException: Socket is closed > # java.lang.NullPointerException: null > # org.apache.ignite.IgniteCheckedException: Node stopped > I could restart same ignite client applications running in hot standby > environment where the ignite server contains no active data (see > [^successful_startup-ignite_info.log]). > Is this problem related to GG-17439 and IGNITE-11406? Which is equivalent > version of ignite 2.10 in GainGrid edition? > If anyone can provide insight as to how I can resolve this, that would be > greatly appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (IGNITE-15343) NullPointerException occurs when restarting ignite client application
[ https://issues.apache.org/jira/browse/IGNITE-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403877#comment-17403877 ] Pavel Vinokurov edited comment on IGNITE-15343 at 8/24/21, 3:29 PM: {code:java} [2021/08/19 20:25:38.398] INFO [tcp-client-disco-msg-worker-#4-#42] [] - Router node: TcpDiscoveryNode [id=c791881c-983f-44c9-a30b-c9b12e9cb7f6, consistentId=rhdpg03, addrs=ArrayList [10.211.80.17, 127.0.0.1], sockAddrs=HashSet [/127.0.0.1:6200, rhdpg03/10.211.80.17:6200], discPort=6200, order=1, intOrder=1, lastExchangeTime=1629375878183, loc=false, ver=8.8.5#20210519-sha1:067284c6, isClient=false] {code} was (Author: pvinokurov): The difference > NullPointerException occurs when restarting ignite client application > - > > Key: IGNITE-15343 > URL: https://issues.apache.org/jira/browse/IGNITE-15343 > Project: Ignite > Issue Type: Bug >Reporter: Franco Po >Priority: Critical > Attachments: failed_startup-ignite_info.1st.attempt.log, > failed_startup-ignite_info.2nd.attempt.log, > server1-ignite_info.1st.attempt.log, server2-ignite_info.1st.attempt.log, > successful_startup-ignite_info.log > > > I upgraded one of my API backend applications from Apache Ignite 2.6 to > GridGain Community Edition 8.8.5 successfully in live environment a couple of > months ago. The entire setup is 2 instances of this ignite client application > plus a cluster of 2 ignite server instances. A planned maintenance needed to > restart the ignite client application. However, it couldn't be started again > due to a sequence of below exceptions (see > [^failed_startup-ignite_info.1st.attempt.log] and > [^failed_startup-ignite_info.2nd.attempt.log] for full log): > # java.io.IOException: Failed to get acknowledge for message: > TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage > [sndNodeId=null, id=fef7e5e5b71-b588bb65-6fe8-4aff-8f17-4a9e8733369b, > verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=true]] > # java.net.SocketException: Socket is closed > # java.lang.NullPointerException: null > # org.apache.ignite.IgniteCheckedException: Node stopped > I could restart same ignite client applications running in hot standby > environment where the ignite server contains no active data (see > [^successful_startup-ignite_info.log]). > Is this problem related to GG-17439 and IGNITE-11406? Which is equivalent > version of ignite 2.10 in GainGrid edition? > If anyone can provide insight as to how I can resolve this, that would be > greatly appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-15343) NullPointerException occurs when restarting ignite client application
[ https://issues.apache.org/jira/browse/IGNITE-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403877#comment-17403877 ] Pavel Vinokurov commented on IGNITE-15343: -- The difference > NullPointerException occurs when restarting ignite client application > - > > Key: IGNITE-15343 > URL: https://issues.apache.org/jira/browse/IGNITE-15343 > Project: Ignite > Issue Type: Bug >Reporter: Franco Po >Priority: Critical > Attachments: failed_startup-ignite_info.1st.attempt.log, > failed_startup-ignite_info.2nd.attempt.log, > server1-ignite_info.1st.attempt.log, server2-ignite_info.1st.attempt.log, > successful_startup-ignite_info.log > > > I upgraded one of my API backend applications from Apache Ignite 2.6 to > GridGain Community Edition 8.8.5 successfully in live environment a couple of > months ago. The entire setup is 2 instances of this ignite client application > plus a cluster of 2 ignite server instances. A planned maintenance needed to > restart the ignite client application. However, it couldn't be started again > due to a sequence of below exceptions (see > [^failed_startup-ignite_info.1st.attempt.log] and > [^failed_startup-ignite_info.2nd.attempt.log] for full log): > # java.io.IOException: Failed to get acknowledge for message: > TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage > [sndNodeId=null, id=fef7e5e5b71-b588bb65-6fe8-4aff-8f17-4a9e8733369b, > verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=true]] > # java.net.SocketException: Socket is closed > # java.lang.NullPointerException: null > # org.apache.ignite.IgniteCheckedException: Node stopped > I could restart same ignite client applications running in hot standby > environment where the ignite server contains no active data (see > [^successful_startup-ignite_info.log]). > Is this problem related to GG-17439 and IGNITE-11406? Which is equivalent > version of ignite 2.10 in GainGrid edition? > If anyone can provide insight as to how I can resolve this, that would be > greatly appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (IGNITE-15343) NullPointerException occurs when restarting ignite client application
[ https://issues.apache.org/jira/browse/IGNITE-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403228#comment-17403228 ] Pavel Vinokurov edited comment on IGNITE-15343 at 8/24/21, 3:16 PM: {code:java} [2021/08/19 20:25:38.519] WARN [main] [] - Local node's value of 'java.net.preferIPv4Stack' system property differs from remote node's (all nodes in topology should have identical value) [locPreferIpV4=true, rmtPreferIpV4=null, locId8=b588bb65, rmtId8=7d483a80, rmtAddrs=[rhdpg02/0:0:0:0:0:0:0:1%lo, /10.211.80.16, /127.0.0.1], rmtNode=ClusterNode [id=7d483a80-4ada-4c10-b2e2-3b85a47b2d26, order=24, addr=[0:0:0:0:0:0:0:1%lo, 10.211.80.16, 127.0.0.1], daemon=false]] {code} Please add -Djava.net.preferIPv4Stack=true to all nodes including server and set IgniteConfiguration.setLocalHost() was (Author: pvinokurov): {code:java} [2021/08/19 20:25:38.519] WARN [main] [] - Local node's value of 'java.net.preferIPv4Stack' system property differs from remote node's (all nodes in topology should have identical value) [locPreferIpV4=true, rmtPreferIpV4=null, locId8=b588bb65, rmtId8=7d483a80, rmtAddrs=[rhdpg02/0:0:0:0:0:0:0:1%lo, /10.211.80.16, /127.0.0.1], rmtNode=ClusterNode [id=7d483a80-4ada-4c10-b2e2-3b85a47b2d26, order=24, addr=[0:0:0:0:0:0:0:1%lo, 10.211.80.16, 127.0.0.1], daemon=false]] {code} Please add -Djava.net.preferIPv4Stack=true to the client node and set IgniteConfiguration.setLocalHost() > NullPointerException occurs when restarting ignite client application > - > > Key: IGNITE-15343 > URL: https://issues.apache.org/jira/browse/IGNITE-15343 > Project: Ignite > Issue Type: Bug >Reporter: Franco Po >Priority: Critical > Attachments: failed_startup-ignite_info.1st.attempt.log, > failed_startup-ignite_info.2nd.attempt.log, > server1-ignite_info.1st.attempt.log, server2-ignite_info.1st.attempt.log, > successful_startup-ignite_info.log > > > I upgraded one of my API backend applications from Apache Ignite 2.6 to > GridGain Community Edition 8.8.5 successfully in live environment a couple of > months ago. The entire setup is 2 instances of this ignite client application > plus a cluster of 2 ignite server instances. A planned maintenance needed to > restart the ignite client application. However, it couldn't be started again > due to a sequence of below exceptions (see > [^failed_startup-ignite_info.1st.attempt.log] and > [^failed_startup-ignite_info.2nd.attempt.log] for full log): > # java.io.IOException: Failed to get acknowledge for message: > TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage > [sndNodeId=null, id=fef7e5e5b71-b588bb65-6fe8-4aff-8f17-4a9e8733369b, > verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=true]] > # java.net.SocketException: Socket is closed > # java.lang.NullPointerException: null > # org.apache.ignite.IgniteCheckedException: Node stopped > I could restart same ignite client applications running in hot standby > environment where the ignite server contains no active data (see > [^successful_startup-ignite_info.log]). > Is this problem related to GG-17439 and IGNITE-11406? Which is equivalent > version of ignite 2.10 in GainGrid edition? > If anyone can provide insight as to how I can resolve this, that would be > greatly appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-15343) NullPointerException occurs when restarting ignite client application
[ https://issues.apache.org/jira/browse/IGNITE-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403228#comment-17403228 ] Pavel Vinokurov commented on IGNITE-15343: -- {code:java} [2021/08/19 20:25:38.519] WARN [main] [] - Local node's value of 'java.net.preferIPv4Stack' system property differs from remote node's (all nodes in topology should have identical value) [locPreferIpV4=true, rmtPreferIpV4=null, locId8=b588bb65, rmtId8=7d483a80, rmtAddrs=[rhdpg02/0:0:0:0:0:0:0:1%lo, /10.211.80.16, /127.0.0.1], rmtNode=ClusterNode [id=7d483a80-4ada-4c10-b2e2-3b85a47b2d26, order=24, addr=[0:0:0:0:0:0:0:1%lo, 10.211.80.16, 127.0.0.1], daemon=false]] {code} Please add -Djava.net.preferIPv4Stack=true to the client node and set IgniteConfiguration.setLocalHost() > NullPointerException occurs when restarting ignite client application > - > > Key: IGNITE-15343 > URL: https://issues.apache.org/jira/browse/IGNITE-15343 > Project: Ignite > Issue Type: Bug >Reporter: Franco Po >Priority: Critical > Attachments: failed_startup-ignite_info.1st.attempt.log, > failed_startup-ignite_info.2nd.attempt.log, > server1-ignite_info.1st.attempt.log, server2-ignite_info.1st.attempt.log, > successful_startup-ignite_info.log > > > I upgraded one of my API backend applications from Apache Ignite 2.6 to > GridGain Community Edition 8.8.5 successfully in live environment a couple of > months ago. The entire setup is 2 instances of this ignite client application > plus a cluster of 2 ignite server instances. A planned maintenance needed to > restart the ignite client application. However, it couldn't be started again > due to a sequence of below exceptions (see > [^failed_startup-ignite_info.1st.attempt.log] and > [^failed_startup-ignite_info.2nd.attempt.log] for full log): > # java.io.IOException: Failed to get acknowledge for message: > TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage > [sndNodeId=null, id=fef7e5e5b71-b588bb65-6fe8-4aff-8f17-4a9e8733369b, > verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=true]] > # java.net.SocketException: Socket is closed > # java.lang.NullPointerException: null > # org.apache.ignite.IgniteCheckedException: Node stopped > I could restart same ignite client applications running in hot standby > environment where the ignite server contains no active data (see > [^successful_startup-ignite_info.log]). > Is this problem related to GG-17439 and IGNITE-11406? Which is equivalent > version of ignite 2.10 in GainGrid edition? > If anyone can provide insight as to how I can resolve this, that would be > greatly appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (IGNITE-15343) NullPointerException occurs when restarting ignite client application
[ https://issues.apache.org/jira/browse/IGNITE-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402837#comment-17402837 ] Pavel Vinokurov edited comment on IGNITE-15343 at 8/22/21, 5:22 PM: [~francopo] It would be helpful if you attached the logs from server nodes. The log messages indicated connection issues. Thus the server logs could show the cause of this issues was (Author: pvinokurov): [~francopo] It would be helpful if you attached the logs from server nodes. > NullPointerException occurs when restarting ignite client application > - > > Key: IGNITE-15343 > URL: https://issues.apache.org/jira/browse/IGNITE-15343 > Project: Ignite > Issue Type: Bug >Reporter: Franco Po >Priority: Critical > Attachments: failed_startup-ignite_info.1st.attempt.log, > failed_startup-ignite_info.2nd.attempt.log, successful_startup-ignite_info.log > > > I upgraded one of my API backend applications from Apache Ignite 2.6 to > GridGain Community Edition 8.8.5 successfully in live environment a couple of > months ago. The entire setup is 2 instances of this ignite client application > plus a cluster of 2 ignite server instances. A planned maintenance needed to > restart the ignite client application. However, it couldn't be started again > due to a sequence of below exceptions (see > [^failed_startup-ignite_info.1st.attempt.log] and > [^failed_startup-ignite_info.2nd.attempt.log] for full log): > # java.io.IOException: Failed to get acknowledge for message: > TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage > [sndNodeId=null, id=fef7e5e5b71-b588bb65-6fe8-4aff-8f17-4a9e8733369b, > verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=true]] > # java.net.SocketException: Socket is closed > # java.lang.NullPointerException: null > # org.apache.ignite.IgniteCheckedException: Node stopped > I could restart same ignite client applications running in hot standby > environment where the ignite server contains no active data (see > [^successful_startup-ignite_info.log]). > Is this problem related to GG-17439 and IGNITE-11406? Which is equivalent > version of ignite 2.10 in GainGrid edition? > If anyone can provide insight as to how I can resolve this, that would be > greatly appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-15343) NullPointerException occurs when restarting ignite client application
[ https://issues.apache.org/jira/browse/IGNITE-15343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402837#comment-17402837 ] Pavel Vinokurov commented on IGNITE-15343: -- [~francopo] It would be helpful if you attached the logs from server nodes. > NullPointerException occurs when restarting ignite client application > - > > Key: IGNITE-15343 > URL: https://issues.apache.org/jira/browse/IGNITE-15343 > Project: Ignite > Issue Type: Bug >Reporter: Franco Po >Priority: Critical > Attachments: failed_startup-ignite_info.1st.attempt.log, > failed_startup-ignite_info.2nd.attempt.log, successful_startup-ignite_info.log > > > I upgraded one of my API backend applications from Apache Ignite 2.6 to > GridGain Community Edition 8.8.5 successfully in live environment a couple of > months ago. The entire setup is 2 instances of this ignite client application > plus a cluster of 2 ignite server instances. A planned maintenance needed to > restart the ignite client application. However, it couldn't be started again > due to a sequence of below exceptions (see > [^failed_startup-ignite_info.1st.attempt.log] and > [^failed_startup-ignite_info.2nd.attempt.log] for full log): > # java.io.IOException: Failed to get acknowledge for message: > TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage > [sndNodeId=null, id=fef7e5e5b71-b588bb65-6fe8-4aff-8f17-4a9e8733369b, > verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=true]] > # java.net.SocketException: Socket is closed > # java.lang.NullPointerException: null > # org.apache.ignite.IgniteCheckedException: Node stopped > I could restart same ignite client applications running in hot standby > environment where the ignite server contains no active data (see > [^successful_startup-ignite_info.log]). > Is this problem related to GG-17439 and IGNITE-11406? Which is equivalent > version of ignite 2.10 in GainGrid edition? > If anyone can provide insight as to how I can resolve this, that would be > greatly appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (IGNITE-14439) NPE when accessing clustername before first exchange finished
[ https://issues.apache.org/jira/browse/IGNITE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402637#comment-17402637 ] Pavel Vinokurov edited comment on IGNITE-14439 at 8/21/21, 4:10 PM: Hi [~francopo], most probably NPE was caused by another issue because before calling GridServiceProcessor.onKernalStart the system cache should be initialised. Is that NPE being repeated all the time? was (Author: pvinokurov): Hi [~francopo], most probably NPE was caused by another issue because before calling GridServiceProcessor.onKernalStart the system cache should be initialised. Is those NPE being repeated all the time? > NPE when accessing clustername before first exchange finished > - > > Key: IGNITE-14439 > URL: https://issues.apache.org/jira/browse/IGNITE-14439 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.9 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Major > Fix For: 2.11 > > Time Spent: 20m > Remaining Estimate: 0h > > [IGNITE-11406|https://issues.apache.org/jira/browse/IGNITE-11406] has not > been fixed properly for two reasons. The first is one is that > _GridCacheProcessor.utilityCache_ could be accessed before the first exchange > finished. The second is that it doesn't resolve the original issue, because > _GridServiceProcessor.onKernelStop_ is followed by > _GridCacheProcessor.onKernelStop_, so caches should be already initialized. > Thus that fix should be reverted. > Revering this fix induces the issue related to accessing the utility cache by > getting cluster name. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-14439) NPE when accessing clustername before first exchange finished
[ https://issues.apache.org/jira/browse/IGNITE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402637#comment-17402637 ] Pavel Vinokurov commented on IGNITE-14439: -- Hi [~francopo], most probably NPE was caused by another issue because before calling GridServiceProcessor.onKernalStart the system cache should be initialised. Is those NPE being repeated all the time? > NPE when accessing clustername before first exchange finished > - > > Key: IGNITE-14439 > URL: https://issues.apache.org/jira/browse/IGNITE-14439 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.9 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Major > Fix For: 2.11 > > Time Spent: 20m > Remaining Estimate: 0h > > [IGNITE-11406|https://issues.apache.org/jira/browse/IGNITE-11406] has not > been fixed properly for two reasons. The first is one is that > _GridCacheProcessor.utilityCache_ could be accessed before the first exchange > finished. The second is that it doesn't resolve the original issue, because > _GridServiceProcessor.onKernelStop_ is followed by > _GridCacheProcessor.onKernelStop_, so caches should be already initialized. > Thus that fix should be reverted. > Revering this fix induces the issue related to accessing the utility cache by > getting cluster name. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (IGNITE-13000) Connection.prepareStatement(String,int) always throws UnsupportedException ignoring 'autoGeneratedKeys' parameter
[ https://issues.apache.org/jira/browse/IGNITE-13000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov reassigned IGNITE-13000: Assignee: (was: Pavel Vinokurov) > Connection.prepareStatement(String,int) always throws UnsupportedException > ignoring 'autoGeneratedKeys' parameter > --- > > Key: IGNITE-13000 > URL: https://issues.apache.org/jira/browse/IGNITE-13000 > Project: Ignite > Issue Type: Bug > Components: sql >Affects Versions: 2.8 >Reporter: Pavel Vinokurov >Priority: Major > > Below the method call throwing Exception. > {code:java} > conn.prepareStatement(query, Statement.NO_GENERATED_KEYS) > {code} > But there is should be the same result as for: > {code:java} > conn.prepareStatement(query) > {code} > The possible fix: > {code:java} > @Override > public PreparedStatement prepareStatement(String sql, int autoGeneratedKeys) > throws SQLException { > ensureNotClosed(); > if(autoGeneratedKeys == Statement.RETURN_GENERATED_KEYS) > throw new SQLFeatureNotSupportedException("Auto generated keys are > not supported."); > return prepareStatement(sql); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (IGNITE-14439) NPE when accessing clustername before first exchange finished
[ https://issues.apache.org/jira/browse/IGNITE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov reassigned IGNITE-14439: Assignee: Pavel Vinokurov > NPE when accessing clustername before first exchange finished > - > > Key: IGNITE-14439 > URL: https://issues.apache.org/jira/browse/IGNITE-14439 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.9 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > [IGNITE-11406|https://issues.apache.org/jira/browse/IGNITE-11406] has not > been fixed properly for two reasons. The first is one is that > _GridCacheProcessor.utilityCache_ could be accessed before the first exchange > finished. The second is that it doesn't resolve the original issue, because > _GridServiceProcessor.onKernelStop_ is followed by > _GridCacheProcessor.onKernelStop_, so caches should be already initialized. > Thus that fix should be reverted. > Revering this fix induces the issue related to accessing the utility cache by > getting cluster name. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-14439) NPE when accessing clustername before first exchange finished
[ https://issues.apache.org/jira/browse/IGNITE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312174#comment-17312174 ] Pavel Vinokurov commented on IGNITE-14439: -- [~ilyak] Please review > NPE when accessing clustername before first exchange finished > - > > Key: IGNITE-14439 > URL: https://issues.apache.org/jira/browse/IGNITE-14439 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.9 >Reporter: Pavel Vinokurov >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > [IGNITE-11406|https://issues.apache.org/jira/browse/IGNITE-11406] has not > been fixed properly for two reasons. The first is one is that > _GridCacheProcessor.utilityCache_ could be accessed before the first exchange > finished. The second is that it doesn't resolve the original issue, because > _GridServiceProcessor.onKernelStop_ is followed by > _GridCacheProcessor.onKernelStop_, so caches should be already initialized. > Thus that fix should be reverted. > Revering this fix induces the issue related to accessing the utility cache by > getting cluster name. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-14443) Calcite integration. SqlFirstLastValueAggFunction support
[ https://issues.apache.org/jira/browse/IGNITE-14443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-14443: - Priority: Major (was: Minor) > Calcite integration. SqlFirstLastValueAggFunction support > - > > Key: IGNITE-14443 > URL: https://issues.apache.org/jira/browse/IGNITE-14443 > Project: Ignite > Issue Type: New Feature > Components: sql >Affects Versions: 3.0.0-alpha1 >Reporter: Pavel Vinokurov >Priority: Major > > We need to support aggregation functions, especially > SqlFirstLastValueAggFunction that allows simplify and optimize the wide range > of sql queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14443) Calcite integration. SqlFirstLastValueAggFunction support
Pavel Vinokurov created IGNITE-14443: Summary: Calcite integration. SqlFirstLastValueAggFunction support Key: IGNITE-14443 URL: https://issues.apache.org/jira/browse/IGNITE-14443 Project: Ignite Issue Type: New Feature Components: sql Affects Versions: 3.0.0-alpha1 Reporter: Pavel Vinokurov We need to support aggregation functions, especially SqlFirstLastValueAggFunction that allows simplify and optimize the wide range of sql queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14439) NPE when accessing clustername before first exchange finished
Pavel Vinokurov created IGNITE-14439: Summary: NPE when accessing clustername before first exchange finished Key: IGNITE-14439 URL: https://issues.apache.org/jira/browse/IGNITE-14439 Project: Ignite Issue Type: Bug Components: cache Affects Versions: 2.9 Reporter: Pavel Vinokurov [IGNITE-11406|https://issues.apache.org/jira/browse/IGNITE-11406] has not been fixed properly for two reasons. The first is one is that _GridCacheProcessor.utilityCache_ could be accessed before the first exchange finished. The second is that it doesn't resolve the original issue, because _GridServiceProcessor.onKernelStop_ is followed by _GridCacheProcessor.onKernelStop_, so caches should be already initialized. Thus that fix should be reverted. Revering this fix induces the issue related to accessing the utility cache by getting cluster name. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-14263) Failure handler is triggered by NPE on unstable topology
Pavel Vinokurov created IGNITE-14263: Summary: Failure handler is triggered by NPE on unstable topology Key: IGNITE-14263 URL: https://issues.apache.org/jira/browse/IGNITE-14263 Project: Ignite Issue Type: Bug Affects Versions: 2.9.1 Reporter: Pavel Vinokurov Attachments: Reproducer.java Restarting servers and clients produced the following exception: {code:java} SEVERE: Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteCheckedException: null]] class org.apache.ignite.IgniteCheckedException: null at org.apache.ignite.internal.util.IgniteUtils.cast(IgniteUtils.java:7759) at org.apache.ignite.internal.util.future.GridFutureAdapter.resolve(GridFutureAdapter.java:268) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:217) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:168) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3431) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3222) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.ownOrphans(GridDhtPartitionTopologyImpl.java:2075) at org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.onExchangeDone(GridDhtPartitionTopologyImpl.java:2059) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:2535) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:159) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:475) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture$8.run(GridDhtPartitionsExchangeFuture.java:5119) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.initDone(GridDhtPartitionsExchangeFuture.java:5002) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.clientOnlyExchange(GridDhtPartitionsExchangeFuture.java:1592) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:1052) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3403) ... 3 more {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-14256) SQL delete statement ignores skipOnReduce and local flags for replicated caches
[ https://issues.apache.org/jira/browse/IGNITE-14256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-14256: - Attachment: DeleteTest.java > SQL delete statement ignores skipOnReduce and local flags for replicated > caches > --- > > Key: IGNITE-14256 > URL: https://issues.apache.org/jira/browse/IGNITE-14256 > Project: Ignite > Issue Type: Bug > Components: sql >Affects Versions: 2.9.1 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: DeleteTest.java > > > Delete statement removes data from all nodes ignoring enabled lazy and > skipOnReduce flags. > The reproducer is attached. > Below the stacktrace > {code:java} > "sys-stripe-4-#68%5f5ea90d-6614-448f-9df7-0d770f0b216d%" #111 prio=5 > os_prio=0 tid=0x7fbac1d59000 nid=0x7329 runnable [0x7fba8c7ed000] >java.lang.Thread.State: RUNNABLE > at sun.nio.ch.EPollArrayWrapper.interrupt(Native Method) > at sun.nio.ch.EPollArrayWrapper.interrupt(EPollArrayWrapper.java:317) > at sun.nio.ch.EPollSelectorImpl.wakeup(EPollSelectorImpl.java:207) > - locked <0x0005cc4475f8> (a java.lang.Object) > at > org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.offer(GridNioServer.java:1988) > at > org.apache.ignite.internal.util.nio.GridNioServer.send0(GridNioServer.java:652) > at > org.apache.ignite.internal.util.nio.GridNioServer.send(GridNioServer.java:620) > at > org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onSessionWrite(GridNioServer.java:3704) > at > org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionWrite(GridNioFilterAdapter.java:120) > at > org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onSessionWrite(GridConnectionBytesVerifyFilter.java:80) > at > org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionWrite(GridNioFilterAdapter.java:120) > at > org.apache.ignite.internal.util.nio.GridNioCodecFilter.onSessionWrite(GridNioCodecFilter.java:90) > at > org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionWrite(GridNioFilterAdapter.java:120) > at > org.apache.ignite.internal.util.nio.GridNioFilterChain$TailFilter.onSessionWrite(GridNioFilterChain.java:268) > at > org.apache.ignite.internal.util.nio.GridNioFilterChain.onSessionWrite(GridNioFilterChain.java:191) > at > org.apache.ignite.internal.util.nio.GridNioSessionImpl.sendNoFuture(GridNioSessionImpl.java:129) > at > org.apache.ignite.internal.util.nio.GridTcpNioCommunicationClient.sendMessage(GridTcpNioCommunicationClient.java:115) > at > org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1182) > at > org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1124) > at > org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1809) > at > org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1923) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1257) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1296) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.sendDhtRequests(GridDhtAtomicAbstractUpdateFuture.java:489) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.map(GridDhtAtomicAbstractUpdateFuture.java:445) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1926) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1679) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3190) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:151) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:286) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:281) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142) > at >
[jira] [Created] (IGNITE-14256) SQL delete statement ignores skipOnReduce and local flags for replicated caches
Pavel Vinokurov created IGNITE-14256: Summary: SQL delete statement ignores skipOnReduce and local flags for replicated caches Key: IGNITE-14256 URL: https://issues.apache.org/jira/browse/IGNITE-14256 Project: Ignite Issue Type: Bug Components: sql Affects Versions: 2.9.1 Reporter: Pavel Vinokurov Delete statement removes data from all nodes ignoring enabled lazy and skipOnReduce flags. The reproducer is attached. Below the stacktrace {code:java} "sys-stripe-4-#68%5f5ea90d-6614-448f-9df7-0d770f0b216d%" #111 prio=5 os_prio=0 tid=0x7fbac1d59000 nid=0x7329 runnable [0x7fba8c7ed000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.interrupt(Native Method) at sun.nio.ch.EPollArrayWrapper.interrupt(EPollArrayWrapper.java:317) at sun.nio.ch.EPollSelectorImpl.wakeup(EPollSelectorImpl.java:207) - locked <0x0005cc4475f8> (a java.lang.Object) at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.offer(GridNioServer.java:1988) at org.apache.ignite.internal.util.nio.GridNioServer.send0(GridNioServer.java:652) at org.apache.ignite.internal.util.nio.GridNioServer.send(GridNioServer.java:620) at org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onSessionWrite(GridNioServer.java:3704) at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionWrite(GridNioFilterAdapter.java:120) at org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onSessionWrite(GridConnectionBytesVerifyFilter.java:80) at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionWrite(GridNioFilterAdapter.java:120) at org.apache.ignite.internal.util.nio.GridNioCodecFilter.onSessionWrite(GridNioCodecFilter.java:90) at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionWrite(GridNioFilterAdapter.java:120) at org.apache.ignite.internal.util.nio.GridNioFilterChain$TailFilter.onSessionWrite(GridNioFilterChain.java:268) at org.apache.ignite.internal.util.nio.GridNioFilterChain.onSessionWrite(GridNioFilterChain.java:191) at org.apache.ignite.internal.util.nio.GridNioSessionImpl.sendNoFuture(GridNioSessionImpl.java:129) at org.apache.ignite.internal.util.nio.GridTcpNioCommunicationClient.sendMessage(GridTcpNioCommunicationClient.java:115) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1182) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1124) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1809) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1923) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1257) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1296) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.sendDhtRequests(GridDhtAtomicAbstractUpdateFuture.java:489) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicAbstractUpdateFuture.map(GridDhtAtomicAbstractUpdateFuture.java:445) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1926) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1679) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3190) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:151) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:286) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:281) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:392) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:318) at
[jira] [Created] (IGNITE-13989) Destroy of persisted cache doesn't remove cache folder
Pavel Vinokurov created IGNITE-13989: Summary: Destroy of persisted cache doesn't remove cache folder Key: IGNITE-13989 URL: https://issues.apache.org/jira/browse/IGNITE-13989 Project: Ignite Issue Type: Bug Affects Versions: 2.9.1 Reporter: Pavel Vinokurov IgniteCache#destroy doesn't remove the folder in the persistent storage. Creating/Destroying dynamic caches could clutter the PDS and meet with the system limits -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-13960) Starvation in mgmt pool caused by MetadataTask execution
[ https://issues.apache.org/jira/browse/IGNITE-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1726#comment-1726 ] Pavel Vinokurov commented on IGNITE-13960: -- [~tledkov] Please review > Starvation in mgmt pool caused by MetadataTask execution > - > > Key: IGNITE-13960 > URL: https://issues.apache.org/jira/browse/IGNITE-13960 > Project: Ignite > Issue Type: Bug > Components: compute >Affects Versions: 2.9.1 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > *Issue:* > Requesting cache metadata from multiple threads causes starvation in the mgmt > pool. > *Root Cause:* > From the mgmt pool GridCacheCommandHandler.MetadataJob calls > GridCacheQueryManager#sqlMetadata() and > GridClosureProcessor#callAsyncNoFailover().get() that executes and waits an > another internal task. The job response of this task should be also handled > from the mgmt pool. It causes starvation. > *Proposed Fix:* > Make GridCacheQueryManager#sqlMetadata() asynchronous and apply continuation > for GridCacheCommandHandler.MetadataJob to release a mgmt thread for the time > of completing the future returned by sqlMetadata(). > Attached threads with hanging threads: > {code:java} > "mgmt-#10633" #14311 prio=5 os_prio=0 tid=0x560c79117000 nid=0x134c6 > waiting on condition [0x7f15baa77000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140) > at > org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.sqlMetadata(GridCacheQueryManager.java:1803) > at > org.apache.ignite.internal.processors.rest.handlers.cache.GridCacheCommandHandler$MetadataJob.execute(GridCacheCommandHandler.java:1123) > at > org.apache.ignite.internal.processors.rest.handlers.cache.GridCacheCommandHandler$MetadataJob.execute(GridCacheCommandHandler.java:1088) > at > org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:567) > at > org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7069) > at > org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:561) > at > org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:490) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) > at > org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1270) > at > org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:2088) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1635) > at > org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1255) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4300(GridIoManager.java:144) > at > org.apache.ignite.internal.managers.communication.GridIoManager$8.execute(GridIoManager.java:1144) > at > org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:50) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "mgmt-#81" #270 prio=5 os_prio=0 tid=0x562323c3c800 nid=0x592 waiting on > condition [0x7fba5f378000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140) > at > org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor$ClientChangeGlobalStateComputeRequest.run(GridClusterStateProcessor.java:1979) > at > org.apache.ignite.internal.processors.closure.GridClosureProcessor$C4.execute(GridClosureProcessor.java:1943) > at > org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:567) > at > org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7069) > at >
[jira] [Created] (IGNITE-13960) Starvation in mgmt pool caused by MetadataTask execution
Pavel Vinokurov created IGNITE-13960: Summary: Starvation in mgmt pool caused by MetadataTask execution Key: IGNITE-13960 URL: https://issues.apache.org/jira/browse/IGNITE-13960 Project: Ignite Issue Type: Bug Components: compute Affects Versions: 2.9.1 Reporter: Pavel Vinokurov Assignee: Pavel Vinokurov *Issue:* Requesting cache metadata from multiple threads causes starvation in the mgmt pool. *Root Cause:* >From the mgmt pool GridCacheCommandHandler.MetadataJob calls >GridCacheQueryManager#sqlMetadata() and >GridClosureProcessor#callAsyncNoFailover().get() that executes and waits an >another internal task. The job response of this task should be also handled >from the mgmt pool. It causes starvation. *Proposed Fix:* Make GridCacheQueryManager#sqlMetadata() asynchronous and apply continuation for GridCacheCommandHandler.MetadataJob to release a mgmt thread for the time of completing the future returned by sqlMetadata(). Attached threads with hanging threads: {code:java} "mgmt-#10633" #14311 prio=5 os_prio=0 tid=0x560c79117000 nid=0x134c6 waiting on condition [0x7f15baa77000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140) at org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.sqlMetadata(GridCacheQueryManager.java:1803) at org.apache.ignite.internal.processors.rest.handlers.cache.GridCacheCommandHandler$MetadataJob.execute(GridCacheCommandHandler.java:1123) at org.apache.ignite.internal.processors.rest.handlers.cache.GridCacheCommandHandler$MetadataJob.execute(GridCacheCommandHandler.java:1088) at org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:567) at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7069) at org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:561) at org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:490) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1270) at org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:2088) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1635) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1255) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4300(GridIoManager.java:144) at org.apache.ignite.internal.managers.communication.GridIoManager$8.execute(GridIoManager.java:1144) at org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:50) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) "mgmt-#81" #270 prio=5 os_prio=0 tid=0x562323c3c800 nid=0x592 waiting on condition [0x7fba5f378000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140) at org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor$ClientChangeGlobalStateComputeRequest.run(GridClusterStateProcessor.java:1979) at org.apache.ignite.internal.processors.closure.GridClosureProcessor$C4.execute(GridClosureProcessor.java:1943) at org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:567) at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7069) at org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:561) at org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:490) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1270) at
[jira] [Updated] (IGNITE-11406) NullPointerException may occur on client start
[ https://issues.apache.org/jira/browse/IGNITE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11406: - Summary: NullPointerException may occur on client start (was: Fix NullPointerException on client start) > NullPointerException may occur on client start > -- > > Key: IGNITE-11406 > URL: https://issues.apache.org/jira/browse/IGNITE-11406 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.9 >Reporter: Dmitry Sherstobitov >Assignee: Pavel Vinokurov >Priority: Critical > Fix For: 2.10 > > Time Spent: 40m > Remaining Estimate: 0h > > During testing fixes for https://issues.apache.org/jira/browse/IGNITE-10878 > # Start cluster, create caches with no persistence and load data into it > # Restart each node in cluster by order (coordinator first) > Do not wait until topology message occurs > # Try to run utilities: activate, baseline (to check that cluster is alive) > # Run clients and load data into alive caches > On 4th step one of the clients throw NPE on start > {code:java} > 2019-02-23T18:36:24,045][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Connection closed, local node received force fail message, will not try to > restore connection > 2019-02-23T18:36:24,045][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Failed to restore closed connection, will try to reconnect > [networkTimeout=5000, joinTimeout=0, failMsg=TcpDiscoveryNodeFailedMessage > [failedNodeId=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, order=90, warning=Client > node considered as unreachable and will be dropped from cluster, because no > metrics update messages received in interval: > TcpDiscoverySpi.clientFailureDetectionTimeout() ms. It may be caused by > network problems or long GC pause on client node, try to increase this > parameter. [nodeId=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, > clientFailureDetectionTimeout=3], super=TcpDiscoveryAbstractMessage > [sndNodeId=987d4a03-8233-4130-af5b-c06900bdb6d7, > id=3642cfa1961-987d4a03-8233-4130-af5b-c06900bdb6d7, > verifierNodeId=d9abbff3-4b4d-4a13-9cb1-0ca4d2436164, topVer=167, > pendingIdx=0, failedNodes=null, isClient=false]]] > 2019-02-23T18:36:24,046][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Discovery notification [node=TcpDiscoveryNode > [id=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, addrs=[172.25.1.34], > sockAddrs=[lab34.gridgain.local/172.25.1.34:0], discPort=0, order=165, > intOrder=0, lastExchangeTime=1550936128313, loc=true, > ver=2.4.15#20190222-sha1:36b1d676, isClient=true], > type=CLIENT_NODE_DISCONNECTED, topVer=166] > 2019-02-23T18:36:24,049][INFO > ][tcp-client-disco-msg-worker-#4][GridDhtPartitionsExchangeFuture] Finish > exchange future [startVer=AffinityTopologyVersion [topVer=165, > minorTopVer=0], resVer=null, err=class > org.apache.ignite.internal.IgniteClientDisconnectedCheckedException: Client > node disconnected: null] > [2019-02-23T18:36:24,061][ERROR][Thread-2][IgniteKernal] Got exception while > starting (will rollback startup routine). > java.lang.NullPointerException: null > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.internalCacheEx(GridCacheProcessor.java:3886) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.utilityCache(GridCacheProcessor.java:3858) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.updateUtilityCache(GridServiceProcessor.java:290) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart0(GridServiceProcessor.java:233) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart(GridServiceProcessor.java:221) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1038) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1973) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1716) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1144) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1062) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:948) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:847) >
[jira] [Updated] (IGNITE-11406) Fix NullPointerException on client start
[ https://issues.apache.org/jira/browse/IGNITE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11406: - Summary: Fix NullPointerException on client start (was: NullPointerException may occur on client start) > Fix NullPointerException on client start > > > Key: IGNITE-11406 > URL: https://issues.apache.org/jira/browse/IGNITE-11406 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.9 >Reporter: Dmitry Sherstobitov >Assignee: Pavel Vinokurov >Priority: Critical > Fix For: 2.10 > > Time Spent: 40m > Remaining Estimate: 0h > > During testing fixes for https://issues.apache.org/jira/browse/IGNITE-10878 > # Start cluster, create caches with no persistence and load data into it > # Restart each node in cluster by order (coordinator first) > Do not wait until topology message occurs > # Try to run utilities: activate, baseline (to check that cluster is alive) > # Run clients and load data into alive caches > On 4th step one of the clients throw NPE on start > {code:java} > 2019-02-23T18:36:24,045][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Connection closed, local node received force fail message, will not try to > restore connection > 2019-02-23T18:36:24,045][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Failed to restore closed connection, will try to reconnect > [networkTimeout=5000, joinTimeout=0, failMsg=TcpDiscoveryNodeFailedMessage > [failedNodeId=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, order=90, warning=Client > node considered as unreachable and will be dropped from cluster, because no > metrics update messages received in interval: > TcpDiscoverySpi.clientFailureDetectionTimeout() ms. It may be caused by > network problems or long GC pause on client node, try to increase this > parameter. [nodeId=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, > clientFailureDetectionTimeout=3], super=TcpDiscoveryAbstractMessage > [sndNodeId=987d4a03-8233-4130-af5b-c06900bdb6d7, > id=3642cfa1961-987d4a03-8233-4130-af5b-c06900bdb6d7, > verifierNodeId=d9abbff3-4b4d-4a13-9cb1-0ca4d2436164, topVer=167, > pendingIdx=0, failedNodes=null, isClient=false]]] > 2019-02-23T18:36:24,046][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Discovery notification [node=TcpDiscoveryNode > [id=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, addrs=[172.25.1.34], > sockAddrs=[lab34.gridgain.local/172.25.1.34:0], discPort=0, order=165, > intOrder=0, lastExchangeTime=1550936128313, loc=true, > ver=2.4.15#20190222-sha1:36b1d676, isClient=true], > type=CLIENT_NODE_DISCONNECTED, topVer=166] > 2019-02-23T18:36:24,049][INFO > ][tcp-client-disco-msg-worker-#4][GridDhtPartitionsExchangeFuture] Finish > exchange future [startVer=AffinityTopologyVersion [topVer=165, > minorTopVer=0], resVer=null, err=class > org.apache.ignite.internal.IgniteClientDisconnectedCheckedException: Client > node disconnected: null] > [2019-02-23T18:36:24,061][ERROR][Thread-2][IgniteKernal] Got exception while > starting (will rollback startup routine). > java.lang.NullPointerException: null > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.internalCacheEx(GridCacheProcessor.java:3886) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.utilityCache(GridCacheProcessor.java:3858) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.updateUtilityCache(GridServiceProcessor.java:290) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart0(GridServiceProcessor.java:233) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart(GridServiceProcessor.java:221) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1038) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1973) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1716) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1144) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1062) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:948) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:847) >
[jira] [Commented] (IGNITE-11406) NullPointerException may occur on client start
[ https://issues.apache.org/jira/browse/IGNITE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17255422#comment-17255422 ] Pavel Vinokurov commented on IGNITE-11406: -- [~ilyak] Fixed! > NullPointerException may occur on client start > -- > > Key: IGNITE-11406 > URL: https://issues.apache.org/jira/browse/IGNITE-11406 > Project: Ignite > Issue Type: Bug >Reporter: Dmitry Sherstobitov >Assignee: Pavel Vinokurov >Priority: Critical > Time Spent: 0.5h > Remaining Estimate: 0h > > During testing fixes for https://issues.apache.org/jira/browse/IGNITE-10878 > # Start cluster, create caches with no persistence and load data into it > # Restart each node in cluster by order (coordinator first) > Do not wait until topology message occurs > # Try to run utilities: activate, baseline (to check that cluster is alive) > # Run clients and load data into alive caches > On 4th step one of the clients throw NPE on start > {code:java} > 2019-02-23T18:36:24,045][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Connection closed, local node received force fail message, will not try to > restore connection > 2019-02-23T18:36:24,045][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Failed to restore closed connection, will try to reconnect > [networkTimeout=5000, joinTimeout=0, failMsg=TcpDiscoveryNodeFailedMessage > [failedNodeId=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, order=90, warning=Client > node considered as unreachable and will be dropped from cluster, because no > metrics update messages received in interval: > TcpDiscoverySpi.clientFailureDetectionTimeout() ms. It may be caused by > network problems or long GC pause on client node, try to increase this > parameter. [nodeId=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, > clientFailureDetectionTimeout=3], super=TcpDiscoveryAbstractMessage > [sndNodeId=987d4a03-8233-4130-af5b-c06900bdb6d7, > id=3642cfa1961-987d4a03-8233-4130-af5b-c06900bdb6d7, > verifierNodeId=d9abbff3-4b4d-4a13-9cb1-0ca4d2436164, topVer=167, > pendingIdx=0, failedNodes=null, isClient=false]]] > 2019-02-23T18:36:24,046][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Discovery notification [node=TcpDiscoveryNode > [id=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, addrs=[172.25.1.34], > sockAddrs=[lab34.gridgain.local/172.25.1.34:0], discPort=0, order=165, > intOrder=0, lastExchangeTime=1550936128313, loc=true, > ver=2.4.15#20190222-sha1:36b1d676, isClient=true], > type=CLIENT_NODE_DISCONNECTED, topVer=166] > 2019-02-23T18:36:24,049][INFO > ][tcp-client-disco-msg-worker-#4][GridDhtPartitionsExchangeFuture] Finish > exchange future [startVer=AffinityTopologyVersion [topVer=165, > minorTopVer=0], resVer=null, err=class > org.apache.ignite.internal.IgniteClientDisconnectedCheckedException: Client > node disconnected: null] > [2019-02-23T18:36:24,061][ERROR][Thread-2][IgniteKernal] Got exception while > starting (will rollback startup routine). > java.lang.NullPointerException: null > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.internalCacheEx(GridCacheProcessor.java:3886) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.utilityCache(GridCacheProcessor.java:3858) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.updateUtilityCache(GridServiceProcessor.java:290) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart0(GridServiceProcessor.java:233) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart(GridServiceProcessor.java:221) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1038) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1973) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1716) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1144) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1062) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:948) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:847) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:717) > [ignite-core-2.4.15.jar:2.4.15]
[jira] [Commented] (IGNITE-11406) NullPointerException may occur on client start
[ https://issues.apache.org/jira/browse/IGNITE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17254839#comment-17254839 ] Pavel Vinokurov commented on IGNITE-11406: -- [~ilyak] Please review > NullPointerException may occur on client start > -- > > Key: IGNITE-11406 > URL: https://issues.apache.org/jira/browse/IGNITE-11406 > Project: Ignite > Issue Type: Bug >Reporter: Dmitry Sherstobitov >Assignee: Pavel Vinokurov >Priority: Critical > Time Spent: 0.5h > Remaining Estimate: 0h > > During testing fixes for https://issues.apache.org/jira/browse/IGNITE-10878 > # Start cluster, create caches with no persistence and load data into it > # Restart each node in cluster by order (coordinator first) > Do not wait until topology message occurs > # Try to run utilities: activate, baseline (to check that cluster is alive) > # Run clients and load data into alive caches > On 4th step one of the clients throw NPE on start > {code:java} > 2019-02-23T18:36:24,045][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Connection closed, local node received force fail message, will not try to > restore connection > 2019-02-23T18:36:24,045][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Failed to restore closed connection, will try to reconnect > [networkTimeout=5000, joinTimeout=0, failMsg=TcpDiscoveryNodeFailedMessage > [failedNodeId=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, order=90, warning=Client > node considered as unreachable and will be dropped from cluster, because no > metrics update messages received in interval: > TcpDiscoverySpi.clientFailureDetectionTimeout() ms. It may be caused by > network problems or long GC pause on client node, try to increase this > parameter. [nodeId=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, > clientFailureDetectionTimeout=3], super=TcpDiscoveryAbstractMessage > [sndNodeId=987d4a03-8233-4130-af5b-c06900bdb6d7, > id=3642cfa1961-987d4a03-8233-4130-af5b-c06900bdb6d7, > verifierNodeId=d9abbff3-4b4d-4a13-9cb1-0ca4d2436164, topVer=167, > pendingIdx=0, failedNodes=null, isClient=false]]] > 2019-02-23T18:36:24,046][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Discovery notification [node=TcpDiscoveryNode > [id=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, addrs=[172.25.1.34], > sockAddrs=[lab34.gridgain.local/172.25.1.34:0], discPort=0, order=165, > intOrder=0, lastExchangeTime=1550936128313, loc=true, > ver=2.4.15#20190222-sha1:36b1d676, isClient=true], > type=CLIENT_NODE_DISCONNECTED, topVer=166] > 2019-02-23T18:36:24,049][INFO > ][tcp-client-disco-msg-worker-#4][GridDhtPartitionsExchangeFuture] Finish > exchange future [startVer=AffinityTopologyVersion [topVer=165, > minorTopVer=0], resVer=null, err=class > org.apache.ignite.internal.IgniteClientDisconnectedCheckedException: Client > node disconnected: null] > [2019-02-23T18:36:24,061][ERROR][Thread-2][IgniteKernal] Got exception while > starting (will rollback startup routine). > java.lang.NullPointerException: null > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.internalCacheEx(GridCacheProcessor.java:3886) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.utilityCache(GridCacheProcessor.java:3858) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.updateUtilityCache(GridServiceProcessor.java:290) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart0(GridServiceProcessor.java:233) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart(GridServiceProcessor.java:221) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1038) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1973) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1716) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1144) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1062) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:948) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:847) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:717) >
[jira] [Assigned] (IGNITE-11406) NullPointerException may occur on client start
[ https://issues.apache.org/jira/browse/IGNITE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov reassigned IGNITE-11406: Assignee: Pavel Vinokurov > NullPointerException may occur on client start > -- > > Key: IGNITE-11406 > URL: https://issues.apache.org/jira/browse/IGNITE-11406 > Project: Ignite > Issue Type: Bug >Reporter: Dmitry Sherstobitov >Assignee: Pavel Vinokurov >Priority: Critical > > During testing fixes for https://issues.apache.org/jira/browse/IGNITE-10878 > # Start cluster, create caches with no persistence and load data into it > # Restart each node in cluster by order (coordinator first) > Do not wait until topology message occurs > # Try to run utilities: activate, baseline (to check that cluster is alive) > # Run clients and load data into alive caches > On 4th step one of the clients throw NPE on start > {code:java} > 2019-02-23T18:36:24,045][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Connection closed, local node received force fail message, will not try to > restore connection > 2019-02-23T18:36:24,045][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Failed to restore closed connection, will try to reconnect > [networkTimeout=5000, joinTimeout=0, failMsg=TcpDiscoveryNodeFailedMessage > [failedNodeId=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, order=90, warning=Client > node considered as unreachable and will be dropped from cluster, because no > metrics update messages received in interval: > TcpDiscoverySpi.clientFailureDetectionTimeout() ms. It may be caused by > network problems or long GC pause on client node, try to increase this > parameter. [nodeId=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, > clientFailureDetectionTimeout=3], super=TcpDiscoveryAbstractMessage > [sndNodeId=987d4a03-8233-4130-af5b-c06900bdb6d7, > id=3642cfa1961-987d4a03-8233-4130-af5b-c06900bdb6d7, > verifierNodeId=d9abbff3-4b4d-4a13-9cb1-0ca4d2436164, topVer=167, > pendingIdx=0, failedNodes=null, isClient=false]]] > 2019-02-23T18:36:24,046][DEBUG][tcp-client-disco-msg-worker-#4][TcpDiscoverySpi] > Discovery notification [node=TcpDiscoveryNode > [id=80f8b6ee-6a6d-4235-86e9-1b66ea310eb6, addrs=[172.25.1.34], > sockAddrs=[lab34.gridgain.local/172.25.1.34:0], discPort=0, order=165, > intOrder=0, lastExchangeTime=1550936128313, loc=true, > ver=2.4.15#20190222-sha1:36b1d676, isClient=true], > type=CLIENT_NODE_DISCONNECTED, topVer=166] > 2019-02-23T18:36:24,049][INFO > ][tcp-client-disco-msg-worker-#4][GridDhtPartitionsExchangeFuture] Finish > exchange future [startVer=AffinityTopologyVersion [topVer=165, > minorTopVer=0], resVer=null, err=class > org.apache.ignite.internal.IgniteClientDisconnectedCheckedException: Client > node disconnected: null] > [2019-02-23T18:36:24,061][ERROR][Thread-2][IgniteKernal] Got exception while > starting (will rollback startup routine). > java.lang.NullPointerException: null > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.internalCacheEx(GridCacheProcessor.java:3886) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.utilityCache(GridCacheProcessor.java:3858) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.updateUtilityCache(GridServiceProcessor.java:290) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart0(GridServiceProcessor.java:233) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.processors.service.GridServiceProcessor.onKernalStart(GridServiceProcessor.java:221) > ~[ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1038) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1973) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1716) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1144) > [ignite-core-2.4.15.jar:2.4.15] > at > org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1062) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:948) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:847) > [ignite-core-2.4.15.jar:2.4.15] > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:717) > [ignite-core-2.4.15.jar:2.4.15] > at
[jira] [Commented] (IGNITE-13507) NullPointerException on tx recovery
[ https://issues.apache.org/jira/browse/IGNITE-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17247090#comment-17247090 ] Pavel Vinokurov commented on IGNITE-13507: -- [~ilyak] Please reivew > NullPointerException on tx recovery > --- > > Key: IGNITE-13507 > URL: https://issues.apache.org/jira/browse/IGNITE-13507 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.7.5 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Server node failed because of NullPointerException on tx recovery: > {code:java} > [17:15:02,428][SEVERE][sys-#305][] Critical system error detected. Will be > handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler > [tryStop=false, timeout=0, super=AbstractFailureHandler > [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, > SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext > [type=CRITICAL_ERROR, err=class o.a.i.IgniteException: Failed to perform tx > recovery]] > class org.apache.ignite.IgniteException: Failed to perform tx recovery > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.run(IgniteTxManager.java:3288) > at > org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7186) > at > org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:826) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at > org.apache.ignite.internal.IgniteFeatures.nodeSupports(IgniteFeatures.java:212) > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.isMvccRecoveryMessageRequired(IgniteTxManager.java:3304) > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.run(IgniteTxManager.java:3208) > ... 6 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-8719) Index left partially built if a node crashes during index create or rebuild
[ https://issues.apache.org/jira/browse/IGNITE-8719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243203#comment-17243203 ] Pavel Vinokurov commented on IGNITE-8719: - The issue still could be reproduced in both cases when an index is creating or rebuilding. Considering impact of this issue I suppose it could be fixed before implementation of IEP-28. > Index left partially built if a node crashes during index create or rebuild > --- > > Key: IGNITE-8719 > URL: https://issues.apache.org/jira/browse/IGNITE-8719 > Project: Ignite > Issue Type: Bug >Reporter: Alexey Goncharuk >Priority: Critical > Attachments: IndexRebuildAfterNodeCrashTest.java, > IndexRebuildingTest.java > > > Currently, we do not have any state associated with the index tree. Consider > the following scenario: > 1) Start node, put some data > 2) start CREATE INDEX operation > 3) Wait for a checkpoint and stop node before index create finished > 4) Restart node > Since the checkpoint finished, the new index tree will be persisted to the > disk, but not all data will be present in the index. > We should somehow store information about initializing index tree and mark it > valid only after all data is indexed. The state should be persisted as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13792) Reconnecting clients trigger failure handler
Pavel Vinokurov created IGNITE-13792: Summary: Reconnecting clients trigger failure handler Key: IGNITE-13792 URL: https://issues.apache.org/jira/browse/IGNITE-13792 Project: Ignite Issue Type: Bug Components: cache Affects Versions: 2.9 Reporter: Pavel Vinokurov Attachments: UnstableClients.java {code:java} Dec 01, 2020 9:38:29 PM java.util.logging.LogManager$RootLogger log SEVERE: JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteCheckedException: Affinity for topology version is not initialized [locNode=b50635ff-0324-431b-bc34-00a6cd36c9e3, grp=ignite-sys-cache, topVer=AffinityTopologyVersion [topVer=570, minorTopVer=0], head=AffinityTopologyVersion [topVer=569, minorTopVer=0], history=[AffinityTopologyVersion [topVer=551, minorTopVer=0], AffinityTopologyVersion [topVer=552, minorTopVer=0], AffinityTopologyVersion [topVer=553, minorTopVer=0], AffinityTopologyVersion [topVer=554, minorTopVer=0], AffinityTopologyVersion [topVer=555, minorTopVer=0], AffinityTopologyVersion [topVer=556, minorTopVer=0], AffinityTopologyVersion [topVer=557, minorTopVer=0], AffinityTopologyVersion [topVer=558, minorTopVer=0], AffinityTopologyVersion [topVer=559, minorTopVer=0], AffinityTopologyVersion [topVer=560, minorTopVer=0], AffinityTopologyVersion [topVer=561, minorTopVer=0], AffinityTopologyVersion [topVer=562, minorTopVer=0], AffinityTopologyVersion [topVer=563, minorTopVer=0], AffinityTopologyVersion [topVer=564, minorTopVer=0], AffinityTopologyVersion [topVer=565, minorTopVer=0], AffinityTopologyVersion [topVer=566, minorTopVer=0], AffinityTopologyVersion [topVer=567, minorTopVer=0], AffinityTopologyVersion [topVer=568, minorTopVer=0], AffinityTopologyVersion [topVer=569, minorTopVer=0] {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13791) NullPointerException when topology is unstable
[ https://issues.apache.org/jira/browse/IGNITE-13791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-13791: - Attachment: (was: UnstableServerTopology.java) > NullPointerException when topology is unstable > --- > > Key: IGNITE-13791 > URL: https://issues.apache.org/jira/browse/IGNITE-13791 > Project: Ignite > Issue Type: Bug > Components: networking >Affects Versions: 2.9.1 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: UnstableServerTopology.java > > > Unstable topology with blinking server nodes leads to the critical system > error: > {code:java} > SEVERE: Critical system error detected. Will be handled accordingly to > configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, > timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet > [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], > failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, > err=java.lang.NullPointerException]] > java.lang.NullPointerException > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddedMessage(ServerImpl.java:5096) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:3236) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2915) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:8064) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:3086) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7995) > at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58) > Dec 01, 2020 8:22:55 PM java.util.logging.LogManager$RootLogger log > SEVERE: JVM will be halted immediately due to the failure: > [failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, > err=java.lang.NullPointerException]] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13791) NullPointerException when topology is unstable
[ https://issues.apache.org/jira/browse/IGNITE-13791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-13791: - Attachment: UnstableServerTopology.java > NullPointerException when topology is unstable > --- > > Key: IGNITE-13791 > URL: https://issues.apache.org/jira/browse/IGNITE-13791 > Project: Ignite > Issue Type: Bug > Components: networking >Affects Versions: 2.9.1 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: UnstableServerTopology.java > > > Unstable topology with blinking server nodes leads to the critical system > error: > {code:java} > SEVERE: Critical system error detected. Will be handled accordingly to > configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, > timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet > [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], > failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, > err=java.lang.NullPointerException]] > java.lang.NullPointerException > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddedMessage(ServerImpl.java:5096) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:3236) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2915) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:8064) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:3086) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) > at > org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7995) > at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58) > Dec 01, 2020 8:22:55 PM java.util.logging.LogManager$RootLogger log > SEVERE: JVM will be halted immediately due to the failure: > [failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, > err=java.lang.NullPointerException]] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13791) NullPointerException when topology is unstable
Pavel Vinokurov created IGNITE-13791: Summary: NullPointerException when topology is unstable Key: IGNITE-13791 URL: https://issues.apache.org/jira/browse/IGNITE-13791 Project: Ignite Issue Type: Bug Components: networking Affects Versions: 2.9.1 Reporter: Pavel Vinokurov Attachments: UnstableServerTopology.java Unstable topology with blinking server nodes leads to the critical system error: {code:java} SEVERE: Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.NullPointerException]] java.lang.NullPointerException at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddedMessage(ServerImpl.java:5096) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:3236) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2915) at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:8064) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:3086) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7995) at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58) Dec 01, 2020 8:22:55 PM java.util.logging.LogManager$RootLogger log SEVERE: JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.NullPointerException]] {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13507) NullPointerException on tx recovery
[ https://issues.apache.org/jira/browse/IGNITE-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-13507: - Summary: NullPointerException on tx recovery (was: NullPointerException error on tx recovery) > NullPointerException on tx recovery > --- > > Key: IGNITE-13507 > URL: https://issues.apache.org/jira/browse/IGNITE-13507 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.7.5 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Major > > Server node failed because of NullPointerException on tx recovery: > {code:java} > [17:15:02,428][SEVERE][sys-#305][] Critical system error detected. Will be > handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler > [tryStop=false, timeout=0, super=AbstractFailureHandler > [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, > SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext > [type=CRITICAL_ERROR, err=class o.a.i.IgniteException: Failed to perform tx > recovery]] > class org.apache.ignite.IgniteException: Failed to perform tx recovery > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.run(IgniteTxManager.java:3288) > at > org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7186) > at > org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:826) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at > org.apache.ignite.internal.IgniteFeatures.nodeSupports(IgniteFeatures.java:212) > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.isMvccRecoveryMessageRequired(IgniteTxManager.java:3304) > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.run(IgniteTxManager.java:3208) > ... 6 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13507) NullPointerException error on tx recovery
[ https://issues.apache.org/jira/browse/IGNITE-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-13507: - Summary: NullPointerException error on tx recovery (was: Critical error on tx recovery) > NullPointerException error on tx recovery > - > > Key: IGNITE-13507 > URL: https://issues.apache.org/jira/browse/IGNITE-13507 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.7.5 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Major > > Server node failed because of NullPointerException on tx recovery: > {code:java} > [17:15:02,428][SEVERE][sys-#305][] Critical system error detected. Will be > handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler > [tryStop=false, timeout=0, super=AbstractFailureHandler > [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, > SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext > [type=CRITICAL_ERROR, err=class o.a.i.IgniteException: Failed to perform tx > recovery]] > class org.apache.ignite.IgniteException: Failed to perform tx recovery > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.run(IgniteTxManager.java:3288) > at > org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7186) > at > org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:826) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at > org.apache.ignite.internal.IgniteFeatures.nodeSupports(IgniteFeatures.java:212) > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.isMvccRecoveryMessageRequired(IgniteTxManager.java:3304) > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.run(IgniteTxManager.java:3208) > ... 6 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (IGNITE-13507) Critical error on tx recovery
[ https://issues.apache.org/jira/browse/IGNITE-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov reassigned IGNITE-13507: Assignee: Pavel Vinokurov > Critical error on tx recovery > - > > Key: IGNITE-13507 > URL: https://issues.apache.org/jira/browse/IGNITE-13507 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.7.5 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Major > > Server node failed because of NullPointerException on tx recovery: > {code:java} > [17:15:02,428][SEVERE][sys-#305][] Critical system error detected. Will be > handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler > [tryStop=false, timeout=0, super=AbstractFailureHandler > [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, > SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext > [type=CRITICAL_ERROR, err=class o.a.i.IgniteException: Failed to perform tx > recovery]] > class org.apache.ignite.IgniteException: Failed to perform tx recovery > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.run(IgniteTxManager.java:3288) > at > org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7186) > at > org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:826) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.NullPointerException > at > org.apache.ignite.internal.IgniteFeatures.nodeSupports(IgniteFeatures.java:212) > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.isMvccRecoveryMessageRequired(IgniteTxManager.java:3304) > at > org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.run(IgniteTxManager.java:3208) > ... 6 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13649) Local cache causes system thread pool overflow
Pavel Vinokurov created IGNITE-13649: Summary: Local cache causes system thread pool overflow Key: IGNITE-13649 URL: https://issues.apache.org/jira/browse/IGNITE-13649 Project: Ignite Issue Type: Bug Components: cache Affects Versions: 2.8.1 Reporter: Pavel Vinokurov Attachments: LocalCacheAndStoreReproducerClient.java, LocalCacheAndStoreReproducerServer.java Calling get operations for a LOCAL cache with read-through within a long running job causes system thread pool overflow. Scenario: 1. Start 2 server nodes using LocalCacheAndStoreReproducerServer 2. Start 1 client node using LocalCacheAndStoreReproducerClient 3. Forcible stop the client node. Result: The system thread pool is consistently increasing until OOM. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13632) Transaction hangs due to communication failures
Pavel Vinokurov created IGNITE-13632: Summary: Transaction hangs due to communication failures Key: IGNITE-13632 URL: https://issues.apache.org/jira/browse/IGNITE-13632 Project: Ignite Issue Type: Bug Components: cache Affects Versions: 2.8.1 Reporter: Pavel Vinokurov Attachments: TxReproducer.java Transaction hangs after dropping communication messages. The reproducer is attached -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13590) Node fails with "Node with the same ID was found in node IDs history" after missing TcpDiscoveryNodeAddedMessage
[ https://issues.apache.org/jira/browse/IGNITE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-13590: - Description: A new server node sends the join request and doesn't receive TcpDiscoveryNodeAddedMessage due to network issues. The node retries the join request and fails with: {code:java} Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same ID was found in node IDs history or existing node in topology has the same ID (fix configuration and restart local node) {code} Instead of fail down it could retry joining the cluster after failureDetectionTimeout. was: A new server node sends the join request and doesn't receive TcpDiscoveryNodeAddedMessage due to network issues. The node retries the join request and fails down with: {code:java} Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same ID was found in node IDs history or existing node in topology has the same ID (fix configuration and restart local node) {code} Instead of fail down it could retry joining the cluster after failureDetectionTimeout. > Node fails with "Node with the same ID was found in node IDs history" after > missing TcpDiscoveryNodeAddedMessage > > > Key: IGNITE-13590 > URL: https://issues.apache.org/jira/browse/IGNITE-13590 > Project: Ignite > Issue Type: Bug > Components: networking >Affects Versions: 2.8.1 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: TcpDiscoveryMissingNodeAddedMessageTest.class > > > A new server node sends the join request and doesn't receive > TcpDiscoveryNodeAddedMessage due to network issues. > The node retries the join request and fails with: > {code:java} > Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same > ID was found in node IDs history or existing node in topology has the same ID > (fix configuration and restart local node) > {code} > Instead of fail down it could retry joining the cluster after > failureDetectionTimeout. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13590) Node fails with "Node with the same ID was found in node IDs history" after missing TcpDiscoveryNodeAddedMessage
[ https://issues.apache.org/jira/browse/IGNITE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-13590: - Description: A new server node sends the join request and doesn't receive TcpDiscoveryNodeAddedMessage due to network issues. The node retries the join request and fails with: {code:java} Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same ID was found in node IDs history or existing node in topology has the same ID (fix configuration and restart local node) {code} Instead of stopping it could retry joining to the cluster after failureDetectionTimeout. was: A new server node sends the join request and doesn't receive TcpDiscoveryNodeAddedMessage due to network issues. The node retries the join request and fails with: {code:java} Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same ID was found in node IDs history or existing node in topology has the same ID (fix configuration and restart local node) {code} Instead of stopping it could retry joining the cluster after failureDetectionTimeout. > Node fails with "Node with the same ID was found in node IDs history" after > missing TcpDiscoveryNodeAddedMessage > > > Key: IGNITE-13590 > URL: https://issues.apache.org/jira/browse/IGNITE-13590 > Project: Ignite > Issue Type: Bug > Components: networking >Affects Versions: 2.8.1 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: TcpDiscoveryMissingNodeAddedMessageTest.class > > > A new server node sends the join request and doesn't receive > TcpDiscoveryNodeAddedMessage due to network issues. > The node retries the join request and fails with: > {code:java} > Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same > ID was found in node IDs history or existing node in topology has the same ID > (fix configuration and restart local node) > {code} > Instead of stopping it could retry joining to the cluster after > failureDetectionTimeout. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13590) Node fails with "Node with the same ID was found in node IDs history" after missing TcpDiscoveryNodeAddedMessage
[ https://issues.apache.org/jira/browse/IGNITE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-13590: - Attachment: (was: TcpDiscoveryMissingNodeAddedMessageTest.class) > Node fails with "Node with the same ID was found in node IDs history" after > missing TcpDiscoveryNodeAddedMessage > > > Key: IGNITE-13590 > URL: https://issues.apache.org/jira/browse/IGNITE-13590 > Project: Ignite > Issue Type: Bug > Components: networking >Affects Versions: 2.8.1 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: TcpDiscoveryMissingNodeAddedMessageTest.java > > > A new server node sends the join request and doesn't receive > TcpDiscoveryNodeAddedMessage due to network issues. > The node retries the join request and fails with: > {code:java} > Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same > ID was found in node IDs history or existing node in topology has the same ID > (fix configuration and restart local node) > {code} > Instead of stopping it could retry joining to the cluster after > failureDetectionTimeout. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13590) Node fails with "Node with the same ID was found in node IDs history" after missing TcpDiscoveryNodeAddedMessage
[ https://issues.apache.org/jira/browse/IGNITE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-13590: - Attachment: TcpDiscoveryMissingNodeAddedMessageTest.java > Node fails with "Node with the same ID was found in node IDs history" after > missing TcpDiscoveryNodeAddedMessage > > > Key: IGNITE-13590 > URL: https://issues.apache.org/jira/browse/IGNITE-13590 > Project: Ignite > Issue Type: Bug > Components: networking >Affects Versions: 2.8.1 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: TcpDiscoveryMissingNodeAddedMessageTest.java > > > A new server node sends the join request and doesn't receive > TcpDiscoveryNodeAddedMessage due to network issues. > The node retries the join request and fails with: > {code:java} > Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same > ID was found in node IDs history or existing node in topology has the same ID > (fix configuration and restart local node) > {code} > Instead of stopping it could retry joining to the cluster after > failureDetectionTimeout. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13590) Node fails with "Node with the same ID was found in node IDs history" after missing TcpDiscoveryNodeAddedMessage
[ https://issues.apache.org/jira/browse/IGNITE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-13590: - Description: A new server node sends the join request and doesn't receive TcpDiscoveryNodeAddedMessage due to network issues. The node retries the join request and fails down with: {code:java} Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same ID was found in node IDs history or existing node in topology has the same ID (fix configuration and restart local node) {code} Instead of fail down it could retry joining the cluster after failureDetectionTimeout. was: A new server node sends the join request and doesn't receive TcpDiscoveryNodeAddedMessage due to network issues. it retries the join request and fails down with {code:java} Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same ID was found in node IDs history or existing node in topology has the same ID (fix configuration and restart local node) {code} Instead of fail down it could retry joining the cluster after failureDetectionTimeout. > Node fails with "Node with the same ID was found in node IDs history" after > missing TcpDiscoveryNodeAddedMessage > > > Key: IGNITE-13590 > URL: https://issues.apache.org/jira/browse/IGNITE-13590 > Project: Ignite > Issue Type: Bug > Components: networking >Affects Versions: 2.8.1 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: TcpDiscoveryMissingNodeAddedMessageTest.class > > > A new server node sends the join request and doesn't receive > TcpDiscoveryNodeAddedMessage due to network issues. > The node retries the join request and fails down with: > {code:java} > Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same > ID was found in node IDs history or existing node in topology has the same ID > (fix configuration and restart local node) > {code} > Instead of fail down it could retry joining the cluster after > failureDetectionTimeout. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13590) Node fails with "Node with the same ID was found in node IDs history" after missing TcpDiscoveryNodeAddedMessage
[ https://issues.apache.org/jira/browse/IGNITE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-13590: - Description: A new server node sends the join request and doesn't receive TcpDiscoveryNodeAddedMessage due to network issues. The node retries the join request and fails with: {code:java} Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same ID was found in node IDs history or existing node in topology has the same ID (fix configuration and restart local node) {code} Instead of stopping it could retry joining the cluster after failureDetectionTimeout. was: A new server node sends the join request and doesn't receive TcpDiscoveryNodeAddedMessage due to network issues. The node retries the join request and fails with: {code:java} Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same ID was found in node IDs history or existing node in topology has the same ID (fix configuration and restart local node) {code} Instead of fail down it could retry joining the cluster after failureDetectionTimeout. > Node fails with "Node with the same ID was found in node IDs history" after > missing TcpDiscoveryNodeAddedMessage > > > Key: IGNITE-13590 > URL: https://issues.apache.org/jira/browse/IGNITE-13590 > Project: Ignite > Issue Type: Bug > Components: networking >Affects Versions: 2.8.1 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: TcpDiscoveryMissingNodeAddedMessageTest.class > > > A new server node sends the join request and doesn't receive > TcpDiscoveryNodeAddedMessage due to network issues. > The node retries the join request and fails with: > {code:java} > Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same > ID was found in node IDs history or existing node in topology has the same ID > (fix configuration and restart local node) > {code} > Instead of stopping it could retry joining the cluster after > failureDetectionTimeout. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13590) Node fails with "Node with the same ID was found in node IDs history" after missing TcpDiscoveryNodeAddedMessage
Pavel Vinokurov created IGNITE-13590: Summary: Node fails with "Node with the same ID was found in node IDs history" after missing TcpDiscoveryNodeAddedMessage Key: IGNITE-13590 URL: https://issues.apache.org/jira/browse/IGNITE-13590 Project: Ignite Issue Type: Bug Components: networking Affects Versions: 2.8.1 Reporter: Pavel Vinokurov Attachments: TcpDiscoveryMissingNodeAddedMessageTest.class A new server node sends the join request and doesn't receive TcpDiscoveryNodeAddedMessage due to network issues. it retries the join request and fails down with {code:java} Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same ID was found in node IDs history or existing node in topology has the same ID (fix configuration and restart local node) {code} Instead of fail down it could retry joining the cluster after failureDetectionTimeout. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13507) Critical error on tx recovery
Pavel Vinokurov created IGNITE-13507: Summary: Critical error on tx recovery Key: IGNITE-13507 URL: https://issues.apache.org/jira/browse/IGNITE-13507 Project: Ignite Issue Type: Bug Components: cache Affects Versions: 2.7.5 Reporter: Pavel Vinokurov Server node failed because of NullPointerException on tx recovery: {code:java} [17:15:02,428][SEVERE][sys-#305][] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.IgniteException: Failed to perform tx recovery]] class org.apache.ignite.IgniteException: Failed to perform tx recovery at org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.run(IgniteTxManager.java:3288) at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7186) at org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:826) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: java.lang.NullPointerException at org.apache.ignite.internal.IgniteFeatures.nodeSupports(IgniteFeatures.java:212) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.isMvccRecoveryMessageRequired(IgniteTxManager.java:3304) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxManager$TxRecoveryInitRunnable.run(IgniteTxManager.java:3208) ... 6 more {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-13439) Printing detailed classpath slowdowns node initialization
[ https://issues.apache.org/jira/browse/IGNITE-13439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17199244#comment-17199244 ] Pavel Vinokurov commented on IGNITE-13439: -- [~ilyak] Please review > Printing detailed classpath slowdowns node initialization > - > > Key: IGNITE-13439 > URL: https://issues.apache.org/jira/browse/IGNITE-13439 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: 2.8.1 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > If IGNITE_LOG_CLASSPATH_CONTENT_ON_STARTUP is enabled, > IgniteKernel#ackClassPathContent parses the classpath and recursively > traverses the file system printing all jars and class files. > Traversing the files system could take much time in case of many class files > or having a root folder in the classpath. > The reasonable behavior is to print only root classpath folders. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13439) Printing detailed classpath slowdowns node initialization
[ https://issues.apache.org/jira/browse/IGNITE-13439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-13439: - Reviewer: Ilya Kasnacheev > Printing detailed classpath slowdowns node initialization > - > > Key: IGNITE-13439 > URL: https://issues.apache.org/jira/browse/IGNITE-13439 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: 2.8.1 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > If IGNITE_LOG_CLASSPATH_CONTENT_ON_STARTUP is enabled, > IgniteKernel#ackClassPathContent parses the classpath and recursively > traverses the file system printing all jars and class files. > Traversing the files system could take much time in case of many class files > or having a root folder in the classpath. > The reasonable behavior is to print only root classpath folders. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (IGNITE-13439) Printing detailed classpath slowdowns node initialization
[ https://issues.apache.org/jira/browse/IGNITE-13439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov reassigned IGNITE-13439: Assignee: Pavel Vinokurov > Printing detailed classpath slowdowns node initialization > - > > Key: IGNITE-13439 > URL: https://issues.apache.org/jira/browse/IGNITE-13439 > Project: Ignite > Issue Type: Bug > Components: general >Affects Versions: 2.8.1 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > If IGNITE_LOG_CLASSPATH_CONTENT_ON_STARTUP is enabled, > IgniteKernel#ackClassPathContent parses the classpath and recursively > traverses the file system printing all jars and class files. > Traversing the files system could take much time in case of many class files > or having a root folder in the classpath. > The reasonable behavior is to print only root classpath folders. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13439) Printing detailed classpath slowdowns node initialization
Pavel Vinokurov created IGNITE-13439: Summary: Printing detailed classpath slowdowns node initialization Key: IGNITE-13439 URL: https://issues.apache.org/jira/browse/IGNITE-13439 Project: Ignite Issue Type: Bug Components: general Affects Versions: 2.8.1 Reporter: Pavel Vinokurov If IGNITE_LOG_CLASSPATH_CONTENT_ON_STARTUP is enabled, IgniteKernel#ackClassPathContent parses the classpath and recursively traverses the file system printing all jars and class files. Traversing the files system could take much time in case of many class files or having a root folder in the classpath. The reasonable behavior is to print only root classpath folders. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9474) Ignite does not eagerly remove expired cache entries
[ https://issues.apache.org/jira/browse/IGNITE-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-9474: Attachment: IgniteExpirationReproducerWithoutPersistance.java > Ignite does not eagerly remove expired cache entries > > > Key: IGNITE-9474 > URL: https://issues.apache.org/jira/browse/IGNITE-9474 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: IgniteExpirationReproducerWithoutPersistance.java > > > cache.size() indicates existed rows, but any get operation returns empty > result. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9474) Ignite does not eagerly remove expired cache entries
[ https://issues.apache.org/jira/browse/IGNITE-9474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-9474: Attachment: (was: IgniteExpirationReproducerWithoutPersistance.java) > Ignite does not eagerly remove expired cache entries > > > Key: IGNITE-9474 > URL: https://issues.apache.org/jira/browse/IGNITE-9474 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Priority: Major > > cache.size() indicates existed rows, but any get operation returns empty > result. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (IGNITE-13000) Connection.prepareStatement(String,int) always throws UnsupportedException ignoring 'autoGeneratedKeys' parameter
[ https://issues.apache.org/jira/browse/IGNITE-13000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov reassigned IGNITE-13000: Assignee: Pavel Vinokurov > Connection.prepareStatement(String,int) always throws UnsupportedException > ignoring 'autoGeneratedKeys' parameter > --- > > Key: IGNITE-13000 > URL: https://issues.apache.org/jira/browse/IGNITE-13000 > Project: Ignite > Issue Type: Bug > Components: sql >Affects Versions: 2.8 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Major > > Below the method call throwing Exception. > {code:java} > conn.prepareStatement(query, Statement.NO_GENERATED_KEYS) > {code} > But there is should be the same result as for: > {code:java} > conn.prepareStatement(query) > {code} > The possible fix: > {code:java} > @Override > public PreparedStatement prepareStatement(String sql, int autoGeneratedKeys) > throws SQLException { > ensureNotClosed(); > if(autoGeneratedKeys == Statement.RETURN_GENERATED_KEYS) > throw new SQLFeatureNotSupportedException("Auto generated keys are > not supported."); > return prepareStatement(sql); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13000) Connection.prepareStatement(String,int) always throws UnsupportedException ignoring 'autoGeneratedKeys' parameter
Pavel Vinokurov created IGNITE-13000: Summary: Connection.prepareStatement(String,int) always throws UnsupportedException ignoring 'autoGeneratedKeys' parameter Key: IGNITE-13000 URL: https://issues.apache.org/jira/browse/IGNITE-13000 Project: Ignite Issue Type: Bug Components: sql Affects Versions: 2.8 Reporter: Pavel Vinokurov Below the method call throwing Exception. {code:java} conn.prepareStatement(query, Statement.NO_GENERATED_KEYS) {code} But there is should be the same result as for: {code:java} conn.prepareStatement(query) {code} The possible fix: {code:java} @Override public PreparedStatement prepareStatement(String sql, int autoGeneratedKeys) throws SQLException { ensureNotClosed(); if(autoGeneratedKeys == Statement.RETURN_GENERATED_KEYS) throw new SQLFeatureNotSupportedException("Auto generated keys are not supported."); return prepareStatement(sql); } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-11798) Memory leak on unstable topology caused by partition reservation
[ https://issues.apache.org/jira/browse/IGNITE-11798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11798: - Affects Version/s: 2.7 > Memory leak on unstable topology caused by partition reservation > > > Key: IGNITE-11798 > URL: https://issues.apache.org/jira/browse/IGNITE-11798 > Project: Ignite > Issue Type: Bug > Components: cache, sql >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: PartitionReservationReproducer.java > > > Executing queries on unstable topology leads to OOM caused by leak of the > partition reservation. > The reproducer is attached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11798) Memory leak on unstable topology caused by partition reservation
[ https://issues.apache.org/jira/browse/IGNITE-11798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11798: - Summary: Memory leak on unstable topology caused by partition reservation (was: Memory leak on unstable topology caused by reservation partitions) > Memory leak on unstable topology caused by partition reservation > > > Key: IGNITE-11798 > URL: https://issues.apache.org/jira/browse/IGNITE-11798 > Project: Ignite > Issue Type: Bug > Components: cache, sql >Reporter: Pavel Vinokurov >Priority: Major > Attachments: PartitionReservationReproducer.java > > > Executing queries on unstable topology leads to OOM caused by leak of the > partition reservation. > The reproducer is attached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11798) Memory leak on unstable topology caused by reservation partitions
Pavel Vinokurov created IGNITE-11798: Summary: Memory leak on unstable topology caused by reservation partitions Key: IGNITE-11798 URL: https://issues.apache.org/jira/browse/IGNITE-11798 Project: Ignite Issue Type: Bug Components: cache, sql Reporter: Pavel Vinokurov Attachments: PartitionReservationReproducer.java Executing queries on unstable topology leads to OOM caused by leak of the partition reservation. The reproducer is attached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-11544) Unclear behavior for cache operations using classes different from specified as indexed types
[ https://issues.apache.org/jira/browse/IGNITE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818999#comment-16818999 ] Pavel Vinokurov commented on IGNITE-11544: -- [~zstan]The main issue is to throwing CorruptedTreeException by cache2.removeAll().Thus at least it's unable to perform removeAll() operation. > Unclear behavior for cache operations using classes different from specified > as indexed types > - > > Key: IGNITE-11544 > URL: https://issues.apache.org/jira/browse/IGNITE-11544 > Project: Ignite > Issue Type: Bug > Components: cache, sql >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Assignee: Igor Belyakov >Priority: Major > Attachments: IndexedTypesReproducer.java > > > There are a few cases presented in the attached reproducer where caches are > populated by objects of classes different from specified in > CacheConfiguration#setIndexedTypes -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11699) Node can't start after forced shutdown if the wal archiver disabled
[ https://issues.apache.org/jira/browse/IGNITE-11699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11699: - Attachment: disabled-wal-archive-reproducer.zip > Node can't start after forced shutdown if the wal archiver disabled > --- > > Key: IGNITE-11699 > URL: https://issues.apache.org/jira/browse/IGNITE-11699 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: disabled-wal-archive-reproducer.zip > > > If a server node killed with the disabled wal archive, it could fail on start > with following exception: > {code:java} > [18:37:53,887][SEVERE][sys-stripe-1-#2][G] Failed to execute runnable. > java.lang.IllegalStateException: Failed to get page IO instance (page content > is corrupted) > at > org.apache.ignite.internal.processors.cache.persistence.tree.io.IOVersions.forVersion(IOVersions.java:85) > at > org.apache.ignite.internal.processors.cache.persistence.tree.io.IOVersions.forPage(IOVersions.java:97) > at > org.apache.ignite.internal.pagemem.wal.record.delta.MetaPageUpdatePartitionDataRecord.applyDelta(MetaPageUpdatePartitionDataRecord.java:109) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.applyPageDelta(GridCacheDatabaseSharedManager.java:2532) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$performBinaryMemoryRestore$11(GridCacheDatabaseSharedManager.java:2327) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$stripedApplyPage$12(GridCacheDatabaseSharedManager.java:2441) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$stripedApply$13(GridCacheDatabaseSharedManager.java:2479) > at > org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:550) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) > at java.lang.Thread.run(Thread.java:748) > {code} > The reproducer is attached(works only on Linux). > Steps to run the reproducer. > 1. Copy config/server.xml into IGNITE_HOME/config folder; > 2. Set IGNITE_HOME in the CorruptionReproducer class; > 3. Launch CorruptionReproducer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11699) Node can't start after forced shutdown if the wal archiver disabled
Pavel Vinokurov created IGNITE-11699: Summary: Node can't start after forced shutdown if the wal archiver disabled Key: IGNITE-11699 URL: https://issues.apache.org/jira/browse/IGNITE-11699 Project: Ignite Issue Type: Bug Components: persistence Affects Versions: 2.7 Reporter: Pavel Vinokurov If a server node killed with the disabled wal archive, it could fail on start with following exception: {code:java} [18:37:53,887][SEVERE][sys-stripe-1-#2][G] Failed to execute runnable. java.lang.IllegalStateException: Failed to get page IO instance (page content is corrupted) at org.apache.ignite.internal.processors.cache.persistence.tree.io.IOVersions.forVersion(IOVersions.java:85) at org.apache.ignite.internal.processors.cache.persistence.tree.io.IOVersions.forPage(IOVersions.java:97) at org.apache.ignite.internal.pagemem.wal.record.delta.MetaPageUpdatePartitionDataRecord.applyDelta(MetaPageUpdatePartitionDataRecord.java:109) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.applyPageDelta(GridCacheDatabaseSharedManager.java:2532) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$performBinaryMemoryRestore$11(GridCacheDatabaseSharedManager.java:2327) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$stripedApplyPage$12(GridCacheDatabaseSharedManager.java:2441) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.lambda$stripedApply$13(GridCacheDatabaseSharedManager.java:2479) at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:550) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) at java.lang.Thread.run(Thread.java:748) {code} The reproducer is attached(works only on Linux). Steps to run the reproducer. 1. Copy config/server.xml into IGNITE_HOME/config folder; 2. Set IGNITE_HOME in the CorruptionReproducer class; 3. Launch CorruptionReproducer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-8357) Recreated atomic sequence produces "Sequence was removed from cache"
[ https://issues.apache.org/jira/browse/IGNITE-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov reassigned IGNITE-8357: --- Assignee: (was: Pavel Vinokurov) > Recreated atomic sequence produces "Sequence was removed from cache" > > > Key: IGNITE-8357 > URL: https://issues.apache.org/jira/browse/IGNITE-8357 > Project: Ignite > Issue Type: Bug > Components: data structures >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: RecreatingAtomicSequence.java > > > If a cluster has two or more nodes, recreated atomic sequence produces error > on incrementAndGet operation. > The reproducer is attached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov resolved IGNITE-11378. -- Resolution: Not A Problem There is the long checkpoint process caused by the small data region size. > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > {code:java} > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > {code} > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > {code:java} > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-9626) Applying WAL updates ignores evicition policy
[ https://issues.apache.org/jira/browse/IGNITE-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov resolved IGNITE-9626. - Resolution: Duplicate > Applying WAL updates ignores evicition policy > - > > Key: IGNITE-9626 > URL: https://issues.apache.org/jira/browse/IGNITE-9626 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.6 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Major > Attachments: IgniteExpirationWitPeristanceReproducer.java > > > Steps to reproduce: > 1. Add record for cache obtained by ignite.cache().withExpiryPolicy(). > 2. Stops node before checkpoint. > 3. Start node and get record for cache after specified duration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11585) Update Spring dependency to version 5
[ https://issues.apache.org/jira/browse/IGNITE-11585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11585: - Issue Type: Wish (was: Improvement) > Update Spring dependency to version 5 > - > > Key: IGNITE-11585 > URL: https://issues.apache.org/jira/browse/IGNITE-11585 > Project: Ignite > Issue Type: Wish >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11585) Update Spring dependency to version 5
Pavel Vinokurov created IGNITE-11585: Summary: Update Spring dependency to version 5 Key: IGNITE-11585 URL: https://issues.apache.org/jira/browse/IGNITE-11585 Project: Ignite Issue Type: Improvement Reporter: Pavel Vinokurov Assignee: Pavel Vinokurov -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-7718) Collections.singleton() and Collections.singletonMap() are not properly serialized by binary marshaller
[ https://issues.apache.org/jira/browse/IGNITE-7718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795146#comment-16795146 ] Pavel Vinokurov commented on IGNITE-7718: - [~amashenkov] Please review > Collections.singleton() and Collections.singletonMap() are not properly > serialized by binary marshaller > --- > > Key: IGNITE-7718 > URL: https://issues.apache.org/jira/browse/IGNITE-7718 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.3 >Reporter: Pavel Vinokurov >Assignee: Pavel Vinokurov >Priority: Major > > After desialization collections obtained by Collections.singleton() and > Collections.singletonMap() does not return collection of binary objects, but > rather collection of deserialized objects. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11544) Unclear behavior for cache operations using classes different from specified as indexed types
Pavel Vinokurov created IGNITE-11544: Summary: Unclear behavior for cache operations using classes different from specified as indexed types Key: IGNITE-11544 URL: https://issues.apache.org/jira/browse/IGNITE-11544 Project: Ignite Issue Type: Bug Components: cache, sql Affects Versions: 2.7 Reporter: Pavel Vinokurov Attachments: IndexedTypesReproducer.java There are a few cases presented in the attached reproducer where caches are populated by objects of classes different from specified in CacheConfiguration#setIndexedTypes -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11524) Memory leak caused by executing an jdbc prepared statement
Pavel Vinokurov created IGNITE-11524: Summary: Memory leak caused by executing an jdbc prepared statement Key: IGNITE-11524 URL: https://issues.apache.org/jira/browse/IGNITE-11524 Project: Ignite Issue Type: Bug Components: sql, thin client Reporter: Pavel Vinokurov Fix For: 2.7 Attachments: PreparedStatementOOMReproducer.java Executing a prepared statement multiple times lead to OOM. VisualVM indicates that heap contains a lot of JdbcThinPreparedStatament objects. The reproducer is attached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (IGNITE-11419) Memory leak after multiple restarts of server node within the same thread
[ https://issues.apache.org/jira/browse/IGNITE-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784180#comment-16784180 ] Pavel Vinokurov edited comment on IGNITE-11419 at 3/5/19 8:28 AM: -- ConnectionManager declares the thread local variable initialized by the instance of anonymous class.Thus the thread local variable linked with the ConnectionManager. It leads to memory leak. {code:java} /** Connection cache. */ private final ThreadLocal.Reusable> threadConn = new ThreadLocal.Reusable>() { @Override public ThreadLocalObjectPool.Reusable get() { ThreadLocalObjectPool.Reusable reusable = super.get(); {code} was (Author: pvinokurov): ConnectionManager declares the thread local variable initialized by the instance of anonymous class.Thus the thread local variably linked with the ConnectionManager. It leads to memory leak. {code:java} /** Connection cache. */ private final ThreadLocal.Reusable> threadConn = new ThreadLocal.Reusable>() { @Override public ThreadLocalObjectPool.Reusable get() { ThreadLocalObjectPool.Reusable reusable = super.get(); {code} > Memory leak after multiple restarts of server node within the same thread > - > > Key: IGNITE-11419 > URL: https://issues.apache.org/jira/browse/IGNITE-11419 > Project: Ignite > Issue Type: Bug > Components: sql >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Multiple restarts of a server node with enabled persistence and 20 caches > lead to OutOfMemory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11419) Memory leak after multiple restarts of server node within the same thread
[ https://issues.apache.org/jira/browse/IGNITE-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11419: - Affects Version/s: 2.7 > Memory leak after multiple restarts of server node within the same thread > - > > Key: IGNITE-11419 > URL: https://issues.apache.org/jira/browse/IGNITE-11419 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Minor > > Multiple restarts of a server node with enabled persistence and 20 caches > lead to OutOfMemory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11419) Memory leak after multiple restarts of server node within the same thread
[ https://issues.apache.org/jira/browse/IGNITE-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11419: - Component/s: sql > Memory leak after multiple restarts of server node within the same thread > - > > Key: IGNITE-11419 > URL: https://issues.apache.org/jira/browse/IGNITE-11419 > Project: Ignite > Issue Type: Bug > Components: sql >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Minor > > Multiple restarts of a server node with enabled persistence and 20 caches > lead to OutOfMemory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11419) Memory leak after multiple restarts of server node within the same thread
[ https://issues.apache.org/jira/browse/IGNITE-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11419: - Ignite Flags: (was: Docs Required) > Memory leak after multiple restarts of server node within the same thread > - > > Key: IGNITE-11419 > URL: https://issues.apache.org/jira/browse/IGNITE-11419 > Project: Ignite > Issue Type: Bug > Components: sql >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Minor > > Multiple restarts of a server node with enabled persistence and 20 caches > lead to OutOfMemory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-11419) Memory leak after multiple restarts of server node within the same jvm
[ https://issues.apache.org/jira/browse/IGNITE-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784180#comment-16784180 ] Pavel Vinokurov commented on IGNITE-11419: -- ConnectionManager declares the thread local variable initialized by the instance of anonymous class.Thus the thread local variably linked with the ConnectionManager. It leads to memory leak. {code:java} /** Connection cache. */ private final ThreadLocal.Reusable> threadConn = new ThreadLocal.Reusable>() { @Override public ThreadLocalObjectPool.Reusable get() { ThreadLocalObjectPool.Reusable reusable = super.get(); {code} > Memory leak after multiple restarts of server node within the same jvm > -- > > Key: IGNITE-11419 > URL: https://issues.apache.org/jira/browse/IGNITE-11419 > Project: Ignite > Issue Type: Bug >Reporter: Pavel Vinokurov >Priority: Minor > > Multiple restarts of a server node with enabled persistence and 20 caches > lead to OutOfMemory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11419) Memory leak after multiple restarts of server node within the same jvm
Pavel Vinokurov created IGNITE-11419: Summary: Memory leak after multiple restarts of server node within the same jvm Key: IGNITE-11419 URL: https://issues.apache.org/jira/browse/IGNITE-11419 Project: Ignite Issue Type: Bug Reporter: Pavel Vinokurov Multiple restarts of a server node with enabled persistence and 20 caches lead to OutOfMemory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-11383) Unable to restart node with WALMode.NONE
[ https://issues.apache.org/jira/browse/IGNITE-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775018#comment-16775018 ] Pavel Vinokurov commented on IGNITE-11383: -- [~dpavlov] I suppose the server node should clean up PDS or show a correct message. > Unable to restart node with WALMode.NONE > - > > Key: IGNITE-11383 > URL: https://issues.apache.org/jira/browse/IGNITE-11383 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: MemoryRestoreReproducer.java > > > Scenario: > 1. Start single node with persistence without WAL. > 2. Stream data to a cache. > 3. Restart the node. > Result: > Node failed with following exception. > {code:java} > Exception in thread "main" class org.apache.ignite.IgniteException: null > at > org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1059) > at org.apache.ignite.Ignition.start(Ignition.java:324) > at MemoryRestoreReproducer.main(MemoryRestoreReproducer.java:27) > Caused by: class org.apache.ignite.IgniteCheckedException: null > at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1196) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1992) > at > org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1683) > at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1109) > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:629) > at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:554) > at org.apache.ignite.Ignition.start(Ignition.java:321) > ... 1 more > Caused by: java.util.NoSuchElementException > at > org.apache.ignite.internal.util.GridCloseableIteratorAdapter.nextX(GridCloseableIteratorAdapter.java:39) > at > org.apache.ignite.internal.util.lang.GridIteratorAdapter.next(GridIteratorAdapter.java:35) > at > org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.read(FileWriteAheadLogManager.java:855) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.performBinaryMemoryRestore(GridCacheDatabaseSharedManager.java:2120) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:749) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetaStorageSubscribersOnReadyForRead(GridCacheDatabaseSharedManager.java:4963) > at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1058) > ... 7 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11378: - Description: The attached reproducer shows the following exception during streaming data to cache: {code:java} [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] class org.apache.ignite.IgniteException: GridWorker [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, heartbeatTs=1550754912905] {code} If the blocked timeout is changed by cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and restarting several nodes the following critical error occurs: {code:java} [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked system-critical thread has been detected. This can lead to cluster-wide undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, waitCnt=729] [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] {code} was: The attached reproducer shows the following exception during streaming data to cache: [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] class org.apache.ignite.IgniteException: GridWorker [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, heartbeatTs=1550754912905] If the blocked timeout is changed by cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and restarting several nodes the following critical error occurs: [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked system-critical thread has been detected. This can lead to cluster-wide undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, waitCnt=729] [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > {code:java} > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler
[jira] [Updated] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11378: - Description: The attached reproducer shows the following exception during streaming data to cache: [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] class org.apache.ignite.IgniteException: GridWorker [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, heartbeatTs=1550754912905] If the blocked timeout is changed by cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and restarting several nodes the following critical error occurs: [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked system-critical thread has been detected. This can lead to cluster-wide undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, waitCnt=729] [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] was: The attached reproducer shows the following exception during streaming data to cache: [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] class org.apache.ignite.IgniteException: GridWorker [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, heartbeatTs=1550754912905] If the blocked timeout is changed by cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and restarting several nodes the following critical error occurs: [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked system-critical thread has been detected. This can lead to cluster-wide undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, waitCnt=729] [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler
[jira] [Created] (IGNITE-11383) Unable to restart node with WALMode.NONE
Pavel Vinokurov created IGNITE-11383: Summary: Unable to restart node with WALMode.NONE Key: IGNITE-11383 URL: https://issues.apache.org/jira/browse/IGNITE-11383 Project: Ignite Issue Type: Bug Components: persistence Affects Versions: 2.7 Reporter: Pavel Vinokurov Attachments: MemoryRestoreReproducer.java Scenario: 1. Start single node with persistence without WAL. 2. Stream data to a cache. 3. Restart the node. Result: Node failed with following exception. {code:java} Exception in thread "main" class org.apache.ignite.IgniteException: null at org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1059) at org.apache.ignite.Ignition.start(Ignition.java:324) at MemoryRestoreReproducer.main(MemoryRestoreReproducer.java:27) Caused by: class org.apache.ignite.IgniteCheckedException: null at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1196) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1992) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1683) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1109) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:629) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:554) at org.apache.ignite.Ignition.start(Ignition.java:321) ... 1 more Caused by: java.util.NoSuchElementException at org.apache.ignite.internal.util.GridCloseableIteratorAdapter.nextX(GridCloseableIteratorAdapter.java:39) at org.apache.ignite.internal.util.lang.GridIteratorAdapter.next(GridIteratorAdapter.java:35) at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.read(FileWriteAheadLogManager.java:855) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.performBinaryMemoryRestore(GridCacheDatabaseSharedManager.java:2120) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore(GridCacheDatabaseSharedManager.java:749) at org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetaStorageSubscribersOnReadyForRead(GridCacheDatabaseSharedManager.java:4963) at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1058) ... 7 more {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11378: - Attachment: CheckpointLockReproducer.java > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11378: - Attachment: (was: CheckpointLockReproducer.java) > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774351#comment-16774351 ] Pavel Vinokurov edited comment on IGNITE-11378 at 2/21/19 5:49 PM: --- The reproducer has been updated was (Author: pvinokurov): The reproducer was updated > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774351#comment-16774351 ] Pavel Vinokurov commented on IGNITE-11378: -- The reproducer was updated > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11378: - Attachment: (was: CheckpointLockReproducer.java) > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11378: - Attachment: CheckpointLockReproducer.java > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11378: - Affects Version/s: 2.7 > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11378: - Attachment: CheckpointLockReproducer.java > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11378: - Attachment: (was: CheckpointLockReproducer.java) > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Reporter: Pavel Vinokurov >Priority: Major > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11378: - Attachment: CheckpointLockReproducer.java > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11378: - Attachment: CheckpointLockReproducer.java > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11378: - Attachment: (was: CheckpointLockReproducer.java) > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11378) Critical system errors on cluster with enabled peristance
[ https://issues.apache.org/jira/browse/IGNITE-11378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-11378: - Attachment: (was: CheckpointLockReproducer.java) > Critical system errors on cluster with enabled peristance > - > > Key: IGNITE-11378 > URL: https://issues.apache.org/jira/browse/IGNITE-11378 > Project: Ignite > Issue Type: Bug > Components: persistence >Reporter: Pavel Vinokurov >Priority: Major > Attachments: CheckpointLockReproducer.java > > > The attached reproducer shows the following exception during streaming data > to cache: > [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 > 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will > be handled accordingly to configured handler > [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, > igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] > class org.apache.ignite.IgniteException: GridWorker > [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, > heartbeatTs=1550754912905] > If the blocked timeout is changed by > cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and > restarting several nodes the following critical error occurs: > [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked > system-critical thread has been detected. This can lead to cluster-wide > undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] > [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread > [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, > waitCnt=729] > [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] > Critical system error detected. Will be handled accordingly to configured > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], > failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class > o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] > class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, > igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11378) Critical system errors on cluster with enabled peristance
Pavel Vinokurov created IGNITE-11378: Summary: Critical system errors on cluster with enabled peristance Key: IGNITE-11378 URL: https://issues.apache.org/jira/browse/IGNITE-11378 Project: Ignite Issue Type: Bug Components: persistence Reporter: Pavel Vinokurov Attachments: CheckpointLockReproducer.java The attached reproducer shows the following exception during streaming data to cache: [2019-02-21 16:15:23,202][ERROR][tcp-disco-msg-worker-[100a6976 0:0:0:0:0:0:0:1%lo:47500]-#19%3%][root] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, heartbeatTs=1550754912905]]] class org.apache.ignite.IgniteException: GridWorker [name=data-streamer-stripe-5, igniteInstanceName=3, finished=false, heartbeatTs=1550754912905] If the blocked timeout is changed by cfg.setSystemWorkerBlockedTimeout(30_000), after streaming data and restarting several nodes the following critical error occurs: [2019-02-21 16:24:07,164][ERROR][grid-timeout-worker-#214%client%][G] Blocked system-critical thread has been detected. This can lead to cluster-wide undefined behaviour [threadName=tcp-comm-worker, blockedFor=36s] [2019-02-21 16:24:07,166][WARN ][grid-timeout-worker-#214%client%][G] Thread [name="tcp-comm-worker-#25%client%", id=482, state=TIMED_WAITING, blockCnt=0, waitCnt=729] [2019-02-21 16:24:07,168][ERROR][grid-timeout-worker-#214%client%][root] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet []]], failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=client, finished=false, heartbeatTs=1550755410331]]] class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker, igniteInstanceName=client, finished=false, heartbeatTs=1550755410331] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-10873) CorruptedTreeException during simultaneous cache put operations
[ https://issues.apache.org/jira/browse/IGNITE-10873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-10873: - Component/s: sql > CorruptedTreeException during simultaneous cache put operations > --- > > Key: IGNITE-10873 > URL: https://issues.apache.org/jira/browse/IGNITE-10873 > Project: Ignite > Issue Type: Bug > Components: cache, persistence, sql >Affects Versions: 2.7 >Reporter: Pavel Vinokurov >Priority: Critical > > [2019-01-09 20:47:04,376][ERROR][pool-9-thread-9][GridDhtAtomicCache] > Unexpected exception during cache update > org.h2.message.DbException: General error: "class > org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: > Runtime failure on row: Row@780acfb4[ key: .. ][ GTEST, null, 254, null, > null, null, null, 0, null, null, null, null, null, null, null, 0, 0, 0, null, > 0, 0, 0, 0, 0, 0, 0, null, 0, 0, null, 0, null, 0, null, 0, null, null, null, > 0, 0, 0, 0, 0, 0, null, null, null, null, null, null, null, 0.0, 0, 0.0, 0, > 0.0, 0, null, 0, 0, 0, 0, null, null, null, null, null, null, null, null, > null, null, null, null, null, null, null, null ]" [5-197] > at org.h2.message.DbException.get(DbException.java:168) > at org.h2.message.DbException.convert(DbException.java:307) > at > org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.putx(H2TreeIndex.java:302) > at > org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.addToIndex(GridH2Table.java:546) > at > org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.update(GridH2Table.java:479) > at > org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.store(IgniteH2Indexing.java:768) > at > org.apache.ignite.internal.processors.query.GridQueryProcessor.store(GridQueryProcessor.java:1905) > at > org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.store(GridCacheQueryManager.java:404) > at > org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.finishUpdate(IgniteCacheOffheapManagerImpl.java:2633) > at > org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1646) > at > org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1621) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:1935) > at > org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:428) > at > org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:2295) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2494) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update(GridDhtAtomicCache.java:1951) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1780) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1668) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:299) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:483) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:443) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:248) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1153) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.put0(GridDhtAtomicCache.java:611) > at > org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2449) > at > org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2426) > at > org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1105) > at >
[jira] [Updated] (IGNITE-10873) CorruptedTreeException during simultaneous cache put operations
[ https://issues.apache.org/jira/browse/IGNITE-10873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-10873: - Description: [2019-01-09 20:47:04,376][ERROR][pool-9-thread-9][GridDhtAtomicCache] Unexpected exception during cache update org.h2.message.DbException: General error: "class org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: Runtime failure on row: Row@780acfb4[ key: .. ][ GTEST, null, 254, null, null, null, null, 0, null, null, null, null, null, null, null, 0, 0, 0, null, 0, 0, 0, 0, 0, 0, 0, null, 0, 0, null, 0, null, 0, null, 0, null, null, null, 0, 0, 0, 0, 0, 0, null, null, null, null, null, null, null, 0.0, 0, 0.0, 0, 0.0, 0, null, 0, 0, 0, 0, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null ]" [5-197] at org.h2.message.DbException.get(DbException.java:168) at org.h2.message.DbException.convert(DbException.java:307) at org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.putx(H2TreeIndex.java:302) at org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.addToIndex(GridH2Table.java:546) at org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.update(GridH2Table.java:479) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.store(IgniteH2Indexing.java:768) at org.apache.ignite.internal.processors.query.GridQueryProcessor.store(GridQueryProcessor.java:1905) at org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.store(GridCacheQueryManager.java:404) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.finishUpdate(IgniteCacheOffheapManagerImpl.java:2633) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1646) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1621) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:1935) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:428) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:2295) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2494) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update(GridDhtAtomicCache.java:1951) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1780) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1668) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:299) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:483) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:443) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:248) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1153) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.put0(GridDhtAtomicCache.java:611) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2449) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2426) at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1105) at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:820) at IndexCorruptionReproducer$1.run(IndexCorruptionReproducer.java:43) at java.util.concurrent.Executors$RunnableAdapter.call$$$capture(Executors.java:511) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java) at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) at java.util.concurrent.FutureTask.run(FutureTask.java) at
[jira] [Created] (IGNITE-10873) CorruptedTreeException during simultaneous cache put operations
Pavel Vinokurov created IGNITE-10873: Summary: CorruptedTreeException during simultaneous cache put operations Key: IGNITE-10873 URL: https://issues.apache.org/jira/browse/IGNITE-10873 Project: Ignite Issue Type: Bug Components: cache, persistence Affects Versions: 2.7 Reporter: Pavel Vinokurov [2019-01-09 20:47:04,376][ERROR][pool-9-thread-9][GridDhtAtomicCache] Unexpected exception during cache update org.h2.message.DbException: General error: "class org.apache.ignite.internal.processors.cache.persistence.tree.CorruptedTreeException: Runtime failure on row: Row@780acfb4[ key: model.TbclsrInputKey [idHash=1823856383, hash=275143246, clsbInputRef=GTEST, firstInputFlag=254], val: model.TbclsrInput [idHash=708235920, hash=-19147671, clsbMatchRef=null, origBic=null, desStlmtMbrBic=null, cpBic=null, cpDesSmBic=null, desSmManuAuth=0, origRef=null, relatedRef=null, commonRef=null, clsbTransRef=null, lastAmdSendRef=null, branchId=null, inputType=null, formatType=0, sourceType=0, sourceId=0, operType=null, fwdBookFlag=0, possDupFlag=0, sameDayFlag=0, pendingFlag=0, rescOrigSmFlag=0, rescCpCpsmFlag=0, stlmtEligFlag=0, authTms=null, ntfId=0, inputStatus=0, lastActionTms=null, ofacStatus=0, ofacTms=null, prevInputStatus=0, prevTms=null, cpOfacStatus=0, sentDt=null, valueDt=null, tradeDt=null, origSuspFlag=0, origSmSuspFlag=0, cpSuspFlag=0, cpSmSuspFlag=0, currSuspFlag=0, tpIndicatorFlag=0, tpBic=null, tpReference=null, tpFreeText=null, tpFurtherRef=null, tpCustIntRef=null, tpMbrField1=null, tpMbrField2=null, exchRate=0.0, currIdBuy=0, volBuy=0.0, currIdSell=0, volSell=0.0, inputVersionId=0, versionId=null, grpQueueOrderNo=0, queueOrderNo=0, originalGroupId=0, groupStatus=0, usi=null, prevUsi=null, origLei=null, cpLei=null, fundLei=null, reportJuris=null, execVenue=null, execTms=null, execTmsUtcoff=null, mappingRule=null, reportJuris2=null, usi2=null, prevUsi2=null, reportJuris3=null, usi3=null, prevUsi3=null], ver: GridCacheVersion [topVer=158536014, order=1547056011256, nodeOrder=1] ][ GTEST, null, 254, null, null, null, null, 0, null, null, null, null, null, null, null, 0, 0, 0, null, 0, 0, 0, 0, 0, 0, 0, null, 0, 0, null, 0, null, 0, null, 0, null, null, null, 0, 0, 0, 0, 0, 0, null, null, null, null, null, null, null, 0.0, 0, 0.0, 0, 0.0, 0, null, 0, 0, 0, 0, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null ]" [5-197] at org.h2.message.DbException.get(DbException.java:168) at org.h2.message.DbException.convert(DbException.java:307) at org.apache.ignite.internal.processors.query.h2.database.H2TreeIndex.putx(H2TreeIndex.java:302) at org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.addToIndex(GridH2Table.java:546) at org.apache.ignite.internal.processors.query.h2.opt.GridH2Table.update(GridH2Table.java:479) at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.store(IgniteH2Indexing.java:768) at org.apache.ignite.internal.processors.query.GridQueryProcessor.store(GridQueryProcessor.java:1905) at org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager.store(GridCacheQueryManager.java:404) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.finishUpdate(IgniteCacheOffheapManagerImpl.java:2633) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1646) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1621) at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.invoke(GridCacheOffheapManager.java:1935) at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:428) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:2295) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2494) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update(GridDhtAtomicCache.java:1951) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1780) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1668) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:299)
[jira] [Created] (IGNITE-10524) IgniteCache.iterator() from a client node leads to OOM
Pavel Vinokurov created IGNITE-10524: Summary: IgniteCache.iterator() from a client node leads to OOM Key: IGNITE-10524 URL: https://issues.apache.org/jira/browse/IGNITE-10524 Project: Ignite Issue Type: Bug Components: cache Affects Versions: 2.4 Reporter: Pavel Vinokurov Looks like "iterator()" method perform a scan query and load all cache rows into heap. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-10291) Unable to find row by index created on partial baseline topology
Pavel Vinokurov created IGNITE-10291: Summary: Unable to find row by index created on partial baseline topology Key: IGNITE-10291 URL: https://issues.apache.org/jira/browse/IGNITE-10291 Project: Ignite Issue Type: Bug Components: cache, sql Affects Versions: 2.6, 2.5, 2.4 Reporter: Pavel Vinokurov Attachments: Reproducer.java Steps to reproduce: 1. Start two nodes cluster with persistence. 2. Create table CREATE TABLE PERSON ( FIRST_NAME VARCHAR, LAST_NAME VARCHAR, ADDRESS VARCHAR, LANG VARCHAR, BIRTH_DATE TIMESTAMP, CONSTRAINT PK_PESON PRIMARY KEY (FIRST_NAME,LAST_NAME,ADDRESS,LANG) ) WITH "key_type=PersonKeyType, CACHE_NAME=PersonCache, value_type=PersonValueType, AFFINITY_KEY=FIRST_NAME,template=PARTITIONED,backups=1" Insert 1000 rows. 3. Stop the second node. 4. Create index create index PERSON_FIRST_NAME_IDX on PERSON(FIRST_NAME) 5. Start the second node 6. Perform select query for each row: select * from PERSON use index(PERSON_FIRST_NAME_IDX) where FIRST_NAME=? and LAST_NAME=? and ADDRESS=? and LANG = ? Result: The select doesn't return row in half of cases. The reproducer is attached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-10110) SQL query with DISTINCT and JOIN in suquery produces "Column not found"
[ https://issues.apache.org/jira/browse/IGNITE-10110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-10110: - Description: Initial script: CREATE TABLE Person( person_id INTEGER PRIMARY KEY, company_id INTEGER, last_name VARCHAR(100) ); CREATE TABLE Company( company_id INTEGER PRIMARY KEY, location_id INTEGER ); CREATE TABLE Department( department_id INTEGER PRIMARY KEY, person_id INTEGER ); CREATE TABLE Organization( organization_id INTEGER PRIMARY KEY, company_id INTEGER ); Query: {code:java} SELECT last_name FROM ( SELECT last_name, person_id, company_id FROM ( SELECT last_name, person_id, p.company_id as company_id FROM Person p INNER JOIN ( SELECT DISTINCT location_id, company_id FROM Company WHERE location_id = 1 ) cpy ON ( p.company_id = cpy.company_id ) ) a ) src INNER JOIN department dep ON src.person_id = dep.person_id LEFT JOIN organization og ON src.company_id = og.company_id {code} Result: Caused by: org.h2.jdbc.JdbcSQLException: Column "SRC__Z4.COMPANY_ID" not found; SQL statement: SELECT DEP__Z5.PERSON_ID __C2_0 FROM PUBLIC.DEPARTMENT DEP__Z5 LEFT OUTER JOIN PUBLIC.ORGANIZATION OG__Z6 ON SRC__Z4.COMPANY_ID = OG__Z6.COMPANY_ID was: Initial script: CREATE TABLE Person( person_id INTEGER PRIMARY KEY, company_id INTEGER, last_name VARCHAR(100) ); CREATE TABLE Company( company_id INTEGER PRIMARY KEY, location_id INTEGER ); CREATE TABLE Department( department_id INTEGER PRIMARY KEY, person_id INTEGER ); CREATE TABLE Organization( organization_id INTEGER PRIMARY KEY, company_id INTEGER ); Query: SELECT last_name FROM ( SELECT last_name, person_id, company_id FROM ( SELECT last_name, person_id, p.company_id as company_id FROM Person p INNER JOIN ( SELECT DISTINCT location_id, company_id FROM Company WHERE location_id = 1 ) cpy ON ( p.company_id = cpy.company_id ) ) a ) src INNER JOIN department dep ON src.person_id = dep.person_id LEFT JOIN organization og ON src.company_id = og.company_id Result: Caused by: org.h2.jdbc.JdbcSQLException: Column "SRC__Z4.COMPANY_ID" not found; SQL statement: SELECT DEP__Z5.PERSON_ID __C2_0 FROM PUBLIC.DEPARTMENT DEP__Z5 LEFT OUTER JOIN PUBLIC.ORGANIZATION OG__Z6 ON SRC__Z4.COMPANY_ID = OG__Z6.COMPANY_ID > SQL query with DISTINCT and JOIN in suquery produces "Column not found" > - > > Key: IGNITE-10110 > URL: https://issues.apache.org/jira/browse/IGNITE-10110 > Project: Ignite > Issue Type: Bug > Components: sql >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Priority: Major > Labels: sql > > Initial script: > CREATE TABLE Person( > person_id INTEGER PRIMARY KEY, > company_id INTEGER, > last_name VARCHAR(100) > ); > CREATE TABLE Company( > company_id INTEGER PRIMARY KEY, > location_id INTEGER > ); > CREATE TABLE Department( > department_id INTEGER PRIMARY KEY, > person_id INTEGER > ); > CREATE TABLE Organization( > organization_id INTEGER PRIMARY KEY, > company_id INTEGER > ); > Query: > {code:java} > SELECT > last_name > FROM > ( SELECT > last_name, > person_id, > company_id > FROM > ( SELECT > last_name, > person_id, > p.company_id as company_id > FROM > Person p > INNER JOIN > ( > SELECT > DISTINCT location_id, > company_id > FROM > Company > WHERE > location_id = 1 > ) cpy > ON ( > p.company_id = cpy.company_id > ) > ) a > ) src > INNER JOIN > department dep > ON src.person_id = dep.person_id > LEFT JOIN > organization og > ON src.company_id = og.company_id > {code} > Result: > Caused by: org.h2.jdbc.JdbcSQLException: Column "SRC__Z4.COMPANY_ID" not > found; SQL statement: > SELECT > DEP__Z5.PERSON_ID __C2_0 > FROM PUBLIC.DEPARTMENT DEP__Z5 > LEFT OUTER JOIN PUBLIC.ORGANIZATION OG__Z6 > ON SRC__Z4.COMPANY_ID = OG__Z6.COMPANY_ID -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-10110) SQL query with DISTINCT and JOIN in suquery produces "Column not found"
[ https://issues.apache.org/jira/browse/IGNITE-10110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Vinokurov updated IGNITE-10110: - Description: Initial script: CREATE TABLE Person( person_id INTEGER PRIMARY KEY, company_id INTEGER, last_name VARCHAR(100) ); CREATE TABLE Company( company_id INTEGER PRIMARY KEY, location_id INTEGER ); CREATE TABLE Department( department_id INTEGER PRIMARY KEY, person_id INTEGER ); CREATE TABLE Organization( organization_id INTEGER PRIMARY KEY, company_id INTEGER ); Query: {code:java} SELECT last_name FROM ( SELECT last_name, person_id, company_id FROM ( SELECT last_name, person_id, p.company_id as company_id FROM Person p INNER JOIN ( SELECT DISTINCT location_id, company_id FROM Company WHERE location_id = 1 ) cpy ON ( p.company_id = cpy.company_id ) ) a ) src INNER JOIN department dep ON src.person_id = dep.person_id LEFT JOIN organization og ON src.company_id = og.company_id {code} Result: Caused by: org.h2.jdbc.JdbcSQLException: Column "SRC__Z4.COMPANY_ID" not found; SQL statement: SELECT DEP__Z5.PERSON_ID __C2_0 FROM PUBLIC.DEPARTMENT DEP__Z5 LEFT OUTER JOIN PUBLIC.ORGANIZATION OG__Z6 ON SRC__Z4.COMPANY_ID = OG__Z6.COMPANY_ID was: Initial script: CREATE TABLE Person( person_id INTEGER PRIMARY KEY, company_id INTEGER, last_name VARCHAR(100) ); CREATE TABLE Company( company_id INTEGER PRIMARY KEY, location_id INTEGER ); CREATE TABLE Department( department_id INTEGER PRIMARY KEY, person_id INTEGER ); CREATE TABLE Organization( organization_id INTEGER PRIMARY KEY, company_id INTEGER ); Query: {code:java} SELECT last_name FROM ( SELECT last_name, person_id, company_id FROM ( SELECT last_name, person_id, p.company_id as company_id FROM Person p INNER JOIN ( SELECT DISTINCT location_id, company_id FROM Company WHERE location_id = 1 ) cpy ON ( p.company_id = cpy.company_id ) ) a ) src INNER JOIN department dep ON src.person_id = dep.person_id LEFT JOIN organization og ON src.company_id = og.company_id {code} Result: Caused by: org.h2.jdbc.JdbcSQLException: Column "SRC__Z4.COMPANY_ID" not found; SQL statement: SELECT DEP__Z5.PERSON_ID __C2_0 FROM PUBLIC.DEPARTMENT DEP__Z5 LEFT OUTER JOIN PUBLIC.ORGANIZATION OG__Z6 ON SRC__Z4.COMPANY_ID = OG__Z6.COMPANY_ID > SQL query with DISTINCT and JOIN in suquery produces "Column not found" > - > > Key: IGNITE-10110 > URL: https://issues.apache.org/jira/browse/IGNITE-10110 > Project: Ignite > Issue Type: Bug > Components: sql >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Priority: Major > Labels: sql > > Initial script: > CREATE TABLE Person( > person_id INTEGER PRIMARY KEY, > company_id INTEGER, > last_name VARCHAR(100) > ); > CREATE TABLE Company( > company_id INTEGER PRIMARY KEY, > location_id INTEGER > ); > CREATE TABLE Department( > department_id INTEGER PRIMARY KEY, > person_id INTEGER > ); > CREATE TABLE Organization( > organization_id INTEGER PRIMARY KEY, > company_id INTEGER > ); > Query: > {code:java} > SELECT > last_name > FROM > ( SELECT > last_name, > person_id, > company_id > FROM > ( SELECT > last_name, > person_id, > p.company_id as company_id > FROM > Person p > INNER JOIN > ( > SELECT > DISTINCT location_id, > company_id > FROM > Company > WHERE > location_id = 1 > ) cpy > ON ( > p.company_id = cpy.company_id > ) > ) a > ) src > INNER JOIN > department dep > ON src.person_id = dep.person_id > LEFT JOIN > organization og > ON src.company_id = og.company_id > {code} > Result: > Caused by: org.h2.jdbc.JdbcSQLException: Column "SRC__Z4.COMPANY_ID" not > found; SQL statement: > SELECT > DEP__Z5.PERSON_ID __C2_0 > FROM PUBLIC.DEPARTMENT DEP__Z5 > LEFT OUTER JOIN PUBLIC.ORGANIZATION OG__Z6 > ON SRC__Z4.COMPANY_ID = OG__Z6.COMPANY_ID -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (IGNITE-10110) SQL query with DISTINCT and JOIN in suquery produces "Column not found"
[ https://issues.apache.org/jira/browse/IGNITE-10110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671457#comment-16671457 ] Pavel Vinokurov edited comment on IGNITE-10110 at 11/1/18 10:48 AM: {code:java} SELECT last_name FROM ( SELECT DISTINCT last_name, person_id, company_id FROM Person ) src INNER JOIN department dep ON src.person_id = dep.person_id LEFT JOIN organization og ON src.company_id = og.company_id {code} was (Author: pvinokurov): Simplified query: SELECT last_name FROM ( SELECT DISTINCT last_name, person_id, company_id FROM Person ) src INNER JOIN department dep ON src.person_id = dep.person_id LEFT JOIN organization og ON src.company_id = og.company_id > SQL query with DISTINCT and JOIN in suquery produces "Column not found" > - > > Key: IGNITE-10110 > URL: https://issues.apache.org/jira/browse/IGNITE-10110 > Project: Ignite > Issue Type: Bug > Components: sql >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Priority: Major > Labels: sql > > Initial script: > CREATE TABLE Person( > person_id INTEGER PRIMARY KEY, > company_id INTEGER, > last_name VARCHAR(100) > ); > CREATE TABLE Company( > company_id INTEGER PRIMARY KEY, > location_id INTEGER > ); > CREATE TABLE Department( > department_id INTEGER PRIMARY KEY, > person_id INTEGER > ); > CREATE TABLE Organization( > organization_id INTEGER PRIMARY KEY, > company_id INTEGER > ); > Query: > SELECT > last_name > FROM > ( SELECT > last_name, > person_id, > company_id > FROM > ( SELECT > last_name, > person_id, > p.company_id as company_id > FROM > Person p > INNER JOIN > ( > SELECT > DISTINCT location_id, > company_id > FROM > Company > WHERE > location_id = 1 > ) cpy > ON ( > p.company_id = cpy.company_id > ) > ) a > ) src > INNER JOIN > department dep > ON src.person_id = dep.person_id > LEFT JOIN > organization og > ON src.company_id = og.company_id > Result: > Caused by: org.h2.jdbc.JdbcSQLException: Column "SRC__Z4.COMPANY_ID" not > found; SQL statement: > SELECT > DEP__Z5.PERSON_ID __C2_0 > FROM PUBLIC.DEPARTMENT DEP__Z5 > LEFT OUTER JOIN PUBLIC.ORGANIZATION OG__Z6 > ON SRC__Z4.COMPANY_ID = OG__Z6.COMPANY_ID -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (IGNITE-10110) SQL query with DISTINCT and JOIN in suquery produces "Column not found"
[ https://issues.apache.org/jira/browse/IGNITE-10110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16671457#comment-16671457 ] Pavel Vinokurov edited comment on IGNITE-10110 at 11/1/18 10:48 AM: Simplified query: {code:java} SELECT last_name FROM ( SELECT DISTINCT last_name, person_id, company_id FROM Person ) src INNER JOIN department dep ON src.person_id = dep.person_id LEFT JOIN organization og ON src.company_id = og.company_id {code} was (Author: pvinokurov): {code:java} SELECT last_name FROM ( SELECT DISTINCT last_name, person_id, company_id FROM Person ) src INNER JOIN department dep ON src.person_id = dep.person_id LEFT JOIN organization og ON src.company_id = og.company_id {code} > SQL query with DISTINCT and JOIN in suquery produces "Column not found" > - > > Key: IGNITE-10110 > URL: https://issues.apache.org/jira/browse/IGNITE-10110 > Project: Ignite > Issue Type: Bug > Components: sql >Affects Versions: 2.4 >Reporter: Pavel Vinokurov >Priority: Major > Labels: sql > > Initial script: > CREATE TABLE Person( > person_id INTEGER PRIMARY KEY, > company_id INTEGER, > last_name VARCHAR(100) > ); > CREATE TABLE Company( > company_id INTEGER PRIMARY KEY, > location_id INTEGER > ); > CREATE TABLE Department( > department_id INTEGER PRIMARY KEY, > person_id INTEGER > ); > CREATE TABLE Organization( > organization_id INTEGER PRIMARY KEY, > company_id INTEGER > ); > Query: > SELECT > last_name > FROM > ( SELECT > last_name, > person_id, > company_id > FROM > ( SELECT > last_name, > person_id, > p.company_id as company_id > FROM > Person p > INNER JOIN > ( > SELECT > DISTINCT location_id, > company_id > FROM > Company > WHERE > location_id = 1 > ) cpy > ON ( > p.company_id = cpy.company_id > ) > ) a > ) src > INNER JOIN > department dep > ON src.person_id = dep.person_id > LEFT JOIN > organization og > ON src.company_id = og.company_id > Result: > Caused by: org.h2.jdbc.JdbcSQLException: Column "SRC__Z4.COMPANY_ID" not > found; SQL statement: > SELECT > DEP__Z5.PERSON_ID __C2_0 > FROM PUBLIC.DEPARTMENT DEP__Z5 > LEFT OUTER JOIN PUBLIC.ORGANIZATION OG__Z6 > ON SRC__Z4.COMPANY_ID = OG__Z6.COMPANY_ID -- This message was sent by Atlassian JIRA (v7.6.3#76005)