[jira] [Commented] (IGNITE-8827) Disable WAL during apply updates on recovery
[ https://issues.apache.org/jira/browse/IGNITE-8827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536500#comment-16536500 ] kcheng.mvp commented on IGNITE-8827: [~DmitriyGovorukhin][~agura][~ilantukh] Seems the checkin caused a Javadoc error. I have created a ticket https://issues.apache.org/jira/browse/IGNITE-8956 for this. I raised a PR for the fix. please help me do a code review. > Disable WAL during apply updates on recovery > > > Key: IGNITE-8827 > URL: https://issues.apache.org/jira/browse/IGNITE-8827 > Project: Ignite > Issue Type: Bug >Reporter: Dmitriy Govorukhin >Assignee: Dmitriy Govorukhin >Priority: Major > Fix For: 2.7 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-8376) Add cluster (de)activation events
[ https://issues.apache.org/jira/browse/IGNITE-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536501#comment-16536501 ] kcheng.mvp commented on IGNITE-8376: Thank you all very much for your comments. Right now I am working on it, once it's ready I will raise PR. > Add cluster (de)activation events > - > > Key: IGNITE-8376 > URL: https://issues.apache.org/jira/browse/IGNITE-8376 > Project: Ignite > Issue Type: Improvement >Reporter: Alexey Goncharuk >Assignee: kcheng.mvp >Priority: Major > Labels: newbie > Fix For: 2.7 > > > Currently, we do not have any way to detect that a cluster got activated, > which results in busy-loops polling {{cluster().active()}}. > I suggest to add new events, {{EVT_CLUSTER_ACTIVATED}}, > {{EVT_CLUSTER_DEACTIVATED}}, {{EVT_CLUSTER_ACTIVATION_FAILED}} which will be > fired when corresponding steps are completed. The event should contain, if > possible, information about the activation source (public API or > auto-activation), topology version on which activation was performed. The > fail event should contain information about the cause of the failure. If > needed, a new class for this event should be introduced. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-8956) Javadoc build failure: Warnings:core/src/main/java/org/apache/ignite/internal/processors/cache/WalStateManager.java:1271: warning - @inheritDoc used but check() does n
[ https://issues.apache.org/jira/browse/IGNITE-8956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536488#comment-16536488 ] kcheng.mvp commented on IGNITE-8956: [~ilantukh] Please help do the code review, here is the PR https://github.com/apache/ignite/pull/4328 > Javadoc build failure: > Warnings:core/src/main/java/org/apache/ignite/internal/processors/cache/WalStateManager.java:1271: > warning - @inheritDoc used but check() does not override or implement any > method. > --- > > Key: IGNITE-8956 > URL: https://issues.apache.org/jira/browse/IGNITE-8956 > Project: Ignite > Issue Type: Bug >Reporter: kcheng.mvp >Assignee: kcheng.mvp >Priority: Minor > > Checked the history, seems it's caused by the checkin for > https://issues.apache.org/jira/browse/IGNITE-8827 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-8776) Eviction policy MBeans are never registered if evictionPolicyFactory is used
[ https://issues.apache.org/jira/browse/IGNITE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536486#comment-16536486 ] kcheng.mvp commented on IGNITE-8776: [~slukyanov] I did the necessary changes, please view it again. Thanks. > Eviction policy MBeans are never registered if evictionPolicyFactory is used > > > Key: IGNITE-8776 > URL: https://issues.apache.org/jira/browse/IGNITE-8776 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.5 >Reporter: Stanislav Lukyanov >Assignee: kcheng.mvp >Priority: Minor > Labels: newbie > Fix For: 2.6 > > > Eviction policy MBeans, such as LruEvictionPolicyMBean, are never registered > if evictionPolicyFactory is set instead of evictionPolicy (the latter is > deprecated). > This happens because GridCacheProcessor::registerMbean attempts to find > either an *MBean interface or IgniteMBeanAware interface on the passed > object. It works for LruEvictionPolicy but not for LruEvictionPolicyFactory > (which doesn't implement these interfaces). > The code needs to be adjusted to handle factories correctly. > New tests are needed to make sure that all standard beans are registered > (IgniteKernalMbeansTest does that for kernal mbeans - need the same for cache > beans). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-8828) Detecting and stopping unresponsive nodes during Partition Map Exchange
[ https://issues.apache.org/jira/browse/IGNITE-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536387#comment-16536387 ] ASF GitHub Bot commented on IGNITE-8828: GitHub user ilantukh opened a pull request: https://github.com/apache/ignite/pull/4330 IGNITE-8828 You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-8828 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/4330.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4330 commit bcc5d2ce5c9272ccc1e1dfca8554dba5cef79055 Author: Ilya Lantukh Date: 2018-06-27T14:38:22Z IGNITE-8828 : Soft/Hard timeout - draft. commit 7e42e990e3ebd9dde23e9ca296351b8b38a7fa7b Author: Ilya Lantukh Date: 2018-06-29T13:01:10Z IGNITE-8828 : Node failure - draft. commit edf1d410b57dc9583c417369b07e4a5923032d43 Author: Ilya Lantukh Date: 2018-07-02T12:58:29Z IGNITE-8828 : Tests. commit 5dd15dc8277f375ef6bd29e3638215e300222675 Author: Ilya Lantukh Date: 2018-07-03T13:54:10Z IGNITE-8828 : Exchange state check messages. commit 4e464b8442ce1a942ecb99fce4f2dbad1741f319 Author: Ilya Lantukh Date: 2018-07-06T19:09:33Z IGNITE-8828 : Finalization. commit 2ca90377fe729f10c1b08427801f048d42b4020b Author: Ilya Lantukh Date: 2018-07-08T18:52:00Z IGNITE-8828 : Reverted soft-hard timeout implementation. commit 7510867e883852738737b4348204711c41f67a9c Author: Ilya Lantukh Date: 2018-07-08T19:04:42Z IGNITE-8828 : Cosmetic changes. > Detecting and stopping unresponsive nodes during Partition Map Exchange > --- > > Key: IGNITE-8828 > URL: https://issues.apache.org/jira/browse/IGNITE-8828 > Project: Ignite > Issue Type: Improvement > Components: general >Reporter: Sergey Chugunov >Assignee: Ilya Lantukh >Priority: Major > Labels: iep-25 > Original Estimate: 264h > Remaining Estimate: 264h > > During PME process coordinator (1) gathers local partition maps from all > nodes and (2) sends calculated full partition map back to all nodes in the > topology. > However if one or more nodes fail to send local information on step 1 for any > reason, PME process hangs blocking all operations. The only solution will be > to manually identify and stop nodes which failed to send info to coordinator. > This should be done by coordinator itself: in case it didn't receive in time > local partition maps from any nodes, it should check that stopping these > nodes won't lead to data loss and then stop them forcibly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8942) In some cases grid cannot be deactivated because of hanging CQ internal cleanup.
[ https://issues.apache.org/jira/browse/IGNITE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexei Scherbakov updated IGNITE-8942: -- Description: See the attachment for thread dump. Most probably caused by blocking of message worker while waiting for cluster state change: {noformat} "tcp-disco-msg-worker-#2%DPL_GRID%DplGridNodeName%" #380 daemon prio=10 os_prio=0 tid=0x7fe084c4c000 nid=0x39aa waiting on condition [0x7fdcd76f5000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140) at org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:193) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83) at org.apache.ignite.internal.processors.cache.CacheMetricsImpl.isValidForOperation(CacheMetricsImpl.java:715) at org.apache.ignite.internal.processors.cache.CacheMetricsImpl.isValidForReading(CacheMetricsImpl.java:724) at org.apache.ignite.internal.processors.cache.CacheMetricsSnapshot.(CacheMetricsSnapshot.java:334) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.localMetrics(GridCacheAdapter.java:3255) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$7.cacheMetrics(GridDiscoveryManager.java:1098) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMetricsUpdateMessage(ServerImpl.java:5141) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2794) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2570) at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:6903) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2657) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:6847) at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) {noformat} Another problem: org.apache.ignite.internal.processors.datastructures.DataStructuresProcessor#onDeActivate is called during exchange before transactions have completed, having probability of losing CQ updates for current transactions. was: See the attachment for thread dump. Most probably caused by blocking of message worker while waiting for cluster state change: {noformat} "tcp-disco-msg-worker-#2%DPL_GRID%DplGridNodeName%" #380 daemon prio=10 os_prio=0 tid=0x7fe084c4c000 nid=0x39aa waiting on condition [0x7fdcd76f5000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140) at org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:193) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83) at org.apache.ignite.internal.processors.cache.CacheMetricsImpl.isValidForOperation(CacheMetricsImpl.java:715) at org.apache.ignite.internal.processors.cache.CacheMetricsImpl.isValidForReading(CacheMetricsImpl.java:724) at org.apache.ignite.internal.processors.cache.CacheMetricsSnapshot.(CacheMetricsSnapshot.java:334) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.localMetrics(GridCacheAdapter.java:3255) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$7.cacheMetrics(GridDiscoveryManager.java:1098) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMetricsUpdateMessage(ServerImpl.java:5141) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2794) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2570) at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:6903) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2657) at