[jira] [Commented] (IGNITE-8827) Disable WAL during apply updates on recovery

2018-07-08 Thread kcheng.mvp (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536500#comment-16536500
 ] 

kcheng.mvp commented on IGNITE-8827:


[~DmitriyGovorukhin][~agura][~ilantukh]

Seems the checkin caused a Javadoc error. I have created a ticket 
https://issues.apache.org/jira/browse/IGNITE-8956 for this.

I raised a PR for the fix. please help me do a code review.

> Disable WAL during apply updates on recovery
> 
>
> Key: IGNITE-8827
> URL: https://issues.apache.org/jira/browse/IGNITE-8827
> Project: Ignite
>  Issue Type: Bug
>Reporter: Dmitriy Govorukhin
>Assignee: Dmitriy Govorukhin
>Priority: Major
> Fix For: 2.7
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8376) Add cluster (de)activation events

2018-07-08 Thread kcheng.mvp (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536501#comment-16536501
 ] 

kcheng.mvp commented on IGNITE-8376:


Thank you all very much for your comments. Right now I am working on it, once 
it's ready I will raise PR.

> Add cluster (de)activation events
> -
>
> Key: IGNITE-8376
> URL: https://issues.apache.org/jira/browse/IGNITE-8376
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Alexey Goncharuk
>Assignee: kcheng.mvp
>Priority: Major
>  Labels: newbie
> Fix For: 2.7
>
>
> Currently, we do not have any way to detect that a cluster got activated, 
> which results in busy-loops polling {{cluster().active()}}.
> I suggest to add new events, {{EVT_CLUSTER_ACTIVATED}}, 
> {{EVT_CLUSTER_DEACTIVATED}}, {{EVT_CLUSTER_ACTIVATION_FAILED}} which will be 
> fired when corresponding steps are completed. The event should contain, if 
> possible, information about the activation source (public API or 
> auto-activation), topology version on which activation was performed. The 
> fail event should contain information about the cause of the failure. If 
> needed, a new class for this event should be introduced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8956) Javadoc build failure: Warnings:core/src/main/java/org/apache/ignite/internal/processors/cache/WalStateManager.java:1271: warning - @inheritDoc used but check() does n

2018-07-08 Thread kcheng.mvp (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536488#comment-16536488
 ] 

kcheng.mvp commented on IGNITE-8956:


[~ilantukh]

Please help do the code review, here is the PR

https://github.com/apache/ignite/pull/4328

> Javadoc build failure: 
> Warnings:core/src/main/java/org/apache/ignite/internal/processors/cache/WalStateManager.java:1271:
>  warning - @inheritDoc used but check() does not override or implement any 
> method.
> ---
>
> Key: IGNITE-8956
> URL: https://issues.apache.org/jira/browse/IGNITE-8956
> Project: Ignite
>  Issue Type: Bug
>Reporter: kcheng.mvp
>Assignee: kcheng.mvp
>Priority: Minor
>
> Checked the history, seems it's caused by the checkin for 
> https://issues.apache.org/jira/browse/IGNITE-8827



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8776) Eviction policy MBeans are never registered if evictionPolicyFactory is used

2018-07-08 Thread kcheng.mvp (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536486#comment-16536486
 ] 

kcheng.mvp commented on IGNITE-8776:


[~slukyanov] I did the necessary changes, please view it again. Thanks.

> Eviction policy MBeans are never registered if evictionPolicyFactory is used
> 
>
> Key: IGNITE-8776
> URL: https://issues.apache.org/jira/browse/IGNITE-8776
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.5
>Reporter: Stanislav Lukyanov
>Assignee: kcheng.mvp
>Priority: Minor
>  Labels: newbie
> Fix For: 2.6
>
>
> Eviction policy MBeans, such as LruEvictionPolicyMBean, are never registered 
> if evictionPolicyFactory is set instead of evictionPolicy (the latter is 
> deprecated).
> This happens because GridCacheProcessor::registerMbean attempts to find 
> either an *MBean interface or IgniteMBeanAware interface on the passed 
> object. It works for LruEvictionPolicy but not for LruEvictionPolicyFactory 
> (which doesn't implement these interfaces).
> The code needs to be adjusted to handle factories correctly.
> New tests are needed to make sure that all standard beans are registered 
> (IgniteKernalMbeansTest does that for kernal mbeans - need the same for cache 
> beans).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8828) Detecting and stopping unresponsive nodes during Partition Map Exchange

2018-07-08 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16536387#comment-16536387
 ] 

ASF GitHub Bot commented on IGNITE-8828:


GitHub user ilantukh opened a pull request:

https://github.com/apache/ignite/pull/4330

IGNITE-8828



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-8828

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/4330.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4330


commit bcc5d2ce5c9272ccc1e1dfca8554dba5cef79055
Author: Ilya Lantukh 
Date:   2018-06-27T14:38:22Z

IGNITE-8828 : Soft/Hard timeout - draft.

commit 7e42e990e3ebd9dde23e9ca296351b8b38a7fa7b
Author: Ilya Lantukh 
Date:   2018-06-29T13:01:10Z

IGNITE-8828 : Node failure - draft.

commit edf1d410b57dc9583c417369b07e4a5923032d43
Author: Ilya Lantukh 
Date:   2018-07-02T12:58:29Z

IGNITE-8828 : Tests.

commit 5dd15dc8277f375ef6bd29e3638215e300222675
Author: Ilya Lantukh 
Date:   2018-07-03T13:54:10Z

IGNITE-8828 : Exchange state check messages.

commit 4e464b8442ce1a942ecb99fce4f2dbad1741f319
Author: Ilya Lantukh 
Date:   2018-07-06T19:09:33Z

IGNITE-8828 : Finalization.

commit 2ca90377fe729f10c1b08427801f048d42b4020b
Author: Ilya Lantukh 
Date:   2018-07-08T18:52:00Z

IGNITE-8828 : Reverted soft-hard timeout implementation.

commit 7510867e883852738737b4348204711c41f67a9c
Author: Ilya Lantukh 
Date:   2018-07-08T19:04:42Z

IGNITE-8828 : Cosmetic changes.




> Detecting and stopping unresponsive nodes during Partition Map Exchange
> ---
>
> Key: IGNITE-8828
> URL: https://issues.apache.org/jira/browse/IGNITE-8828
> Project: Ignite
>  Issue Type: Improvement
>  Components: general
>Reporter: Sergey Chugunov
>Assignee: Ilya Lantukh
>Priority: Major
>  Labels: iep-25
>   Original Estimate: 264h
>  Remaining Estimate: 264h
>
> During PME process coordinator (1) gathers local partition maps from all 
> nodes and (2) sends calculated full partition map back to all nodes in the 
> topology.
> However if one or more nodes fail to send local information on step 1 for any 
> reason, PME process hangs blocking all operations. The only solution will be 
> to manually identify and stop nodes which failed to send info to coordinator.
> This should be done by coordinator itself: in case it didn't receive in time 
> local partition maps from any nodes, it should check that stopping these 
> nodes won't lead to data loss and then stop them forcibly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-8942) In some cases grid cannot be deactivated because of hanging CQ internal cleanup.

2018-07-08 Thread Alexei Scherbakov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexei Scherbakov updated IGNITE-8942:
--
Description: 
See the attachment for thread dump.

Most probably caused by blocking of message worker while waiting for cluster 
state change:

{noformat}
"tcp-disco-msg-worker-#2%DPL_GRID%DplGridNodeName%" #380 daemon prio=10 
os_prio=0 tid=0x7fe084c4c000 nid=0x39aa waiting on condition 
[0x7fdcd76f5000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
at 
org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:193)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
at 
org.apache.ignite.internal.processors.cache.CacheMetricsImpl.isValidForOperation(CacheMetricsImpl.java:715)
at 
org.apache.ignite.internal.processors.cache.CacheMetricsImpl.isValidForReading(CacheMetricsImpl.java:724)
at 
org.apache.ignite.internal.processors.cache.CacheMetricsSnapshot.(CacheMetricsSnapshot.java:334)
at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter.localMetrics(GridCacheAdapter.java:3255)
at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$7.cacheMetrics(GridDiscoveryManager.java:1098)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMetricsUpdateMessage(ServerImpl.java:5141)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2794)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2570)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:6903)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2657)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:6847)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
{noformat}

Another problem:

org.apache.ignite.internal.processors.datastructures.DataStructuresProcessor#onDeActivate
 is called during exchange before transactions have completed, having 
probability of losing CQ updates for current transactions.

  was:
See the attachment for thread dump.

Most probably caused by blocking of message worker while waiting for cluster 
state change:

{noformat}
"tcp-disco-msg-worker-#2%DPL_GRID%DplGridNodeName%" #380 daemon prio=10 
os_prio=0 tid=0x7fe084c4c000 nid=0x39aa waiting on condition 
[0x7fdcd76f5000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
at 
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
at 
org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:193)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
at 
org.apache.ignite.internal.processors.cache.CacheMetricsImpl.isValidForOperation(CacheMetricsImpl.java:715)
at 
org.apache.ignite.internal.processors.cache.CacheMetricsImpl.isValidForReading(CacheMetricsImpl.java:724)
at 
org.apache.ignite.internal.processors.cache.CacheMetricsSnapshot.(CacheMetricsSnapshot.java:334)
at 
org.apache.ignite.internal.processors.cache.GridCacheAdapter.localMetrics(GridCacheAdapter.java:3255)
at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$7.cacheMetrics(GridDiscoveryManager.java:1098)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMetricsUpdateMessage(ServerImpl.java:5141)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2794)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2570)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:6903)
at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2657)
at