[jira] [Commented] (IGNITE-4424) REPLICATED cache isn't synced across nodes

2016-12-20 Thread Anton Vinogradov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15764520#comment-15764520
 ] 

Anton Vinogradov commented on IGNITE-4424:
--

Problem found. 
cctx.mvcc().addAtomicFuture(...) happens not under topology.readlock (it 
already released)
So, exchange is not waiting for putAll operation finish.

Dummy hotfix (relocation of this code to readlock section solved the issue).
Fixing code in proper way.

> REPLICATED cache isn't synced across nodes
> --
>
> Key: IGNITE-4424
> URL: https://issues.apache.org/jira/browse/IGNITE-4424
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 1.8
>Reporter: Andrew Mashenkov
>Assignee: Anton Vinogradov
>Priority: Blocker
> Fix For: 2.0
>
> Attachments: ReplicatedCacheRebalanceFails.java, 
> ReplicatedCacheRebalanceFails.java, ignite-d8e433e4.log
>
>
> Replicated cache sometimes won't sync across nodes properly.
> PFA a reproducer code.
> All nodes are started at the same time on different machines:
> * Ignition.start() // Blocks until node is up
> * Only one of the nodes performs next: getOrCreateCache() then putAll() 
> * All the other nodes block on this before proceeding. 
> * All of the nodes perform next:
> ** getOrCreateCache() // Again
> ** cache.localSize(CachePeekMode.ALL)
> All nodes should see filled cache, but sometimes some nodes see empty cache. 
> LocalSize call can be replaced by iterating over cache, but result will be 
> same.
> Much more rarely, cluster degradation is possible and one part of cluster see 
> empty cache while another see filled cache. Logs contain no errors at all. It 
> takes about two hours running test in infinite loop to catch this rare error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (IGNITE-4424) REPLICATED cache isn't synced across nodes

2016-12-20 Thread Anton Vinogradov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15763635#comment-15763635
 ] 

Anton Vinogradov commented on IGNITE-4424:
--

Initially reproduced. Reproducer added to issue.

> REPLICATED cache isn't synced across nodes
> --
>
> Key: IGNITE-4424
> URL: https://issues.apache.org/jira/browse/IGNITE-4424
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 1.8
>Reporter: Andrew Mashenkov
>Assignee: Anton Vinogradov
>Priority: Blocker
> Fix For: 2.0
>
> Attachments: ReplicatedCacheRebalanceFails.java, 
> ReplicatedCacheRebalanceFails.java, ignite-d8e433e4.log
>
>
> Replicated cache sometimes won't sync across nodes properly.
> PFA a reproducer code.
> All nodes are started at the same time on different machines:
> * Ignition.start() // Blocks until node is up
> * Only one of the nodes performs next: getOrCreateCache() then putAll() 
> * All the other nodes block on this before proceeding. 
> * All of the nodes perform next:
> ** getOrCreateCache() // Again
> ** cache.localSize(CachePeekMode.ALL)
> All nodes should see filled cache, but sometimes some nodes see empty cache. 
> LocalSize call can be replaced by iterating over cache, but result will be 
> same.
> Much more rarely, cluster degradation is possible and one part of cluster see 
> empty cache while another see filled cache. Logs contain no errors at all. It 
> takes about two hours running test in infinite loop to catch this rare error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)