[jira] [Created] (IGNITE-9768) Network partition leads to failures in Ignite's atomic data types.

2018-10-02 Thread Mo (JIRA)
Mo created IGNITE-9768:
--

 Summary: Network partition leads to failures in Ignite's atomic 
data types.
 Key: IGNITE-9768
 URL: https://issues.apache.org/jira/browse/IGNITE-9768
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.4
Reporter: Mo


Creating a network partition in a replicated Ignite cluster leads to creating 
two independent clusters, each of which would operate independently from the 
other, even after the network partition is healed.

 

Setup: 3 servers (s1,s2,s3) two clients (c1,c2).

A partition created \{(s1,s2,c1),(s3,c2)}.

--> At this point two independent clusters form; one containing s1 and s2, 
while the other containing s3. The two never rejoin even after the partition is 
healed. 

 

This leads to a faulty atomic types in Ignite. 

Affected data types:  
 * *Atomic Sequence*: An IncrementAndGet operation on *s3* will no affect the 
sequence in both *s1* and *s2* (even after the partition is healed).

 * *AtomicLong* and *AtomicRef*: Operations such as IncrementAndGet, 
CompareAndSet on *s3* will not be reflected to *s1* and *s2* even after the 
partition heals, which leads in faulty results for clients connected to these 
servers.

 * *CountDownLatch*: A CountDown Operation on the latch in *s3* will not be 
reflected to the other servers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9767) Network partition leads to failures in Ignite's semaphore

2018-10-02 Thread Mo (JIRA)
Mo created IGNITE-9767:
--

 Summary: Network partition leads to failures in Ignite's semaphore
 Key: IGNITE-9767
 URL: https://issues.apache.org/jira/browse/IGNITE-9767
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.4
Reporter: Mo


Creating a network partition in a replicated Ignite cluster leads to creating 
two independent clusters, each of which would operate independently from the 
other, even after the network partition is healed.

 

Setup: 3 servers (s1,s2,s3) two clients (c1,c2).

A partition created \{(s1,s2,c1),(s3,c2)}.

--> At this point two independent clusters form; one containing s1 and s2, 
while the other containing s3. The two never rejoin even after the partition is 
healed. 

 

This leads to a faulty semaphore on both sides of the partition. For example, 
if a semaphore with one permit is created in the cluster, after creating a 
network partition and healing it, both *c1* and *c2* can acquire that one 
permit.

 

System config: 

Release acquired permits if node, that owned them, left topology ==> Set to true



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9766) Network partition leads to failures in Ignite's set

2018-10-02 Thread Mo (JIRA)
Mo created IGNITE-9766:
--

 Summary: Network partition leads to failures in Ignite's set
 Key: IGNITE-9766
 URL: https://issues.apache.org/jira/browse/IGNITE-9766
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.4
Reporter: Mo


Creating a network partition in a replicated Ignite cluster leads to creating 
two independent clusters, each of which would operate independently from the 
other, even after the network partition is healed.

 

Setup: 3 servers (s1,s2,s3) two clients (c1,c2).

A partition created \{(s1,s2,c1),(s3,c2)}.

--> At this point two independent clusters form; one containing s1 and s2, 
while the other containing s3. The two never rejoin even after the partition is 
healed. 

 

This leads to a faulty set in both sides of the partition. For example, adding 
an element to the set in *s3* will not add that element *s1* and *s2*, even 
after the partition is healed. This leads to data unavailability.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9765) Network partition leads to failures in Ignite's queue

2018-10-02 Thread Mo (JIRA)
Mo created IGNITE-9765:
--

 Summary: Network partition leads to failures in Ignite's queue
 Key: IGNITE-9765
 URL: https://issues.apache.org/jira/browse/IGNITE-9765
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.4
Reporter: Mo


Creating a network partition in a replicated Ignite cluster leads to creating 
two independent clusters, each of which would operate independently from the 
other, even after the network partition is healed.

 

Setup: 3 servers (s1,s2,s3) two clients (c1,c2).

A partition created \{(s1,s2,c1),(s3,c2)}.

--> At this point two independent clusters form; one containing s1 and s2, 
while the other containing s3. The two never rejoin even after the partition is 
healed. 

 

Affected operations:  
 * *Queue add*: Inserting an element to *s3*'s queue ** will no be propagated 
to *s1* and *s2* even after the partition is healed. This leads to data 
unavailability.

 
 * *Queue remove:* Removing an element from the queue in *s3* will not be 
executed in the other servers. This leads to reappearance of deleted data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-9762) Network partition leads to failures in Ignite's cache

2018-10-02 Thread Mo (JIRA)
Mo created IGNITE-9762:
--

 Summary: Network partition leads to failures in Ignite's cache
 Key: IGNITE-9762
 URL: https://issues.apache.org/jira/browse/IGNITE-9762
 Project: Ignite
  Issue Type: Bug
  Components: cache
Affects Versions: 2.4
Reporter: Mo


Creating a network partition in a replicated Ignite cluster leads to creating 
two independent clusters, each of which would operate independently from the 
other, even after the network partition is healed.

 

 

Setup: 3 servers (s1,s2,s3) two clients (c1,c2).

A partition created \{(s1,s2,c1),(s3,c2)}.

--> At this point two independent clusters form; one containing s1 and s2, 
while the other containing s3. The two never rejoin even after the partition is 
healed. 

 

This leads to a faulty cache in both sides of the partition:

 
 * *Stale reads*: An update to a cache in one side of the partition will not be 
propagated to the other side, hence, future reads to the other side's cache 
(using the updated key) will be stale reads.

 
 * *Data unavailability*: Inserting a new element to the cache on one side of 
the partition will not be added to the other side even after the partition is 
healed. This results in data unavailability for clients connected to the 
servers on the other side of the partition.

 

These are the settings used for the replicated cache:

 
cfg.setCacheMode(CacheMode.REPLICATED);
cfg.setAtomicityMode(CacheAtomicityMode.ATOMIC);
cfg.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
cfg.setReadFromBackup(false);
cfg.setPartitionLossPolicy(PartitionLossPolicy.READ_ONLY_SAFE);



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8883) Semaphore fails on network partitioning 2

2018-06-26 Thread Mo (JIRA)
Mo created IGNITE-8883:
--

 Summary: Semaphore fails on network partitioning 2
 Key: IGNITE-8883
 URL: https://issues.apache.org/jira/browse/IGNITE-8883
 Project: Ignite
  Issue Type: Bug
  Components: data structures
Reporter: Mo


Scenario: Three servers (s1,s2,s3) two clients (c1,c2).

A semaphore with one permit is created. 

Config: 
 # {{Release acquired permits if the node that owned them left topology: set to 
true}}

steps: 
 # c2 acquires the permit.
 # Network failure happens, isolating c2 from the rest of nodes for a period of 
time.
 # Network heals.
 # c2 releases the permit.
 # c2 acquires the permit.
 # c1 tries to acquire lock but fails (exception)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8882) Semaphore fails on network partitioning 1

2018-06-26 Thread Mo (JIRA)
Mo created IGNITE-8882:
--

 Summary: Semaphore fails on network partitioning 1
 Key: IGNITE-8882
 URL: https://issues.apache.org/jira/browse/IGNITE-8882
 Project: Ignite
  Issue Type: Bug
  Components: data structures
Reporter: Mo


Scenario: Three servers (s1,s2,s3) two clients (c1, c2, c3, c4).

A semaphore with one permit is created. 

Config: 

{{1. Release acquired permits if the node that owned them left topology: set to 
false}}

2.  TCP discovery mode: on

 

steps: 
 # c2 acquires the permit.
 # Network failure happens, isolating s1,s2, c1, and c3 from s3, c2, and c4 
(i.e., (s1,s2,c1,c3),(s3,c2,c4)})
 # c2 releases the lock
 # c1 and c3 try to acquire lock, but fail (an exception happens)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8881) Semaphore hangs on network partitioning

2018-06-26 Thread Mo (JIRA)
Mo created IGNITE-8881:
--

 Summary: Semaphore hangs on network partitioning
 Key: IGNITE-8881
 URL: https://issues.apache.org/jira/browse/IGNITE-8881
 Project: Ignite
  Issue Type: Bug
  Components: data structures
Affects Versions: 2.4
Reporter: Mo


Scenario: Three servers (s1,s2,s3) two clients (c1,c2).

A semaphore with one permit is created. 

Config: 

{{1. Release acquired permits if the node that owned them left topology: set to 
false}}

2.  TCP discovery mode: on

 1.c2 takes a lock
2. Network partitioning \{(s1,s2,,s3,c1),(c2)}, then heal it
4. c3 tries to release the lock, but hangs

steps: 
 # c2 acquires the permit.
 # Network failure happens, isolating c2 from the rest of nodes for a period of 
time.
 # Network heals.
 # c2 tries to release the permit but hangs. 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8593) The semaphore's isBroken function doesn't work properly.

2018-05-24 Thread Mo (JIRA)
Mo created IGNITE-8593:
--

 Summary: The semaphore's isBroken function doesn't work properly.
 Key: IGNITE-8593
 URL: https://issues.apache.org/jira/browse/IGNITE-8593
 Project: Ignite
  Issue Type: Bug
  Components: data structures
Affects Versions: 2.4
Reporter: Mo


Scenario: Three servers (s1,s2,s3) two clients (c1,c2).

A semaphore with one permit is created. 

Config: {{Release acquired permits if node, that owned them, left topology: set 
to false}}

 
 # c2 acquires the permit.
 # Network failure happens, isolating c2 from the rest of nodes for a period of 
time.
 # Network heals.
 # c2 releases the permit.
 # c2 acquires the permit.
 # Calling semaphore.isBroken() returns false on both c1 and c2.
 # c1 tries to acquire the permit but fails.
 # Now calling isBroken() returns true on both c1 and c2.

 

I think isBroken() should return true before a client tries to acquire a 
permit, and then fails (i.e., in step 6) rather than after acquiring a permit 
fails, as in the latter case, what purpose does the isBroken() function serves?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-8592) Network partitions lead to two independent clusters

2018-05-24 Thread Mo (JIRA)
Mo created IGNITE-8592:
--

 Summary: Network partitions lead to two independent clusters
 Key: IGNITE-8592
 URL: https://issues.apache.org/jira/browse/IGNITE-8592
 Project: Ignite
  Issue Type: Bug
Affects Versions: 2.4
Reporter: Mo


Creating a network partition in a replicated Ignite cluster leads to creating 
two independent clusters, each of which would operate independently from the 
other, even after the network partition is healed.

 

Setup: 3 servers (s1,s2,s3) two clients (c1,c2).

A partition created \{(s1,s2,c1),(s3,c2)}.

--> At this point two independent clusters form; one containing s1 and s2, 
while the other containing s3. The two never rejoin even after the partition is 
healed. 

 

This creates different kinds of problems for the different data structure 
ignite provides, such as the cache (stale reads, and data unavailability), 
atomic types (atomicref and long ) ... etc. 

 

These are the settings used for the replicated cache:

 
cfg.setCacheMode(CacheMode.REPLICATED);
cfg.setAtomicityMode(CacheAtomicityMode.ATOMIC);
cfg.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
cfg.setReadFromBackup(false);
cfg.setPartitionLossPolicy(PartitionLossPolicy.READ_ONLY_SAFE);



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)