[jira] [Commented] (GEODE-4748) Geode put may result in inconsistent cache if network problem occurs or serialization of key or value class fails

2018-03-06 Thread Eugene Nedzvetsky (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389202#comment-16389202
 ] 

Eugene Nedzvetsky commented on GEODE-4748:
--

Dan Smith. I couldn't reproduce this issue with drop packets emulation. I'll 
create another bug with drop packets emulation(cluster hangs on  
TXCommitMessage$CommitReplyProcessor.waitForCommitCompletion due to system 
didn't kick out recipient member). Thanks.

 

 

> Geode put may result in inconsistent cache if network problem occurs or 
> serialization of key or value class fails
> -
>
> Key: GEODE-4748
> URL: https://issues.apache.org/jira/browse/GEODE-4748
> Project: Geode
>  Issue Type: Bug
>  Components: membership, regions
>Affects Versions: 1.0.0-incubating, 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.2.1, 
> 1.4.0
>Reporter: Vadim Lotarev
>Priority: Critical
> Attachments: clumsy.jpg, geode-4748.log
>
>
> Geode cache became inconsistent in case if networking and serialization 
> problems occur at commit time. How to reproduce:
> # create any simple _replicated_ region
> # run two nodes
> # put some value in the region (within a transaction or not)
> # execute query on both nodes to check that the same value is returned (I 
> used JMX for that)
> # emulate somehow temporary networking or serialization error (throw 
> IOException from toData() or use [clumsy|https://jagt.github.io/clumsy/] to 
> emulate network interruption)
> # repeat [#3], exception should occur
> # repeat [#4] - you should see different values on different nodes
> It looks like errors occurred after {{TXState.applyChanges}} produce 
> inconsistency - it is impossible to rollback applied local changes what leads 
> to the state where local cache contains  changed data but other node(s) old 
> data (before changes made in transaction).
> To me, consistency is a key property for the systems like Geode so I would 
> consider this bug as a critical one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GEODE-4748) Geode put may result in inconsistent cache if network problem occurs or serialization of key or value class fails

2018-03-02 Thread Dan Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384289#comment-16384289
 ] 

Dan Smith commented on GEODE-4748:
--

Can you describe the symptoms you are seeing when you *drop* packets, as 
opposed to *tamper* with them? What I would expect is that if you are dropping 
packets the commit message may be delayed, or eventually a member will be 
kicked out of the system, but you won't see the serialization errors that are 
in the attached [^geode-4748.log]

 

Do you see similar errors with SSL enabled if you are tampering with packets?

> Geode put may result in inconsistent cache if network problem occurs or 
> serialization of key or value class fails
> -
>
> Key: GEODE-4748
> URL: https://issues.apache.org/jira/browse/GEODE-4748
> Project: Geode
>  Issue Type: Bug
>  Components: membership, regions
>Affects Versions: 1.0.0-incubating, 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.2.1, 
> 1.4.0
>Reporter: Vadim Lotarev
>Priority: Critical
> Attachments: clumsy.jpg, geode-4748.log
>
>
> Geode cache became inconsistent in case if networking and serialization 
> problems occur at commit time. How to reproduce:
> # create any simple _replicated_ region
> # run two nodes
> # put some value in the region (within a transaction or not)
> # execute query on both nodes to check that the same value is returned (I 
> used JMX for that)
> # emulate somehow temporary networking or serialization error (throw 
> IOException from toData() or use [clumsy|https://jagt.github.io/clumsy/] to 
> emulate network interruption)
> # repeat [#3], exception should occur
> # repeat [#4] - you should see different values on different nodes
> It looks like errors occurred after {{TXState.applyChanges}} produce 
> inconsistency - it is impossible to rollback applied local changes what leads 
> to the state where local cache contains  changed data but other node(s) old 
> data (before changes made in transaction).
> To me, consistency is a key property for the systems like Geode so I would 
> consider this bug as a critical one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GEODE-4748) Geode put may result in inconsistent cache if network problem occurs or serialization of key or value class fails

2018-03-02 Thread Vadim Lotarev (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384030#comment-16384030
 ] 

Vadim Lotarev commented on GEODE-4748:
--

I think that it is not really important what to do with the packets. We can 
reproduce this issue just dropping packets between nodes (emulating temporary 
network interruption). The really important think is that if this interruption 
occurs at transaction commit time than cache will be inconsistent.

> Geode put may result in inconsistent cache if network problem occurs or 
> serialization of key or value class fails
> -
>
> Key: GEODE-4748
> URL: https://issues.apache.org/jira/browse/GEODE-4748
> Project: Geode
>  Issue Type: Bug
>  Components: membership, regions
>Affects Versions: 1.0.0-incubating, 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.2.1, 
> 1.4.0
>Reporter: Vadim Lotarev
>Priority: Critical
> Attachments: clumsy.jpg, geode-4748.log
>
>
> Geode cache became inconsistent in case if networking and serialization 
> problems occur at commit time. How to reproduce:
> # create any simple _replicated_ region
> # run two nodes
> # put some value in the region (within a transaction or not)
> # execute query on both nodes to check that the same value is returned (I 
> used JMX for that)
> # emulate somehow temporary networking or serialization error (throw 
> IOException from toData() or use [clumsy|https://jagt.github.io/clumsy/] to 
> emulate network interruption)
> # repeat [#3], exception should occur
> # repeat [#4] - you should see different values on different nodes
> It looks like errors occurred after {{TXState.applyChanges}} produce 
> inconsistency - it is impossible to rollback applied local changes what leads 
> to the state where local cache contains  changed data but other node(s) old 
> data (before changes made in transaction).
> To me, consistency is a key property for the systems like Geode so I would 
> consider this bug as a critical one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (GEODE-4748) Geode put may result in inconsistent cache if network problem occurs or serialization of key or value class fails

2018-03-02 Thread Dan Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/GEODE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383889#comment-16383889
 ] 

Dan Smith commented on GEODE-4748:
--

Does the "Tamper" checkbox do what I think it does - modify the data that is 
being transmitted? Geode is relying on the error correction being done at the 
lower levels of the networking stack. If you want to protect against 
_malicious_ operators on your network corrupting packets I would suggest 
enabling ssl between peers - that would detect tampering with packets being 
sent. See 
https://geode.apache.org/docs/guide/latest/managing/security/implementing_ssl.html.

> Geode put may result in inconsistent cache if network problem occurs or 
> serialization of key or value class fails
> -
>
> Key: GEODE-4748
> URL: https://issues.apache.org/jira/browse/GEODE-4748
> Project: Geode
>  Issue Type: Bug
>  Components: membership, regions
>Affects Versions: 1.0.0-incubating, 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.2.1, 
> 1.4.0
>Reporter: Vadim Lotarev
>Assignee: Kirk Lund
>Priority: Critical
> Attachments: clumsy.jpg, geode-4748.log
>
>
> Geode cache became inconsistent in case if networking and serialization 
> problems occur at commit time. How to reproduce:
> # create any simple _replicated_ region
> # run two nodes
> # put some value in the region (within a transaction or not)
> # execute query on both nodes to check that the same value is returned (I 
> used JMX for that)
> # emulate somehow temporary networking or serialization error (throw 
> IOException from toData() or use [clumsy|https://jagt.github.io/clumsy/] to 
> emulate network interruption)
> # repeat [#3], exception should occur
> # repeat [#4] - you should see different values on different nodes
> It looks like errors occurred after {{TXState.applyChanges}} produce 
> inconsistency - it is impossible to rollback applied local changes what leads 
> to the state where local cache contains  changed data but other node(s) old 
> data (before changes made in transaction).
> To me, consistency is a key property for the systems like Geode so I would 
> consider this bug as a critical one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)