[jira] [Commented] (GEODE-4748) Geode put may result in inconsistent cache if network problem occurs or serialization of key or value class fails
[ https://issues.apache.org/jira/browse/GEODE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389202#comment-16389202 ] Eugene Nedzvetsky commented on GEODE-4748: -- Dan Smith. I couldn't reproduce this issue with drop packets emulation. I'll create another bug with drop packets emulation(cluster hangs on TXCommitMessage$CommitReplyProcessor.waitForCommitCompletion due to system didn't kick out recipient member). Thanks. > Geode put may result in inconsistent cache if network problem occurs or > serialization of key or value class fails > - > > Key: GEODE-4748 > URL: https://issues.apache.org/jira/browse/GEODE-4748 > Project: Geode > Issue Type: Bug > Components: membership, regions >Affects Versions: 1.0.0-incubating, 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.2.1, > 1.4.0 >Reporter: Vadim Lotarev >Priority: Critical > Attachments: clumsy.jpg, geode-4748.log > > > Geode cache became inconsistent in case if networking and serialization > problems occur at commit time. How to reproduce: > # create any simple _replicated_ region > # run two nodes > # put some value in the region (within a transaction or not) > # execute query on both nodes to check that the same value is returned (I > used JMX for that) > # emulate somehow temporary networking or serialization error (throw > IOException from toData() or use [clumsy|https://jagt.github.io/clumsy/] to > emulate network interruption) > # repeat [#3], exception should occur > # repeat [#4] - you should see different values on different nodes > It looks like errors occurred after {{TXState.applyChanges}} produce > inconsistency - it is impossible to rollback applied local changes what leads > to the state where local cache contains changed data but other node(s) old > data (before changes made in transaction). > To me, consistency is a key property for the systems like Geode so I would > consider this bug as a critical one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-4748) Geode put may result in inconsistent cache if network problem occurs or serialization of key or value class fails
[ https://issues.apache.org/jira/browse/GEODE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384289#comment-16384289 ] Dan Smith commented on GEODE-4748: -- Can you describe the symptoms you are seeing when you *drop* packets, as opposed to *tamper* with them? What I would expect is that if you are dropping packets the commit message may be delayed, or eventually a member will be kicked out of the system, but you won't see the serialization errors that are in the attached [^geode-4748.log] Do you see similar errors with SSL enabled if you are tampering with packets? > Geode put may result in inconsistent cache if network problem occurs or > serialization of key or value class fails > - > > Key: GEODE-4748 > URL: https://issues.apache.org/jira/browse/GEODE-4748 > Project: Geode > Issue Type: Bug > Components: membership, regions >Affects Versions: 1.0.0-incubating, 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.2.1, > 1.4.0 >Reporter: Vadim Lotarev >Priority: Critical > Attachments: clumsy.jpg, geode-4748.log > > > Geode cache became inconsistent in case if networking and serialization > problems occur at commit time. How to reproduce: > # create any simple _replicated_ region > # run two nodes > # put some value in the region (within a transaction or not) > # execute query on both nodes to check that the same value is returned (I > used JMX for that) > # emulate somehow temporary networking or serialization error (throw > IOException from toData() or use [clumsy|https://jagt.github.io/clumsy/] to > emulate network interruption) > # repeat [#3], exception should occur > # repeat [#4] - you should see different values on different nodes > It looks like errors occurred after {{TXState.applyChanges}} produce > inconsistency - it is impossible to rollback applied local changes what leads > to the state where local cache contains changed data but other node(s) old > data (before changes made in transaction). > To me, consistency is a key property for the systems like Geode so I would > consider this bug as a critical one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-4748) Geode put may result in inconsistent cache if network problem occurs or serialization of key or value class fails
[ https://issues.apache.org/jira/browse/GEODE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384030#comment-16384030 ] Vadim Lotarev commented on GEODE-4748: -- I think that it is not really important what to do with the packets. We can reproduce this issue just dropping packets between nodes (emulating temporary network interruption). The really important think is that if this interruption occurs at transaction commit time than cache will be inconsistent. > Geode put may result in inconsistent cache if network problem occurs or > serialization of key or value class fails > - > > Key: GEODE-4748 > URL: https://issues.apache.org/jira/browse/GEODE-4748 > Project: Geode > Issue Type: Bug > Components: membership, regions >Affects Versions: 1.0.0-incubating, 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.2.1, > 1.4.0 >Reporter: Vadim Lotarev >Priority: Critical > Attachments: clumsy.jpg, geode-4748.log > > > Geode cache became inconsistent in case if networking and serialization > problems occur at commit time. How to reproduce: > # create any simple _replicated_ region > # run two nodes > # put some value in the region (within a transaction or not) > # execute query on both nodes to check that the same value is returned (I > used JMX for that) > # emulate somehow temporary networking or serialization error (throw > IOException from toData() or use [clumsy|https://jagt.github.io/clumsy/] to > emulate network interruption) > # repeat [#3], exception should occur > # repeat [#4] - you should see different values on different nodes > It looks like errors occurred after {{TXState.applyChanges}} produce > inconsistency - it is impossible to rollback applied local changes what leads > to the state where local cache contains changed data but other node(s) old > data (before changes made in transaction). > To me, consistency is a key property for the systems like Geode so I would > consider this bug as a critical one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (GEODE-4748) Geode put may result in inconsistent cache if network problem occurs or serialization of key or value class fails
[ https://issues.apache.org/jira/browse/GEODE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383889#comment-16383889 ] Dan Smith commented on GEODE-4748: -- Does the "Tamper" checkbox do what I think it does - modify the data that is being transmitted? Geode is relying on the error correction being done at the lower levels of the networking stack. If you want to protect against _malicious_ operators on your network corrupting packets I would suggest enabling ssl between peers - that would detect tampering with packets being sent. See https://geode.apache.org/docs/guide/latest/managing/security/implementing_ssl.html. > Geode put may result in inconsistent cache if network problem occurs or > serialization of key or value class fails > - > > Key: GEODE-4748 > URL: https://issues.apache.org/jira/browse/GEODE-4748 > Project: Geode > Issue Type: Bug > Components: membership, regions >Affects Versions: 1.0.0-incubating, 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.2.1, > 1.4.0 >Reporter: Vadim Lotarev >Assignee: Kirk Lund >Priority: Critical > Attachments: clumsy.jpg, geode-4748.log > > > Geode cache became inconsistent in case if networking and serialization > problems occur at commit time. How to reproduce: > # create any simple _replicated_ region > # run two nodes > # put some value in the region (within a transaction or not) > # execute query on both nodes to check that the same value is returned (I > used JMX for that) > # emulate somehow temporary networking or serialization error (throw > IOException from toData() or use [clumsy|https://jagt.github.io/clumsy/] to > emulate network interruption) > # repeat [#3], exception should occur > # repeat [#4] - you should see different values on different nodes > It looks like errors occurred after {{TXState.applyChanges}} produce > inconsistency - it is impossible to rollback applied local changes what leads > to the state where local cache contains changed data but other node(s) old > data (before changes made in transaction). > To me, consistency is a key property for the systems like Geode so I would > consider this bug as a critical one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)