Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade

2017-07-03 Thread Dan Smith


> On July 3, 2017, 4:02 p.m., Dan Smith wrote:
> > geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
> > Lines 129 (patched)
> > 
> >
> > I don't see this property ever get used?
> 
> Bruce Schuchardt wrote:
> This is used in ProcessManager.  The setting is already in place.

Ok - it's using the same string? It would be better for everything to use the 
same java constant. Or better yet, our tests should be doing what users do and 
use an old version locator.


> On July 3, 2017, 4:02 p.m., Dan Smith wrote:
> > geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java
> > Lines 307 (patched)
> > 
> >
> > Is this not just targetVersion.equals(GFE_90) ?
> 
> Bruce Schuchardt wrote:
> Anything between 1.0.0-incubating and GEODE-110 could have this problem

Got it.


- Dan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/#review179513
---


On June 30, 2017, 11:17 p.m., Bruce Schuchardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60570/
> ---
> 
> (Updated June 30, 2017, 11:17 p.m.)
> 
> 
> Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, 
> Hitesh Khamesra, and Brian Rowe.
> 
> 
> Bugs: GEODE-3153
> https://issues.apache.org/jira/browse/GEODE-3153
> 
> 
> Repository: geode
> 
> 
> Description
> ---
> 
> Another problem was found in backward-compatibility testing.  If a 1.0.0 
> client was receiving subscription events generated by a 1.0.0 peer "feeder" 
> member and the events were routed through a 1.0.0 server the client might see 
> duplicate events when the server is stopped and the client fails over to a 
> 1.2.0 server holding its redundant subscription queue.  This is especially 
> possible if a large "ack" period is established in the client.
> 
> The problem stems from the EventID deserialization/reserialization of the 
> memberID bytes when sending to a 1.0 client.  It was deserializing using 
> Version.CURRENT, which ignores the UUID bytes in the serialized ID.  Then it 
> serialized the identifier using the client's version, which includes the UUID 
> bytes but which are zero due to the version used in deserialization.
> 
> 
> Diffs
> -
> 
>   
> geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
>  bc3d708da2ae9a8e386accb8d15e2ed49123241e 
>   geode-core/src/main/java/org/apache/geode/internal/Version.java 
> 557697159da644915e4ffe2405cdddc9ef37c5ac 
>   geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 
> 55c89f1f2e0800371dd4a30c4312c44f942a45ea 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
>  bc48d976096fafe2545e707da68dab5120ddca51 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
>  bfe4646b9abdf6075e8d30fab3d79924faade2aa 
>   
> geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
>  b69e004d63d74eccd5cd562ea269363ba3f2782e 
> 
> 
> Diff: https://reviews.apache.org/r/60570/diff/3/
> 
> 
> Testing
> ---
> 
> new unit tests, precheckin
> 
> 
> Thanks,
> 
> Bruce Schuchardt
> 
>



Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade

2017-07-03 Thread Bruce Schuchardt


> On July 3, 2017, 8:29 a.m., Alexander Murmann wrote:
> > geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
> > Lines 131 (patched)
> > 
> >
> > Nit pick: Do our tests commonly print debugging information? On other 
> > projects I've been on, everyone aimed for noise free tests.

We commonly have debugging information in test output.


- Bruce


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/#review179512
---


On June 30, 2017, 4:17 p.m., Bruce Schuchardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60570/
> ---
> 
> (Updated June 30, 2017, 4:17 p.m.)
> 
> 
> Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, 
> Hitesh Khamesra, and Brian Rowe.
> 
> 
> Bugs: GEODE-3153
> https://issues.apache.org/jira/browse/GEODE-3153
> 
> 
> Repository: geode
> 
> 
> Description
> ---
> 
> Another problem was found in backward-compatibility testing.  If a 1.0.0 
> client was receiving subscription events generated by a 1.0.0 peer "feeder" 
> member and the events were routed through a 1.0.0 server the client might see 
> duplicate events when the server is stopped and the client fails over to a 
> 1.2.0 server holding its redundant subscription queue.  This is especially 
> possible if a large "ack" period is established in the client.
> 
> The problem stems from the EventID deserialization/reserialization of the 
> memberID bytes when sending to a 1.0 client.  It was deserializing using 
> Version.CURRENT, which ignores the UUID bytes in the serialized ID.  Then it 
> serialized the identifier using the client's version, which includes the UUID 
> bytes but which are zero due to the version used in deserialization.
> 
> 
> Diffs
> -
> 
>   
> geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
>  bc3d708da2ae9a8e386accb8d15e2ed49123241e 
>   geode-core/src/main/java/org/apache/geode/internal/Version.java 
> 557697159da644915e4ffe2405cdddc9ef37c5ac 
>   geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 
> 55c89f1f2e0800371dd4a30c4312c44f942a45ea 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
>  bc48d976096fafe2545e707da68dab5120ddca51 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
>  bfe4646b9abdf6075e8d30fab3d79924faade2aa 
>   
> geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
>  b69e004d63d74eccd5cd562ea269363ba3f2782e 
> 
> 
> Diff: https://reviews.apache.org/r/60570/diff/3/
> 
> 
> Testing
> ---
> 
> new unit tests, precheckin
> 
> 
> Thanks,
> 
> Bruce Schuchardt
> 
>



Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade

2017-07-03 Thread Bruce Schuchardt


> On July 3, 2017, 9:02 a.m., Dan Smith wrote:
> > It looks like this code should work. I'm still a bit worried that we are 
> > relying on code that tries to deserialize the member id with the wrong 
> > version and then ignores errors that occur. I'd like to see us revisit 
> > propegating the version with the membership id bytes.

Yes, we need to have the version info in bytes that we retain in serialized 
form like this.


> On July 3, 2017, 9:02 a.m., Dan Smith wrote:
> > geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
> > Lines 129 (patched)
> > 
> >
> > I don't see this property ever get used?

This is used in ProcessManager.  The setting is already in place.


> On July 3, 2017, 9:02 a.m., Dan Smith wrote:
> > geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java
> > Lines 307 (patched)
> > 
> >
> > Is this not just targetVersion.equals(GFE_90) ?

Anything between 1.0.0-incubating and GEODE-110 could have this problem


- Bruce


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/#review179513
---


On June 30, 2017, 4:17 p.m., Bruce Schuchardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60570/
> ---
> 
> (Updated June 30, 2017, 4:17 p.m.)
> 
> 
> Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, 
> Hitesh Khamesra, and Brian Rowe.
> 
> 
> Bugs: GEODE-3153
> https://issues.apache.org/jira/browse/GEODE-3153
> 
> 
> Repository: geode
> 
> 
> Description
> ---
> 
> Another problem was found in backward-compatibility testing.  If a 1.0.0 
> client was receiving subscription events generated by a 1.0.0 peer "feeder" 
> member and the events were routed through a 1.0.0 server the client might see 
> duplicate events when the server is stopped and the client fails over to a 
> 1.2.0 server holding its redundant subscription queue.  This is especially 
> possible if a large "ack" period is established in the client.
> 
> The problem stems from the EventID deserialization/reserialization of the 
> memberID bytes when sending to a 1.0 client.  It was deserializing using 
> Version.CURRENT, which ignores the UUID bytes in the serialized ID.  Then it 
> serialized the identifier using the client's version, which includes the UUID 
> bytes but which are zero due to the version used in deserialization.
> 
> 
> Diffs
> -
> 
>   
> geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
>  bc3d708da2ae9a8e386accb8d15e2ed49123241e 
>   geode-core/src/main/java/org/apache/geode/internal/Version.java 
> 557697159da644915e4ffe2405cdddc9ef37c5ac 
>   geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 
> 55c89f1f2e0800371dd4a30c4312c44f942a45ea 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
>  bc48d976096fafe2545e707da68dab5120ddca51 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
>  bfe4646b9abdf6075e8d30fab3d79924faade2aa 
>   
> geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
>  b69e004d63d74eccd5cd562ea269363ba3f2782e 
> 
> 
> Diff: https://reviews.apache.org/r/60570/diff/3/
> 
> 
> Testing
> ---
> 
> new unit tests, precheckin
> 
> 
> Thanks,
> 
> Bruce Schuchardt
> 
>



Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade

2017-07-03 Thread Dan Smith

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/#review179513
---


Fix it, then Ship it!




It looks like this code should work. I'm still a bit worried that we are 
relying on code that tries to deserialize the member id with the wrong version 
and then ignores errors that occur. I'd like to see us revisit propegating the 
version with the membership id bytes.


geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
Lines 129 (patched)


I don't see this property ever get used?



geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java
Lines 307 (patched)


Is this not just targetVersion.equals(GFE_90) ?


- Dan Smith


On June 30, 2017, 11:17 p.m., Bruce Schuchardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60570/
> ---
> 
> (Updated June 30, 2017, 11:17 p.m.)
> 
> 
> Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, 
> Hitesh Khamesra, and Brian Rowe.
> 
> 
> Bugs: GEODE-3153
> https://issues.apache.org/jira/browse/GEODE-3153
> 
> 
> Repository: geode
> 
> 
> Description
> ---
> 
> Another problem was found in backward-compatibility testing.  If a 1.0.0 
> client was receiving subscription events generated by a 1.0.0 peer "feeder" 
> member and the events were routed through a 1.0.0 server the client might see 
> duplicate events when the server is stopped and the client fails over to a 
> 1.2.0 server holding its redundant subscription queue.  This is especially 
> possible if a large "ack" period is established in the client.
> 
> The problem stems from the EventID deserialization/reserialization of the 
> memberID bytes when sending to a 1.0 client.  It was deserializing using 
> Version.CURRENT, which ignores the UUID bytes in the serialized ID.  Then it 
> serialized the identifier using the client's version, which includes the UUID 
> bytes but which are zero due to the version used in deserialization.
> 
> 
> Diffs
> -
> 
>   
> geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
>  bc3d708da2ae9a8e386accb8d15e2ed49123241e 
>   geode-core/src/main/java/org/apache/geode/internal/Version.java 
> 557697159da644915e4ffe2405cdddc9ef37c5ac 
>   geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 
> 55c89f1f2e0800371dd4a30c4312c44f942a45ea 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
>  bc48d976096fafe2545e707da68dab5120ddca51 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
>  bfe4646b9abdf6075e8d30fab3d79924faade2aa 
>   
> geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
>  b69e004d63d74eccd5cd562ea269363ba3f2782e 
> 
> 
> Diff: https://reviews.apache.org/r/60570/diff/3/
> 
> 
> Testing
> ---
> 
> new unit tests, precheckin
> 
> 
> Thanks,
> 
> Bruce Schuchardt
> 
>



Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade

2017-07-03 Thread Alexander Murmann

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/#review179512
---


Fix it, then Ship it!




Ship It!


geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
Lines 131 (patched)


Nit pick: Do our tests commonly print debugging information? On other 
projects I've been on, everyone aimed for noise free tests.


- Alexander Murmann


On June 30, 2017, 11:17 p.m., Bruce Schuchardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60570/
> ---
> 
> (Updated June 30, 2017, 11:17 p.m.)
> 
> 
> Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, 
> Hitesh Khamesra, and Brian Rowe.
> 
> 
> Bugs: GEODE-3153
> https://issues.apache.org/jira/browse/GEODE-3153
> 
> 
> Repository: geode
> 
> 
> Description
> ---
> 
> Another problem was found in backward-compatibility testing.  If a 1.0.0 
> client was receiving subscription events generated by a 1.0.0 peer "feeder" 
> member and the events were routed through a 1.0.0 server the client might see 
> duplicate events when the server is stopped and the client fails over to a 
> 1.2.0 server holding its redundant subscription queue.  This is especially 
> possible if a large "ack" period is established in the client.
> 
> The problem stems from the EventID deserialization/reserialization of the 
> memberID bytes when sending to a 1.0 client.  It was deserializing using 
> Version.CURRENT, which ignores the UUID bytes in the serialized ID.  Then it 
> serialized the identifier using the client's version, which includes the UUID 
> bytes but which are zero due to the version used in deserialization.
> 
> 
> Diffs
> -
> 
>   
> geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
>  bc3d708da2ae9a8e386accb8d15e2ed49123241e 
>   geode-core/src/main/java/org/apache/geode/internal/Version.java 
> 557697159da644915e4ffe2405cdddc9ef37c5ac 
>   geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 
> 55c89f1f2e0800371dd4a30c4312c44f942a45ea 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
>  bc48d976096fafe2545e707da68dab5120ddca51 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
>  bfe4646b9abdf6075e8d30fab3d79924faade2aa 
>   
> geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
>  b69e004d63d74eccd5cd562ea269363ba3f2782e 
> 
> 
> Diff: https://reviews.apache.org/r/60570/diff/3/
> 
> 
> Testing
> ---
> 
> new unit tests, precheckin
> 
> 
> Thanks,
> 
> Bruce Schuchardt
> 
>



Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade

2017-07-01 Thread Bruce Schuchardt

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/#review179474
---



This has now passed precheckin, so if I can get a couple of shipIts I can check 
this in and we can move forward with the 1.2.0 release

- Bruce Schuchardt


On June 30, 2017, 4:17 p.m., Bruce Schuchardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60570/
> ---
> 
> (Updated June 30, 2017, 4:17 p.m.)
> 
> 
> Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, 
> Hitesh Khamesra, and Brian Rowe.
> 
> 
> Bugs: GEODE-3153
> https://issues.apache.org/jira/browse/GEODE-3153
> 
> 
> Repository: geode
> 
> 
> Description
> ---
> 
> Another problem was found in backward-compatibility testing.  If a 1.0.0 
> client was receiving subscription events generated by a 1.0.0 peer "feeder" 
> member and the events were routed through a 1.0.0 server the client might see 
> duplicate events when the server is stopped and the client fails over to a 
> 1.2.0 server holding its redundant subscription queue.  This is especially 
> possible if a large "ack" period is established in the client.
> 
> The problem stems from the EventID deserialization/reserialization of the 
> memberID bytes when sending to a 1.0 client.  It was deserializing using 
> Version.CURRENT, which ignores the UUID bytes in the serialized ID.  Then it 
> serialized the identifier using the client's version, which includes the UUID 
> bytes but which are zero due to the version used in deserialization.
> 
> 
> Diffs
> -
> 
>   
> geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
>  bc3d708da2ae9a8e386accb8d15e2ed49123241e 
>   geode-core/src/main/java/org/apache/geode/internal/Version.java 
> 557697159da644915e4ffe2405cdddc9ef37c5ac 
>   geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 
> 55c89f1f2e0800371dd4a30c4312c44f942a45ea 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
>  bc48d976096fafe2545e707da68dab5120ddca51 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
>  bfe4646b9abdf6075e8d30fab3d79924faade2aa 
>   
> geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
>  b69e004d63d74eccd5cd562ea269363ba3f2782e 
> 
> 
> Diff: https://reviews.apache.org/r/60570/diff/3/
> 
> 
> Testing
> ---
> 
> new unit tests, precheckin
> 
> 
> Thanks,
> 
> Bruce Schuchardt
> 
>



Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade

2017-06-30 Thread Bruce Schuchardt

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/
---

(Updated June 30, 2017, 4:17 p.m.)


Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, 
Hitesh Khamesra, and Brian Rowe.


Changes
---

adding a second current-version server to the test, as suggested by Galen and 
Brian


Bugs: GEODE-3153
https://issues.apache.org/jira/browse/GEODE-3153


Repository: geode


Description
---

Another problem was found in backward-compatibility testing.  If a 1.0.0 client 
was receiving subscription events generated by a 1.0.0 peer "feeder" member and 
the events were routed through a 1.0.0 server the client might see duplicate 
events when the server is stopped and the client fails over to a 1.2.0 server 
holding its redundant subscription queue.  This is especially possible if a 
large "ack" period is established in the client.

The problem stems from the EventID deserialization/reserialization of the 
memberID bytes when sending to a 1.0 client.  It was deserializing using 
Version.CURRENT, which ignores the UUID bytes in the serialized ID.  Then it 
serialized the identifier using the client's version, which includes the UUID 
bytes but which are zero due to the version used in deserialization.


Diffs (updated)
-

  
geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
 bc3d708da2ae9a8e386accb8d15e2ed49123241e 
  geode-core/src/main/java/org/apache/geode/internal/Version.java 
557697159da644915e4ffe2405cdddc9ef37c5ac 
  geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 
55c89f1f2e0800371dd4a30c4312c44f942a45ea 
  
geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
 bc48d976096fafe2545e707da68dab5120ddca51 
  
geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
 bfe4646b9abdf6075e8d30fab3d79924faade2aa 
  
geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
 b69e004d63d74eccd5cd562ea269363ba3f2782e 


Diff: https://reviews.apache.org/r/60570/diff/3/

Changes: https://reviews.apache.org/r/60570/diff/2-3/


Testing
---

new unit tests, precheckin


Thanks,

Bruce Schuchardt



Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade

2017-06-30 Thread Bruce Schuchardt


> On June 30, 2017, 2:56 p.m., Galen O'Sullivan wrote:
> > I've been looking at this with @WireBaron, and we're wondering whether a 
> > client membership ID can still get sent with a zeroed UUID if it's passed 
> > between two Gemfire 9.1 servers as a result of client queue replication 
> > failover. We tried to write a test and failed.
> > 
> > The basic idea is something like this:
> > * start two servers, an interested client and an event-creating client.
> >   One server is running 9.0 and the other 9.1 .
> > * put a couple of events in the system via the 9.0 server (should be fine 
> > with either).
> > * kill the 9.0 server and add a new 9.1 server to the system.
> >   At this point, if we're understanding client queue replication correctly, 
> > the new server should receive a copy of the queue from the other 9.1 server.
> > * Check the same events on the new server to see if they've lost the UUID.
> > 
> > Does that sound reasonable to you?
> > 
> > I'm not sure how to trigger failures in the right order to verify this 
> > isn't an issue, but I think it's reasonably plausible that during rolling 
> > upgrades someone could encounter the issue. It would require an old client 
> > to get a queue from a new version server that has been passed that queue by 
> > another new version server.
> > 
> > If that's not possible to trigger, or you can't test it and are confident 
> > that the scenario we described won't happen, then go ahead and ship it.

Please use Geode version numbers.

I modified my new test to do as you described and it passed, as I expected it 
to.  During image transfer the first 1.2.0 server would just transmit the 
membershipID bytes in its toData method since the target is a 1.2.0 server.  
The second server would preserve the UUID bytes when sending the eventID to the 
1.0.0 client.


- Bruce


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/#review179384
---


On June 30, 2017, 3:02 p.m., Bruce Schuchardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60570/
> ---
> 
> (Updated June 30, 2017, 3:02 p.m.)
> 
> 
> Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, 
> Hitesh Khamesra, and Brian Rowe.
> 
> 
> Bugs: GEODE-3153
> https://issues.apache.org/jira/browse/GEODE-3153
> 
> 
> Repository: geode
> 
> 
> Description
> ---
> 
> Another problem was found in backward-compatibility testing.  If a 1.0.0 
> client was receiving subscription events generated by a 1.0.0 peer "feeder" 
> member and the events were routed through a 1.0.0 server the client might see 
> duplicate events when the server is stopped and the client fails over to a 
> 1.2.0 server holding its redundant subscription queue.  This is especially 
> possible if a large "ack" period is established in the client.
> 
> The problem stems from the EventID deserialization/reserialization of the 
> memberID bytes when sending to a 1.0 client.  It was deserializing using 
> Version.CURRENT, which ignores the UUID bytes in the serialized ID.  Then it 
> serialized the identifier using the client's version, which includes the UUID 
> bytes but which are zero due to the version used in deserialization.
> 
> 
> Diffs
> -
> 
>   
> geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
>  bc3d708da2ae9a8e386accb8d15e2ed49123241e 
>   geode-core/src/main/java/org/apache/geode/internal/Version.java 
> 557697159da644915e4ffe2405cdddc9ef37c5ac 
>   geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 
> 55c89f1f2e0800371dd4a30c4312c44f942a45ea 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
>  bc48d976096fafe2545e707da68dab5120ddca51 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
>  bfe4646b9abdf6075e8d30fab3d79924faade2aa 
>   
> geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
>  b69e004d63d74eccd5cd562ea269363ba3f2782e 
> 
> 
> Diff: https://reviews.apache.org/r/60570/diff/2/
> 
> 
> Testing
> ---
> 
> new unit tests, precheckin
> 
> 
> Thanks,
> 
> Bruce Schuchardt
> 
>



Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade

2017-06-30 Thread Bruce Schuchardt


> On June 30, 2017, 2:56 p.m., Galen O'Sullivan wrote:
> > geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java
> > Lines 301 (patched)
> > 
> >
> > Maybe declare `dis` and then set it based on the version number?
> > 
> > If you're feeling charitable, you could rename `dis` to 
> > `dataInputStream`.

A versioned stream needs to wrap a real DataInputStream, I think.


> On June 30, 2017, 2:56 p.m., Galen O'Sullivan wrote:
> > geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
> > Lines 105 (patched)
> > 
> >
> > I'm not all that familiar with backwards compatibility tests, but do we 
> > want to be testing agains current version and testing version or current 
> > and 9.0? Will backwards compatibility set `testVersion` to 9.0 among other 
> > versions?

the framework will cycle through 1.0.0-incubating, 1.1.0 and 1.1.1.  Pivotal's 
9.0 releases were based on 1.0.0-incubating


- Bruce


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/#review179384
---


On June 30, 2017, 3:02 p.m., Bruce Schuchardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60570/
> ---
> 
> (Updated June 30, 2017, 3:02 p.m.)
> 
> 
> Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, 
> Hitesh Khamesra, and Brian Rowe.
> 
> 
> Bugs: GEODE-3153
> https://issues.apache.org/jira/browse/GEODE-3153
> 
> 
> Repository: geode
> 
> 
> Description
> ---
> 
> Another problem was found in backward-compatibility testing.  If a 1.0.0 
> client was receiving subscription events generated by a 1.0.0 peer "feeder" 
> member and the events were routed through a 1.0.0 server the client might see 
> duplicate events when the server is stopped and the client fails over to a 
> 1.2.0 server holding its redundant subscription queue.  This is especially 
> possible if a large "ack" period is established in the client.
> 
> The problem stems from the EventID deserialization/reserialization of the 
> memberID bytes when sending to a 1.0 client.  It was deserializing using 
> Version.CURRENT, which ignores the UUID bytes in the serialized ID.  Then it 
> serialized the identifier using the client's version, which includes the UUID 
> bytes but which are zero due to the version used in deserialization.
> 
> 
> Diffs
> -
> 
>   
> geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
>  bc3d708da2ae9a8e386accb8d15e2ed49123241e 
>   geode-core/src/main/java/org/apache/geode/internal/Version.java 
> 557697159da644915e4ffe2405cdddc9ef37c5ac 
>   geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 
> 55c89f1f2e0800371dd4a30c4312c44f942a45ea 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
>  bc48d976096fafe2545e707da68dab5120ddca51 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
>  bfe4646b9abdf6075e8d30fab3d79924faade2aa 
>   
> geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
>  b69e004d63d74eccd5cd562ea269363ba3f2782e 
> 
> 
> Diff: https://reviews.apache.org/r/60570/diff/2/
> 
> 
> Testing
> ---
> 
> new unit tests, precheckin
> 
> 
> Thanks,
> 
> Bruce Schuchardt
> 
>



Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade

2017-06-30 Thread Bruce Schuchardt

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/
---

(Updated June 30, 2017, 3:02 p.m.)


Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, 
Hitesh Khamesra, and Brian Rowe.


Changes
---

addressing review issues and failure with old GFE 8.2.x client that Barry 
encountered


Bugs: GEODE-3153
https://issues.apache.org/jira/browse/GEODE-3153


Repository: geode


Description
---

Another problem was found in backward-compatibility testing.  If a 1.0.0 client 
was receiving subscription events generated by a 1.0.0 peer "feeder" member and 
the events were routed through a 1.0.0 server the client might see duplicate 
events when the server is stopped and the client fails over to a 1.2.0 server 
holding its redundant subscription queue.  This is especially possible if a 
large "ack" period is established in the client.

The problem stems from the EventID deserialization/reserialization of the 
memberID bytes when sending to a 1.0 client.  It was deserializing using 
Version.CURRENT, which ignores the UUID bytes in the serialized ID.  Then it 
serialized the identifier using the client's version, which includes the UUID 
bytes but which are zero due to the version used in deserialization.


Diffs (updated)
-

  
geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
 bc3d708da2ae9a8e386accb8d15e2ed49123241e 
  geode-core/src/main/java/org/apache/geode/internal/Version.java 
557697159da644915e4ffe2405cdddc9ef37c5ac 
  geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 
55c89f1f2e0800371dd4a30c4312c44f942a45ea 
  
geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
 bc48d976096fafe2545e707da68dab5120ddca51 
  
geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
 bfe4646b9abdf6075e8d30fab3d79924faade2aa 
  
geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
 b69e004d63d74eccd5cd562ea269363ba3f2782e 


Diff: https://reviews.apache.org/r/60570/diff/2/

Changes: https://reviews.apache.org/r/60570/diff/1-2/


Testing
---

new unit tests, precheckin


Thanks,

Bruce Schuchardt



Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade

2017-06-30 Thread Galen O'Sullivan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/#review179384
---



I've been looking at this with @WireBaron, and we're wondering whether a client 
membership ID can still get sent with a zeroed UUID if it's passed between two 
Gemfire 9.1 servers as a result of client queue replication failover. We tried 
to write a test and failed.

The basic idea is something like this:
* start two servers, an interested client and an event-creating client.
  One server is running 9.0 and the other 9.1 .
* put a couple of events in the system via the 9.0 server (should be fine with 
either).
* kill the 9.0 server and add a new 9.1 server to the system.
  At this point, if we're understanding client queue replication correctly, the 
new server should receive a copy of the queue from the other 9.1 server.
* Check the same events on the new server to see if they've lost the UUID.

Does that sound reasonable to you?

I'm not sure how to trigger failures in the right order to verify this isn't an 
issue, but I think it's reasonably plausible that during rolling upgrades 
someone could encounter the issue. It would require an old client to get a 
queue from a new version server that has been passed that queue by another new 
version server.

If that's not possible to trigger, or you can't test it and are confident that 
the scenario we described won't happen, then go ahead and ship it.


geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java
Lines 301 (patched)


Maybe declare `dis` and then set it based on the version number?

If you're feeling charitable, you could rename `dis` to `dataInputStream`.



geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
Lines 105 (patched)


I'm not all that familiar with backwards compatibility tests, but do we 
want to be testing agains current version and testing version or current and 
9.0? Will backwards compatibility set `testVersion` to 9.0 among other versions?


- Galen O'Sullivan


On June 30, 2017, 4:33 p.m., Bruce Schuchardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60570/
> ---
> 
> (Updated June 30, 2017, 4:33 p.m.)
> 
> 
> Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, 
> Hitesh Khamesra, and Brian Rowe.
> 
> 
> Bugs: GEODE-3153
> https://issues.apache.org/jira/browse/GEODE-3153
> 
> 
> Repository: geode
> 
> 
> Description
> ---
> 
> Another problem was found in backward-compatibility testing.  If a 1.0.0 
> client was receiving subscription events generated by a 1.0.0 peer "feeder" 
> member and the events were routed through a 1.0.0 server the client might see 
> duplicate events when the server is stopped and the client fails over to a 
> 1.2.0 server holding its redundant subscription queue.  This is especially 
> possible if a large "ack" period is established in the client.
> 
> The problem stems from the EventID deserialization/reserialization of the 
> memberID bytes when sending to a 1.0 client.  It was deserializing using 
> Version.CURRENT, which ignores the UUID bytes in the serialized ID.  Then it 
> serialized the identifier using the client's version, which includes the UUID 
> bytes but which are zero due to the version used in deserialization.
> 
> 
> Diffs
> -
> 
>   
> geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
>  bc3d708da2ae9a8e386accb8d15e2ed49123241e 
>   geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 
> 55c89f1f2e0800371dd4a30c4312c44f942a45ea 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
>  bc48d976096fafe2545e707da68dab5120ddca51 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
>  bfe4646b9abdf6075e8d30fab3d79924faade2aa 
>   
> geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
>  b69e004d63d74eccd5cd562ea269363ba3f2782e 
> 
> 
> Diff: https://reviews.apache.org/r/60570/diff/1/
> 
> 
> Testing
> ---
> 
> new unit tests, precheckin
> 
> 
> Thanks,
> 
> Bruce Schuchardt
> 
>



Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade

2017-06-30 Thread Dan Smith

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/#review179390
---




geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java
Lines 301 (patched)


I think this needs at least some comments that explain what's going on 
here. Normally we should always be deserializing something using the version of 
the SENDER.

In this case, we have no idea what version the membershipID bytes were 
serialized with. We're just deserializing them with the version of the TARGET 
and hoping that this will preserve information if the membershipID was 
serialized with 1.0, and not fail if the membershipID was serialized with 1.2.


- Dan Smith


On June 30, 2017, 4:33 p.m., Bruce Schuchardt wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60570/
> ---
> 
> (Updated June 30, 2017, 4:33 p.m.)
> 
> 
> Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, 
> Hitesh Khamesra, and Brian Rowe.
> 
> 
> Bugs: GEODE-3153
> https://issues.apache.org/jira/browse/GEODE-3153
> 
> 
> Repository: geode
> 
> 
> Description
> ---
> 
> Another problem was found in backward-compatibility testing.  If a 1.0.0 
> client was receiving subscription events generated by a 1.0.0 peer "feeder" 
> member and the events were routed through a 1.0.0 server the client might see 
> duplicate events when the server is stopped and the client fails over to a 
> 1.2.0 server holding its redundant subscription queue.  This is especially 
> possible if a large "ack" period is established in the client.
> 
> The problem stems from the EventID deserialization/reserialization of the 
> memberID bytes when sending to a 1.0 client.  It was deserializing using 
> Version.CURRENT, which ignores the UUID bytes in the serialized ID.  Then it 
> serialized the identifier using the client's version, which includes the UUID 
> bytes but which are zero due to the version used in deserialization.
> 
> 
> Diffs
> -
> 
>   
> geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
>  bc3d708da2ae9a8e386accb8d15e2ed49123241e 
>   geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 
> 55c89f1f2e0800371dd4a30c4312c44f942a45ea 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
>  bc48d976096fafe2545e707da68dab5120ddca51 
>   
> geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
>  bfe4646b9abdf6075e8d30fab3d79924faade2aa 
>   
> geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
>  b69e004d63d74eccd5cd562ea269363ba3f2782e 
> 
> 
> Diff: https://reviews.apache.org/r/60570/diff/1/
> 
> 
> Testing
> ---
> 
> new unit tests, precheckin
> 
> 
> Thanks,
> 
> Bruce Schuchardt
> 
>



Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade

2017-06-30 Thread Bruce Schuchardt

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/
---

(Updated June 30, 2017, 9:33 a.m.)


Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, 
Hitesh Khamesra, and Brian Rowe.


Summary (updated)
-

GEODE-3153 Client receives duplicate events during rolling upgrade


Bugs: GEODE-3153
https://issues.apache.org/jira/browse/GEODE-3153


Repository: geode


Description
---

Another problem was found in backward-compatibility testing.  If a 1.0.0 client 
was receiving subscription events generated by a 1.0.0 peer "feeder" member and 
the events were routed through a 1.0.0 server the client might see duplicate 
events when the server is stopped and the client fails over to a 1.2.0 server 
holding its redundant subscription queue.  This is especially possible if a 
large "ack" period is established in the client.

The problem stems from the EventID deserialization/reserialization of the 
memberID bytes when sending to a 1.0 client.  It was deserializing using 
Version.CURRENT, which ignores the UUID bytes in the serialized ID.  Then it 
serialized the identifier using the client's version, which includes the UUID 
bytes but which are zero due to the version used in deserialization.


Diffs
-

  
geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
 bc3d708da2ae9a8e386accb8d15e2ed49123241e 
  geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 
55c89f1f2e0800371dd4a30c4312c44f942a45ea 
  
geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
 bc48d976096fafe2545e707da68dab5120ddca51 
  
geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
 bfe4646b9abdf6075e8d30fab3d79924faade2aa 
  
geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
 b69e004d63d74eccd5cd562ea269363ba3f2782e 


Diff: https://reviews.apache.org/r/60570/diff/1/


Testing
---

new unit tests, precheckin


Thanks,

Bruce Schuchardt