[ 
https://issues.apache.org/jira/browse/GEODE-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710787#comment-16710787
 ] 

John Blum edited comment on GEODE-6032 at 12/6/18 12:27 AM:
------------------------------------------------------------

All ([~amb], [~agingade], + others) -

I want to shed more light on this subject.

First, and most importantly (IMO), from an application developer/development 
perspective, the current definition of the 
[{{org.apache.geode.Delta}}|http://geode.apache.org/releases/latest/javadoc/org/apache/geode/Delta.html]
 interface, 
[{{hasDelta()}}|http://geode.apache.org/releases/latest/javadoc/org/apache/geode/Delta.html#hasDelta--]
 method states:

{quote}
Returns true if this object has pending changes it can write out.
{quote}

This is both logical and intuitive, i.e. it makes sense to me.  If there are 
"_pending changes_", "_write them out_".  And, while it is less clear/obvious 
what happens when there are NO "_pending changes_", I would assume, as an 
application developer, that Apache Geode would do NOTHING in this case.

Why would I assume this, and why is this behavior important?

3 reasons.

1) *CONTRACTUAL RESPONSIBILITY*
 
First and foremost, the 
[{{org.apache.geode.Delta}}|http://geode.apache.org/releases/latest/javadoc/org/apache/geode/Delta.html]
 _interface_ is a contract implemented by our users' own application domain 
objects, specifically to tell Geode exactly what should happen and how (i.e. 
how to behave).  It is a _callback interface_, no different than a 
{{CacheLoader}}, {{CacheWriter}}, {{TransactionListener}}, etc, used to advise 
Geode, and in this case for serialization.  

As such, the {{o.a.g.Delta}} _interface_ should not be implemented by anything 
other than application domain objects to affect the behavior of Geode's 
serialization logic/mechanics, appropriately, where the application's domain 
objects are concerned.  If used internally by Geode objects themselves (e.g. 
like {{EntryEvent}}, not saying this class does implement {{Delta}} in fact it 
does not, but if it did) then it should NOT be apparent to users.

Geode will actually do the right thing when told, except for when 
{{Delta.hasDelta()}} is *false*, and hence this bug, enhancement, whatever you 
want to call it.

So, from both an Application AND Geode perspective, let's explore this point 
and exactly what this means and why having this "application contract" is 
important!

For example purposes, we will use the concept of an HTTP Session.

When a user logs into a website, a new HTTP Session is created and by its very 
definition of being new, it is also, by default, DIRTY, i.e. it has a DELTA (it 
went from nothing to something).  In this case, yes {{Delta.hasDelta()}} for 
the (HTTP) Session implementation should return *true*.  So, then what happens?

Well, Geode will query {{Session.hasDelta()}} on a 
{{Region.put(session.getId(), session)}} cache op and see that {{hasDelta()}} 
returns *true*.  The Web, {{ClientCache}} application will then attempt to send 
a "delta" to the servers in the cluster since Geode will subsequently call 
{{Session.toDelta(..)}}.

Upon receiving the "delta bytes", the server will say back to the client, "I 
don't know anything about the Session object for the the delta bytes you just 
sent", and will then request the client to send the entire Session object back 
over, which the client will and does (using either a registered, custom 
{{DataSerializer}} for Session or by calling {{Session.toData(..)}} if Session 
implement {{DataSerializer}}). Great!  The Session object is now stored on the 
server.

The client need never send anything other than delta bytes from this point 
forward, unless the Session object is *removed*, *invalidated* or *expired*, in 
which case, the same Session (identified by ID, or it's equals method) will be 
treated exactly like a new object again (assuming the client retained a 
reference to this Session, which probably indicates an application problem at 
this point, but lets continue with this train of thought...) and the client 
will do the right thing, as described above.

Now, when the Session needs to be updated again, the application domain object 
class implementing the {{Delta}} interface is "responsible" for keeping track 
of telling Geode "when" there is a change (hence {{hasDelta()}}) and "how" to 
handle, i.e. send/apply the changes ({{toDelta}} / {{fromDelta}}).  As long as 
{{Session.hasDelta()}} returns *true*, and legitimately has a "delta" (i.e. 
changes) then Geode continues to do the the right thing.

But, what happens when a user attempts to save the Session when {{hasDelta()}} 
would return *false*.  

Well, Geode as described, will send the entire Session object again, which is 
both naive and reckless as it leads to lost updates.  I will describe this a 
bit more below, but first...

Why does Geode need to send anything at all???

The application just informed Geode there was NO delta, no changes to this 
object, at all.  If the application is not correct, then that is not Geode's 
responsibility.  The fact the application developer implemented the {{Delta}} 
_interface_ on his/her Session class, and potentially any object contained in 
the Session, SOLEY PLACES THE RESPONSIBILITY ON THE USER, PERIOD.  That is the 
contract, and that is not any different than a {{PartitionResolver}}, or a 
{{CacheWriter}} that vetos a cache op (e.g. write).  If they lose updates 
because they did not correctly implement {{hasDelta()}} (including properly 
clearing the delta when the object is saved, aka {{Region.put(..)}}), then NOT 
OUR PROBLEM, SERIOUSLY!

There is NO concern for CONSISTENCY or needing to invoke a {{CacheListener}} on 
the server or down stream on other clients via notification/subscriptions, 
because there was simply NO CHANGES / NO DELTA on the Session object in the 
first place.  What do you want to send?  You don't need to send the entire 
object, my application just told you it wasn't dirty and if "I" am mistaken, 
TOUGH LUCK for me.

But, what if we continue to send the entire Session object over when 
{{Delta.hasDelta()}} returns *false*?  What then?

Well, it is very likely that the Session will lose data.  WHY!?

2) *LOST UPDATES* due to *RACE CONDITIONS*!

It is very common in a Web application architecture to run multiple instances 
of the Web application in production, with all (HTTP) requests managed by a 
load balancer.  As such, it is not uncommon for more than 1 instance of the Web 
application, running in a separate JVM process (no-less) to have a handle to 
the same "logical" Session (by ID).  For that matter, it is not even that 
uncommon to have multiple references to the same HTTP Session in the same Web 
server (i.e. same JVM process) since each HTTP request is managed by a separate 
Thread!  It can happen and is likely to happen the more the redundancy you have 
in the system architecture in the Web tier.  This has been common practice for 
years!

Now, imagine if each Web application instance/process (even HTTP request 
processing Thread) had a slightly different view of the HTTP Session.  That 
Session could vary by even a single Session attribute.  If we send the entire 
object again, rather than just a "delta", simply because the client has not 
received notification and complete view of the Session object yet, when we can 
overwrite data (it is akin to the last 1 wins situation, but in this case we 
cause it by sending a non-dirty Session) to the server.  For instance, some 
HTTP request might only view the state of the HTTP Session as it was when the 
Session was read, like let's look at the shopping cart contents.  Perhaps the 
HTTP request also aggregates data from other (remote) REST/Web Services and 
therefore will take more than a few seconds to process.

EXAMPLE: Think of how long it takes to get the current set of flights on Concur 
for booking purposes.  It is probably aggregating all the results from 
different airlines Web portals/services to collect (all possible) available 
flights, then has to order them based your your preferences (costs, time, etc), 
before it returns a response.

I know how things like this are built because I built things just like it.

3) *NETWORK SENSITIVITY*

This brings up another interesting point.  Well, 2 things really.  When an HTTP 
request only reads the HTTP Session and all the garbage that users will stick 
in the (HTTP) Session.

First, Session attributes store objects, and not just any plain, simple 
(scalar/primitive) objects, but complex objects, and objects of objects, i.e 
object hierarchies.  A --> B --> C --> D ..., e.g. Customer -> Account(s) --> 
Orders --> Line Items --> Products, or StateHealtCareRecipient with Household, 
Address, Parents, Employment History, Income, Assets, having an 
EligibilityDecision, an Enrollment Record, Renewals, this, that.

One of the many benefits of Spring Session for Apache Geode's implementation is 
that the Session implementation's attribute "values" (i.e. the user objects) 
are routed through Geode's serialization logic/mechanics like everything else.

This has the added benefit that users can now take advantage of Geode' 
serialization functionality, implement {{DataSerializable}} / 
{{PdxSerializable}}, or alternatively implement and register custom 
{{DataSerializers}} /  {{PdxSerializers} for their own application domain 
object types, even implement {{Delta}}.

Imagine, if an HTTP Session is read, has no changes (i.e. {{Delta.hasDelta()}}) 
returns *false* and the Session is loaded up with 100s of Objects, containing 
Objects, containing still other Objects, and so on and the user's app make a 
call to {{Region.put(..)}} somewhere, either via SSDG or not.  It is possible 
for users, in their applications to acquire a handle to the SSDG 
{{SessionRepository}} to save Sessions independently (perhaps to implement 
"Attached" Sessions) or even the Region in which Session state is stored and 
call {{Region.put(..)}} directly.  What then?

I have taken steps in SSDG to 
[guard|https://github.com/jxblum/spring-session-data-geode/blob/master/spring-session-data-geode/src/main/java/org/springframework/session/data/gemfire/GemFireOperationsSessionRepository.java#L157-L159]
 against saving non-dirty Sessions.  But the argument of why do a cache op if 
the Session is not dirty in the first place is not a very good one!  The 
framework can only do so much to protect users from themselves.

Additionally, the "contract" is spelled out, above, in #1.  If the object type 
(e.g. Session, or other) implements {{Delta.hasDelta()}} and it returns 
*false*, there is NOTHING to send, the object is not dirty, not different, 
nothing.

Why send the entire object again, especially an object like an HTTP Session, 
which can contain MBs of data over a network.  

I agree that putting that much in an HTTP Session is wrong to begin with, that 
HTTP Session objects are mostly read-only and should be limited to what is 
stored in the Session in the first place, but it can happen, so it will happen 
and it seems to me that Geode could be smarter in this case when send such 
(potentially) complex objects over the wire.

I hope this makes sense.  I have both code references and examples/tests that 
can demonstrate each of the 3 cases noted above.

Regards,
[~jblum]

P.S. Sorry for the typos; I have not proof read this thoroughly yet.














was (Author: jblum):
All ([~amb], [~agingade], + others) -

I want to shed more light on this subject.

First, and most importantly (IMO), from an application developer/development 
perspective, the current definition of the 
[{{org.apache.geode.Delta}}|http://geode.apache.org/releases/latest/javadoc/org/apache/geode/Delta.html]
 interface, 
[hasDelta()|http://geode.apache.org/releases/latest/javadoc/org/apache/geode/Delta.html#hasDelta--]
 method states:

{quote}
Returns true if this object has pending changes it can write out.
{quote}

This is both logical and intuitive, i.e. it makes sense to me.  If there are 
"_pending changes_", "_write them out_".  And, while it is less clear/obvious 
what happens when there are NO "_pending changes_", I would assume, as an 
application developer, that Apache Geode would do NOTHING in this case.

Why would I assume this, and why is this behavior important?

3 reasons.

1) *CONTRACTUAL RESPONSIBILITY*
 
First and foremost, the 
[{{org.apache.geode.Delta}}|http://geode.apache.org/releases/latest/javadoc/org/apache/geode/Delta.html]
 _interface_ is a contract implemented by our users' own application domain 
object, specifically to tell Geode exactly what should happen and how (i.e. how 
to behave).  It is a _callback interface_, no different than a {{CacheLoader}}, 
{{CacheWriter}}, {{TransactionListener}}, etc, used to advise Geode, and in 
this case for serialization.  

As such, the {{o.a.g.Delta}} _interface_ should not be implemented by anything 
other than application domain objects to affect the behavior of Geode's 
serialization logic/mechanics, appropriately, where the application's domain 
objects are concerned.  If used internally by Geode objects themselves (e.g. 
like {{EntryEvent}}, not saying this class does implement {{Delta}} in fact it 
does not, but if it did) then it should NOT be apparent to users.

Geode will actually do the right thing when told, except for when 
{{Delta.hasDelta()}} is *false*, and hence this bug, enhancement, whatever you 
want to call it.

So, from both an Application AND Geode perspective, let's explore this point 
and exactly what this means and why having this "application contract" is 
important!

For example purposes, we will use the concept of an HTTP Session.

When a user logs into a website, a new HTTP Session is created and by its very 
definition of being new, it is also, by default, DIRTY, i.e. it has a DELTA (it 
went from nothing to something).  In this case, yes {{Delta.hasDelta()}} for 
the (HTTP) Session implementation should return *true*.  So, then what happens?

Well, Geode will query {{Session.hasDelta()}} on a 
{{Region.put(session.getId(), session)}} cache op and see that {{hasDelta()}} 
returns *true*.  The Web, {{ClientCache}} application will then attempt to send 
a "delta" to the servers in the cluster since Geode will subsequently call 
{{Session.toDelta(..)}}.

Upon receiving the "delta bytes", the server will say back to the client, "I 
don't know anything about the Session object for the the delta bytes you just 
sent", and will then request the client to send the entire Session object back 
over, which the client will and does (using either a registered, custom 
{{DataSerializer}} for Session or by calling {{Session.toData(..)}} if Session 
implement {{DataSerializer}}). Great!  The Session object is now stored on the 
server.

The client need never send anything other than delta bytes from this point 
forward, unless the Session object is *removed*, *invalidated* or *expired*, in 
which case, the same Session (identified by ID, or it's equals method) will be 
treated exactly like a new object again (assuming the client retained a 
reference to this Session, which probably indicates an application problem at 
this point, but lets continue with this train of thought...) and the client 
will do the right thing, as described above.

Now, when the Session needs to be updated again, the application domain object 
class implementing the {{Delta}} interface is "responsible" for keeping track 
of telling Geode "when" there is a change (hence {{hasDelta()}}) and "how" to 
handle, i.e. send/apply the changes ({{toDelta}} / {{fromDelta}}).  As long as 
{{Session.hasDelta()}} returns *true*, and legitimately has a "delta" (i.e. 
changes) then Geode continues to do the the right thing.

But, what happens when a user attempts to save the Session when {{hasDelta()}} 
would return *false*.  

Well, Geode as described, will send the entire Session object again, which is 
both naive and reckless as it leads to lost updates.  I will describe this a 
bit more below, but first...

Why does Geode need to send anything at all???

The application just informed Geode there was NO delta, no changes to this 
object, at all.  If the application is not correct, then that is not Geode's 
responsibility.  The fact the application developer implemented the {{Delta}} 
_interface_ on his/her Session class, and potentially any object contained in 
the Session, SOLEY PLACES THE RESPONSIBILITY ON THE USER, PERIOD.  That is the 
contract, and that is not any different than a {{PartitionResolver}}, or a 
{{CacheWriter}} that vetos a cache op (e.g. write).  If they lose updates 
because they did not correctly implement {{hasDelta()}} (including properly 
clearing the delta when the object is saved, aka {{Region.put(..)}}), then NOT 
OUR PROBLEM, SERIOUSLY!

There is NO concern for CONSISTENCY or needing to invoke a {{CacheListener}} on 
the server or down stream on other clients via notification/subscriptions, 
because there was simply NO CHANGES / NO DELTA on the Session object in the 
first place.  What do you want to send?  You don't need to send the entire 
object, my application just told you it wasn't dirty and if "I" am mistaken, 
TOUGH LUCK for me.

But, what if we continue to send the entire Session object over when 
{{Delta.hasDelta()}} returns *false*?  What then?

Well, it is very likely that the Session will lose data.  WHY!?

2) *LOST UPDATES* due to *RACE CONDITIONS*!

It is very common in a Web application architecture to run multiple instances 
of the Web application in production, with all (HTTP) requests managed by a 
load balancer.  As such, it is not uncommon for more than 1 instance of the Web 
application, running in a separate JVM process (no-less) to have a handle to 
the same "logical" Session (by ID).  For that matter, it is not even that 
uncommon to have multiple references to the same HTTP Session in the same Web 
server (i.e. same JVM process) since each HTTP request is managed by a separate 
Thread!  It can happen and is likely to happen the more the redundancy you have 
in the system architecture in the Web tier.  This has been common practice for 
years!

Now, imagine if each Web application instance/process (even HTTP request 
processing Thread) had a slightly different view of the HTTP Session.  That 
Session could vary by even a single Session attribute.  If we send the entire 
object again, rather than just a "delta", simply because the client has not 
received notification and complete view of the Session object yet, when we can 
overwrite data (it is akin to the last 1 wins situation, but in this case we 
cause it by sending a non-dirty Session) to the server.  For instance, some 
HTTP request might only view the state of the HTTP Session as it was when the 
Session was read, like let's look at the shopping cart contents.  Perhaps the 
HTTP request also aggregates data from other (remote) REST/Web Services and 
therefore will take more than a few seconds to process.

EXAMPLE: Think of how long it takes to get the current set of flights on Concur 
for booking purposes.  It is probably aggregating all the results from 
different airlines Web portals/services to collect (all possible) available 
flights, then has to order them based your your preferences (costs, time, etc), 
before it returns a response.

I know how things like this are built because I built things just like it.

3) *NETWORK SENSITIVITY*

This brings up another interesting point.  Well, 2 things really.  When an HTTP 
request only reads the HTTP Session and all the garbage that users will stick 
in the (HTTP) Session.

First, Session attributes store objects, and not just any plain, simple 
(scalar/primitive) objects, but complex objects, and objects of objects, i.e 
object hierarchies.  A --> B --> C --> D ..., e.g. Customer -> Account(s) --> 
Orders --> Line Items --> Products, or StateHealtCareRecipient with Household, 
Address, Parents, Employment History, Income, Assets, having an 
EligibilityDecision, an Enrollment Record, Renewals, this, that.

One of the many benefits of Spring Session for Apache Geode's implementation is 
that the Session implementation's attribute "values" (i.e. the user objects) 
are routed through Geode's serialization logic/mechanics like everything else.

This has the added benefit that users can now take advantage of Geode' 
serialization functionality, implement {{DataSerializable}} / 
{{PdxSerializable}}, or alternatively implement and register custom 
{{DataSerializers}} /  {{PdxSerializers} for their own application domain 
object types, even implement {{Delta}}.

Imagine, if an HTTP Session is read, has no changes (i.e. {{Delta.hasDelta()}}) 
returns *false* and the Session is loaded up with 100s of Objects, containing 
Objects, containing still other Objects, and so on and the user's app make a 
call to {{Region.put(..)}} somewhere, either via SSDG or not.  It is possible 
for users, in their applications to acquire a handle to the SSDG 
{{SessionRepository}} to save Sessions independently (perhaps to implement 
"Attached" Sessions) or even the Region in which Session state is stored and 
call {{Region.put(..)}} directly.  What then?

I have taken steps in SSDG to 
[guard|https://github.com/jxblum/spring-session-data-geode/blob/master/spring-session-data-geode/src/main/java/org/springframework/session/data/gemfire/GemFireOperationsSessionRepository.java#L157-L159]
 against saving non-dirty Sessions.  But the argument of why do a cache op if 
the Session is not dirty in the first place is not a very good one!  The 
framework can only do so much to protect users from themselves.

Additionally, the "contract" is spelled out, above, in #1.  If the object type 
(e.g. Session, or other) implements {{Delta.hasDelta()}} and it returns 
*false*, there is NOTHING to send, the object is not dirty, not different, 
nothing.

Why send the entire object again, especially an object like an HTTP Session, 
which can contain MBs of data over a network.  

I agree that putting that much in an HTTP Session is wrong to begin with, that 
HTTP Session objects are mostly read-only and should be limited to what is 
stored in the Session in the first place, but it can happen, so it will happen 
and it seems to me that Geode could be smarter in this case when send such 
(potentially) complex objects over the wire.

I hope this makes sense.  I have both code references and examples/tests that 
can demonstrate each of the 3 cases noted above.

Regards,
[~jblum]

P.S. Sorry for the typos; I have not proof read this thoroughly yet.













> Entire object is serialized again (and again) when Delta.hasDelta() returns 
> false and client is using PROXY Region
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-6032
>                 URL: https://issues.apache.org/jira/browse/GEODE-6032
>             Project: Geode
>          Issue Type: Bug
>          Components: client/server
>    Affects Versions: 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.2.1, 1.4.0, 1.5.0, 1.6.0, 
> 1.7.0
>            Reporter: John Blum
>            Assignee: Anilkumar Gingade
>            Priority: Critical
>              Labels: SmallFeature, pull-request-available
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, if a cache client application is...
> 1) configured with a client {{PROXY}} Region, and...
> 2) the application is also using {{DataSerialization}} to de/serialize 
> objects to/from the servers, and...
> 3) the application domain objects implement the {{org.apache.geode.Delta}} 
> interface then Apache Geode will incorrectly send the entire object (again) 
> when {{Delta.hasDelta}} returns *false*.
> It is understandable that the application domain object needs to be 
> serialized in its entirety the first time the object is sent to the server(s) 
> (or if the object is later, subsequently removed or has expired/been evicted, 
> and then needs to be re-added for whatever reason).
> However, once the server(s) know about the object, then only ever a "delta" 
> should be sent, and only when {{Delta.hasDelta()}} returns *true*.  
> Otherwise, if {{Delta.hasDelta()}} returns *false*, then most certainly, the 
> entire object should not be sent again, otherwise it is possible for the 
> application to enter a "_race condition_" scenario where the object gets 
> "overwritten", and as a result, the application can lose data (aka "_lost 
> updates_").
> If users were to change their client Region data management policy from 
> {{PROXY}} to {{CACHING_PROXY}} then this works as expected.  Apache Geode 
> will only send updates for an object it already knows about if there is 
> actually a "delta", otherwise Geode does nothing (that is, does not send the 
> entire object or any delta to the server(s) since there is technically 
> nothing to send).
> Obviously, in the {{CACHING_PROXY}} case, there is "local" state to compare 
> against, and therefore, Geode knows about the object already, in that it 
> "exists".  It can therefore assess the object to determine if it is the 
> same/unchanged, and not do anything in the case where {{Delta.hasDelta()}} 
> returns *false*, thus the "application" informing Geode there is nothing to 
> send.
> Clearly, in the {{PROXY}} case, there is no "local" state, and therefore, 
> Geode does not know whether the object (already) exists on the servers or 
> not.  So, if {{Delta.hasDelta()}} returns *false*, it is unsure whether the 
> objects exists or not and so decides just to send the entire object again, a 
> "_premature optimization_" to be sure, which now has sacrificed 
> "_correctness_", and has amplified the possible "_race conditions_" on the 
> application side.
> However, this is no different than if {{Delta.hasDelta()}} returns *true* and 
> the object is *not yet* known by the servers.  When the client sends just the 
> delta in this case, the server will send back to the client, "I don't know 
> anything about this object for which the delta needs to be applied", and 
> therefore, the client must turn around and send it the entire object anyway.
> So, in the {{PROXY}} case, it would be better if the client made a 
> determination about whether the object truly exists on the server side or 
> not, first, before arbitrarily and falsely assuming the entire object should 
> be sent again if the {{Delta.hasDelta()}} returns *false*.  The client simply 
> does not know and should "verify" before sending the object.
> Obviously, this affect will performance, but is a small price to pay (and the 
> "correct" thing to do) compared with "_lost updates_" and amplifying "_race 
> conditions_" on the client-side.
> There is also a situation where {{CACHING_PROXY}} client Regions can even 
> *fail*, and that is when {{copy-on-read}} is set to *true*.  Hopefully, this 
> is obvious why.
> To make matters worse, even the 
> [_Javadoc_|http://geode.apache.org/releases/latest/javadoc/org/apache/geode/Delta.html#hasDelta--]
>  explains and implies that only "_pending changes_" are written if they 
> exist...
> > "Returns true if this object has pending changes it can write out."
> Of course, this doc is less than clear and very ambiguous about what exactly 
> happens when ((hasDelta()) return *false*, which is "pertinent" information.  
> But, to be sure, it is certainly not consistent in behavior when different 
> data management policies are in effect, and most definitely not correct!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to