[ 
https://issues.apache.org/jira/browse/GEODE-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710574#comment-16710574
 ] 

John Blum edited comment on GEODE-6032 at 12/5/18 8:05 PM:
-----------------------------------------------------------

[~agingade] - I stand corrected on the difference between {{PROXY}} and 
{{CACHING_PROXY}}.  It is *true* that, even for a {{CACHING_PROXY}}, the 
{{DataSerializer.toData(..)}} method is called anytime the entity's 
{{hasDelta()}} method returns *false*, as verified from my test.

However, please do NOT close this ticket.  I see this as an "*enhancement*" 
rather than to simply ustify Apache Geode's unintuitive, broken behavior, 
especially since this scenario and related UCs (e.g. Session state caching) can 
and will lead to *lost updates* (i.e. lost data due to *race conditions* in an 
application).

I fail to see why Apache Geode needs to send anything at all if the application 
entities communicate to Geode, via the {{org.apache.geode.Delta}} interface, 
which is a interface implemented by application developers for application 
entity classes, that nothing has changed using the `hasDelta()` method.

I see Geode's behavior to send the entire object again as a reckless and 
premature performance optimization.



was (Author: jblum):
[~agingade] - I stand corrected on the difference between {{PROXY}} and 
{{CACHING_PROXY}}.  It is *true* that, even for a {{CACHING_PROXY}}, the 
{{DataSerializer.toData(..)}} method is called anytime the entity's 
{{hasDelta()}} method returns *false*, as verified from my test.

However, please do NOT close this ticket.  I see this as an "*enhancement*" 
rather than to simply ustify Apache Geode's unintuitive, broken behavior, 
especially since this scenario and related UCs (e.g. Session state caching) can 
and will lead to *lost updates* (i.e. lost data due to *race conditions* in an 
application).

I fail to see why Apache Geode needs to send anything at all if the application 
entities communicate to Geode, via the application `o.a.g.Delta` interface that 
nothing has changed using the `hasDelta()` method.

I see Geode's behavior to send the entire object again as a reckless and 
premature performance optimization.


> Entire object is serialized again (and again) when Delta.hasDelta() returns 
> false and client is using PROXY Region
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-6032
>                 URL: https://issues.apache.org/jira/browse/GEODE-6032
>             Project: Geode
>          Issue Type: Bug
>          Components: client/server
>    Affects Versions: 1.1.0, 1.1.1, 1.2.0, 1.3.0, 1.2.1, 1.4.0, 1.5.0, 1.6.0, 
> 1.7.0
>            Reporter: John Blum
>            Assignee: Anilkumar Gingade
>            Priority: Critical
>              Labels: SmallFeature, pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, if a cache client application is...
> 1) configured with a client {{PROXY}} Region, and...
> 2) the application is also using {{DataSerialization}} to de/serialize 
> objects to/from the servers, and...
> 3) the application domain objects implement the {{org.apache.geode.Delta}} 
> interface then Apache Geode will incorrectly send the entire object (again) 
> when {{Delta.hasDelta}} returns *false*.
> It is understandable that the application domain object needs to be 
> serialized in its entirety the first time the object is sent to the server(s) 
> (or if the object is later, subsequently removed or has expired/been evicted, 
> and then needs to be re-added for whatever reason).
> However, once the server(s) know about the object, then only ever a "delta" 
> should be sent, and only when {{Delta.hasDelta()}} returns *true*.  
> Otherwise, if {{Delta.hasDelta()}} returns *false*, then most certainly, the 
> entire object should not be sent again, otherwise it is possible for the 
> application to enter a "_race condition_" scenario where the object gets 
> "overwritten", and as a result, the application can lose data (aka "_lost 
> updates_").
> If users were to change their client Region data management policy from 
> {{PROXY}} to {{CACHING_PROXY}} then this works as expected.  Apache Geode 
> will only send updates for an object it already knows about if there is 
> actually a "delta", otherwise Geode does nothing (that is, does not send the 
> entire object or any delta to the server(s) since there is technically 
> nothing to send).
> Obviously, in the {{CACHING_PROXY}} case, there is "local" state to compare 
> against, and therefore, Geode knows about the object already, in that it 
> "exists".  It can therefore assess the object to determine if it is the 
> same/unchanged, and not do anything in the case where {{Delta.hasDelta()}} 
> returns *false*, thus the "application" informing Geode there is nothing to 
> send.
> Clearly, in the {{PROXY}} case, there is no "local" state, and therefore, 
> Geode does not know whether the object (already) exists on the servers or 
> not.  So, if {{Delta.hasDelta()}} returns *false*, it is unsure whether the 
> objects exists or not and so decides just to send the entire object again, a 
> "_premature optimization_" to be sure, which now has sacrificed 
> "_correctness_", and has amplified the possible "_race conditions_" on the 
> application side.
> However, this is no different than if {{Delta.hasDelta()}} returns *true* and 
> the object is *not yet* known by the servers.  When the client sends just the 
> delta in this case, the server will send back to the client, "I don't know 
> anything about this object for which the delta needs to be applied", and 
> therefore, the client must turn around and send it the entire object anyway.
> So, in the {{PROXY}} case, it would be better if the client made a 
> determination about whether the object truly exists on the server side or 
> not, first, before arbitrarily and falsely assuming the entire object should 
> be sent again if the {{Delta.hasDelta()}} returns *false*.  The client simply 
> does not know and should "verify" before sending the object.
> Obviously, this affect will performance, but is a small price to pay (and the 
> "correct" thing to do) compared with "_lost updates_" and amplifying "_race 
> conditions_" on the client-side.
> There is also a situation where {{CACHING_PROXY}} client Regions can even 
> *fail*, and that is when {{copy-on-read}} is set to *true*.  Hopefully, this 
> is obvious why.
> To make matters worse, even the 
> [_Javadoc_|http://geode.apache.org/releases/latest/javadoc/org/apache/geode/Delta.html#hasDelta--]
>  explains and implies that only "_pending changes_" are written if they 
> exist...
> > "Returns true if this object has pending changes it can write out."
> Of course, this doc is less than clear and very ambiguous about what exactly 
> happens when ((hasDelta()) return *false*, which is "pertinent" information.  
> But, to be sure, it is certainly not consistent in behavior when different 
> data management policies are in effect, and most definitely not correct!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to