Re: [infinispan-dev] PutForExternalRead consistency

Galder Zamarreño Wed, 27 Nov 2013 07:31:45 -0800

On Nov 22, 2013, at 4:52 PM, William Burns <[email protected]> wrote:


> You are correct in the current way PFER works that a lock obtained for
> only a brief moment while it sends the replication async, but returns
> right away.  That is what my changes would change.
> 
> Thinking about this more I was thinking we could even remove all of
> the extra PFER logic in all of our interceptors and commands.
> 
> Instead PFER would do the following:
> 
> 1. Suspend the current tx if applicable (already does this)
> 2. Send the putIfAbsent async using the async executor, just like
> putIfAbsentAsync method (this would be different than
> FORCE_ASYNCHRONOUS as that only affects remote invocations).  Note
> this means the put would be done synchronously and would obtain all
> locks as normal which would guarantee consistency between primary and
> backups.
> 3. Only pass along the following flags: IGNORE_PREVIOUS_VALUE,
> FAIL_SILENTLY and ZERO_LOCK_ACQUISITION_TIMEOUT.  Note we wouldn't
> need a PUT_FOR_EXTERNAL_READ (not quite sure if we can get away with
> this in tx or not - would have to test it out)
> 4. Optionally return a Future object that the user could block on
> confirming the PFER completed.  Note there is no way to way to confirm
> if it wrote a value or not though due to FAIL_SILENTLY.
> 
> As Dan pointed out to me this will could slow down concurrency in REPL
> and DIST since the replication is now synchronous since it will hold
> onto the lock longer.  This seems fair to me since you at least can
> guarantee your cache is consistent still.
> 
> Also this approach an end user can still do LOCAL only PFER by setting
> the CACHE_MODE_LOCAL flag if needed.
> 
> I hope this cleared it up a bit more.  WDYT or is there something I am
> overlooking?

>From a 2LC use case perspective, to keep performance high, I'd continue with 
>same strategy we have right now. Fire and forget the PFER. So, if there's an 
>error in the PFER, it still assumes it's been "cached" from a stats 
>perspective, though the contents really tell the truth for the application.

With the new API suggested above, I'd essentially forget about waiting for the 
future, since that would have a massive impact of the perfomance, particularly 
for non-local modes. Since these could compete with normal put operations (iow, 
entity updates) , then I'd go for LOCAL only.

The performance might go down for situations when there are a lot of PFER and 
put calls competing, but that would at least mean that the cache is consistent. 
That's probably the price to pay here.

@William, if you implement this, I can run it through a stress test I have for 
Hibernate putFromLoad [2], which gives us some indication of performance impact.

Related to this, a while back considered implementing [1]. In some cases it 
might make sense to do something with the return. From a 2LC perspective, I'd 
not wait to see what the boolean returns. I've got a workaround in the code for 
this, so I don't need this immediately. So, if we're gonna return 
NotifiableFuture from now on, better return a NotifiableFuture<Boolean> for 
those that want to know if the PFER did actually put anything or not.

Cheers,

[1] https://issues.jboss.org/browse/ISPN-1986
[2] 
https://github.com/hibernate/hibernate-orm/blob/master/hibernate-infinispan/src/test/java/org/hibernate/test/cache/infinispan/stress/PutFromLoadStressTestCase.java

> 
> - Will
> 
> On Fri, Nov 22, 2013 at 10:39 AM, Dan Berindei <[email protected]> wrote:
>> I think I need to clarify my earlier email a bit: the problem I'm worried
>> about is that we could have a thread execute a putForExternal*Read*(k, v1),
>> then a put(k, v2), then a remove(k), and in the end leaving k = v1 in the
>> cache. (Without the remove(k) we'd be ok, because PFER uses putIfAbsent()
>> under the hood.)
> 
> Fixed the little typo.  I personally think that if you are using PFER
> you are accepting these kind of things could happen.  With the Future
> returned they could also block on that and know for sure :)
> 
>> 
>> This is quite different from the problem that Pedro raised, that of
>> different owners ending up with different values for the same key. Will's
>> suggestion to implement PFER as a regular put from a background thread does
>> fix that problem.
>> 
>> Writing the value only locally like Galder suggested would also address my
>> concern, but at the expense of extra misses from the cache - especially in
>> DIST mode. Hence my proposal to not support PFER in DIST mode at all.
> 
> You could still get into weird issues with REPL with this though too.
> Since you have a cluster with possibly different values in each node
> you would never know which node has which value.  A new node could
> join and some nodes have the newer some have the older etc.
> 
> I would assume (I am probably wrong, since people use things however
> they want) that when end users use PFER they do so in a way that they
> only ever call PFER on that cache and then use size based eviction to
> remove elements and free memory.  In that case you don't have these
> consistency problems that Pedro mentioned :)
> 
>> 
>> 
>> 
>> On Fri, Nov 22, 2013 at 3:45 PM, Dan Berindei <[email protected]>
>> wrote:
>>> 
>>> That doesn't sound right, we don't keep any lock for the duration of the
>>> replication. In non-tx mode, we have to do a RPC to the primary owner before
>>> we acquire any key. So there's nothing stopping the PFER from writing its
>>> value after a regular (sync) put when the put was initiated after the PFER.
>>> 
>>> 
>>> On Fri, Nov 22, 2013 at 2:49 PM, William Burns <[email protected]>
>>> wrote:
>>>> 
>>>> I wonder if we are over analyzing this.  It seems the main issue is
>>>> that the replication is done asynchronously.  Infinispan has many ways
>>>> to be make something asynchronous.  In my opinion we just chose the
>>>> wrong way.  Wouldn't it be easier to just change the PFER to instead
>>>> of passing along the FORCE_ASYNCHRONOUS flag we instead just have the
>>>> operation performed asynchronous using putIfAbsentAsync ?  This way
>>>> the lock is held during the duration of the replication and should be
>>>> consistent with other operations.  Also the user can regain control
>>>> back faster as it doesn't even have to process the local interceptor
>>>> chain.  We could also change the putForExternalRead method declaration
>>>> to also return a NotifiableFuture<Void> or something so they know when
>>>> the operation is completed (if they want).
>>>> 
>>>> - Will
>>>> 
>>>> On Thu, Nov 21, 2013 at 9:54 AM, Dan Berindei <[email protected]>
>>>> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> On Thu, Nov 21, 2013 at 12:35 PM, Galder Zamarreño <[email protected]>
>>>>> wrote:
>>>>>> 
>>>>>> 
>>>>>> On Nov 18, 2013, at 12:42 PM, Dan Berindei <[email protected]>
>>>>>> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, Nov 18, 2013 at 9:43 AM, Galder Zamarreño
>>>>>>> <[email protected]>
>>>>>>> wrote:
>>>>>>> 
>>>>>>> On Nov 14, 2013, at 1:20 PM, Pedro Ruivo <[email protected]>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> Simple question: shouldn't PFER ensure some consistency?
>>>>>>>> 
>>>>>>>> I know that PFER is asynchronous but (IMO) it can create
>>>>>>>> inconsistencies
>>>>>>>> in the data. the primary owner replicates the PFER follow by a PUT
>>>>>>>> (PFER
>>>>>>>> is sent async log the lock is released immediately) for the same
>>>>>>>> key,
>>>>>>>> we
>>>>>>>> have no way to be sure if the PFER is delivered after or before in
>>>>>>>> all
>>>>>>>> the backup owners.
>>>>>>>> 
>>>>>>>> comments?
>>>>>>> 
>>>>>>> Assuming that PFER and PUT happen in the same thread, we're normally
>>>>>>> relying on the JGroups sequence of events to send the first, wait no
>>>>>>> response, and then send the second put. That should guarantee order
>>>>>>> in which
>>>>>>> puts are received in the other nodes, but after that yeah, there's a
>>>>>>> risk
>>>>>>> that it could happen. PFER and PUT for a given key normally happen
>>>>>>> in the
>>>>>>> same thread in cache heavy use cases such as Hibernate 2LC, but
>>>>>>> there's no
>>>>>>> guarantee.
>>>>>>> 
>>>>>>> I don't think that's correct. If the cache is synchronous, the PUT
>>>>>>> will
>>>>>>> be sent as an OOB message, and as such it can be delivered on the
>>>>>>> target
>>>>>>> before the previous PFER command. That's regardless of whether the
>>>>>>> PFER
>>>>>>> command was sent as a regular or as an OOB message.
>>>>>> 
>>>>>> ^ Hmmmm, that's definitely risky. I think we should make PFER local
>>>>>> only.
>>>>>> 
>>>>>> The fact that PFER is asynchronous is nice to have. IOW, if you read a
>>>>>> value from a database and you want to store it in the cache for later
>>>>>> read,
>>>>>> the fact that it's replicated asynchronously is just so that other
>>>>>> nodes can
>>>>>> take advantage of the value being in the cache. Since it's
>>>>>> asynchronous some
>>>>>> nodes could fail to apply, but that's fine since you can go to the
>>>>>> database
>>>>>> and re-retrieve it from there. So, making PFER local only would be the
>>>>>> degenerate case, where all nodes fail to apply except the local node,
>>>>>> which
>>>>>> is fine. This is better than having the reordering above.
>>>>>> 
>>>>>> In a chat I had with Dan, he pointed out that having PFER local only
>>>>>> would
>>>>>> be problematic for DIST mode w/ L1 enabled, since the local write
>>>>>> would not
>>>>>> invalidate other nodes, but this is fine because PFER only really
>>>>>> makes
>>>>>> sense for situations where the Infinispan is used as a cache. So, if
>>>>>> the
>>>>>> data is in the DB, you might as well go there (1 network trip), as
>>>>>> opposed
>>>>>> to askign the other nodes for data and the database in the worst case
>>>>>> (2
>>>>>> network trips).
>>>>>> 
>>>>>> PFER is really designed for replication or invalidation use cases,
>>>>>> which
>>>>>> are precisely the ones configured for Hibernate 2LC.
>>>>>> 
>>>>>> Thoughts?
>>>>>> 
>>>>> 
>>>>> +1 to make PFER local-only in replicated caches, but I now think we
>>>>> should
>>>>> go all the way and disallow PFER completely in dist caches.
>>>>> 
>>>>> I still think having L1 enabled would be a problem, because a regular
>>>>> put()
>>>>> won't invalidate the entry on all the nodes that did a PFER for that
>>>>> key
>>>>> (there are no requestors, and even if we assume that we do a remote get
>>>>> before the PFER we'd still have race conditions).
>>>>> 
>>>>> With L1 disabled, we have the problem that you mentioned: we're trying
>>>>> to
>>>>> read the value from the proper owners, but we never write it to the
>>>>> proper
>>>>> owners, so the hit ratio will be pretty bad. Using the
>>>>> SKIP_REMOTE_LOOKUP
>>>>> flag on reads, we'll avoid the extra RPC in Infinispan, but that will
>>>>> make
>>>>> the hit ratio even worse. E.g. in a 4-nodes cluster with numOwners=2,
>>>>> the
>>>>> hit ratio will never go above 50%.
>>>>> 
>>>>> I don't think anyone would use a cache knowing that its hit ratio can
>>>>> never
>>>>> get above 50%, so we should just save ourselves some effort and stop
>>>>> supporting PFER in DIST mode.
>>>>> 
>>>>> Cheers
>>>>> Dan
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> [email protected]
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>> 
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> [email protected]
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> 
>>> 
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> [email protected]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> _______________________________________________
> infinispan-dev mailing list
> [email protected]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


--
Galder Zamarreño
[email protected]
twitter.com/galderz

Project Lead, Escalante
http://escalante.io

Engineer, Infinispan
http://infinispan.org


_______________________________________________
infinispan-dev mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] PutForExternalRead consistency

Reply via email to