Re: Entity Caching

Scott Gray Fri, 20 Mar 2015 00:48:13 -0700

Yes ehcache supports transactions and would ideally be what we use for
caching.  I started work on it and there's a branch in svn for it but I
haven't had time to continue since December.  Unfortunately there were a
few incompatible aspects of the existing OFBiz cache API and the ehcache
API which need to be reconciled before it would be possible to run the two
against the same API and compare them.


I'll restate my opinion that I don't think the lack of transactional
awareness by the OFBiz cache is an "edge case".  I think if you try and
cache everything you'll soon encounter strange behavior that will be very
difficult to reproduce and debug.  My preference is to cache data that is
read often and updated rarely.

On 20 March 2015 at 17:44, Ron Wheeler <[email protected]>
wrote:

>
> Isn't this the kind of issue that something like ehcache handles?
> It seems to know the difference between a committed transaction and a
> transaction which is in progress and might be rolled back.
>
> Certainly a relational database with transaction support is not going to
> allow a process to access data from other processes unless the transaction
> is completed.
> The cache needs to know the difference between private data (incomplete
> transactions) and public data (data previously committed and not in the
> process of being changed) and prevent others from using private data from
> the cache.
>
> On the bright side, an SOA does make this much more of an edge case at the
> expense of moving transaction rollback higher up the application logic.
>
> Ron
>
>
>
> On 19/03/2015 4:55 PM, Adrian Crum wrote:
>
>> I understand. Yes, that could occur.
>>
>> But I still believe it is an edge case. ;)
>>
>> Adrian Crum
>> Sandglass Software
>> www.sandglass-software.com
>>
>> On 3/19/2015 8:37 PM, Scott Gray wrote:
>>
>>> You're missing a step that actually causes the issue, prior to the
>>> rollback
>>> in 5b some code within the same transaction retrieves the modified row
>>> from
>>> the database again which puts the modified row in the cache and makes the
>>> change visible to other transactions even though it hasn't yet been
>>> committed.
>>>
>>> Because of our service oriented architecture this scenario isn't
>>> uncommon.
>>> An example is updating an OrderHeader's statusId which can trigger a
>>> number
>>> of SECAs which in turn are likely to retrieve the OrderHeader row after
>>> being passed only the orderId. If a rollback occurred in one of those
>>> services, the modified row would remain in the cache even though the
>>> changes were never committed.
>>> On 20 Mar 2015 00:06, "Adrian Crum" <[email protected]>
>>> wrote:
>>>
>>>  Okay, let's assume processes cannot "see" changes made by another
>>>> transaction until that transaction is committed. Here is how the current
>>>> entity cache works:
>>>>
>>>> 1. A Delegator find method is invoked. The Delegator checks the cache,
>>>> and
>>>> the SQL SELECT result does not exist in the cache.
>>>> 2. The Delegator executes the SQL SELECT and puts the results in the
>>>> entity cache.
>>>> 3. The SQL SELECT results are returned to the calling process.
>>>> 4. The calling process modifies one of the values (rows) in the SQL
>>>> SELECT
>>>> result (after cloning the immutable entity value).
>>>> 5a. Something goes wrong and the calling process rolls back the
>>>> transaction before the cloned value is persisted.
>>>> 5b. Something goes wrong and the calling process rolls back the
>>>> transaction after the cloned value is persisted and all related caches
>>>> have
>>>> been cleared.
>>>> 6. Another process performs the same query as #1.
>>>> 7. The second process gets the results from the cache. The values from
>>>> the
>>>> cache have not changed because the cloned & modified value (in #4) was
>>>> not
>>>> put in the cache, nor was it written to the data source.
>>>>
>>>>  From my perspective, the scenario you described can only happen if
>>>> another
>>>> process can see changes that are made in the data source before the
>>>> transaction is committed.
>>>>
>>>>  From your perspective, the entity cache is somehow inserting invalid
>>>> values when a transaction is rolled back.
>>>>
>>>> Adrian Crum
>>>> Sandglass Software
>>>> www.sandglass-software.com
>>>>
>>>> On 3/19/2015 10:41 AM, Scott Gray wrote:
>>>>
>>>>  I'm sorry but I'm not following what you're proposing.  Currently row
>>>>> changes caused within a transaction are available only to queries
>>>>> issued
>>>>> within that same transaction (i.e. read committed), except that the
>>>>> cache
>>>>> breaks this isolation by making them immediately available to any
>>>>> transaction querying that entity.  I don't see how this scenario exists
>>>>> outside of the cache unless the logic within the transaction explicitly
>>>>> passes a row off to another transaction, and I'm not aware of any cases
>>>>> like that.
>>>>>
>>>>> On Thu, Mar 19, 2015 at 3:17 AM, Adrian Crum <
>>>>> [email protected]> wrote:
>>>>>
>>>>>   I call it an edge case because it is easily fixed by changing the
>>>>>
>>>>>> transaction isolation level.
>>>>>>
>>>>>> The behavior you describe is not caused by the entity cache, but by
>>>>>> the
>>>>>> transaction isolation level. The same scenario would exist without the
>>>>>> entity cache - where two processes hold a reference to the updated
>>>>>> row,
>>>>>> and
>>>>>> one process performs a rollback.
>>>>>>
>>>>>> Adrian Crum
>>>>>> Sandglass Software
>>>>>> www.sandglass-software.com
>>>>>>
>>>>>> On 3/19/2015 7:28 AM, Scott Gray wrote:
>>>>>>
>>>>>>   Ah, it's quite a large edge case IMO
>>>>>>
>>>>>>>
>>>>>>> On Thu, Mar 19, 2015 at 12:20 AM, Adrian Crum <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>    That is the edge case I mentioned.
>>>>>>>
>>>>>>>
>>>>>>>> Adrian Crum
>>>>>>>> Sandglass Software
>>>>>>>> www.sandglass-software.com
>>>>>>>>
>>>>>>>> On 3/19/2015 6:54 AM, Scott Gray wrote:
>>>>>>>>
>>>>>>>>    I tend to disagree with the "cache everything" approach because
>>>>>>>> the
>>>>>>>>
>>>>>>>>  cache
>>>>>>>>> isn't transaction aware.
>>>>>>>>> If you:
>>>>>>>>> 1. update a record
>>>>>>>>> 2. select that same record
>>>>>>>>> 3. encounter a transaction rollback
>>>>>>>>>
>>>>>>>>> Then the cache will still contain the changes that were rolled
>>>>>>>>> back.
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Scott
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Mar 18, 2015 at 5:16 AM, Adrian Crum <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>     I would like to share some insights into the entity cache
>>>>>>>>> feature,
>>>>>>>>> some
>>>>>>>>>
>>>>>>>>>   best practices I like to follow, and some related information.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Some OFBiz experts may disagree with some of my views, and that is
>>>>>>>>>> okay.
>>>>>>>>>> Different experiences with OFBiz will lead to different
>>>>>>>>>> viewpoints.
>>>>>>>>>>
>>>>>>>>>> The OFBiz entity caching feature is intended to improve
>>>>>>>>>> performance
>>>>>>>>>> by
>>>>>>>>>> keeping GenericValue instances in memory - decreasing the number
>>>>>>>>>> of
>>>>>>>>>> calls
>>>>>>>>>> to the database.
>>>>>>>>>>
>>>>>>>>>> Background
>>>>>>>>>> ----------
>>>>>>>>>>
>>>>>>>>>> Initially, the entity cache was very unreliable due to a number of
>>>>>>>>>> flaws
>>>>>>>>>> in its design and in the code that calls it (it was guaranteed to
>>>>>>>>>> produce
>>>>>>>>>> stale data). As a result, I personally avoided using the entity
>>>>>>>>>> cache
>>>>>>>>>> feature.
>>>>>>>>>>
>>>>>>>>>> Some time ago, Adam Heath did a lot of work on the entity cache.
>>>>>>>>>> After
>>>>>>>>>> that, Jacopo and I did a lot of work fixing stale data issues in
>>>>>>>>>> the
>>>>>>>>>> entity
>>>>>>>>>> cache. Today, the entity cache is much improved and unit tests
>>>>>>>>>> ensure
>>>>>>>>>> it
>>>>>>>>>> produces the correct data (except for one edge case that Jacopo
>>>>>>>>>> has
>>>>>>>>>> identified).
>>>>>>>>>>
>>>>>>>>>> I mention all of this because the previous quirky behavior led to
>>>>>>>>>> some
>>>>>>>>>> "best practices" that didn't make much sense. A search through the
>>>>>>>>>> OFBiz
>>>>>>>>>> mail archives will produce a mountain of conflicting and confusing
>>>>>>>>>> information.
>>>>>>>>>>
>>>>>>>>>> Today
>>>>>>>>>> -----
>>>>>>>>>>
>>>>>>>>>> Since the current entity cache is reliable, there is no reason
>>>>>>>>>> NOT to
>>>>>>>>>> use
>>>>>>>>>> it. My preference is to make ALL Delegator calls use the cache. If
>>>>>>>>>> all
>>>>>>>>>> code
>>>>>>>>>> uses the cache, then individual entities can have their caching
>>>>>>>>>> characteristics configured outside of code. This enables
>>>>>>>>>> sysadmins to
>>>>>>>>>> fine-tune entity caches for best performance.
>>>>>>>>>>
>>>>>>>>>> [Some experts might disagree with this approach because the entity
>>>>>>>>>> cache
>>>>>>>>>> will consume all available memory. But the idea is to configure
>>>>>>>>>> the
>>>>>>>>>> cache
>>>>>>>>>> so that doesn't happen.]
>>>>>>>>>>
>>>>>>>>>> If you code Delegator calls to avoid the cache, then there is no
>>>>>>>>>> way
>>>>>>>>>> for
>>>>>>>>>> a
>>>>>>>>>> sysadmin to configure the caching behavior - that bit of code will
>>>>>>>>>> ALWAYS
>>>>>>>>>> make a database call.
>>>>>>>>>>
>>>>>>>>>> If you make all Delegator calls use the cache, then there is an
>>>>>>>>>> additional
>>>>>>>>>> complication that will add a bit more code: the GenericValue
>>>>>>>>>> instances
>>>>>>>>>> retrieved from the cache are immutable - if you want to modify
>>>>>>>>>> them,
>>>>>>>>>> then
>>>>>>>>>> you will have to clone them. So, this approach can produce an
>>>>>>>>>> additional
>>>>>>>>>> line of code.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Adrian Crum
>>>>>>>>>> Sandglass Software
>>>>>>>>>> www.sandglass-software.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>>
>
> --
> Ron Wheeler
> President
> Artifact Software Inc
> email: [email protected]
> skype: ronaldmwheeler
> phone: 866-970-2435, ext 102
>
>

Re: Entity Caching

Reply via email to