Re: [jira] Created: (OPENJPA-556) Data cache infrastructure should take advantage of batching

Frederic_Bellier Fri, 04 Apr 2008 20:13:19 -0700

Patrick -

What you said was music to my ears.
I am very happy to see that you are interested in cleaning this up - it
gives me great hope for future version of openJPA.


It is not easy to discuss this in emails - maybe we will need a way to have
a conversation on this some day.

In a nutshell - after having looked quite a bit into the relevant code - it
seems to me that a good start would be to simplify the interface between
OpenJPA and L2cache. This would basically mean to extract that new
interface from the current code and put it into a DataCachePlugIn
interface. The current DataCache interface is filled with method that
should not be there in the case of just a plug in.
Also - The data cache manager should be what is defined from the OpenJPA
perspective - i.e. lifecycle control - etc... The datacache and the query
cache should have no life cycle API at all. Let the DataCAche manager
manages its caches. I think it will be easier if OpenJPA would only talk to
the cache manager to get cache reference and then only calls a very small
set of methods on these caches.

There are also a few minor changes to make as well -
Just to give a small example I came across recently - for instance we need
a public method to get the timeout value from the DataCachePCData. Right
now the value is changed to an expriation value. The issue there is that
timeout is a cache concept - therefore it should be managed by the cache.
Instead - as the code is currently written this concept is managed by
OpenJPA in certain method with the isTimedOut method() - and if the oject
is timedOut then the result is ignored/removed. The problem there is that
this design gives you only half of the benefit of timeout in a cache - that
is get rid off old values. But it does not take care of the hardware
resource issue - that is to say - the object should actually be gone and
not utilize anymore resource.
This could be easily changed by separating the 2 different concepts that
are O/R mapping and caching.

I think you are right about having to write some duplicate code for
different plug in. But personaly I would not worry too much about that -
here is why I am saying this:
- first - there is not that many valuable cache solution out there. So
there would not be that many plugin. Coherence is a good one - you need one
for the OpenJPA cache as well of course - and I guess only a few others.
- second - sometimes trying to fit different cache into one code base might
generate sub standard (performance and quality) code. By that I mean that
we will be forced to reach the lowest comon denominator between various
product.
- Third -- at this point I really believe that the interface between
OpenJPA and a cache can be very simple. If I look at what the commit method
really needs - it is not much.

Frederic

PS: About having reference to API from third party vendor inside a plug in
- I have no idea how this is handled for other open source project. I guess
I am leaving this question for others to answer.



                                                                           
             "Patrick Linskey"                                             
             <[EMAIL PROTECTED]                                             
             om>                                                        To 
                                       dev@openjpa.apache.org              
                                                                        cc 
             04/04/2008 05:29                                              
             PM                                                    Subject 
                                       Re: [jira] Created: (OPENJPA-556)   
                                       Data cache infrastructure should    
             Please respond to         take advantage of batching          
             [EMAIL PROTECTED]                                             
                   e.org                                                   
                                                                           
                                                                           
                                                                           
                                                                           




Yeah, it'd definitely be nice to formalize the cache pluggability a
bunch. We haven't done it largely because it's really really hard to
get pluggability APIs right for just one use case, so in the absence
of doing a bunch of plugins, you're unlikely to get it right.

To what extent is the code that you've written something that could be
incorporated into OpenJPA as a more-pluggable infrastructure for
caching? As I mentioned in a different thread, the lack of
license-unencumbered Coherence jars to link against is an issue for
pulling in all the code you've worked on, but it sounds like what
you've done might be well-abstracted anyhow.

The other thing that's always bothered me about OpenJPA's cache
pluggability story is that you've gotta duplicate work for the query
cache and the data cache -- this seems unnecessary.

-Patrick

On Fri, Apr 4, 2008 at 5:25 PM,  <[EMAIL PROTECTED]> wrote:
> alleluia
>
>  I am running Coherence clusters. OpenJPA needs to do the right thing
when
>  it comes to interacting with a cache. Otherwise the all performance
aspect
>  goes by the window. Which is our reason to use a cache in the first
place.
>
>  One small suggestion about OpenJPA and caching in general. I think it
would
>  be easier if OpenJPA the O/R Mapping tool and the cache implementation
code
>  was separated. Create an new OpenCache open source project and put the
code
>  in there.
>  The reason I say this is becuase OpenJPA is quite coupled with its own
>  cache implementation at the moment. I would be quite happy to go in more
>  detail on that one if people are interested. Sepratating the two project
>  might force the dev to create simpler interfaces.
>
>  Seprate the DataCache interface into two - one with the commit method
>  (required for OpenJPA to access the cache) - and create another
interface
>  with whatever method you want in there for your cache product.
>  In fact OpenJPA should only need the commit method to a cache facade
>  (interface) to do its works properly.
>  All the other methods in DataCache should not be needed by OpenJPA the
O/R
>  mapping tool.
>
>  Just a thought...
>
>
>  Frederic
>
>
>
>
>
>              "Daniel Lee"
>              <[EMAIL PROTECTED]
>              l.com>
To
>                                        dev@openjpa.apache.org
>
cc
>              04/04/2008 03:08
>
>              PM
Subject
>                                        Re: [jira] Created: (OPENJPA-556)
>                                        Data cache infrastructure should
>              Please respond to         take advantage of batching
>              [EMAIL PROTECTED]
>                    e.org
>
>
>
>
>
>
>
>
>
>
> It is still a good practice because, for the third party data cache
plugins
>  which support special getAll() API to batch the get function may benefit
a
>  lot from the implementation, especially in the cluster environment.
>
>  On Fri, Apr 4, 2008 at 2:18 PM, Patrick Linskey <[EMAIL PROTECTED]>
wrote:
>
>  > The latest snapshot should have this behavior now.
>  >
>  > >  It won't gain much for OpanJPA "native"
>  > >  data cache since it is delegated to AbstractDataCache.getAll() (ln.
>  > 449)
>  > >  which loops thru the list to get the objects from the cache.
However,
>  > it
>  > >  save the traffic if any.
>  >
>  > Also, the built-in data cache is local-only, with remote invalidations
>  > handled via the RemoteCommitProvider. So the difference between a
>  > series of get() calls vs. batching is negligible anyways.
>  >
>  > -Patrick
>  >
>  > On Fri, Apr 4, 2008 at 1:01 PM, Daniel Lee <[EMAIL PROTECTED]>
wrote:
>  > > That's right, that the code here (transformToVersionSafePCDatas)
should
>  > call
>  > >  cache.containsAll() or cache.getAll().  That way, it will save a
lot
>  > for any
>  > >  data cache that provide getAll().  It won't gain much for OpanJPA
>  > "native"
>  > >  data cache since it is delegated to AbstractDataCache.getAll() (ln.
>  > 449)
>  > >  which loops thru the list to get the objects from the cache.
However,
>  > it
>  > >  save the traffic if any.
>  > >
>  > >  On Thu, Apr 3, 2008 at 11:51 PM, Patrick Linskey (JIRA) <
>  > [EMAIL PROTECTED]>
>  > >  wrote:
>  > >
>  > >
>  > >
>  > >  > Data cache infrastructure should take advantage of batching
>  > >  > -----------------------------------------------------------
>  > >  >
>  > >  >                 Key: OPENJPA-556
>  > >  >                 URL:
>  > https://issues.apache.org/jira/browse/OPENJPA-556
>  > >  >             Project: OpenJPA
>  > >  >          Issue Type: Improvement
>  > >  >            Reporter: Patrick Linskey
>  > >  >
>  > >  >
>  > >  > From the newsgroup:
>  > >  >
>  > >  > "The DataCacheStoreManager.transformToVersionSafePCDatas()  line
>  261.
>  > This
>  > >  > method should call either cache.containsAll() or cache.getAll().
The
>  > >  > current implementation makes one call to the cache for each
element
>  > in the
>  > >  > collection."
>  > >  >
>  > >  > --
>  > >  > This message is automatically generated by JIRA.
>  > >  > -
>  > >  > You can reply to this email to add a comment to the issue online.
>  > >  >
>  > >  >
>  > >
>  >
>  >
>  >
>  > --
>  > Patrick Linskey
>  > 202 669 5907
>  >
>
>
>



--
Patrick Linskey
202 669 5907

Re: [jira] Created: (OPENJPA-556) Data cache infrastructure should take advantage of batching

Reply via email to