Re: CMP2 on G2 - Delayed Database Flush

Rick McGuire Mon, 14 Apr 2008 06:08:55 -0700

Dain Sundstrom wrote:

On Apr 7, 2008, at 7:06 AM, Rick McGuire wrote:
Dain Sundstrom wrote:
I've been sucked into another project and haven't been paying muchattention to the lists...
The problem is we flush before returning the created object to thecaller. The reason we do this is because database generated fieldsare not filled in until the flush statement which means the primarykey is not guaranteed to be available until flush. The current coderequires the primary key to create the cmp proxy we return to thecaller. The code will have to be changed to allow for late primarykey resolution either when the code calls getPrimaryKey or at theend of the transaction.
I don't have the time to look at this, but I can help you if youwant to work on it.
I've started poking around in the code trying to understand whatneeds to change. Is the JpaCmpEngine.createBean() method where theflushing takes place? It appears at that point in time that theprimaryKey is used for 1) creating the ThreadContext instance, 2) rorstoring the bean in the transaction cache, and 3) for creating theProxyInfo instance. Am I looking in the correct location for this?
Yes.
The ThreadContext primary key bit looks easily changed to a lazyresolution, and probably the ProxyInfo as well, but the transcactioncache does not appear to be as easily changed, since the primary keyis the main lookup method for the transaction cache. I guess thetransaction cache step could be bypassed until the primary key isactually generated, but I'm concerned that this could result in someresolution failures where an object would be expected to be locatedin the cache.
The transaction cache was introduced as a work around to thenew-delete-new bug in OpenJPA (see JpaTestObject.newDeleteNew()). Ifyou create, remove and recreated a bean with the same pk, OpenJPAinternally leave the pk as "deleted" so calls find(Class,Object)result in a null. We work around this by using a private cache totrack the objects created during the transaction.
To implement delayed flush, you will have to add another way to trackthe JPA instance object (since we won't have the pk to "find" theobject in the entity manager). When the pk is not available, you usethe new, alternate, method to find the object, and when the pk isfinally resolved, you would add it to the transaction cache.

I've not come up with any clever way of implementing the cache so far,other than just keeping a list of objects whose primary keys have notbeen calculated, and then, if all other lookups fail, start resolvingthe primary keys looking for the given target. Not elegant, but I thinkthis will work.I do wonder if another approach might work better. If I understand thereasoning behind the flush, it is necessary because it's possible thatsome of the information needed to calculate the primary key only becomesavailable after the JPA flush()/merge() sequence. I suspect for manyobjects, this is not needed because a simple primary key is used. Wouldit be feasible to detect the situation where a flush is needed to"crystalize" the object to calculate the primary key? This way, simpleobject instances where the primary key is provided in the create()operation would not experience the performance hit.


Rick

Off the top of my head, it may be possible to use a stand-in pk objectwhich wraps the JPA object itself (using identity based hashcode andequals) until the real pk is resolved. This pk object would then bethe alternate tx cache.
Any pointers on where the end of transaction processing would need tobe performed?
CmpContainer.ejbLoad(EntityBean) usesTransactionSynchronizationRegistry.registerInterposedSynchronizationto store entities at the end of the transaction. You'll want toexpand that logic to handle pk resolution in addition to ejbStorecallbacks. The registerInterposedSynchronization doesn't reallyhandle ordering well so I suggest you use a single Synchronizationobject to handle processing of the pks and the ejb store callbacks.
One other think to keep in mind is that before a CMP is passed to aremote vm, you'll need to make sure the pk has been resolved.
-dain

Re: CMP2 on G2 - Delayed Database Flush

Reply via email to