Dain Sundstrom wrote:
On Apr 7, 2008, at 7:06 AM, Rick McGuire wrote:
Dain Sundstrom wrote:
I've been sucked into another project and haven't been paying much attention to the lists...

The problem is we flush before returning the created object to the caller. The reason we do this is because database generated fields are not filled in until the flush statement which means the primary key is not guaranteed to be available until flush. The current code requires the primary key to create the cmp proxy we return to the caller. The code will have to be changed to allow for late primary key resolution either when the code calls getPrimaryKey or at the end of the transaction.

I don't have the time to look at this, but I can help you if you want to work on it.
I've started poking around in the code trying to understand what needs to change. Is the JpaCmpEngine.createBean() method where the flushing takes place? It appears at that point in time that the primaryKey is used for 1) creating the ThreadContext instance, 2) ror storing the bean in the transaction cache, and 3) for creating the ProxyInfo instance. Am I looking in the correct location for this?

Yes.

The ThreadContext primary key bit looks easily changed to a lazy resolution, and probably the ProxyInfo as well, but the transcaction cache does not appear to be as easily changed, since the primary key is the main lookup method for the transaction cache. I guess the transaction cache step could be bypassed until the primary key is actually generated, but I'm concerned that this could result in some resolution failures where an object would be expected to be located in the cache.

The transaction cache was introduced as a work around to the new-delete-new bug in OpenJPA (see JpaTestObject.newDeleteNew()). If you create, remove and recreated a bean with the same pk, OpenJPA internally leave the pk as "deleted" so calls find(Class,Object) result in a null. We work around this by using a private cache to track the objects created during the transaction.

To implement delayed flush, you will have to add another way to track the JPA instance object (since we won't have the pk to "find" the object in the entity manager). When the pk is not available, you use the new, alternate, method to find the object, and when the pk is finally resolved, you would add it to the transaction cache.
I've not come up with any clever way of implementing the cache so far, other than just keeping a list of objects whose primary keys have not been calculated, and then, if all other lookups fail, start resolving the primary keys looking for the given target. Not elegant, but I think this will work. I do wonder if another approach might work better. If I understand the reasoning behind the flush, it is necessary because it's possible that some of the information needed to calculate the primary key only becomes available after the JPA flush()/merge() sequence. I suspect for many objects, this is not needed because a simple primary key is used. Would it be feasible to detect the situation where a flush is needed to "crystalize" the object to calculate the primary key? This way, simple object instances where the primary key is provided in the create() operation would not experience the performance hit.

Rick

Off the top of my head, it may be possible to use a stand-in pk object which wraps the JPA object itself (using identity based hashcode and equals) until the real pk is resolved. This pk object would then be the alternate tx cache.

Any pointers on where the end of transaction processing would need to be performed?

CmpContainer.ejbLoad(EntityBean) uses TransactionSynchronizationRegistry.registerInterposedSynchronization to store entities at the end of the transaction. You'll want to expand that logic to handle pk resolution in addition to ejbStore callbacks. The registerInterposedSynchronization doesn't really handle ordering well so I suggest you use a single Synchronization object to handle processing of the pks and the ejb store callbacks.

One other think to keep in mind is that before a CMP is passed to a remote vm, you'll need to make sure the pk has been resolved.

-dain



Reply via email to