(I'm resending this, because it didn't seem to show up on jboss-dev.  If you
want the actually source code I changed, please email me.)


All,
I really need another set of eyes to check out some code changes I've made.
I'll check this in a day or so if I don't get any feedback. We've been
having some problems lately with findBy methods that return references to
the wrong rows in the database. (Please see the dev-threads "Serious Bug?
EntityBean out of synch with database" and "race condition between
EntityInstanceInterceptor"). In examining EntityInstanceInterceptor and
EntitySynchronizationInterceptor.InstanceSynchronization I found a race
condition and one other problem when an EntityInstanceCache.remove() is
called(within a rollback or an exception). Also there are potential problems
with re-entrant beans not getting a transactional "lock" because
EntityEnterpriseContext.setTransaction is not done in a mutex lock block.
race condition:
- When a rollback happens within a transaction,
InstanceSynchronization.afterCompletion would call ctx.setTransaction(null)
and would not protect the EntityEnterpriseContext with a mutex lock. There
it was quite possible for another thread waiting for the entity bean to
leave it's transaction to wakeup, do a cache.get(id) before afterCompletion
did a cache.remove and an InstancePool.free.(I actually wrote a little test
that did a sleep after ctx.setTransaction(null) to verify this). Really bad
things can happen like the context being put back into the instance pool
while another thread thinks the context is still valid.
invalid mutexes:
- When an EntityInstanceCache.remove is called, the Entity's mutex is
disassociated with the id of the Entity. Thus, all threads waiting on the
same bean will be locking on an invalid mutex if EntityInstanceCache.remove
is called.
ctx.setTransaction not synchronized
- setTransaction is called within EntityContainer.ContainerInterceptor. The
mutex of the Entity is NOT acquire before this is done, therefore, with
re-entrant beans, you can possibly have 2 transaction working on the same
entity instance.
Too many LOCKING-WAITING messages:
- You server.log file can grow to 100 megabytes or more very quickly when
beans are in contention.
FIXES!!
Files changed:
AbstractInstanceCache.java
EntityInstanceCache.java
EntityInstanceInterceptor.java
EntitySynchronizationInterceptor.java
Files added:
BeanSemaphore.java
- Wherever there is an EntityInstanceCache.remove(), I make sure that I
acquire the mutex of the entity before doing this.
- I added the ability to invalidate a mutex. When
AbstractInstanceCache.removeLock is called, the mutex is flagged as invalid.
EntityInstanceInterceptor checks for this and does a re-lookup of the mutex.
- I do ctx.setTransaction within the EntityInstanceInterceptor to ensure
that a transactional lock happens. THis will fix the re-entrant bean problem
I discussed earlier.
- I put in some code in EntityInstanceInterceptor so that LOCKING-WAITING
messages don't happen continuously.
- Put in a Thread.yield() in the infamous EntityInstanceInterceptor
do..while loop.
- Added an extension of org.jboss.util.Semaphore, BeanSemaphore, for
invalidation of mutexes.
- Added an id check in EntityInstanceCache.get to make sure that the key
you're looking up on is equal the the EntityEnterpriseContext.getId()
Thanks in advance for any help with this,
Bill
----------------------------------------------------------------------------
----
DIFFS
----------------------------------------------------------------------------
----
cvs diff -w EntitySynchronizationInterceptor.java
EntityInstanceInterceptor.java EntityInstanceCache.java
AbstractInstanceCache.java (in directory
D:\main\jboss2.3_src\jboss\src\main\org\jboss\ejb\plugins)
Index: EntitySynchronizationInterceptor.java
===================================================================
RCS file:
/cvsroot/jboss/jboss/src/main/org/jboss/ejb/plugins/EntitySynchronizationInt
erceptor.java,v
retrieving revision 1.34
diff -w -r1.34 EntitySynchronizationInterceptor.java
39a40
> import org.jboss.util.Sync;
414c415,421
<
---
> EntityContainer container = (EntityContainer)ctx.getContainer();
> AbstractInstanceCache cache =
(AbstractInstanceCache)container.getInstanceCache();
> Object id = ctx.getCacheKey();
> // Hopefully this transaction has an exclusive lock
> // on this id so that cache.getLock returns a mutex that is being
> // used by every other thread on this id.
> Sync mutex = (Sync)cache.getLock(id);
416c423,425
<
---
> // We really should be acquiring a mutex before
> // mucking with the InstanceCache or InstancePool
> mutex.acquire();
420a430,431
> // removing from the cache should also invalidate the mutex thus waking up
> // other threads.
423a435,438
> // REVISIT: FIXME:
> // We really should only let passivation free an instance because it is
> // quite possible that another thread is working with
> // the same context, but let's do it anyways.
429c444,447
<
---
> finally
> {
> mutex.release();
> }
Index: EntityInstanceInterceptor.java
===================================================================
RCS file:
/cvsroot/jboss/jboss/src/main/org/jboss/ejb/plugins/EntityInstanceIntercepto
r.java,v
retrieving revision 1.28
diff -w -r1.28 EntityInstanceInterceptor.java
109c109
< Sync mutex = (Sync)cache.getLock(key);
---
> BeanSemaphore mutex = (BeanSemaphore)cache.getLock(key);
111a112
> boolean exceptionThrown = false;
114a116,117
> boolean waitingOnTransaction = false; // So we don't output
LOCKING-WAITING all the time
> boolean waitingOnContext = false; // So we don't output LOCKING-WAITING
all the time
124a128,146
> // This loop guarantees a mutex lock for the specified object id
> // When a cache.remove happens, it disassociates the mutex from the bean's
id,
> // Thus, the mutex in this code could be invalid. We avoid this problem
with the following loop.
> //
> // Also we should have a while loop so we don't mess up the finally clause
that does
> // mutex.release()
> //
> while (!mutex.isValid())
> {
> BeanSemaphore newmutex = (BeanSemaphore)cache.getLock(key);
> mutex.release();
> mutex = newmutex;
>
> // Avoid infinite loop.
> if (mi.getTransaction() != null && mi.getTransaction().getStatus() ==
Status.STATUS_MARKED_ROLLBACK)
> throw new RuntimeException("Transaction marked for rollback, possibly a
timeout");
>
> mutex.acquire();
> }
136c158,165
< Logger.debug("LOCKING-WAITING (TRANSACTION) for id "+ctx.getId()+"
ctx.hash "+ctx.hashCode()+" tx:"+((tx == null) ? "null" : tx.toString()));
---
> if (!waitingOnTransaction)
> {
> Logger.debug("LOCKING-WAITING (TRANSACTION) in Thread "
> + Thread.currentThread().getName()
> + " for id "+ctx.getId()+" ctx.hash "+ctx.hashCode()
> +" tx:"+((tx == null) ? "null" : tx.toString()));
> waitingOnTransaction = true;
> }
139a169
> Thread.yield(); // Give the OS some help.
142c172,193
< else
---
> if (waitingOnTransaction)
> {
> Logger.debug("FINISHED-LOCKING-WAITING (TRANSACTION) in Thread "
> + Thread.currentThread().getName()
> + " for id "+ctx.getId()
> +" ctx.hash "+ctx.hashCode()
> +" tx:"+((tx == null) ? "null" : tx.toString()));
> waitingOnTransaction = false;
> }
> // OK so we now know that for this PrimaryKey, no other
> // thread has a transactional lock on the bean.
> // So, let's setTransaction of the ctx here instead of in later code.
> // I really don't understand why it wasn't done here anyways before.
> // Later on, in the finally clause, I'll check to see if the ctx was
> // invoked upon and if not, dissassociate the transaction with the ctx.
> // If there is no transactional assocation here, there is a race condition
> // for re-entrant entity beans, so don't remove the code below.
> //
> // If this "waitingOnTransaction" loop is ever removed in favor of
> // something like using db locks instead, this transactional assocation
> // must be removed.
> if (mi.getTransaction() != null && tx != null &&
!tx.equals(mi.getTransaction()))
143a195,197
> // Do transactional "lock" on ctx right now!
> ctx.setTransaction(mi.getTransaction());
> }
152,153c206,207
< if (!isCallAllowed(mi)) {
<
---
> if (!isCallAllowed(mi))
> {
158c212,219
< Logger.debug("LOCKING-WAITING (CTX) for id "+ctx.getId()+" ctx.hash
"+ctx.hashCode());
---
> if (!waitingOnContext)
> {
> Logger.debug("LOCKING-WAITING (CTX) in Thread "
> + Thread.currentThread().getName()
> + " for id "+ctx.getId()
> +" ctx.hash "+ctx.hashCode());
> waitingOnContext = true;
> }
161a223
> Thread.yield(); // Help out the OS.
163,164d224
< // Not allowed reentrant call
< //throw new RemoteException("Reentrant call");
171a232,239
> if (waitingOnContext)
> {
> Logger.debug("FINISHED-LOCKING-WAITING (CTX) in Thread "
> + Thread.currentThread().getName()
> + " for id "
> + ctx.getId()
> + " ctx.hash " + ctx.hashCode());
> waitingOnContext = false;
179d246
<
191,194c258
< // Discard instance
< // EJB 1.1 spec 12.3.1
< cache.remove(key);
<
---
> exceptionThrown = true;
198,201c262
< // Discard instance
< // EJB 1.1 spec 12.3.1
< cache.remove(key);
<
---
> exceptionThrown = true;
205,208c266
< // Discard instance
< // EJB 1.1 spec 12.3.1
< cache.remove(key);
<
---
> exceptionThrown = true;
210c268,269
< } finally
---
> }
> finally
211a271,273
> try
> {
> mutex.acquire();
213d274
<
218,221d278
< try
< {
< mutex.acquire();
<
225,226c282,294
< if (ctx.getId() == null)
< {
---
> Transaction tx = ctx.getTransaction();
> if (tx != null
> && mi.getTransaction() != null
> && tx.equals(mi.getTransaction())
> && !((EntityEnterpriseContext)ctx).isInvoked())
> {
> // The context has been associated with this method's transaction
> // but the entity has not been invoked upon yet, so let's
> // disassociate the transaction from the ctx.
> // I'm doing this because I'm assuming that the bean hasn't been
registered with
> // the TxManager.
> ctx.setTransaction(null);
> }
227a296,304
> if (exceptionThrown)
> {
> // Discard instance
> // EJB 1.1 spec 12.3.1
> //
> cache.remove(key);
> }
> else if (ctx.getId() == null)
> {
238a316,320
> // REVISIT: FIXME:
> // We really should only let passivation free an instance because it is
> // quite possible that another thread is working with
> // the same context, but let's do it anyways.
> //
251a334
> }
254c337,347
< // Yeah, do nothing
---
> /*
> * This used to happen in the old code.
> * Why? If the ctx is null, why should we remove it? Another thread could
> * be screwed up by this.
> if (exceptionThrown)
> {
> // Discard instance
> // EJB 1.1 spec 12.3.1
> cache.remove(key);
> }
> */
257d349
< catch (InterruptedException ignored) {}
263d354
< }
Index: EntityInstanceCache.java
===================================================================
RCS file:
/cvsroot/jboss/jboss/src/main/org/jboss/ejb/plugins/EntityInstanceCache.java
,v
retrieving revision 1.3
diff -w -r1.3 EntityInstanceCache.java
58c58,71
< return super.get(id);
---
> EnterpriseContext rtn = null;
> rtn = super.get(id);
> //
> // FIXME: (Bill Burke) We were running into problems where CMP EntityBeans
> // were out of sync with the database
> // The problem went away after a few minutes of inactivity leading us to
> // believe that the bean in cache was passivated and the CacheKey was out
> // of sync with the Context's key. So I put this defensive check in to
> // flag the problem.
> if (rtn != null && !rtn.getId().equals(((CacheKey)id).id))
> {
> throw new IllegalStateException("somehow the cache is out of synch with
the context ctx.id != lookup.id");
> }
> return rtn;
Index: AbstractInstanceCache.java
===================================================================
RCS file:
/cvsroot/jboss/jboss/src/main/org/jboss/ejb/plugins/AbstractInstanceCache.ja
va,v
retrieving revision 1.6
diff -w -r1.6 AbstractInstanceCache.java
266c266
< mutex = new Semaphore(1);
---
> mutex = new BeanSemaphore(1);
279a280,283
> if (mutex instanceof BeanSemaphore)
> {
> ((BeanSemaphore)mutex).invalidate();
> }
*****CVS exited normally with code 1*****



_______________________________________________
Jboss-development mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/jboss-development

Reply via email to