On 03/20/2013 07:53 AM, Dan Berindei wrote: > > On Tue, Mar 19, 2013 at 11:55 PM, Pedro Ruivo <[email protected] > <mailto:[email protected]>> wrote: > > > > On 03/19/2013 08:41 PM, Dan Berindei wrote: > > > > On Mon, Mar 18, 2013 at 6:09 PM, Pedro Ruivo > <[email protected] <mailto:[email protected]> > > <mailto:[email protected] <mailto:[email protected]>>> wrote: > > > > Hi all, > > > > To solve ISPN-2808 (avoid blocking JGroups threads in order > to allow to > > deliver the request responses), I've created another thread > pool to move > > the possible blocking commands (i.e. the commands that may > block until > > some state is achieved). > > > > Problem description: > > > > With this solution, the new thread pool should be large in > order to be > > able to handle the remote commands without deadlocks. The > problem is > > that all the threads can be block to process the command that may > > unblock other commands. > > > > Example: a bunch of commands are blocked waiting for a new > topology ID > > and the command that will increment the topology ID is in the > thread > > pool queue. > > > > Solution: > > > > Use a smart command dispatcher, i.e., keep the command in the > queue > > until we are sure that it will not wait for other commands. > I've already > > implemented some kind of executor service > (ConditionalExecutorService, > > in ISPN-2635 and ISPN-2636 branches, Total Order stuff) that > only put > > the Runnable (more precisely a new interface called > ConditionalRunnable) > > in the thread pool when it is ready to be processed. Creative > guys, it > > may need a better name :) > > > > The ConditionalRunnable has a new method (boolean isReady()) > that should > > return true when the runnable should not block. > > > > Example how to apply this to ISPN-2808: > > > > Most of the commands awaits for a particular topology ID > and/or for lock > > acquisition. > > > > > > Well, the original problem description was about the > > DistributionInterceptor forwarding the command from the primary > owner to > > the backup owners and waiting for a response from them :) > > The forwarding done by StateTransferInterceptor is also > synchronous and > > can block. > > > > It's true that you can't check how long a remote call will take > > beforehand... > > > > In this way, the isReady() implementation can be something > > like: > > > > isReady() > > return commandTopologyId <= currentTopologyId && (for all > keys; do if > > !lock(key).tryLock(); return false; done) > > > > > > Shouldn't you release the locks you managed to lock already if one of > > the lock acquisitions failed? > > you are right. I have to release the locks for the pessimist mode. In > optimistic mode, the locks are only released with the rollback command. > > > > Actually you may want to release the locks even if lock > acquisition did > > succeed... In non-tx mode, the locks are owned by the current > thread, so > > you can't lock a key on one thread and unlock it on another > (though you > > could skip the check completely in non-tx mode). And in transactional > > mode, you could have a deadlock because 2 txs lock the same keys in a > > different order. > Non-tx caches cannot use this optimization. I've seen that problem early > today when I start debugging it. > > > > Which leads me to a different point, how would you handle deadlocks? > > With pessimistic mode, if tx1 holds lock k1 and wants to acquire > k2, but > > tx2 holds k2 and wants to acquire k1, will the LockCommands > tx1:k2 and > > tx2:k1 ever be scheduled? In general, can we make the time that a > > Lock/PrepareCommand spends in the ConditionalExecutorService > queue count > > against lockAcquisitionTimeout? > > If I got a DeadLockException, I will send it back immediately and > release all the locks. > > > Hmmm, the default LockManager doesn't throw DeadlockDetectedExceptions, > you have to enable deadlockDetection explicitly in the configuration. > I'm guessing that's for performance reasons... > > It's true that you would need a new LockManager.tryLock method anyway, > so you could implement deadlock detection in LockManagerImpl.tryLock. > But probably the same performance considerations would apply.
oh... I though you was talking about the DeadlockDetectingLockManager. In the normal, I am not able to detect if we have a deadlock or not. Actually, I think I can avoid the deadlocks if I use what you suggest. (and here I'm talking about tx caches). when isReady() fails to acquire a lock I should release the locks acquired. this way, the only reason why the isReady can fail to acquire a lock is if the lock is acquired by a running command (i.e. a command that is in the thread pool). Do you think this may work? > > > The lockAcquisitionTimeout is a problem that I haven't solved yet. > > > > > With this, I believe we can keep the number of thread low and > avoid the > > thread deadlocks. > > > > Now, I have two possible implementations: > > > > 1) put a reference for StateTransferManager and/or > LockManager in the > > commands, and invoke the methods directly (a little dirty) > > > > 2) added new method in the CommandInterceptor like: boolean > > preProcess<command>(Command, InvocationContext). each > interceptor will > > check if the command will block on it (returning false) or > not (invoke > > the next interceptor). For example, the > StateTransferInterceptor returns > > immediately false if the commandToplogyId is higher than the > > currentTopologyId and the *LockingIntercerptor will return > false if it > > cannot acquire some lock. > > > > Any other suggestions? If I was not clear let me know. > > > > > > TBH I would only check for the topology id before scheduling the > > commands to the Infinispan thread pool, because checking the locks is > > very complicated with all the configuration options that we have. > > > > This is how I was imagining solving the lock problem: The OOB threads > > would execute directly the commands that do not acquire locks > > (CommitCommand, TxCompletionNotificationCommand) directly and > submit the > > others to the ISPN thread pool; if the command's topology id was > higher > > than the current topology id, the OOB thread would just stick it in a > > queue. A separate thread would read only the commands with the > current > > topology id from the queue, and process them just like an OOB thread. > > > > However... the CommitCommands may very well have a *lower* > topology id > > by the time they are executed, so the OOB/executor thread may > well block > > while waiting for the results of the forwarding RPC. Maybe we > could work > > around it by having StateTransferInterceptor submit a task with the > > command forwarding only to the thread pool, but again it gets quite > > complicated. So for the first phase I recommend handling only the > > topology id problem. > I have a similar implementation, but in your example, the CommitCommand > will be scheduler to the thread pool only if the command topology ID is > lower or equals than the current topology ID (like all topology affected > commands) > > Currently, I only have 3 tests failing in > org.infinispan.replication.FlagsReplicationTest (2 assertions and 1 > timeout) that I'm checking now... > > > > > I guess it's time to look at your code :) > > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
