[
https://issues.apache.org/jira/browse/IGNITE-835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539749#comment-14539749
]
Semen Boikov commented on IGNITE-835:
-------------------------------------
1. For unlock without near cache 'GridDhtColocatedCache.unlockAll()' calls
mvcc.removeExplicitLock and it removes candidate which is not reentry.
2. Scenario when lock with near cache hangs:
- start two nodes
- try lock key, node1 is non primary but still has owns partition
- node1 creates near entry and adds candidate and sends GridNearLockRequest to
node2
- since node1 still owns partition node2 sends GridDhtLockRequest to node1
- node1 receives GridDhtLockRequest, in the
GridDhtTransactionalCacheAdapter.startRemoteTx it fails to find key partition
and oboletes near cache entry (line 177)
- node1 receives GridNearLockResponse, but near cache entry was obsoleted
during handling of GridDhtLockRequest and entry mvcc info was lost, as result
lock callback is not called and GridNearLockFuture does not finish
> IgniteCache.lock is broken for PARTITIONED cache without near cache.
> --------------------------------------------------------------------
>
> Key: IGNITE-835
> URL: https://issues.apache.org/jira/browse/IGNITE-835
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Affects Versions: sprint-2
> Reporter: Vladimir Ozerov
> Assignee: Semen Boikov
> Priority: Critical
> Fix For: sprint-5
>
>
> Steps to reproduce:
> 1) Go to GridCacheLockAbstractTest
> 2) Add the test source below.
> 3) Make sure to disable near cache
> (GridCacheLockAbstractTest.cacheConfiguration() ->
> setNearConfiguration(null)).
> 4) Run GridCachePartitionedLockSelfTest.testLockReentrancy() and observe
> assertion failure.
> 5) Enable near cache back and re-run the test. Observe that now it pass.
> {code}
> public void testLockReentrancy() throws Throwable {
> for (int i = 10; i < 100; i++) {
> System.out.println("Key: " + i);
> final int i0 = i;
> final Lock lock = cache1.lock(i);
> lock.lockInterruptibly();
> try {
> final AtomicReference<Throwable> err = new AtomicReference<>();
> Thread t = new Thread(new Runnable() {
> @Override public void run() {
> try {
> assert !lock.tryLock();
> assert !lock.tryLock(100, TimeUnit.MILLISECONDS);
> assert !cache1.lock(i0).tryLock();
> assert !cache1.lock(i0).tryLock(100,
> TimeUnit.MILLISECONDS);
> }
> catch (Throwable e) {
> err.set(e);
> }
> }
> });
> t.start();
> t.join();
> if (err.get() != null)
> throw err.get();
> lock.lock();
> lock.unlock();
> t = new Thread(new Runnable() {
> @Override public void run() {
> try {
> assert !lock.tryLock();
> assert !lock.tryLock(100, TimeUnit.MILLISECONDS);
> assert !cache1.lock(i0).tryLock();
> assert !cache1.lock(i0).tryLock(100,
> TimeUnit.MILLISECONDS);
> }
> catch (Throwable e) {
> err.set(e);
> }
> }
> });
> t.start();
> t.join();
> if (err.get() != null)
> throw err.get();
> }
> finally {
> lock.unlock();
> }
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)