Hi Denis, Thanks a lot, it looks like my test setup was wrong, I added semaphore tests to GridCacheAbstractDataStructuresFailoverSelfTest suite. Now I have following problem: when I run tests with TOP_CHANGE_THREAD_CNT = 3 tests fail when they reach stop() with the following exception:
class org.apache.ignite.internal.IgniteInterruptedCheckedException: Node is stopping: 09c5e8b8-8998-468e-960d-223220354fd3 at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager.onKernalStop0(GridCachePartitionExchangeManager.java:382) at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.onKernalStop(GridCacheSharedManagerAdapter.java:113) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.onKernalStop(GridCacheProcessor.java:946) at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:1823) at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:1769) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2133) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2096) at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:314) at org.apache.ignite.Ignition.stop(Ignition.java:223) at org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:802) at org.apache.ignite.testframework.junits.GridAbstractTest.stopGrid(GridAbstractTest.java:784) at org.apache.ignite.internal.processors.cache.datastructures.GridCacheAbstractDataStructuresFailoverSelfTest.access$500(GridCacheAbstractDataStructuresFailoverSelfTest.java:54) at org.apache.ignite.internal.processors.cache.datastructures.GridCacheAbstractDataStructuresFailoverSelfTest$5.apply(GridCacheAbstractDataStructuresFailoverSelfTest.java:459) When I run tests with TOP_CHANGE_THEAD_CNT = 1 everything is running ok; @Yakov I made a new commit to my IGNITE-638 branch, can you please take a look? Best regards, Vladisav > On Wed, Nov 11, 2015 at 3:48 PM, Denis Magda <dma...@gridgain.com> wrote: > > > Hi Vladislav, > > > > Please see below.. > > > > > > On 11/11/2015 12:33 PM, Vladisav Jelisavcic wrote: > > > >> Yakov, > >> > >> sorry for running a bit late. > >> > >> Vladislav, do you have any updates for > >>> https://issues.apache.org/jira/browse/IGNITE-638? Or any questions? > >>> > >>> --Yakov > >>> > >> I have problems with some fail-over scenarios; > >> It seems that if the two nodes are in the middle of acquiring or > releasing > >> the semaphore, > >> and one of them fails, all nodes get: > >> > >> [09:36:38,509][ERROR][ignite-#13%pub-null%][GridCacheSemaphoreImpl] > >> <ignite-atomics-sys-cache> Failed to compare and set: > >> o.a.i.i.processors.datastructures.GridCacheSemaphoreImpl$Sync$1@5528b728 > >> class > org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: > >> Failed to acquire lock for keys (primary node left grid, retry > transaction > >> if possible) [keys=[UserKeyCacheObjectImpl [val=GridCacheInternalKeyImpl > >> [name=ac83b8cb-3052-49a6-9301-81b20b0ecf3a], hasValBytes=true]], > >> node=c321fcc4-5db5-4b03-9811-6a5587f2c253] > >> ... > >> Caused by: class > >> org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: > Failed > >> to acquire lock for keys (primary node left grid, retry transaction if > >> possible) [keys=[UserKeyCacheObjectImpl [val=GridCacheInternalKeyImpl > >> [name=ac83b8cb-3052-49a6-9301-81b20b0ecf3a], hasValBytes=true]], > >> node=c321fcc4-5db5-4b03-9811-6a5587f2c253] > >> at > >> > >> > org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.newTopologyException(GridDhtColocatedLockFuture.java:1199) > >> ... 10 more > >> > > You have to process this exception manually at your implementation layer > > since your data structure uses a transactional cache. > > Below is a kind of template I used when it was required to process this > > and some other exeptions. You can use it as-is. > > > > int retries = GridCacheAdapter.MAX_RETRIES; > > > > IgniteCheckedException err =null; > > > > for (int i =0; i < retries; i++) { > > try { > > //Your transactional code that may fail > > } > > catch (IgniteCheckedException e) { > > if (i == retries) > > throw e; > > > > if (X.hasCause(e, ClusterTopologyCheckedException.class)) { > > ClusterTopologyCheckedException topErr = > > e.getCause(ClusterTopologyCheckedException.class); > > > > topErr.retryReadyFuture().get(); > > } > > else if (X.hasCause(e, IgniteTxRollbackCheckedException.class)) > > U.sleep(1); > > else throw e; > > } > > } > > > > > > > >> I'm still trying to find out how to exactly reproduce this behavior, > >> I'll send you more details once I try few more things. > >> > > There is the test suite called > > GridCacheAbstractDataStructuresFailoverSelfTest that checks Ignite > atomics > > and data structures with fail-over scenario. > > The suite will let you reproduce ClusterTopologyCheckedException easily. > > Just add your tests there referring to the tests of other data > structures. > > > > Presently I'm improving this test suite under my work on IGNITE-801 and > > IGNITE-803. If you finish your task earlier then I'll adopt your tests > to a > > new test approach. > > > > > >> I am still using partitioned cache, does it make sense to use replicated > >> cache instead? > >> > >> Yeah, you should support this as well. Cache mode for the data > structures > > is changed using CollectionConfigurations while for atomics using > > AtomicsConfiguration. > > > > -- > > Denis > > > > > > Other than that, I'm done with everything else. > >> > >> Thanks, > >> Vladisav > >> > >> > >> > >> On Tue, Nov 10, 2015 at 7:19 PM, Raul Kripalani <ra...@apache.org> > wrote: > >> > >> Sorry I haven't made an appearance in this thread yet. > >>> > >>> 6. MQTT streamer > >>>> https://issues.apache.org/jira/browse/IGNITE-535 > >>>> > >>> Yes, it was merged to master before the ignite-1.5 was created. > >>> > >>> I'd like to add: > >>> > >>> Camel Streamer => https://issues.apache.org/jira/browse/IGNITE-1790 > >>> -- I'll merge this as soon as I finished with the OSGi tickets with > >>> demand. > >>> > >>> OSGi Manifests, Karaf features and possible ClassLoaderCodec SPI (or > >>> whatever agreement we arrive to in mailing lists and Wiki) > >>> -- https://issues.apache.org/jira/browse/IGNITE-1527 > >>> -- https://issues.apache.org/jira/browse/IGNITE-1877 > >>> -- I'm working actively on these two features. > >>> > >>> *Raúl Kripalani* > >>> PMC & Committer @ Apache Ignite, Apache Camel | Integration, Big Data > and > >>> Messaging Engineer > >>> http://about.me/raulkripalani | > http://www.linkedin.com/in/raulkripalani > >>> http://blog.raulkr.net | twitter: @raulvk > >>> > >>> On Mon, Nov 2, 2015 at 1:35 PM, Yakov Zhdanov <yzhda...@apache.org> > >>> wrote: > >>> > >>> Guys, > >>>> > >>>> I think we can start preparation to Ignite-1.5 release which will > >>>> include > >>>> many interesting features: > >>>> > >>>> 1. Portable object API > >>>> https://issues.apache.org/jira/browse/IGNITE-1486 > >>>> > >>>> 2. Ignite.NET and Ignite C++ > >>>> https://issues.apache.org/jira/browse/IGNITE-1282 > >>>> > >>>> 3. Optimistic serializable transactions > >>>> https://issues.apache.org/jira/browse/IGNITE-1607 > >>>> > >>>> 4. Distributed SQL joins - we will be able to query non-collocated > data > >>>> > >>> as > >>> > >>>> well > >>>> https://issues.apache.org/jira/browse/IGNITE-1232 > >>>> > >>>> 5. Enhanced Oracle and IBM JDK interoperability > >>>> https://issues.apache.org/jira/browse/IGNITE-1526 > >>>> > >>>> 6. MQTT streamer > >>>> https://issues.apache.org/jira/browse/IGNITE-535 > >>>> > >>>> 7. Continuous query failover > >>>> https://issues.apache.org/jira/browse/IGNITE-426 > >>>> > >>>> 8. Significant transactional cache performance optimizations - I will > >>>> > >>> merge > >>> > >>>> these changes from 'ignite-1.4-slow-server-debug' today or tomorrow. > >>>> > >>>> 9. Many stability and fault-tolerance fixes. > >>>> > >>>> 10. I would also like to include distributed Semaphore. Vladislav, any > >>>> chance you can finish with it this week? > >>>> https://issues.apache.org/jira/browse/IGNITE- > >>>> <https://issues.apache.org/jira/browse/IGNITE-426>638 > >>>> > >>>> Thanks to everyone involved! Guys, esp. assignees of mentioned issues, > >>>> please respond to this email and let us know when can we expect your > >>>> changes being merged to master and release branch? > >>>> > >>>> Can someone create ignite-1.5 release branch? > >>>> > >>>> --Yakov > >>>> > >>>> > > >