I meant "release branch", not "master". On Tue, Nov 17, 2015 at 9:25 PM, Vladimir Ozerov <voze...@gridgain.com> wrote:
> Folks, > > I continue working on IGNITE-1917 - marshalling microoptimizations. I did > a bunch of things today which increased performance a bit, so that > unmarshalling with PortableMarshaller is now a bit faster than > OptimizedMarshaller when object has several fields, but still slower when > there are lots of small fields. > > I'm going to apply the last easy optimization in the nearest time and then > will focus on merging all pending tickets to master. Once all important > things are merged, I'm going to spend some more efforts on performance > again. > > On Tue, Nov 17, 2015 at 8:30 PM, Vladisav Jelisavcic <vladis...@gmail.com> > wrote: > >> Hi Yakov, >> 1. Yes >> >> 2. if you mean that nodeMap is accessed in onNodeRemoved(UUID nodeID) >> method of the GridCacheSemaphoreImpl class, >> it shouldn't be a problem, but it can be changed easily not to do so; >> >> 3. >> >> org.apache.ignite.internal.processors.cache.datastructures.GridCacheAbstractDataStructuresFailoverSelfTest#testSemaphoreConstantTopologyChangeFailoverSafe() >> >> org.apache.ignite.internal.processors.cache.datastructures.GridCacheAbstractDataStructuresFailoverSelfTest#testSemaphoreConstantMultipleTopologyChangeFailoverSafe() >> >> I think the problem is with the atomicity of the simulated grid failure; >> once stopGrid() is called for a node, other threads on this same node >> start >> throwing interrupted exceptions, >> which are in turn not handled properly in the >> GridCacheAbstractDataStructuresFailoverSelfTest; >> Those exceptions shouldn't be dealt with inside the GridCacheSemaphoreImpl >> itself. >> In a realworld node failure scenario, all those threads would fail at the >> same time >> (none of them would influence the rest of the grid anymore); >> >> I think fixing the issue Denis is working on can fix this (IGNITE-801 and >> IGNITE-803) >> Am i right? Does it makes sense? >> >> >> Best regards, >> Vladisav >> >> >> >> >> On Tue, Nov 17, 2015 at 5:40 PM, Yakov Zhdanov <yzhda...@apache.org> >> wrote: >> >> > Vladislav, >> > >> > I started to review the latest changes and have couple of questions: >> > >> > 1. latest changes are here - https://github.com/apache/ignite/pull/120? >> Is >> > that correct? >> > 2. >> > >> org.apache.ignite.internal.processors.datastructures.GridCacheSemaphoreImpl.Sync#nodeMap >> > is accessed in both sync and unsync context. Are you sure this is fine. >> > 3. As far as failing test - can you please isolate it into separate >> junit >> > or point out existing one? >> > >> > --Yakov >> > >> > 2015-11-11 12:33 GMT+03:00 Vladisav Jelisavcic <vladis...@gmail.com>: >> > >> > > Yakov, >> > > >> > > sorry for running a bit late. >> > > >> > > > Vladislav, do you have any updates for >> > > > https://issues.apache.org/jira/browse/IGNITE-638? Or any questions? >> > > > >> > > > --Yakov >> > > >> > > I have problems with some fail-over scenarios; >> > > It seems that if the two nodes are in the middle of acquiring or >> > releasing >> > > the semaphore, >> > > and one of them fails, all nodes get: >> > > >> > > [09:36:38,509][ERROR][ignite-#13%pub-null%][GridCacheSemaphoreImpl] >> > > <ignite-atomics-sys-cache> Failed to compare and set: >> > > >> o.a.i.i.processors.datastructures.GridCacheSemaphoreImpl$Sync$1@5528b728 >> > > class >> org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: >> > > Failed to acquire lock for keys (primary node left grid, retry >> > transaction >> > > if possible) [keys=[UserKeyCacheObjectImpl >> [val=GridCacheInternalKeyImpl >> > > [name=ac83b8cb-3052-49a6-9301-81b20b0ecf3a], hasValBytes=true]], >> > > node=c321fcc4-5db5-4b03-9811-6a5587f2c253] >> > > ... >> > > Caused by: class >> > > org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: >> > Failed >> > > to acquire lock for keys (primary node left grid, retry transaction if >> > > possible) [keys=[UserKeyCacheObjectImpl [val=GridCacheInternalKeyImpl >> > > [name=ac83b8cb-3052-49a6-9301-81b20b0ecf3a], hasValBytes=true]], >> > > node=c321fcc4-5db5-4b03-9811-6a5587f2c253] >> > > at >> > > >> > > >> > >> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.newTopologyException(GridDhtColocatedLockFuture.java:1199) >> > > ... 10 more >> > > >> > > >> > > I'm still trying to find out how to exactly reproduce this behavior, >> > > I'll send you more details once I try few more things. >> > > >> > > I am still using partitioned cache, does it make sense to use >> replicated >> > > cache instead? >> > > >> > > >> > > Other than that, I'm done with everything else. >> > > >> > > Thanks, >> > > Vladisav >> > > >> > > >> > >> > >