[jira] [Assigned] (IGNITE-10995) GridDhtPartitionSupplier::handleDemandMessage suppress errors
[ https://issues.apache.org/jira/browse/IGNITE-10995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-10995: - Assignee: Stepachev Maksim > GridDhtPartitionSupplier::handleDemandMessage suppress errors > - > > Key: IGNITE-10995 > URL: https://issues.apache.org/jira/browse/IGNITE-10995 > Project: Ignite > Issue Type: Bug >Reporter: Dmitry Sherstobitov >Assignee: Stepachev Maksim >Priority: Major > Attachments: Screenshot 2019-01-20 at 23.19.08.png > > > Scenario: > # Cluster with data > # Triggered historical rebalance > In this case if OOM occurs on supplier there is no failHandler triggered and > cluster is alive with inconsistent data (target node have MOVING partitions, > supplier do nothing) > Target rebalance node log: > {code:java} > [15:00:31,418][WARNING][sys-#86][GridDhtPartitionDemander] Rebalancing from > node cancelled [grp=cache_group_4, topVer=AffinityTopologyVersion [topVer=17, > minorTopVer=0], supplier=4cbc66d3-9d2c-4396-8366-2839a8d0cdb6, topic=5]]. > Supplier has failed with error: java.lang.OutOfMemoryError: Java heap > space{code} > Supplier stack trace: > !Screenshot 2019-01-20 at 23.19.08.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-11598) Add possibility to have different rebalance thread pool size for nodes in the cluster
[ https://issues.apache.org/jira/browse/IGNITE-11598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-11598: - Assignee: Stepachev Maksim > Add possibility to have different rebalance thread pool size for nodes in the > cluster > - > > Key: IGNITE-11598 > URL: https://issues.apache.org/jira/browse/IGNITE-11598 > Project: Ignite > Issue Type: Improvement >Reporter: Evgenii Zhuravlev >Assignee: Stepachev Maksim >Priority: Major > > It can be used for changing this property without downtime when rebalance is > slow -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11632) Node can't start if WAL is corrupted and the wal archiver disabled.
Stepachev Maksim created IGNITE-11632: - Summary: Node can't start if WAL is corrupted and the wal archiver disabled. Key: IGNITE-11632 URL: https://issues.apache.org/jira/browse/IGNITE-11632 Project: Ignite Issue Type: Bug Affects Versions: 2.7, 2.6, 2.5 Reporter: Stepachev Maksim Assignee: Stepachev Maksim Fix For: 2.7, 2.6, 2.5 If you start node without the wal archiver and your last segment page has the wrong CRC, the node stops with an exception. {code:java} Caused by: class org.apache.ignite.IgniteCheckedException: Failed to read WAL record at position: 234728337 size: 268435456 at org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readWithCrc(RecordV1Serializer.java:394) at org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer.readRecord(RecordV2Serializer.java:235) at org.apache.ignite.internal.processors.cache.persistence.wal.AbstractWalRecordsIterator.advanceRecord(AbstractWalRecordsIterator.java:243) ... 23 more Caused by: class org.apache.ignite.internal.processors.cache.persistence.wal.crc.IgniteDataIntegrityViolationException: val: -202263192 writtenCrc: 0 at org.apache.ignite.internal.processors.cache.persistence.wal.io.FileInput$Crc32CheckingFileInput.close(FileInput.java:106) at org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV1Serializer.readWithCrc(RecordV1Serializer.java:380) ... 25 more {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-11050) Potential deadlock caused by DhtColocatedLockFuture#map being called inside topology read lock
[ https://issues.apache.org/jira/browse/IGNITE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777853#comment-16777853 ] Stepachev Maksim commented on IGNITE-11050: --- It's true. I removed it. > Potential deadlock caused by DhtColocatedLockFuture#map being called inside > topology read lock > -- > > Key: IGNITE-11050 > URL: https://issues.apache.org/jira/browse/IGNITE-11050 > Project: Ignite > Issue Type: Bug >Reporter: Alexey Goncharuk >Assignee: Alexey Goncharuk >Priority: Critical > Labels: MakeTeamcityGreenAgain > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > I observed the following stacktrace on TC during tests analysis: > {code} > Thread > [name="exchange-worker-#18471%near.GridCachePartitionedNodeRestartTest0%", > id=23715, state=WAITING, blockCnt=860, waitCnt=775] > Lock > [object=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@2bfb6b49, > ownerName=null, ownerId=-1] > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) > at > o.a.i.i.util.StripedCompositeReadWriteLock$WriteLock.lock0(StripedCompositeReadWriteLock.java:173) > at > o.a.i.i.util.StripedCompositeReadWriteLock$WriteLock.lock(StripedCompositeReadWriteLock.java:142) > at > o.a.i.i.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition0(GridDhtPartitionTopologyImpl.java:925) > at > o.a.i.i.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition(GridDhtPartitionTopologyImpl.java:826) > at > o.a.i.i.processors.cache.distributed.dht.GridCachePartitionedConcurrentMap.localPartition(GridCachePartitionedConcurrentMap.java:70) > at > o.a.i.i.processors.cache.distributed.dht.GridCachePartitionedConcurrentMap.putEntryIfObsoleteOrAbsent(GridCachePartitionedConcurrentMap.java:89) > at > o.a.i.i.processors.cache.GridCacheAdapter.entryEx(GridCacheAdapter.java:1019) > at > o.a.i.i.processors.cache.distributed.dht.GridDhtCacheAdapter.entryEx(GridDhtCacheAdapter.java:544) > at > o.a.i.i.processors.cache.transactions.IgniteTxManager.txUnlock(IgniteTxManager.java:1764) > at > o.a.i.i.processors.cache.transactions.IgniteTxManager.unlockMultiple(IgniteTxManager.java:1775) > at > o.a.i.i.processors.cache.transactions.IgniteTxManager.rollbackTx(IgniteTxManager.java:1347) > at > o.a.i.i.processors.cache.transactions.IgniteTxLocalAdapter.userRollback(IgniteTxLocalAdapter.java:1075) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.localFinish(GridNearTxLocal.java:3602) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxFinishFuture.doFinish(GridNearTxFinishFuture.java:440) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:390) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.rollbackNearTxLocalAsync(GridNearTxLocal.java:3833) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.rollbackNearTxLocalAsync(GridNearTxLocal.java:3784) > at > o.a.i.i.processors.cache.GridCacheAdapter$53.applyx(GridCacheAdapter.java:4409) > at > o.a.i.i.processors.cache.GridCacheAdapter$53.applyx(GridCacheAdapter.java:4399) > at o.a.i.i.util.lang.IgniteClosureX.apply(IgniteClosureX.java:38) > at > o.a.i.i.util.future.GridFutureChainListener.applyCallback(GridFutureChainListener.java:78) > at > o.a.i.i.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:70) > at > o.a.i.i.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:30) > at > o.a.i.i.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399) > at > o.a.i.i.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) > at > o.a.i.i.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335) > at > o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:511) > at > o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:490) > at >
[jira] [Assigned] (IGNITE-10900) Print a warning if native persistence is used without an explicit consistent ID
[ https://issues.apache.org/jira/browse/IGNITE-10900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-10900: - Assignee: Stepachev Maksim > Print a warning if native persistence is used without an explicit consistent > ID > --- > > Key: IGNITE-10900 > URL: https://issues.apache.org/jira/browse/IGNITE-10900 > Project: Ignite > Issue Type: Bug >Reporter: Stanislav Lukyanov >Assignee: Stepachev Maksim >Priority: Major > > Experience shows that when Native Persistence is enabled, it is better to > explicitly set ConsistentIDs than use the autogenerated ones. > First, it simplifies managing the baseline topology. It is much easier to > manage it via control.sh when the nodes have stable and meaningful names. > Second, it helps to avoid certain shoot-yourself-in-the-foot issues. E.g. if > one loses all the data of a baseline node, when that node is restarted it > doesn't have its old autogenerated consistent ID - so it is not a part of the > baseline anymore. This may be unexpected and confusing. > Finally, having explicit consistent IDs improves the general stability of the > setup - one knows what the the set of nodes, where they run and what they're > called. > All in all, it seems beneficial to urge users to explicitly configure > consistent IDs. We can do this by introducing a warning that is printed every > time a new consistent ID is automatically generated. It should also be > printed when a node doesn't have an explicit consistent ID and picks up one > from an existing peristence folder. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11352) Compatibility with 2.7 with mode: statistics enabled
Stepachev Maksim created IGNITE-11352: - Summary: Compatibility with 2.7 with mode: statistics enabled Key: IGNITE-11352 URL: https://issues.apache.org/jira/browse/IGNITE-11352 Project: Ignite Issue Type: Bug Affects Versions: 2.7 Reporter: Stepachev Maksim Assignee: Stepachev Maksim Fix For: 2.7 The problem was founded when we performed a rolling upgrade from 2.4 to 2.7. The root of the problem is CacheMetricsSnapshot, it doesn't work with the previous version of the protocol. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11352) Compatibility with 2.7 with mode: statistics enabled
[ https://issues.apache.org/jira/browse/IGNITE-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-11352: -- Description: The problem was founded, there are incompatible changes in serialization between 2.4 and 2.7 versions. The root of the problem is CacheMetricsSnapshot. was: The problem was founded when we performed a rolling upgrade from 2.4 to 2.7. The root of the problem is CacheMetricsSnapshot, it doesn't work with the previous version of the protocol. > Compatibility with 2.7 with mode: statistics enabled > - > > Key: IGNITE-11352 > URL: https://issues.apache.org/jira/browse/IGNITE-11352 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.7 >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.7 > > Time Spent: 10m > Remaining Estimate: 0h > > The problem was founded, there are incompatible changes in serialization > between 2.4 and 2.7 versions. The root of the problem is CacheMetricsSnapshot. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-10900) Print a warning if native persistence is used without an explicit consistent ID
[ https://issues.apache.org/jira/browse/IGNITE-10900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-10900: - Assignee: Alexey Goncharuk (was: Stepachev Maksim) > Print a warning if native persistence is used without an explicit consistent > ID > --- > > Key: IGNITE-10900 > URL: https://issues.apache.org/jira/browse/IGNITE-10900 > Project: Ignite > Issue Type: Bug >Reporter: Stanislav Lukyanov >Assignee: Alexey Goncharuk >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Experience shows that when Native Persistence is enabled, it is better to > explicitly set ConsistentIDs than use the autogenerated ones. > First, it simplifies managing the baseline topology. It is much easier to > manage it via control.sh when the nodes have stable and meaningful names. > Second, it helps to avoid certain shoot-yourself-in-the-foot issues. E.g. if > one loses all the data of a baseline node, when that node is restarted it > doesn't have its old autogenerated consistent ID - so it is not a part of the > baseline anymore. This may be unexpected and confusing. > Finally, having explicit consistent IDs improves the general stability of the > setup - one knows what the the set of nodes, where they run and what they're > called. > All in all, it seems beneficial to urge users to explicitly configure > consistent IDs. We can do this by introducing a warning that is printed every > time a new consistent ID is automatically generated. It should also be > printed when a node doesn't have an explicit consistent ID and picks up one > from an existing peristence folder. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-11050) Potential deadlock caused by DhtColocatedLockFuture#map being called inside topology read lock
[ https://issues.apache.org/jira/browse/IGNITE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-11050: - Assignee: Alexey Goncharuk (was: Stepachev Maksim) > Potential deadlock caused by DhtColocatedLockFuture#map being called inside > topology read lock > -- > > Key: IGNITE-11050 > URL: https://issues.apache.org/jira/browse/IGNITE-11050 > Project: Ignite > Issue Type: Bug >Reporter: Alexey Goncharuk >Assignee: Alexey Goncharuk >Priority: Critical > Labels: MakeTeamcityGreenAgain > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > I observed the following stacktrace on TC during tests analysis: > {code} > Thread > [name="exchange-worker-#18471%near.GridCachePartitionedNodeRestartTest0%", > id=23715, state=WAITING, blockCnt=860, waitCnt=775] > Lock > [object=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@2bfb6b49, > ownerName=null, ownerId=-1] > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) > at > o.a.i.i.util.StripedCompositeReadWriteLock$WriteLock.lock0(StripedCompositeReadWriteLock.java:173) > at > o.a.i.i.util.StripedCompositeReadWriteLock$WriteLock.lock(StripedCompositeReadWriteLock.java:142) > at > o.a.i.i.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition0(GridDhtPartitionTopologyImpl.java:925) > at > o.a.i.i.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition(GridDhtPartitionTopologyImpl.java:826) > at > o.a.i.i.processors.cache.distributed.dht.GridCachePartitionedConcurrentMap.localPartition(GridCachePartitionedConcurrentMap.java:70) > at > o.a.i.i.processors.cache.distributed.dht.GridCachePartitionedConcurrentMap.putEntryIfObsoleteOrAbsent(GridCachePartitionedConcurrentMap.java:89) > at > o.a.i.i.processors.cache.GridCacheAdapter.entryEx(GridCacheAdapter.java:1019) > at > o.a.i.i.processors.cache.distributed.dht.GridDhtCacheAdapter.entryEx(GridDhtCacheAdapter.java:544) > at > o.a.i.i.processors.cache.transactions.IgniteTxManager.txUnlock(IgniteTxManager.java:1764) > at > o.a.i.i.processors.cache.transactions.IgniteTxManager.unlockMultiple(IgniteTxManager.java:1775) > at > o.a.i.i.processors.cache.transactions.IgniteTxManager.rollbackTx(IgniteTxManager.java:1347) > at > o.a.i.i.processors.cache.transactions.IgniteTxLocalAdapter.userRollback(IgniteTxLocalAdapter.java:1075) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.localFinish(GridNearTxLocal.java:3602) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxFinishFuture.doFinish(GridNearTxFinishFuture.java:440) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:390) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.rollbackNearTxLocalAsync(GridNearTxLocal.java:3833) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.rollbackNearTxLocalAsync(GridNearTxLocal.java:3784) > at > o.a.i.i.processors.cache.GridCacheAdapter$53.applyx(GridCacheAdapter.java:4409) > at > o.a.i.i.processors.cache.GridCacheAdapter$53.applyx(GridCacheAdapter.java:4399) > at o.a.i.i.util.lang.IgniteClosureX.apply(IgniteClosureX.java:38) > at > o.a.i.i.util.future.GridFutureChainListener.applyCallback(GridFutureChainListener.java:78) > at > o.a.i.i.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:70) > at > o.a.i.i.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:30) > at > o.a.i.i.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399) > at > o.a.i.i.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) > at > o.a.i.i.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335) > at > o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:511) > at > o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:490) > at >
[jira] [Closed] (IGNITE-11352) CacheMetricsSnapshot deserialization may be broken in some cases
[ https://issues.apache.org/jira/browse/IGNITE-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim closed IGNITE-11352. - > CacheMetricsSnapshot deserialization may be broken in some cases > > > Key: IGNITE-11352 > URL: https://issues.apache.org/jira/browse/IGNITE-11352 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.7 >Reporter: Stepachev Maksim >Assignee: Alexey Goncharuk >Priority: Major > Fix For: 2.7 > > Time Spent: 20m > Remaining Estimate: 0h > > There is an incompatible changes in CacheMetricsSnapshot serialization > between 2.4 and 2.7 versions. This may affect users if the event is being > stored to some external storage. The fix should be fairly simple as there is > already some code that checks the stream contents. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-10900) Print a warning if native persistence is used without an explicit consistent ID
[ https://issues.apache.org/jira/browse/IGNITE-10900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773901#comment-16773901 ] Stepachev Maksim commented on IGNITE-10900: --- [~slukyanov], [~agoncharuk] Should I add more details to a new message or this text is fine (from PR)? > Print a warning if native persistence is used without an explicit consistent > ID > --- > > Key: IGNITE-10900 > URL: https://issues.apache.org/jira/browse/IGNITE-10900 > Project: Ignite > Issue Type: Bug >Reporter: Stanislav Lukyanov >Assignee: Alexey Goncharuk >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Experience shows that when Native Persistence is enabled, it is better to > explicitly set ConsistentIDs than use the autogenerated ones. > First, it simplifies managing the baseline topology. It is much easier to > manage it via control.sh when the nodes have stable and meaningful names. > Second, it helps to avoid certain shoot-yourself-in-the-foot issues. E.g. if > one loses all the data of a baseline node, when that node is restarted it > doesn't have its old autogenerated consistent ID - so it is not a part of the > baseline anymore. This may be unexpected and confusing. > Finally, having explicit consistent IDs improves the general stability of the > setup - one knows what the the set of nodes, where they run and what they're > called. > All in all, it seems beneficial to urge users to explicitly configure > consistent IDs. We can do this by introducing a warning that is printed every > time a new consistent ID is automatically generated. It should also be > printed when a node doesn't have an explicit consistent ID and picks up one > from an existing peristence folder. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11359) Improvement of tests. Add additional state check after each test.
Stepachev Maksim created IGNITE-11359: - Summary: Improvement of tests. Add additional state check after each test. Key: IGNITE-11359 URL: https://issues.apache.org/jira/browse/IGNITE-11359 Project: Ignite Issue Type: Improvement Reporter: Stepachev Maksim Assignee: Stepachev Maksim Sometimes, the Flaky tests are interrupted with OOM. There are many reasons for it, but the main is a memory leak in transactions. The good way of fast problem detection will additionally check of maps with transaction futures after a test. It must be empty. Add this logic into GridCommonAbstractTest. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (IGNITE-11205) Cache (Restarts) 1 flaky tests
[ https://issues.apache.org/jira/browse/IGNITE-11205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim closed IGNITE-11205. - > Cache (Restarts) 1 flaky tests > -- > > Key: IGNITE-11205 > URL: https://issues.apache.org/jira/browse/IGNITE-11205 > Project: Ignite > Issue Type: Improvement >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-11050) Potential deadlock caused by DhtColocatedLockFuture#map being called inside topology read lock
[ https://issues.apache.org/jira/browse/IGNITE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-11050: - Assignee: Alexey Goncharuk (was: Stepachev Maksim) I fixed it. Could you do a review? > Potential deadlock caused by DhtColocatedLockFuture#map being called inside > topology read lock > -- > > Key: IGNITE-11050 > URL: https://issues.apache.org/jira/browse/IGNITE-11050 > Project: Ignite > Issue Type: Bug >Reporter: Alexey Goncharuk >Assignee: Alexey Goncharuk >Priority: Critical > Labels: MakeTeamcityGreenAgain > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > I observed the following stacktrace on TC during tests analysis: > {code} > Thread > [name="exchange-worker-#18471%near.GridCachePartitionedNodeRestartTest0%", > id=23715, state=WAITING, blockCnt=860, waitCnt=775] > Lock > [object=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@2bfb6b49, > ownerName=null, ownerId=-1] > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) > at > o.a.i.i.util.StripedCompositeReadWriteLock$WriteLock.lock0(StripedCompositeReadWriteLock.java:173) > at > o.a.i.i.util.StripedCompositeReadWriteLock$WriteLock.lock(StripedCompositeReadWriteLock.java:142) > at > o.a.i.i.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition0(GridDhtPartitionTopologyImpl.java:925) > at > o.a.i.i.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition(GridDhtPartitionTopologyImpl.java:826) > at > o.a.i.i.processors.cache.distributed.dht.GridCachePartitionedConcurrentMap.localPartition(GridCachePartitionedConcurrentMap.java:70) > at > o.a.i.i.processors.cache.distributed.dht.GridCachePartitionedConcurrentMap.putEntryIfObsoleteOrAbsent(GridCachePartitionedConcurrentMap.java:89) > at > o.a.i.i.processors.cache.GridCacheAdapter.entryEx(GridCacheAdapter.java:1019) > at > o.a.i.i.processors.cache.distributed.dht.GridDhtCacheAdapter.entryEx(GridDhtCacheAdapter.java:544) > at > o.a.i.i.processors.cache.transactions.IgniteTxManager.txUnlock(IgniteTxManager.java:1764) > at > o.a.i.i.processors.cache.transactions.IgniteTxManager.unlockMultiple(IgniteTxManager.java:1775) > at > o.a.i.i.processors.cache.transactions.IgniteTxManager.rollbackTx(IgniteTxManager.java:1347) > at > o.a.i.i.processors.cache.transactions.IgniteTxLocalAdapter.userRollback(IgniteTxLocalAdapter.java:1075) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.localFinish(GridNearTxLocal.java:3602) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxFinishFuture.doFinish(GridNearTxFinishFuture.java:440) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:390) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.rollbackNearTxLocalAsync(GridNearTxLocal.java:3833) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.rollbackNearTxLocalAsync(GridNearTxLocal.java:3784) > at > o.a.i.i.processors.cache.GridCacheAdapter$53.applyx(GridCacheAdapter.java:4409) > at > o.a.i.i.processors.cache.GridCacheAdapter$53.applyx(GridCacheAdapter.java:4399) > at o.a.i.i.util.lang.IgniteClosureX.apply(IgniteClosureX.java:38) > at > o.a.i.i.util.future.GridFutureChainListener.applyCallback(GridFutureChainListener.java:78) > at > o.a.i.i.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:70) > at > o.a.i.i.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:30) > at > o.a.i.i.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399) > at > o.a.i.i.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) > at > o.a.i.i.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335) > at > o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:511) > at >
[jira] [Assigned] (IGNITE-11276) Cache (Restarts) 1 flaky tests NPE at GridDhtPartitionsExchangeFuture.topologyVersion
[ https://issues.apache.org/jira/browse/IGNITE-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-11276: - Assignee: Alexey Goncharuk (was: Stepachev Maksim) > Cache (Restarts) 1 flaky tests NPE at > GridDhtPartitionsExchangeFuture.topologyVersion > - > > Key: IGNITE-11276 > URL: https://issues.apache.org/jira/browse/IGNITE-11276 > Project: Ignite > Issue Type: Bug >Reporter: Stepachev Maksim >Assignee: Alexey Goncharuk >Priority: Critical > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > Sometimes the Cache (Restarts) 1 suit finish with fail. The reason is NPE > into at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.topologyVersion(GridDhtPartitionsExchangeFuture.java:515) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-11276) Cache (Restarts) 1 flaky tests NPE at GridDhtPartitionsExchangeFuture.topologyVersion
[ https://issues.apache.org/jira/browse/IGNITE-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-11276: - Assignee: Stepachev Maksim (was: Alexey Goncharuk) > Cache (Restarts) 1 flaky tests NPE at > GridDhtPartitionsExchangeFuture.topologyVersion > - > > Key: IGNITE-11276 > URL: https://issues.apache.org/jira/browse/IGNITE-11276 > Project: Ignite > Issue Type: Bug >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Critical > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > Sometimes the Cache (Restarts) 1 suit finish with fail. The reason is NPE > into at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.topologyVersion(GridDhtPartitionsExchangeFuture.java:515) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-11276) Cache (Restarts) 1 flaky tests NPE at GridDhtPartitionsExchangeFuture.topologyVersion
[ https://issues.apache.org/jira/browse/IGNITE-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-11276: - Assignee: Alexey Goncharuk (was: Stepachev Maksim) > Cache (Restarts) 1 flaky tests NPE at > GridDhtPartitionsExchangeFuture.topologyVersion > - > > Key: IGNITE-11276 > URL: https://issues.apache.org/jira/browse/IGNITE-11276 > Project: Ignite > Issue Type: Bug >Reporter: Stepachev Maksim >Assignee: Alexey Goncharuk >Priority: Critical > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > Sometimes the Cache (Restarts) 1 suit finish with fail. The reason is NPE > into at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.topologyVersion(GridDhtPartitionsExchangeFuture.java:515) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-11136) CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover wait an hour if something wrong happens
[ https://issues.apache.org/jira/browse/IGNITE-11136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim resolved IGNITE-11136. --- Resolution: Fixed > CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover > wait an hour if something wrong happens > -- > > Key: IGNITE-11136 > URL: https://issues.apache.org/jira/browse/IGNITE-11136 > Project: Ignite > Issue Type: Improvement >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 20m > Remaining Estimate: 0h > > For example, if test blocked at a getAndPut, it works one hour. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-11281) Cache (Restarts) 1 flaky tests SEGMENTATION problem
[ https://issues.apache.org/jira/browse/IGNITE-11281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim resolved IGNITE-11281. --- Resolution: Duplicate > Cache (Restarts) 1 flaky tests SEGMENTATION problem > --- > > Key: IGNITE-11281 > URL: https://issues.apache.org/jira/browse/IGNITE-11281 > Project: Ignite > Issue Type: Test >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > > The cache restarts suite has interrupted with fail. One of the reason is: > {code:java} > Stopping local node on Ignite failure: [failureCtx=FailureContext > [type=SEGMENTATION, err=null]]{code} > It happens because the suite has the property: > System.setProperty(IgniteSystemProperties.IGNITE_ENABLE_FORCIBLE_NODE_KILL, > "true"); > There isn't a reason for it in this test. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11281) Cache (Restarts) 1 flaky tests SEGMENTATION problem
Stepachev Maksim created IGNITE-11281: - Summary: Cache (Restarts) 1 flaky tests SEGMENTATION problem Key: IGNITE-11281 URL: https://issues.apache.org/jira/browse/IGNITE-11281 Project: Ignite Issue Type: Test Reporter: Stepachev Maksim Assignee: Stepachev Maksim Fix For: 2.8 The cache restarts suite has interrupted with fail. One of the reason is: {code:java} Stopping local node on Ignite failure: [failureCtx=FailureContext [type=SEGMENTATION, err=null]]{code} It happens because the suite has the property: System.setProperty(IgniteSystemProperties.IGNITE_ENABLE_FORCIBLE_NODE_KILL, "true"); There isn't a reason for it in this test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-11359) Improvement of tests. Add additional state check after each test.
[ https://issues.apache.org/jira/browse/IGNITE-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-11359: - Assignee: (was: Stepachev Maksim) > Improvement of tests. Add additional state check after each test. > - > > Key: IGNITE-11359 > URL: https://issues.apache.org/jira/browse/IGNITE-11359 > Project: Ignite > Issue Type: Improvement >Reporter: Stepachev Maksim >Priority: Major > > Sometimes, the Flaky tests are interrupted with OOM. There are many reasons > for it, but the main is a memory leak in transactions. The good way of fast > problem detection will additionally check of maps with transaction futures > after a test. It must be empty. Add this logic into GridCommonAbstractTest. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-11352) CacheMetricsSnapshot deserialization may be broken in some cases
[ https://issues.apache.org/jira/browse/IGNITE-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-11352: - Assignee: Alexey Goncharuk (was: Stepachev Maksim) > CacheMetricsSnapshot deserialization may be broken in some cases > > > Key: IGNITE-11352 > URL: https://issues.apache.org/jira/browse/IGNITE-11352 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.7 >Reporter: Stepachev Maksim >Assignee: Alexey Goncharuk >Priority: Major > Fix For: 2.7 > > Time Spent: 10m > Remaining Estimate: 0h > > There is an incompatible changes in CacheMetricsSnapshot serialization > between 2.4 and 2.7 versions. This may affect users if the event is being > stored to some external storage. The fix should be fairly simple as there is > already some code that checks the stream contents. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11276) Cache (Restarts) 1 flaky tests NPE at GridDhtPartitionsExchangeFuture.topologyVersion
Stepachev Maksim created IGNITE-11276: - Summary: Cache (Restarts) 1 flaky tests NPE at GridDhtPartitionsExchangeFuture.topologyVersion Key: IGNITE-11276 URL: https://issues.apache.org/jira/browse/IGNITE-11276 Project: Ignite Issue Type: Bug Reporter: Stepachev Maksim Assignee: Stepachev Maksim Fix For: 2.8 Sometimes the Cache (Restarts) 1 suit finish with fail. The reason is NPE into at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.topologyVersion(GridDhtPartitionsExchangeFuture.java:515) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11136) CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover wait an hour if something wrong happens
Stepachev Maksim created IGNITE-11136: - Summary: CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover wait an hour if something wrong happens Key: IGNITE-11136 URL: https://issues.apache.org/jira/browse/IGNITE-11136 Project: Ignite Issue Type: Improvement Reporter: Stepachev Maksim Assignee: Stepachev Maksim Fix For: 2.8 For example, if test blocked at a getAndPut, it works one hour. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-11129) Incorrect size calculation for SWITCH_SEGMENT_RECORD for TDE
[ https://issues.apache.org/jira/browse/IGNITE-11129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-11129: - Assignee: Stepachev Maksim > Incorrect size calculation for SWITCH_SEGMENT_RECORD for TDE > > > Key: IGNITE-11129 > URL: https://issues.apache.org/jira/browse/IGNITE-11129 > Project: Ignite > Issue Type: Bug >Reporter: Andrey Gura >Assignee: Stepachev Maksim >Priority: Major > > Size of {{SWITCH_SEGMENT_RECORD}} will be invalid in case of encryption > switched on. Size for this record type must be constant. > See > {{org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordDataV1Serializer#size}}: > {code:java} > @Override public int size(WALRecord record) throws IgniteCheckedException > { > int clSz = plainSize(record); > if (needEncryption(record)) > return encSpi.encryptedSize(clSz) + 4 /* groupId */ + 4 /* data > size */ + REC_TYPE_SIZE; > return clSz; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11148) PartitionCountersNeighborcastFuture blocks partition map exchange
[ https://issues.apache.org/jira/browse/IGNITE-11148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-11148: -- Description: We researched a problem with "execution timeout" in Continuous Query 2 for *CacheContinuousQueryAsyncFailoverMvccTxSelfTest.testMultiThreadedFailover*. The investigation result showed that we got MVCC problem, as result the test blocks at *getAndPut*, because in some moment wrong behavior happened: {code:java} [16:02:56] : [Step 4/5] [2019-01-30 13:02:56,923][INFO ][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][IgniteTxManager] Finishing prepared transaction [commit=false, tx=GridDhtTxRemote [nearNodeId=6a8546ab-f09d-4b0c-91c1-5fcf5b94, rmtFutId=95bfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4, nearXidVer=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=5], storeWriteThrough=false, super=GridDistributedTxRemoteAdapter [explicitVers=null, started=true, commitAllowed=0, txState=IgniteTxRemoteStateImpl [readMap=EmptyMap {}, writeMap=ConcurrentLinkedHashMap {}], txLbl=null, super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=16078, order=1548853376061, nodeOrder=3], writeVer=GridCacheVersion [topVer=16078, order=1548853376062, nodeOrder=3], implicit=false, loc=false, threadId=21, startTime=1548853376731, nodeId=3e6881c0-1e96-42a9-8bd1-55d344c2, startVer=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=1], endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, sysInvalidate=false, sys=false, plc=2, commitVer=GridCacheVersion [topVer=16078, order=1548853376061, nodeOrder=3], finalizing=NONE, invalidParts=null, state=PREPARED, timedOut=false, topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], mvccSnapshot=MvccSnapshotWithoutTxs [crdVer=1548853371043, cntr=207, cleanupVer=204, opCntr=0], skipCompletedVers=false, parentTx=null, duration=191ms, onePhaseCommit=false{code} and after that: {code:java} [16:02:56] : [Step 4/5] [2019-01-30 13:02:56,931][INFO ][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][recovery] Starting delivery partition countres to remote nodes [txId=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=5], futId=82cfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4{code} _!IMPORTANT - we work with PartitionCountersNeighborcastFuture which *doesn't provide status information* (monitoring)._ One of possible position of the problem: PartitionCountersNeighborcastFuture.onNodeLeft As result we have the transaction in *state=PREPARED* and *completionTime=0* which never complete : {code:java} [16:03:16]W: [org.apache.ignite:ignite-indexing] [2019-01-30 13:03:16,776][WARN ][exchange-worker-#40%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][diagnostic] Failed to wait for partition release future [topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], node=18519119-475a-448f-8c02-ff1f6490] LocalTxReleaseFuture [ topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], futures=[ TxFinishFuture [ tx=GridDhtTxRemote [ nearNodeId=6a8546ab-f09d-4b0c-91c1-5fcf5b94, rmtFutId=95bfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4, nearXidVer=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=5], storeWriteThrough=false, super=GridDistributedTxRemoteAdapter [explicitVers=null, started=true, commitAllowed=0, txState=IgniteTxRemoteStateImpl [readMap=EmptyMap {}, writeMap=ConcurrentLinkedHashMap {}], txLbl=null, super=IgniteTxAdapter [ xidVer=GridCacheVersion [topVer=16078, order=1548853376061, nodeOrder=3], writeVer=GridCacheVersion [topVer=16078, order=1548853376062, nodeOrder=3], implicit=false, loc=false, threadId=21, startTime=1548853376731, nodeId=3e6881c0-1e96-42a9-8bd1-55d344c2, startVer=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=1], endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, sysInvalidate=false, sys=false, plc=2, commitVer=GridCacheVersion [topVer=16078, order=1548853376061, nodeOrder=3], finalizing=RECOVERY_FINISH, invalidParts=null, state=PREPARED, timedOut=false, topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], mvccSnapshot=MvccSnapshotWithoutTxs [crdVer=1548853371043, cntr=207, cleanupVer=204, opCntr=0], skipCompletedVers=false, parentTx=null, duration=20048ms, onePhaseCommit=false]]], completionTime=0, duration=20048] {code} was: We researched a problem with "execution timeout" in Continuous Query 2 for *CacheContinuousQueryAsyncFailoverMvccTxSelfTest.testMultiThreadedFailover*. The investigation result showed that we got MVCC problem, as result the test blocks at getAndPut, because in some moment wrong behavior happened: {code:java} [16:02:56] : [Step 4/5] [2019-01-30 13:02:56,923][INFO
[jira] [Updated] (IGNITE-11148) PartitionCountersNeighborcastFuture blocks partition map exchange
[ https://issues.apache.org/jira/browse/IGNITE-11148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-11148: -- Description: We researched a problem with "execution timeout" in Continuous Query 2 for *CacheContinuousQueryAsyncFailoverMvccTxSelfTest.testMultiThreadedFailover*. The investigation result showed that we got MVCC problem, as result the test blocks at getAndPut, because in some moment wrong behavior happened: {code:java} [16:02:56] : [Step 4/5] [2019-01-30 13:02:56,923][INFO ][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][IgniteTxManager] Finishing prepared transaction [commit=false, tx=GridDhtTxRemote [nearNodeId=6a8546ab-f09d-4b0c-91c1-5fcf5b94, rmtFutId=95bfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4, nearXidVer=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=5], storeWriteThrough=false, super=GridDistributedTxRemoteAdapter [explicitVers=null, started=true, commitAllowed=0, txState=IgniteTxRemoteStateImpl [readMap=EmptyMap {}, writeMap=ConcurrentLinkedHashMap {}], txLbl=null, super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=16078, order=1548853376061, nodeOrder=3], writeVer=GridCacheVersion [topVer=16078, order=1548853376062, nodeOrder=3], implicit=false, loc=false, threadId=21, startTime=1548853376731, nodeId=3e6881c0-1e96-42a9-8bd1-55d344c2, startVer=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=1], endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, sysInvalidate=false, sys=false, plc=2, commitVer=GridCacheVersion [topVer=16078, order=1548853376061, nodeOrder=3], finalizing=NONE, invalidParts=null, state=PREPARED, timedOut=false, topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], mvccSnapshot=MvccSnapshotWithoutTxs [crdVer=1548853371043, cntr=207, cleanupVer=204, opCntr=0], skipCompletedVers=false, parentTx=null, duration=191ms, onePhaseCommit=false{code} and after that: {code:java} [16:02:56] : [Step 4/5] [2019-01-30 13:02:56,931][INFO ][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][recovery] Starting delivery partition countres to remote nodes [txId=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=5], futId=82cfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4{code} _!IMPORTANT - we work with PartitionCountersNeighborcastFuture which doesn't provide status information (monitoring)._ One of possible position of the problem: PartitionCountersNeighborcastFuture.onNodeLeft As result we have the transaction in state=PREPARED and completionTime=0 which never complete : {code:java} [16:03:16]W: [org.apache.ignite:ignite-indexing] [2019-01-30 13:03:16,776][WARN ][exchange-worker-#40%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][diagnostic] Failed to wait for partition release future [topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], node=18519119-475a-448f-8c02-ff1f6490] LocalTxReleaseFuture [ topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], futures=[ TxFinishFuture [ tx=GridDhtTxRemote [ nearNodeId=6a8546ab-f09d-4b0c-91c1-5fcf5b94, rmtFutId=95bfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4, nearXidVer=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=5], storeWriteThrough=false, super=GridDistributedTxRemoteAdapter [explicitVers=null, started=true, commitAllowed=0, txState=IgniteTxRemoteStateImpl [readMap=EmptyMap {}, writeMap=ConcurrentLinkedHashMap {}], txLbl=null, super=IgniteTxAdapter [ xidVer=GridCacheVersion [topVer=16078, order=1548853376061, nodeOrder=3], writeVer=GridCacheVersion [topVer=16078, order=1548853376062, nodeOrder=3], implicit=false, loc=false, threadId=21, startTime=1548853376731, nodeId=3e6881c0-1e96-42a9-8bd1-55d344c2, startVer=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=1], endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, sysInvalidate=false, sys=false, plc=2, commitVer=GridCacheVersion [topVer=16078, order=1548853376061, nodeOrder=3], finalizing=RECOVERY_FINISH, invalidParts=null, state=PREPARED, timedOut=false, topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], mvccSnapshot=MvccSnapshotWithoutTxs [crdVer=1548853371043, cntr=207, cleanupVer=204, opCntr=0], skipCompletedVers=false, parentTx=null, duration=20048ms, onePhaseCommit=false]]], completionTime=0, duration=20048] {code} was: We researched a problem with "execution timeout" in the Continuous Query 2 for *CacheContinuousQueryAsyncFailoverMvccTxSelfTest.testMultiThreadedFailover*. The investigation result showed that we got MVCC problem, as result the test blocks at getAndPut, because in some moment wrong behavior happened: {code:java} [16:02:56] : [Step 4/5] [2019-01-30 13:02:56,923][INFO
[jira] [Created] (IGNITE-11152) IgniteTxManager.idMap possible memory leak
Stepachev Maksim created IGNITE-11152: - Summary: IgniteTxManager.idMap possible memory leak Key: IGNITE-11152 URL: https://issues.apache.org/jira/browse/IGNITE-11152 Project: Ignite Issue Type: Bug Components: mvcc Reporter: Stepachev Maksim Fix For: 2.8 CacheContinuousQueryAsyncFailoverMvccTxSelfTest.testMultiThreadedFailover sometimes finished with OMM. Heapdump analyze showed that leak happened in IgniteTxManager.idMap, this map contains *2_097_152* instances of GridNearTxLocal with *ACTIVE state* and *without* finishFut *and prepFut.* {code:java} while (!updated) { try { prevVal = (Integer)qryClnCache.getAndPut(key, val); updated = true; } catch (CacheException e) { assertSame(atomicityMode(), CacheAtomicityMode.TRANSACTIONAL_SNAPSHOT); } } {code} Possible the CacheException is common and may hide wrong cases. Change it at specific (ignite-10976). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11148) PartitionCountersNeighborcastFuture blocks partition map exchange
Stepachev Maksim created IGNITE-11148: - Summary: PartitionCountersNeighborcastFuture blocks partition map exchange Key: IGNITE-11148 URL: https://issues.apache.org/jira/browse/IGNITE-11148 Project: Ignite Issue Type: Bug Components: mvcc Reporter: Stepachev Maksim We researched a problem with "execution timeout" in the Continuous Query 2 for *CacheContinuousQueryAsyncFailoverMvccTxSelfTest.testMultiThreadedFailover*. The investigation result showed that we got MVCC problem, as result the test blocks at getAndPut, because in some moment wrong behavior happened: {code:java} [16:02:56] : [Step 4/5] [2019-01-30 13:02:56,923][INFO ][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][IgniteTxManager] Finishing prepared transaction [commit=false, tx=GridDhtTxRemote [nearNodeId=6a8546ab-f09d-4b0c-91c1-5fcf5b94, rmtFutId=95bfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4, nearXidVer=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=5], storeWriteThrough=false, super=GridDistributedTxRemoteAdapter [explicitVers=null, started=true, commitAllowed=0, txState=IgniteTxRemoteStateImpl [readMap=EmptyMap {}, writeMap=ConcurrentLinkedHashMap {}], txLbl=null, super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=16078, order=1548853376061, nodeOrder=3], writeVer=GridCacheVersion [topVer=16078, order=1548853376062, nodeOrder=3], implicit=false, loc=false, threadId=21, startTime=1548853376731, nodeId=3e6881c0-1e96-42a9-8bd1-55d344c2, startVer=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=1], endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, sysInvalidate=false, sys=false, plc=2, commitVer=GridCacheVersion [topVer=16078, order=1548853376061, nodeOrder=3], finalizing=NONE, invalidParts=null, state=PREPARED, timedOut=false, topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], mvccSnapshot=MvccSnapshotWithoutTxs [crdVer=1548853371043, cntr=207, cleanupVer=204, opCntr=0], skipCompletedVers=false, parentTx=null, duration=191ms, onePhaseCommit=false{code} and after that: {code:java} [16:02:56] : [Step 4/5] [2019-01-30 13:02:56,931][INFO ][sys-stripe-6-#9%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][recovery] Starting delivery partition countres to remote nodes [txId=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=5], futId=82cfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4{code} _!IMPORTANT - we work with PartitionCountersNeighborcastFuture which doesn't provide status information (monitoring)._ One of possible position of the problem: PartitionCountersNeighborcastFuture.onNodeLeft As result we have the transaction in state=PREPARED and completionTime=0 which never complete : {code:java} [16:03:16]W: [org.apache.ignite:ignite-indexing] [2019-01-30 13:03:16,776][WARN ][exchange-worker-#40%continuous.CacheContinuousQueryAsyncFailoverMvccTxSelfTest0%][diagnostic] Failed to wait for partition release future [topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], node=18519119-475a-448f-8c02-ff1f6490] LocalTxReleaseFuture [ topVer=AffinityTopologyVersion [topVer=8, minorTopVer=0], futures=[ TxFinishFuture [ tx=GridDhtTxRemote [ nearNodeId=6a8546ab-f09d-4b0c-91c1-5fcf5b94, rmtFutId=95bfade9861-4f5107b4-70e5-44ef-96d4-1b18cd6b57e4, nearXidVer=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=5], storeWriteThrough=false, super=GridDistributedTxRemoteAdapter [explicitVers=null, started=true, commitAllowed=0, txState=IgniteTxRemoteStateImpl [readMap=EmptyMap {}, writeMap=ConcurrentLinkedHashMap {}], txLbl=null, super=IgniteTxAdapter [ xidVer=GridCacheVersion [topVer=16078, order=1548853376061, nodeOrder=3], writeVer=GridCacheVersion [topVer=16078, order=1548853376062, nodeOrder=3], implicit=false, loc=false, threadId=21, startTime=1548853376731, nodeId=3e6881c0-1e96-42a9-8bd1-55d344c2, startVer=GridCacheVersion [topVer=16078, order=1548853376060, nodeOrder=1], endVer=null, isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0, sysInvalidate=false, sys=false, plc=2, commitVer=GridCacheVersion [topVer=16078, order=1548853376061, nodeOrder=3], finalizing=RECOVERY_FINISH, invalidParts=null, state=PREPARED, timedOut=false, topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], mvccSnapshot=MvccSnapshotWithoutTxs [crdVer=1548853371043, cntr=207, cleanupVer=204, opCntr=0], skipCompletedVers=false, parentTx=null, duration=20048ms, onePhaseCommit=false]]], completionTime=0, duration=20048] {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-10905) org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtInvalidPartitionException happens during rolling restart of a cluster
[ https://issues.apache.org/jira/browse/IGNITE-10905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim resolved IGNITE-10905. --- Resolution: Duplicate It's a duplicate of https://issues.apache.org/jira/browse/IGNITE-9803 . > org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtInvalidPartitionException > happens during rolling restart of a cluster > - > > Key: IGNITE-10905 > URL: https://issues.apache.org/jira/browse/IGNITE-10905 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.7 >Reporter: Maxim Pudov >Assignee: Stepachev Maksim >Priority: Critical > Fix For: 2.8 > > > JVM is halted after this error during rolling restart of a cluster: > {code} > org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtInvalidPartitionException: > Adding entry to partition that is concurrently evicted [grp=cacheGroup_7, > part=518, shouldBeMoving=, belongs=true, topVer=AffinityTopologyVersion > [topVer=42, minorTopVer=0], curTopVer=AffinityTopologyVersion [topVer=43, > minorTopVer=0]] > at > org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition0(GridDhtPartitionTopologyImpl.java:950) > at > org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition(GridDhtPartitionTopologyImpl.java:825) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.handleSupplyMessage(GridDhtPartitionDemander.java:744) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.handleSupplyMessage(GridDhtPreloader.java:387) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:418) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:408) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1056) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$700(GridCacheIoManager.java:101) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1613) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1569) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4100(GridIoManager.java:127) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2768) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1529) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4400(GridIoManager.java:127) > at > org.apache.ignite.internal.managers.communication.GridIoManager$10.run(GridIoManager.java:1498) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-11050) Potential deadlock caused by DhtColocatedLockFuture#map being called inside topology read lock
[ https://issues.apache.org/jira/browse/IGNITE-11050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-11050: - Assignee: Stepachev Maksim > Potential deadlock caused by DhtColocatedLockFuture#map being called inside > topology read lock > -- > > Key: IGNITE-11050 > URL: https://issues.apache.org/jira/browse/IGNITE-11050 > Project: Ignite > Issue Type: Bug >Reporter: Alexey Goncharuk >Assignee: Stepachev Maksim >Priority: Critical > Labels: MakeTeamcityGreenAgain > Fix For: 2.8 > > > I observed the following stacktrace on TC during tests analysis: > {code} > Thread > [name="exchange-worker-#18471%near.GridCachePartitionedNodeRestartTest0%", > id=23715, state=WAITING, blockCnt=860, waitCnt=775] > Lock > [object=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@2bfb6b49, > ownerName=null, ownerId=-1] > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) > at > java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) > at > o.a.i.i.util.StripedCompositeReadWriteLock$WriteLock.lock0(StripedCompositeReadWriteLock.java:173) > at > o.a.i.i.util.StripedCompositeReadWriteLock$WriteLock.lock(StripedCompositeReadWriteLock.java:142) > at > o.a.i.i.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition0(GridDhtPartitionTopologyImpl.java:925) > at > o.a.i.i.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition(GridDhtPartitionTopologyImpl.java:826) > at > o.a.i.i.processors.cache.distributed.dht.GridCachePartitionedConcurrentMap.localPartition(GridCachePartitionedConcurrentMap.java:70) > at > o.a.i.i.processors.cache.distributed.dht.GridCachePartitionedConcurrentMap.putEntryIfObsoleteOrAbsent(GridCachePartitionedConcurrentMap.java:89) > at > o.a.i.i.processors.cache.GridCacheAdapter.entryEx(GridCacheAdapter.java:1019) > at > o.a.i.i.processors.cache.distributed.dht.GridDhtCacheAdapter.entryEx(GridDhtCacheAdapter.java:544) > at > o.a.i.i.processors.cache.transactions.IgniteTxManager.txUnlock(IgniteTxManager.java:1764) > at > o.a.i.i.processors.cache.transactions.IgniteTxManager.unlockMultiple(IgniteTxManager.java:1775) > at > o.a.i.i.processors.cache.transactions.IgniteTxManager.rollbackTx(IgniteTxManager.java:1347) > at > o.a.i.i.processors.cache.transactions.IgniteTxLocalAdapter.userRollback(IgniteTxLocalAdapter.java:1075) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.localFinish(GridNearTxLocal.java:3602) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxFinishFuture.doFinish(GridNearTxFinishFuture.java:440) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxFinishFuture.finish(GridNearTxFinishFuture.java:390) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.rollbackNearTxLocalAsync(GridNearTxLocal.java:3833) > at > o.a.i.i.processors.cache.distributed.near.GridNearTxLocal.rollbackNearTxLocalAsync(GridNearTxLocal.java:3784) > at > o.a.i.i.processors.cache.GridCacheAdapter$53.applyx(GridCacheAdapter.java:4409) > at > o.a.i.i.processors.cache.GridCacheAdapter$53.applyx(GridCacheAdapter.java:4399) > at o.a.i.i.util.lang.IgniteClosureX.apply(IgniteClosureX.java:38) > at > o.a.i.i.util.future.GridFutureChainListener.applyCallback(GridFutureChainListener.java:78) > at > o.a.i.i.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:70) > at > o.a.i.i.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:30) > at > o.a.i.i.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:399) > at > o.a.i.i.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:347) > at > o.a.i.i.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:335) > at > o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:511) > at > o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:490) > at > o.a.i.i.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:478) > at >
[jira] [Assigned] (IGNITE-10905) org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtInvalidPartitionException happens during rolling restart of a cluster
[ https://issues.apache.org/jira/browse/IGNITE-10905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-10905: - Assignee: Stepachev Maksim > org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtInvalidPartitionException > happens during rolling restart of a cluster > - > > Key: IGNITE-10905 > URL: https://issues.apache.org/jira/browse/IGNITE-10905 > Project: Ignite > Issue Type: Bug > Components: cache >Affects Versions: 2.7 >Reporter: Maxim Pudov >Assignee: Stepachev Maksim >Priority: Critical > Fix For: 2.8 > > > JVM is halted after this error during rolling restart of a cluster: > {code} > org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtInvalidPartitionException: > Adding entry to partition that is concurrently evicted [grp=cacheGroup_7, > part=518, shouldBeMoving=, belongs=true, topVer=AffinityTopologyVersion > [topVer=42, minorTopVer=0], curTopVer=AffinityTopologyVersion [topVer=43, > minorTopVer=0]] > at > org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition0(GridDhtPartitionTopologyImpl.java:950) > at > org.apache.ignite.internal.processors.cache.distributed.dht.topology.GridDhtPartitionTopologyImpl.localPartition(GridDhtPartitionTopologyImpl.java:825) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander.handleSupplyMessage(GridDhtPartitionDemander.java:744) > at > org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader.handleSupplyMessage(GridDhtPreloader.java:387) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:418) > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$5.apply(GridCachePartitionExchangeManager.java:408) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1056) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$700(GridCacheIoManager.java:101) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$OrderedMessageListener.onMessage(GridCacheIoManager.java:1613) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1569) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4100(GridIoManager.java:127) > at > org.apache.ignite.internal.managers.communication.GridIoManager$GridCommunicationMessageSet.unwind(GridIoManager.java:2768) > at > org.apache.ignite.internal.managers.communication.GridIoManager.unwindMessageSet(GridIoManager.java:1529) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$4400(GridIoManager.java:127) > at > org.apache.ignite.internal.managers.communication.GridIoManager$10.run(GridIoManager.java:1498) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-11129) Incorrect size calculation for SWITCH_SEGMENT_RECORD for TDE
[ https://issues.apache.org/jira/browse/IGNITE-11129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757283#comment-16757283 ] Stepachev Maksim commented on IGNITE-11129: --- This case is wrong. The method "needEncryption" contains a condition, which encrypts only records with data. > Incorrect size calculation for SWITCH_SEGMENT_RECORD for TDE > > > Key: IGNITE-11129 > URL: https://issues.apache.org/jira/browse/IGNITE-11129 > Project: Ignite > Issue Type: Bug >Reporter: Andrey Gura >Assignee: Stepachev Maksim >Priority: Major > > Size of {{SWITCH_SEGMENT_RECORD}} will be invalid in case of encryption > switched on. Size for this record type must be constant. > See > {{org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordDataV1Serializer#size}}: > {code:java} > @Override public int size(WALRecord record) throws IgniteCheckedException > { > int clSz = plainSize(record); > if (needEncryption(record)) > return encSpi.encryptedSize(clSz) + 4 /* groupId */ + 4 /* data > size */ + REC_TYPE_SIZE; > return clSz; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-11129) Incorrect size calculation for SWITCH_SEGMENT_RECORD for TDE
[ https://issues.apache.org/jira/browse/IGNITE-11129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim resolved IGNITE-11129. --- Resolution: Invalid > Incorrect size calculation for SWITCH_SEGMENT_RECORD for TDE > > > Key: IGNITE-11129 > URL: https://issues.apache.org/jira/browse/IGNITE-11129 > Project: Ignite > Issue Type: Bug >Reporter: Andrey Gura >Assignee: Stepachev Maksim >Priority: Major > > Size of {{SWITCH_SEGMENT_RECORD}} will be invalid in case of encryption > switched on. Size for this record type must be constant. > See > {{org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordDataV1Serializer#size}}: > {code:java} > @Override public int size(WALRecord record) throws IgniteCheckedException > { > int clSz = plainSize(record); > if (needEncryption(record)) > return encSpi.encryptedSize(clSz) + 4 /* groupId */ + 4 /* data > size */ + REC_TYPE_SIZE; > return clSz; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11205) Cache (Restarts) 1 flaky tests
Stepachev Maksim created IGNITE-11205: - Summary: Cache (Restarts) 1 flaky tests Key: IGNITE-11205 URL: https://issues.apache.org/jira/browse/IGNITE-11205 Project: Ignite Issue Type: Improvement Reporter: Stepachev Maksim Assignee: Stepachev Maksim -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-10995) GridDhtPartitionSupplier::handleDemandMessage suppress errors
[ https://issues.apache.org/jira/browse/IGNITE-10995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754803#comment-16754803 ] Stepachev Maksim commented on IGNITE-10995: --- 1. Fixed. 2. Fixed. 3. Yes, the main reason is the behavior of configuration with a FileIOFactory. It's able to be overridden only in this mod. > GridDhtPartitionSupplier::handleDemandMessage suppress errors > - > > Key: IGNITE-10995 > URL: https://issues.apache.org/jira/browse/IGNITE-10995 > Project: Ignite > Issue Type: Bug >Reporter: Dmitry Sherstobitov >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Attachments: Screenshot 2019-01-20 at 23.19.08.png > > Time Spent: 10m > Remaining Estimate: 0h > > Scenario: > # Cluster with data > # Triggered historical rebalance > In this case if OOM occurs on supplier there is no failHandler triggered and > cluster is alive with inconsistent data (target node have MOVING partitions, > supplier do nothing) > Target rebalance node log: > {code:java} > [15:00:31,418][WARNING][sys-#86][GridDhtPartitionDemander] Rebalancing from > node cancelled [grp=cache_group_4, topVer=AffinityTopologyVersion [topVer=17, > minorTopVer=0], supplier=4cbc66d3-9d2c-4396-8366-2839a8d0cdb6, topic=5]]. > Supplier has failed with error: java.lang.OutOfMemoryError: Java heap > space{code} > Supplier stack trace: > !Screenshot 2019-01-20 at 23.19.08.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11124) CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover sometimes throwing oom
Stepachev Maksim created IGNITE-11124: - Summary: CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover sometimes throwing oom Key: IGNITE-11124 URL: https://issues.apache.org/jira/browse/IGNITE-11124 Project: Ignite Issue Type: Improvement Reporter: Stepachev Maksim This test sometimes throwing OOM it happens because of IgniteTxManager::idMap contains 2 millions of instances of GridNearTxLocal. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-11124) CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover sometimes throwing oom
[ https://issues.apache.org/jira/browse/IGNITE-11124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-11124: - Assignee: Stepachev Maksim > CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover > sometimes throwing oom > - > > Key: IGNITE-11124 > URL: https://issues.apache.org/jira/browse/IGNITE-11124 > Project: Ignite > Issue Type: Improvement >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > > This test sometimes throwing OOM it happens because of IgniteTxManager::idMap > contains 2 millions of instances of GridNearTxLocal. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11124) CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover sometimes throwing oom
[ https://issues.apache.org/jira/browse/IGNITE-11124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-11124: -- Fix Version/s: 2.8 > CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover > sometimes throwing oom > - > > Key: IGNITE-11124 > URL: https://issues.apache.org/jira/browse/IGNITE-11124 > Project: Ignite > Issue Type: Improvement >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > This test sometimes throwing OOM it happens because of IgniteTxManager::idMap > contains 2 millions of instances of GridNearTxLocal. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-11641) Server node copies a lot of WAL files in WAL archive after restart
[ https://issues.apache.org/jira/browse/IGNITE-11641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813074#comment-16813074 ] Stepachev Maksim commented on IGNITE-11641: --- [~DmitriyGovorukhin] Could you revert FailureProcessor? Your commit changed the default behavior for IGNITE_DUMP_THREADS_ON_FAILURE. > Server node copies a lot of WAL files in WAL archive after restart > -- > > Key: IGNITE-11641 > URL: https://issues.apache.org/jira/browse/IGNITE-11641 > Project: Ignite > Issue Type: Bug >Reporter: Dmitriy Govorukhin >Assignee: Dmitriy Govorukhin >Priority: Major > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > Pre-condition: PDS is enabled, wal_path and wal_archive_path are set in > config file. > 1. Cluster is up and running. Some data uploaded into caches. > 2. Start load to generate a lot of files in wal archive (more than files in > wal directory). > 3. Stop some node and delete all files from wal archive. > 4. Start node. > In this case node copies WAL files from WAL dir into wal archive dir again > and again until the amount of files will be the same it was in wal archive > before stop. > Here is information from server node log > {code} > 10:10:17,054][INFO][main][GridCacheDatabaseSharedManager] Restoring partition > state for local groups. > [10:10:18,522][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] > Copied file > [src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/.wal, > > dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/.wal] > [10:10:18,523][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] > Starting to copy WAL segment [absIdx=1, segIdx=1, > origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal, > > dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal] > [10:10:20,631][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] > Copied file > [src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal, > > dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0001.wal] > [10:10:20,632][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] > Starting to copy WAL segment [absIdx=2, segIdx=2, > origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal, > > dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal] > [10:10:23,276][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] > Copied file > [src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal, > > dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0002.wal] > [10:10:23,276][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] > Starting to copy WAL segment [absIdx=3, segIdx=3, > origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal, > > dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal] > [10:10:23,995][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] > Copied file > [src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal, > > dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0003.wal] > [10:10:23,996][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] > Starting to copy WAL segment [absIdx=4, segIdx=4, > origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal, > > dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal] > [10:10:24,644][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] > Copied file > [src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal, > > dst=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0004.wal] > [10:10:24,645][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] > Starting to copy WAL segment [absIdx=5, segIdx=5, > origFile=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal, > > dstFile=/storage/ssd/avolkov/wal_archive/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal] > [10:10:25,301][INFO][wal-file-archiver%null-#64][FileWriteAheadLogManager] > Copied file > [src=/storage/ssd/avolkov/wal/node00-83c9db32-fee5-4f3e-8a1c-559221817759/0005.wal, > >
[jira] [Created] (IGNITE-11736) Make the TeamCity console quiet.
Stepachev Maksim created IGNITE-11736: - Summary: Make the TeamCity console quiet. Key: IGNITE-11736 URL: https://issues.apache.org/jira/browse/IGNITE-11736 Project: Ignite Issue Type: Improvement Reporter: Stepachev Maksim Assignee: Stepachev Maksim As a result of this discussion: [https://lists.apache.org/list.html?d...@ignite.apache.org:lte=1M:Make%20the%20TeamCity%20console%20quiet.] 1. Rollover will be locked. Pros: Only one big file in an archive. Cons: Max size of the file isn't limited. 2. Run all will contain a parameter for switch off the quiet mode. 3. New config: log4j-tc-test.xml for TeamCity environment. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-10797) Remove unused methods from IgniteCacheSnapshotManager.
[ https://issues.apache.org/jira/browse/IGNITE-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820801#comment-16820801 ] Stepachev Maksim commented on IGNITE-10797: --- [~6uest] looks good, thanks. [~agoncharuk] please merge it. > Remove unused methods from IgniteCacheSnapshotManager. > -- > > Key: IGNITE-10797 > URL: https://issues.apache.org/jira/browse/IGNITE-10797 > Project: Ignite > Issue Type: Improvement > Components: persistence >Affects Versions: 2.7 >Reporter: Stanilovsky Evgeny >Assignee: Andrey Kalinin >Priority: Major > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > Remove unused methods: > IgniteCacheSnapshotManager#flushDirtyPageHandler > IgniteCacheSnapshotManager#onPageWrite -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11736) Make the TeamCity console quiet.
[ https://issues.apache.org/jira/browse/IGNITE-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-11736: -- Description: As a result of this discussion: [https://lists.apache.org/list.html?d...@ignite.apache.org:lte=1M:Make%20the%20TeamCity%20console%20quiet.] # Rollover will be locked. Pros: Only one big file in an archive. Cons: Max size of the file isn't limited. 2. Run all will contain a parameter for switch off the quiet mode. 3. New config: log4j-tc-test.xml for TeamCity environment. TC fixes: Add a checkbox into the general run window. *By default* the checkbox *is active*. If the checkbox is *active*, the TeamCity add next params for java run: *-DIGNITE_TEST_PROP_LOG4J_FILE=log4j-tc-test.xml -DIGNITE_QUIET=true* otherwise *empty params*. was: As a result of this discussion: [https://lists.apache.org/list.html?d...@ignite.apache.org:lte=1M:Make%20the%20TeamCity%20console%20quiet.] 1. Rollover will be locked. Pros: Only one big file in an archive. Cons: Max size of the file isn't limited. 2. Run all will contain a parameter for switch off the quiet mode. 3. New config: log4j-tc-test.xml for TeamCity environment. > Make the TeamCity console quiet. > > > Key: IGNITE-11736 > URL: https://issues.apache.org/jira/browse/IGNITE-11736 > Project: Ignite > Issue Type: Improvement >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > As a result of this discussion: > [https://lists.apache.org/list.html?d...@ignite.apache.org:lte=1M:Make%20the%20TeamCity%20console%20quiet.] > > # Rollover will be locked. Pros: Only one big file in an archive. Cons: Max > size of the file isn't limited. 2. Run all will contain a parameter for > switch off the quiet mode. 3. New config: log4j-tc-test.xml for TeamCity > environment. > TC fixes: > Add a checkbox into the general run window. *By default* the checkbox *is > active*. If the checkbox is *active*, the TeamCity add next params for java > run: *-DIGNITE_TEST_PROP_LOG4J_FILE=log4j-tc-test.xml -DIGNITE_QUIET=true* > otherwise *empty params*. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11736) Make the TeamCity console quiet.
[ https://issues.apache.org/jira/browse/IGNITE-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-11736: -- Description: As a result of this discussion: [https://lists.apache.org/list.html?d...@ignite.apache.org:lte=1M:Make%20the%20TeamCity%20console%20quiet.] # Rollover will be locked. Pros: Only one big file in an archive. Cons: Max size of the file isn't limited. 2. Run all will contain a parameter for switch off the quiet mode. 3. New config: log4j-tc-test.xml for TeamCity environment. TC fixes: Add a checkbox into the general run window. *By default* the checkbox *is active*. If the checkbox is *active*, the TeamCity adds next params for java run: *-DIGNITE_TEST_PROP_LOG4J_FILE=log4j-tc-test.xml -DIGNITE_QUIET=true* otherwise *empty params*. was: As a result of this discussion: [https://lists.apache.org/list.html?d...@ignite.apache.org:lte=1M:Make%20the%20TeamCity%20console%20quiet.] # Rollover will be locked. Pros: Only one big file in an archive. Cons: Max size of the file isn't limited. 2. Run all will contain a parameter for switch off the quiet mode. 3. New config: log4j-tc-test.xml for TeamCity environment. TC fixes: Add a checkbox into the general run window. *By default* the checkbox *is active*. If the checkbox is *active*, the TeamCity add next params for java run: *-DIGNITE_TEST_PROP_LOG4J_FILE=log4j-tc-test.xml -DIGNITE_QUIET=true* otherwise *empty params*. > Make the TeamCity console quiet. > > > Key: IGNITE-11736 > URL: https://issues.apache.org/jira/browse/IGNITE-11736 > Project: Ignite > Issue Type: Improvement >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > As a result of this discussion: > [https://lists.apache.org/list.html?d...@ignite.apache.org:lte=1M:Make%20the%20TeamCity%20console%20quiet.] > > # Rollover will be locked. Pros: Only one big file in an archive. Cons: Max > size of the file isn't limited. 2. Run all will contain a parameter for > switch off the quiet mode. 3. New config: log4j-tc-test.xml for TeamCity > environment. > TC fixes: > Add a checkbox into the general run window. *By default* the checkbox *is > active*. If the checkbox is *active*, the TeamCity adds next params for java > run: *-DIGNITE_TEST_PROP_LOG4J_FILE=log4j-tc-test.xml -DIGNITE_QUIET=true* > otherwise *empty params*. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-11878) Rebuild index skips MOVING partitions when historical re balance
Stepachev Maksim created IGNITE-11878: - Summary: Rebuild index skips MOVING partitions when historical re balance Key: IGNITE-11878 URL: https://issues.apache.org/jira/browse/IGNITE-11878 Project: Ignite Issue Type: Bug Affects Versions: 2.7, 2.6, 2.5 Reporter: Stepachev Maksim Assignee: Stepachev Maksim Rebuild index skips MOVING partitions when historical rebalance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-11626) InitNewCoordinatorFuture should be reported in diagnostic output
[ https://issues.apache.org/jira/browse/IGNITE-11626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-11626: - Assignee: Stepachev Maksim > InitNewCoordinatorFuture should be reported in diagnostic output > > > Key: IGNITE-11626 > URL: https://issues.apache.org/jira/browse/IGNITE-11626 > Project: Ignite > Issue Type: Improvement >Reporter: Alexey Goncharuk >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > > Currently {{InitNewCoordinatorFuture}} is not printed in PME diagnostic > output. This future also does not implement diagnostic aware interface and > remote information is not collected for this future. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-6957) Reduce excessive int boxing when accessing cache by ID
[ https://issues.apache.org/jira/browse/IGNITE-6957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-6957: Assignee: Stepachev Maksim > Reduce excessive int boxing when accessing cache by ID > -- > > Key: IGNITE-6957 > URL: https://issues.apache.org/jira/browse/IGNITE-6957 > Project: Ignite > Issue Type: Task > Components: cache >Affects Versions: 2.3 >Reporter: Alexey Goncharuk >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Attachments: 2017-11-20_12-01-31.png > > > We have a number of places which lead to a large number of Integer > allocations when having a large number of caches and partitions. See the > image attached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-11736) Make the TeamCity console quiet.
[ https://issues.apache.org/jira/browse/IGNITE-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-11736: -- Attachment: quiet-console-checkbox.png > Make the TeamCity console quiet. > > > Key: IGNITE-11736 > URL: https://issues.apache.org/jira/browse/IGNITE-11736 > Project: Ignite > Issue Type: Improvement >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Attachments: quiet-console-checkbox.png > > Time Spent: 20m > Remaining Estimate: 0h > > As a result of this discussion: > [https://lists.apache.org/list.html?d...@ignite.apache.org:lte=1M:Make%20the%20TeamCity%20console%20quiet.] > > # Rollover will be locked. Pros: Only one big file in an archive. Cons: Max > size of the file isn't limited. 2. Run all will contain a parameter for > switch off the quiet mode. 3. New config: log4j-tc-test.xml for TeamCity > environment. > TC fixes: > Add a checkbox into the general run window. *By default* the checkbox *is > active*. If the checkbox is *active*, the TeamCity adds next params for java > run: *-DIGNITE_TEST_PROP_LOG4J_FILE=log4j-tc-test.xml -DIGNITE_QUIET=true* > otherwise *empty params*. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-10983) Check that persistenceEnabled is consistent on all nodes
[ https://issues.apache.org/jira/browse/IGNITE-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-10983: - Assignee: Stepachev Maksim > Check that persistenceEnabled is consistent on all nodes > > > Key: IGNITE-10983 > URL: https://issues.apache.org/jira/browse/IGNITE-10983 > Project: Ignite > Issue Type: Task >Reporter: Stanislav Lukyanov >Assignee: Stepachev Maksim >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently it is possible to have a cluster where the same data region is > persistent on some nodes and not persistent on others. This use case doesn't > have enough testing, so it's better to deny it for now by adding a check for > that and not allowing a node with a different persistenceEnabled value to > join the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (IGNITE-11124) CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover sometimes throwing oom
[ https://issues.apache.org/jira/browse/IGNITE-11124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim closed IGNITE-11124. - > CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover > sometimes throwing oom > - > > Key: IGNITE-11124 > URL: https://issues.apache.org/jira/browse/IGNITE-11124 > Project: Ignite > Issue Type: Improvement >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 20m > Remaining Estimate: 0h > > This test sometimes throwing OOM it happens because of IgniteTxManager::idMap > contains 2 millions of instances of GridNearTxLocal. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-11124) CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover sometimes throwing oom
[ https://issues.apache.org/jira/browse/IGNITE-11124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim resolved IGNITE-11124. --- Resolution: Duplicate > CacheContinuousQueryAsyncFailoverMvccTxSelfTest::testMultiThreadedFailover > sometimes throwing oom > - > > Key: IGNITE-11124 > URL: https://issues.apache.org/jira/browse/IGNITE-11124 > Project: Ignite > Issue Type: Improvement >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 20m > Remaining Estimate: 0h > > This test sometimes throwing OOM it happens because of IgniteTxManager::idMap > contains 2 millions of instances of GridNearTxLocal. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-5227) StackOverflowError in GridCacheMapEntry#checkOwnerChanged()
[ https://issues.apache.org/jira/browse/IGNITE-5227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-5227: Assignee: Stepachev Maksim (was: Mikhail Cherkasov) > StackOverflowError in GridCacheMapEntry#checkOwnerChanged() > --- > > Key: IGNITE-5227 > URL: https://issues.apache.org/jira/browse/IGNITE-5227 > Project: Ignite > Issue Type: Bug >Affects Versions: 1.6 >Reporter: Alexey Goncharuk >Assignee: Stepachev Maksim >Priority: Critical > > A simple test reproducing this error: > {code} > /** > * @throws Exception if failed. > */ > public void testBatchUnlock() throws Exception { >startGrid(0); >grid(0).createCache(new CacheConfiguration Integer>(DEFAULT_CACHE_NAME) > .setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL)); > try { > final CountDownLatch releaseLatch = new CountDownLatch(1); > IgniteInternalFuture fut = GridTestUtils.runAsync(new > Callable() { > @Override public Object call() throws Exception { > IgniteCache cache = grid(0).cache(null); > Lock lock = cache.lock("key"); > try { > lock.lock(); > releaseLatch.await(); > } > finally { > lock.unlock(); > } > return null; > } > }); > Map putMap = new LinkedHashMap<>(); > putMap.put("key", "trigger"); > for (int i = 0; i < 10_000; i++) > putMap.put("key-" + i, "value"); > IgniteCache asyncCache = > grid(0).cache(null).withAsync(); > asyncCache.putAll(putMap); > IgniteFuture resFut = asyncCache.future(); > Thread.sleep(1000); > releaseLatch.countDown(); > fut.get(); > resFut.get(); > } > finally { > stopAllGrids(); > } > {code} > We should replace a recursive call with a simple iteration over the linked > list. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (IGNITE-5227) StackOverflowError in GridCacheMapEntry#checkOwnerChanged()
[ https://issues.apache.org/jira/browse/IGNITE-5227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897184#comment-16897184 ] Stepachev Maksim commented on IGNITE-5227: -- https://github.com/apache/ignite/pull/6736/files > StackOverflowError in GridCacheMapEntry#checkOwnerChanged() > --- > > Key: IGNITE-5227 > URL: https://issues.apache.org/jira/browse/IGNITE-5227 > Project: Ignite > Issue Type: Bug >Affects Versions: 1.6 >Reporter: Alexey Goncharuk >Assignee: Stepachev Maksim >Priority: Critical > Time Spent: 10m > Remaining Estimate: 0h > > A simple test reproducing this error: > {code} > /** > * @throws Exception if failed. > */ > public void testBatchUnlock() throws Exception { >startGrid(0); >grid(0).createCache(new CacheConfiguration Integer>(DEFAULT_CACHE_NAME) > .setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL)); > try { > final CountDownLatch releaseLatch = new CountDownLatch(1); > IgniteInternalFuture fut = GridTestUtils.runAsync(new > Callable() { > @Override public Object call() throws Exception { > IgniteCache cache = grid(0).cache(null); > Lock lock = cache.lock("key"); > try { > lock.lock(); > releaseLatch.await(); > } > finally { > lock.unlock(); > } > return null; > } > }); > Map putMap = new LinkedHashMap<>(); > putMap.put("key", "trigger"); > for (int i = 0; i < 10_000; i++) > putMap.put("key-" + i, "value"); > IgniteCache asyncCache = > grid(0).cache(null).withAsync(); > asyncCache.putAll(putMap); > IgniteFuture resFut = asyncCache.future(); > Thread.sleep(1000); > releaseLatch.countDown(); > fut.get(); > resFut.get(); > } > finally { > stopAllGrids(); > } > {code} > We should replace a recursive call with a simple iteration over the linked > list. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (IGNITE-5227) StackOverflowError in GridCacheMapEntry#checkOwnerChanged()
[ https://issues.apache.org/jira/browse/IGNITE-5227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897186#comment-16897186 ] Stepachev Maksim commented on IGNITE-5227: -- [~ivan.glukos] Please, make code review. > StackOverflowError in GridCacheMapEntry#checkOwnerChanged() > --- > > Key: IGNITE-5227 > URL: https://issues.apache.org/jira/browse/IGNITE-5227 > Project: Ignite > Issue Type: Bug >Affects Versions: 1.6 >Reporter: Alexey Goncharuk >Assignee: Stepachev Maksim >Priority: Critical > Time Spent: 10m > Remaining Estimate: 0h > > A simple test reproducing this error: > {code} > /** > * @throws Exception if failed. > */ > public void testBatchUnlock() throws Exception { >startGrid(0); >grid(0).createCache(new CacheConfiguration Integer>(DEFAULT_CACHE_NAME) > .setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL)); > try { > final CountDownLatch releaseLatch = new CountDownLatch(1); > IgniteInternalFuture fut = GridTestUtils.runAsync(new > Callable() { > @Override public Object call() throws Exception { > IgniteCache cache = grid(0).cache(null); > Lock lock = cache.lock("key"); > try { > lock.lock(); > releaseLatch.await(); > } > finally { > lock.unlock(); > } > return null; > } > }); > Map putMap = new LinkedHashMap<>(); > putMap.put("key", "trigger"); > for (int i = 0; i < 10_000; i++) > putMap.put("key-" + i, "value"); > IgniteCache asyncCache = > grid(0).cache(null).withAsync(); > asyncCache.putAll(putMap); > IgniteFuture resFut = asyncCache.future(); > Thread.sleep(1000); > releaseLatch.countDown(); > fut.get(); > resFut.get(); > } > finally { > stopAllGrids(); > } > {code} > We should replace a recursive call with a simple iteration over the linked > list. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (IGNITE-12062) IntMap throws NullPointerException when map is creating
Stepachev Maksim created IGNITE-12062: - Summary: IntMap throws NullPointerException when map is creating Key: IGNITE-12062 URL: https://issues.apache.org/jira/browse/IGNITE-12062 Project: Ignite Issue Type: Bug Reporter: Stepachev Maksim Assignee: Stepachev Maksim The problem located here: compactThreshold = (int)(COMPACT_LOAD_FACTOR * (entries.length >> 1)); scaleThreshold = (int)(entries.length * SCALE_LOAD_FACTOR); The fix looks that: compactThreshold = (int)(COMPACT_LOAD_FACTOR * (entriesSize >> 1)); scaleThreshold = (int)(entriesSize * SCALE_LOAD_FACTOR); -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (IGNITE-12062) IntMap throws NullPointerException when map is creating
[ https://issues.apache.org/jira/browse/IGNITE-12062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905956#comment-16905956 ] Stepachev Maksim commented on IGNITE-12062: --- https://github.com/apache/ignite/pull/6769 > IntMap throws NullPointerException when map is creating > --- > > Key: IGNITE-12062 > URL: https://issues.apache.org/jira/browse/IGNITE-12062 > Project: Ignite > Issue Type: Bug >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > The problem located here: > compactThreshold = (int)(COMPACT_LOAD_FACTOR * (entries.length >> 1)); > scaleThreshold = (int)(entries.length * SCALE_LOAD_FACTOR); > The fix looks that: > compactThreshold = (int)(COMPACT_LOAD_FACTOR * (entriesSize >> 1)); > scaleThreshold = (int)(entriesSize * SCALE_LOAD_FACTOR); -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (IGNITE-12116) Cache doesn
Stepachev Maksim created IGNITE-12116: - Summary: Cache doesn Key: IGNITE-12116 URL: https://issues.apache.org/jira/browse/IGNITE-12116 Project: Ignite Issue Type: Improvement Reporter: Stepachev Maksim -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (IGNITE-12116) Cache doesn't support array as key
[ https://issues.apache.org/jira/browse/IGNITE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-12116: -- Description: The ignite cache doesn't support array as key. You couldn't do the base operations with it. > Cache doesn't support array as key > -- > > Key: IGNITE-12116 > URL: https://issues.apache.org/jira/browse/IGNITE-12116 > Project: Ignite > Issue Type: Improvement > Components: cache >Reporter: Stepachev Maksim >Priority: Major > > The ignite cache doesn't support array as key. You couldn't do the base > operations with it. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Assigned] (IGNITE-12116) Cache doesn't support array as key
[ https://issues.apache.org/jira/browse/IGNITE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim reassigned IGNITE-12116: - Assignee: Stepachev Maksim > Cache doesn't support array as key > -- > > Key: IGNITE-12116 > URL: https://issues.apache.org/jira/browse/IGNITE-12116 > Project: Ignite > Issue Type: Improvement > Components: cache >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > > The ignite cache doesn't support array as key. You couldn't do the base > operations with it. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (IGNITE-12116) Cache doesn't support array as key
[ https://issues.apache.org/jira/browse/IGNITE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-12116: -- Component/s: cache > Cache doesn't support array as key > -- > > Key: IGNITE-12116 > URL: https://issues.apache.org/jira/browse/IGNITE-12116 > Project: Ignite > Issue Type: Improvement > Components: cache >Reporter: Stepachev Maksim >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (IGNITE-12116) Cache doesn't support array as key
[ https://issues.apache.org/jira/browse/IGNITE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-12116: -- Summary: Cache doesn't support array as key (was: Cache doesn) > Cache doesn't support array as key > -- > > Key: IGNITE-12116 > URL: https://issues.apache.org/jira/browse/IGNITE-12116 > Project: Ignite > Issue Type: Improvement >Reporter: Stepachev Maksim >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-12116) Cache doesn't support array as key
[ https://issues.apache.org/jira/browse/IGNITE-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918620#comment-16918620 ] Stepachev Maksim commented on IGNITE-12116: --- [~xtern] Oh, It's my mistake. I'm going to fix in the next ticket. Thanks! > Cache doesn't support array as key > -- > > Key: IGNITE-12116 > URL: https://issues.apache.org/jira/browse/IGNITE-12116 > Project: Ignite > Issue Type: Improvement > Components: cache >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > The ignite cache doesn't support array as key. You couldn't do the base > operations with it. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (IGNITE-12123) Cache throws npe at {null, null, null} array key.
Stepachev Maksim created IGNITE-12123: - Summary: Cache throws npe at {null, null, null} array key. Key: IGNITE-12123 URL: https://issues.apache.org/jira/browse/IGNITE-12123 Project: Ignite Issue Type: Bug Affects Versions: 2.8 Reporter: Stepachev Maksim Assignee: Stepachev Maksim Fix For: 2.8 -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (IGNITE-12123) Cache throws npe at {null, null, null} array key.
[ https://issues.apache.org/jira/browse/IGNITE-12123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-12123: -- Description: When we put null-key to ignite cache we get NPE with the problem description "java.lang.NullPointerException: Ouch! Argument cannot be null: key" But when we put "new String[] {"c", *null*, "a"} > Cache throws npe at {null, null, null} array key. > - > > Key: IGNITE-12123 > URL: https://issues.apache.org/jira/browse/IGNITE-12123 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.8 >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > > When we put null-key to ignite cache we get NPE with the problem description > "java.lang.NullPointerException: Ouch! Argument cannot be null: key" > But when we put "new String[] > {"c", *null*, "a"} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (IGNITE-12123) Cache throws npe at {null, null, null} array key.
[ https://issues.apache.org/jira/browse/IGNITE-12123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-12123: -- Ignite Flags: (was: Docs Required,Release Notes Required) > Cache throws npe at {null, null, null} array key. > - > > Key: IGNITE-12123 > URL: https://issues.apache.org/jira/browse/IGNITE-12123 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.8 >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > > When we put null-key to ignite cache we get NPE with the problem description > "java.lang.NullPointerException: Ouch! Argument cannot be null: key" > But when we put "new String[] > {"c", *null*, "a"} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-12123) Cache throws npe at {null, null, null} array key.
[ https://issues.apache.org/jira/browse/IGNITE-12123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16925685#comment-16925685 ] Stepachev Maksim commented on IGNITE-12123: --- [~xtern] please look at this. > Cache throws npe at {null, null, null} array key. > - > > Key: IGNITE-12123 > URL: https://issues.apache.org/jira/browse/IGNITE-12123 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.8 >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > When we put null-key to ignite cache we get NPE with the problem description > "java.lang.NullPointerException: Ouch! Argument cannot be null: key" > But when we put "new String[] > {"c", *null*, "a"} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-12123) Cache throws npe at {null, null, null} array key.
[ https://issues.apache.org/jira/browse/IGNITE-12123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16925689#comment-16925689 ] Stepachev Maksim commented on IGNITE-12123: --- [~Jokser] please make the review. > Cache throws npe at {null, null, null} array key. > - > > Key: IGNITE-12123 > URL: https://issues.apache.org/jira/browse/IGNITE-12123 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.8 >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > When we put null-key to ignite cache we get NPE with the problem description > "java.lang.NullPointerException: Ouch! Argument cannot be null: key" > But when we put "new String[] > {"c", *null*, "a"} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (IGNITE-6957) Reduce excessive int boxing when accessing cache by ID
[ https://issues.apache.org/jira/browse/IGNITE-6957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900802#comment-16900802 ] Stepachev Maksim commented on IGNITE-6957: -- Hi, I added IntMap for Integers and fixed contains calls. > Reduce excessive int boxing when accessing cache by ID > -- > > Key: IGNITE-6957 > URL: https://issues.apache.org/jira/browse/IGNITE-6957 > Project: Ignite > Issue Type: Task > Components: cache >Affects Versions: 2.3 >Reporter: Alexey Goncharuk >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Attachments: 2017-11-20_12-01-31.png > > Time Spent: 53h 20m > Remaining Estimate: 0h > > We have a number of places which lead to a large number of Integer > allocations when having a large number of caches and partitions. See the > image attached. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (IGNITE-11992) Improvements for new security approach
[ https://issues.apache.org/jira/browse/IGNITE-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16887111#comment-16887111 ] Stepachev Maksim commented on IGNITE-11992: --- PR: https://github.com/apache/ignite/pull/6702/ > Improvements for new security approach > -- > > Key: IGNITE-11992 > URL: https://issues.apache.org/jira/browse/IGNITE-11992 > Project: Ignite > Issue Type: Improvement > Components: security >Affects Versions: 2.8 >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > 1. ZookeaperDiscoveryImpl doesn't implement security into itself. > As a result: Caused by: class org.apache.ignite.spi.IgniteSpiException: > Security context isn't certain. > 2. The visor tasks lost permission. > The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses > context. > 3. The GridRestProcessor does tasks outside "withContext" section. As result > context loses. > 4. The GridRestProcessor isn't client, we can't read security subject from > node attribute. > We should transmit secCtx for fake nodes and secSubjId for real. > 5. NoOpIgniteSecurityProcessor should include a disabled processor and > validate it too if it is not null. It is important for a client node. > For example: > Into IgniteKernal#securityProcessor method createComponent return a > GridSecurityProcessor. For server nodes are enabled, but for clients aren't. > The clients aren't able to pass validation for this reason. > 6. ATTR_SECURITY_SUBJECT was removed. It broke compatibility. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (IGNITE-11992) Improvements for new security approach
Stepachev Maksim created IGNITE-11992: - Summary: Improvements for new security approach Key: IGNITE-11992 URL: https://issues.apache.org/jira/browse/IGNITE-11992 Project: Ignite Issue Type: Improvement Components: security Affects Versions: 2.8 Reporter: Stepachev Maksim Assignee: Stepachev Maksim Fix For: 2.8 1. ZookeaperDiscoveryImpl doesn't implement security into itself. As a result: Caused by: class org.apache.ignite.spi.IgniteSpiException: Security context isn't certain. 2. The visor tasks lost permission. The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses context. 3. The GridRestProcessor does tasks outside "withContext" section. As result context loses. 4. The GridRestProcessor isn't client, we can't read security subject from node attribute. We should transmit secCtx for fake nodes and secSubjId for real. 5. NoOpIgniteSecurityProcessor should include a disabled processor and validate it too if it is not null. It is important for a client node. For example: Into IgniteKernal#securityProcessor method createComponent return a GridSecurityProcessor. For server nodes are enabled, but for clients aren't. The clients aren't able to pass validation for this reason. 6. ATTR_SECURITY_SUBJECT was removed. It broke compatibility. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (IGNITE-12205) GridCachePartitionedSetWithClientSelfTest.testMultithreaded has 95,5% fail rate for long time
Stepachev Maksim created IGNITE-12205: - Summary: GridCachePartitionedSetWithClientSelfTest.testMultithreaded has 95,5% fail rate for long time Key: IGNITE-12205 URL: https://issues.apache.org/jira/browse/IGNITE-12205 Project: Ignite Issue Type: Bug Reporter: Stepachev Maksim Assignee: Stepachev Maksim This is well-know failure, need to investigate and fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12206) Partition state validation warns are not printed to log
Stepachev Maksim created IGNITE-12206: - Summary: Partition state validation warns are not printed to log Key: IGNITE-12206 URL: https://issues.apache.org/jira/browse/IGNITE-12206 Project: Ignite Issue Type: Bug Reporter: Stepachev Maksim Assignee: Stepachev Maksim GridDhtPartitionsExchangeFuture.java {{}} {code:java} if (grpCtx == null || grpCtx.config().isReadThrough() || grpCtx.config().isWriteThrough() || grpCtx.config().getCacheStoreFactory() != null || grpCtx.config().getRebalanceDelay() == -1 || grpCtx.config().getRebalanceMode() == CacheRebalanceMode.NONE || grpCtx.config().getExpiryPolicyFactory() == null || SKIP_PARTITION_SIZE_VALIDATION) return null;{code} {{}} Looks like a typo, probably it should be grpCtx.config().getExpiryPolicyFactory() != null -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12205) GridCachePartitionedSetWithClientSelfTest.testMultithreaded has 95,5% fail rate for long time
[ https://issues.apache.org/jira/browse/IGNITE-12205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934282#comment-16934282 ] Stepachev Maksim commented on IGNITE-12205: --- I add Sf for this test: https://github.com/apache/ignite/pull/6887/files > GridCachePartitionedSetWithClientSelfTest.testMultithreaded has 95,5% fail > rate for long time > - > > Key: IGNITE-12205 > URL: https://issues.apache.org/jira/browse/IGNITE-12205 > Project: Ignite > Issue Type: Bug >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > This is well-know failure, need to investigate and fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-11992) Improvements for new security approach
[ https://issues.apache.org/jira/browse/IGNITE-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-11992: -- Description: 1. The visor tasks lost permission. The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses context. 3. The GridRestProcessor does tasks outside "withContext" section. As result context loses. 4. The GridRestProcessor isn't client, we can't read security subject from node attribute. We should transmit secCtx for fake nodes and secSubjId for real. was: 1. ZookeaperDiscoveryImpl doesn't implement security into itself. As a result: Caused by: class org.apache.ignite.spi.IgniteSpiException: Security context isn't certain. 2. The visor tasks lost permission. The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses context. 3. The GridRestProcessor does tasks outside "withContext" section. As result context loses. 4. The GridRestProcessor isn't client, we can't read security subject from node attribute. We should transmit secCtx for fake nodes and secSubjId for real. 5. NoOpIgniteSecurityProcessor should include a disabled processor and validate it too if it is not null. It is important for a client node. For example: Into IgniteKernal#securityProcessor method createComponent return a GridSecurityProcessor. For server nodes are enabled, but for clients aren't. The clients aren't able to pass validation for this reason. 6. ATTR_SECURITY_SUBJECT was removed. It broke compatibility. > Improvements for new security approach > -- > > Key: IGNITE-11992 > URL: https://issues.apache.org/jira/browse/IGNITE-11992 > Project: Ignite > Issue Type: Improvement > Components: security >Affects Versions: 2.8 >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 0.5h > Remaining Estimate: 0h > > 1. The visor tasks lost permission. > The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses > context. > 3. The GridRestProcessor does tasks outside "withContext" section. As result > context loses. > 4. The GridRestProcessor isn't client, we can't read security subject from > node attribute. > We should transmit secCtx for fake nodes and secSubjId for real. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-11992) Improvements for new security approach
[ https://issues.apache.org/jira/browse/IGNITE-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16939211#comment-16939211 ] Stepachev Maksim commented on IGNITE-11992: --- New PR:https://github.com/apache/ignite/pull/6918 > Improvements for new security approach > -- > > Key: IGNITE-11992 > URL: https://issues.apache.org/jira/browse/IGNITE-11992 > Project: Ignite > Issue Type: Improvement > Components: security >Affects Versions: 2.8 >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 0.5h > Remaining Estimate: 0h > > 1. The visor tasks lost permission. > The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses > context. > 3. The GridRestProcessor does tasks outside "withContext" section. As result > context loses. > 4. The GridRestProcessor isn't client, we can't read security subject from > node attribute. > We should transmit secCtx for fake nodes and secSubjId for real. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12205) GridCachePartitionedSetWithClientSelfTest.testMultithreaded has 95,5% fail rate for long time
[ https://issues.apache.org/jira/browse/IGNITE-12205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934341#comment-16934341 ] Stepachev Maksim commented on IGNITE-12205: --- Look at tests: [https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_DataStructures?branch=pull%2F6887%2Fhead] . It was fixed. > GridCachePartitionedSetWithClientSelfTest.testMultithreaded has 95,5% fail > rate for long time > - > > Key: IGNITE-12205 > URL: https://issues.apache.org/jira/browse/IGNITE-12205 > Project: Ignite > Issue Type: Bug >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > This is well-know failure, need to investigate and fix. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12220) Allow to use cache-related permissions both at system and per-cache levels
[ https://issues.apache.org/jira/browse/IGNITE-12220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940735#comment-16940735 ] Stepachev Maksim commented on IGNITE-12220: --- [~RyzhovSV], Looks good for me. > Allow to use cache-related permissions both at system and per-cache levels > -- > > Key: IGNITE-12220 > URL: https://issues.apache.org/jira/browse/IGNITE-12220 > Project: Ignite > Issue Type: Task > Components: security >Affects Versions: 2.7.6 >Reporter: Andrey Kuznetsov >Assignee: Sergei Ryzhov >Priority: Major > Fix For: 2.8 > > Time Spent: 20m > Remaining Estimate: 0h > > Currently, {{CACHE_CREATE}} and {{CACHE_DESTROY}} permissions are enforced to > be system-level permissions, see for instance > {{SecurityPermissionSetBuilder#appendCachePermissions}}. This looks > inflexible: Ignite Security implementations are not able to manage cache > creation and deletion permissions on per-cache basis (unlike get/put/remove > permissions). All such limitations should be found and removed in order to > allow all {{CACHE_*}} permissions to be set both at system and per-cache > levels. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-11992) Improvements for new security approach
[ https://issues.apache.org/jira/browse/IGNITE-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-11992: -- Description: 1. The visor tasks lost permission. The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses context. 2. The GridRestProcessor does tasks outside "withContext" section. As result context loses. 3. The GridRestProcessor isn't client, we can't read security subject from node attribute. We should transmit secCtx for fake nodes and secSubjId for real. In additional: Change a java docs for TaskEvent, CacheEvent, CacheQueryExecutedEvent and CacheQueryReadEvent. "Gets security subject ID initiated this task event, if available. This property is not available for GridEventType#EVT_TASK_SESSION_ATTR_SET task event. Subject ID will be set either to node ID or client ID initiated task execution." by: "Gets security subject ID initiated this task event if IgniteSecurity is enabled, otherwise returns null." was: 1. The visor tasks lost permission. The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses context. 3. The GridRestProcessor does tasks outside "withContext" section. As result context loses. 4. The GridRestProcessor isn't client, we can't read security subject from node attribute. We should transmit secCtx for fake nodes and secSubjId for real. > Improvements for new security approach > -- > > Key: IGNITE-11992 > URL: https://issues.apache.org/jira/browse/IGNITE-11992 > Project: Ignite > Issue Type: Improvement > Components: security >Affects Versions: 2.8 >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 0.5h > Remaining Estimate: 0h > > 1. The visor tasks lost permission. > The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses > context. > 2. The GridRestProcessor does tasks outside "withContext" section. As result > context loses. > 3. The GridRestProcessor isn't client, we can't read security subject from > node attribute. > We should transmit secCtx for fake nodes and secSubjId for real. > In additional: > Change a java docs for TaskEvent, CacheEvent, CacheQueryExecutedEvent and > CacheQueryReadEvent. > "Gets security subject ID initiated this task event, if available. > This property is not available for GridEventType#EVT_TASK_SESSION_ATTR_SET > task event. > Subject ID will be set either to node ID or client ID initiated task > execution." > by: > "Gets security subject ID initiated this task event if IgniteSecurity is > enabled, otherwise returns null." > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-11992) Improvements for new security approach
[ https://issues.apache.org/jira/browse/IGNITE-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940737#comment-16940737 ] Stepachev Maksim commented on IGNITE-11992: --- [~garus.d.g], please look the pull-request. > Improvements for new security approach > -- > > Key: IGNITE-11992 > URL: https://issues.apache.org/jira/browse/IGNITE-11992 > Project: Ignite > Issue Type: Improvement > Components: security >Affects Versions: 2.8 >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 0.5h > Remaining Estimate: 0h > > 1. The visor tasks lost permission. > The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses > context. > 3. The GridRestProcessor does tasks outside "withContext" section. As result > context loses. > 4. The GridRestProcessor isn't client, we can't read security subject from > node attribute. > We should transmit secCtx for fake nodes and secSubjId for real. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-11992) Improvements for new security approach
[ https://issues.apache.org/jira/browse/IGNITE-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-11992: -- Description: 1. The visor tasks lost permission. The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses context. 2. The GridRestProcessor does tasks outside "withContext" section. As result context loses. In additional: Change a java docs for TaskEvent, CacheEvent, CacheQueryExecutedEvent and CacheQueryReadEvent. "Gets security subject ID initiated this task event, if available. This property is not available for GridEventType#EVT_TASK_SESSION_ATTR_SET task event. Subject ID will be set either to node ID or client ID initiated task execution." by: "Gets security subject ID initiated this task event if IgniteSecurity is enabled, otherwise returns null." was: 1. The visor tasks lost permission. The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses context. 2. The GridRestProcessor does tasks outside "withContext" section. As result context loses. 3. The GridRestProcessor isn't client, we can't read security subject from node attribute. We should transmit secCtx for fake nodes and secSubjId for real. In additional: Change a java docs for TaskEvent, CacheEvent, CacheQueryExecutedEvent and CacheQueryReadEvent. "Gets security subject ID initiated this task event, if available. This property is not available for GridEventType#EVT_TASK_SESSION_ATTR_SET task event. Subject ID will be set either to node ID or client ID initiated task execution." by: "Gets security subject ID initiated this task event if IgniteSecurity is enabled, otherwise returns null." > Improvements for new security approach > -- > > Key: IGNITE-11992 > URL: https://issues.apache.org/jira/browse/IGNITE-11992 > Project: Ignite > Issue Type: Improvement > Components: security >Affects Versions: 2.8 >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 0.5h > Remaining Estimate: 0h > > 1. The visor tasks lost permission. > The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses > context. > 2. The GridRestProcessor does tasks outside "withContext" section. As result > context loses. > In additional: > Change a java docs for TaskEvent, CacheEvent, CacheQueryExecutedEvent and > CacheQueryReadEvent. > "Gets security subject ID initiated this task event, if available. > This property is not available for GridEventType#EVT_TASK_SESSION_ATTR_SET > task event. > Subject ID will be set either to node ID or client ID initiated task > execution." > by: > "Gets security subject ID initiated this task event if IgniteSecurity is > enabled, otherwise returns null." > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-11992) Improvements for new security approach
[ https://issues.apache.org/jira/browse/IGNITE-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954545#comment-16954545 ] Stepachev Maksim commented on IGNITE-11992: --- The point 3, was removed. https://github.com/apache/ignite/pull/6990/files > Improvements for new security approach > -- > > Key: IGNITE-11992 > URL: https://issues.apache.org/jira/browse/IGNITE-11992 > Project: Ignite > Issue Type: Improvement > Components: security >Affects Versions: 2.8 >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 50m > Remaining Estimate: 0h > > 1. The visor tasks lost permission. > The method VisorQueryUtils#scheduleQueryStart makes a new thread and loses > context. > 2. The GridRestProcessor does tasks outside "withContext" section. As result > context loses. > In additional: > Change a java docs for TaskEvent, CacheEvent, CacheQueryExecutedEvent and > CacheQueryReadEvent. > "Gets security subject ID initiated this task event, if available. > This property is not available for GridEventType#EVT_TASK_SESSION_ATTR_SET > task event. > Subject ID will be set either to node ID or client ID initiated task > execution." > by: > "Gets security subject ID initiated this task event if IgniteSecurity is > enabled, otherwise returns null." > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12371) Explicit method for starting client nodes
[ https://issues.apache.org/jira/browse/IGNITE-12371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974171#comment-16974171 ] Stepachev Maksim commented on IGNITE-12371: --- [~nizhikov] I didn't. I decided that the tests refactor is time waste. > Explicit method for starting client nodes > - > > Key: IGNITE-12371 > URL: https://issues.apache.org/jira/browse/IGNITE-12371 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.7.6 >Reporter: Nikolay Izhikov >Priority: Major > Labels: newbie > > Right now there is almost 500 explicit usage of {{setClientMode}} in tests. > Seems we should support the starting of client nodes in test framework. > We should introduce method {{startClientNode(String name)}} and similar. > This will simplify tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12371) Explicit method for starting client nodes
[ https://issues.apache.org/jira/browse/IGNITE-12371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974120#comment-16974120 ] Stepachev Maksim commented on IGNITE-12371: --- [~nizhikov] Hi, I added it. In another ticket. [https://github.com/gridgain/gridgain/blob/master/modules/core/src/test/java/org/apache/ignite/testframework/junits/GridAbstractTest.java#L889] and will donate it. > Explicit method for starting client nodes > - > > Key: IGNITE-12371 > URL: https://issues.apache.org/jira/browse/IGNITE-12371 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.7.6 >Reporter: Nikolay Izhikov >Priority: Major > Labels: newbie > > Right now there is almost 500 explicit usage of {{setClientMode}} in tests. > Seems we should support the starting of client nodes in test framework. > We should introduce method {{startClientNode(String name)}} and similar. > This will simplify tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (IGNITE-12371) Explicit method for starting client nodes
[ https://issues.apache.org/jira/browse/IGNITE-12371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974120#comment-16974120 ] Stepachev Maksim edited comment on IGNITE-12371 at 11/14/19 10:17 AM: -- [~nizhikov] Hi, I added it in another ticket. [https://github.com/gridgain/gridgain/blob/master/modules/core/src/test/java/org/apache/ignite/testframework/junits/GridAbstractTest.java#L889] and will donate it. was (Author: mstepachev): [~nizhikov] Hi, I added it. In another ticket. [https://github.com/gridgain/gridgain/blob/master/modules/core/src/test/java/org/apache/ignite/testframework/junits/GridAbstractTest.java#L889] and will donate it. > Explicit method for starting client nodes > - > > Key: IGNITE-12371 > URL: https://issues.apache.org/jira/browse/IGNITE-12371 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.7.6 >Reporter: Nikolay Izhikov >Priority: Major > Labels: newbie > > Right now there is almost 500 explicit usage of {{setClientMode}} in tests. > Seems we should support the starting of client nodes in test framework. > We should introduce method {{startClientNode(String name)}} and similar. > This will simplify tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12206) Partition state validation warns are not printed to log
[ https://issues.apache.org/jira/browse/IGNITE-12206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-12206: -- Description: GridDhtPartitionsExchangeFuture.java {code:java} if (grpCtx == null || grpCtx.config().isReadThrough() || grpCtx.config().isWriteThrough() || grpCtx.config().getCacheStoreFactory() != null || grpCtx.config().getRebalanceDelay() == -1 || grpCtx.config().getRebalanceMode() == CacheRebalanceMode.NONE || grpCtx.config().getExpiryPolicyFactory() == null || SKIP_PARTITION_SIZE_VALIDATION) return null;{code} Looks like a typo, probably it should be grpCtx.config().getExpiryPolicyFactory() != null was: GridDhtPartitionsExchangeFuture.java {{}} {code:java} if (grpCtx == null || grpCtx.config().isReadThrough() || grpCtx.config().isWriteThrough() || grpCtx.config().getCacheStoreFactory() != null || grpCtx.config().getRebalanceDelay() == -1 || grpCtx.config().getRebalanceMode() == CacheRebalanceMode.NONE || grpCtx.config().getExpiryPolicyFactory() == null || SKIP_PARTITION_SIZE_VALIDATION) return null;{code} {{}} Looks like a typo, probably it should be grpCtx.config().getExpiryPolicyFactory() != null > Partition state validation warns are not printed to log > --- > > Key: IGNITE-12206 > URL: https://issues.apache.org/jira/browse/IGNITE-12206 > Project: Ignite > Issue Type: Bug >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > GridDhtPartitionsExchangeFuture.java > > {code:java} > if (grpCtx == null > || grpCtx.config().isReadThrough() > || grpCtx.config().isWriteThrough() > || grpCtx.config().getCacheStoreFactory() != null > || grpCtx.config().getRebalanceDelay() == -1 > || grpCtx.config().getRebalanceMode() == > CacheRebalanceMode.NONE > || grpCtx.config().getExpiryPolicyFactory() == null > || SKIP_PARTITION_SIZE_VALIDATION) > return null;{code} > > Looks like a typo, probably it should be > grpCtx.config().getExpiryPolicyFactory() != null -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12230) Partition eviction during cache stop / deactivation may cause errors leading to node failure and storage corruption
Stepachev Maksim created IGNITE-12230: - Summary: Partition eviction during cache stop / deactivation may cause errors leading to node failure and storage corruption Key: IGNITE-12230 URL: https://issues.apache.org/jira/browse/IGNITE-12230 Project: Ignite Issue Type: Bug Components: cache Reporter: Stepachev Maksim Assignee: Stepachev Maksim Fix For: 2.8 PartitionEvictionTask may produce NullPointerException if cache / cache group / cluser is stopping / deactivating. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12230) Partition eviction during cache stop / deactivation may cause errors leading to node failure and storage corruption
[ https://issues.apache.org/jira/browse/IGNITE-12230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937664#comment-16937664 ] Stepachev Maksim commented on IGNITE-12230: --- https://github.com/apache/ignite/pull/6906 > Partition eviction during cache stop / deactivation may cause errors leading > to node failure and storage corruption > --- > > Key: IGNITE-12230 > URL: https://issues.apache.org/jira/browse/IGNITE-12230 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > PartitionEvictionTask may produce NullPointerException if cache / cache group > / cluser is stopping / deactivating. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12230) Partition eviction during cache stop / deactivation may cause errors leading to node failure and storage corruption
[ https://issues.apache.org/jira/browse/IGNITE-12230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16937693#comment-16937693 ] Stepachev Maksim commented on IGNITE-12230: --- [~Pavlukhin], It isn't related. > Partition eviction during cache stop / deactivation may cause errors leading > to node failure and storage corruption > --- > > Key: IGNITE-12230 > URL: https://issues.apache.org/jira/browse/IGNITE-12230 > Project: Ignite > Issue Type: Bug > Components: cache >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.8 > > Time Spent: 10m > Remaining Estimate: 0h > > PartitionEvictionTask may produce NullPointerException if cache / cache group > / cluser is stopping / deactivating. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12621) Node leave may cause NullPointerException during IO message processing if security is enabled
[ https://issues.apache.org/jira/browse/IGNITE-12621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17032240#comment-17032240 ] Stepachev Maksim commented on IGNITE-12621: --- [~slava.koptilin] LGTM! > Node leave may cause NullPointerException during IO message processing if > security is enabled > - > > Key: IGNITE-12621 > URL: https://issues.apache.org/jira/browse/IGNITE-12621 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.8 >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > Fix For: 2.9 > > Time Spent: 10m > Remaining Estimate: 0h > > In case the node will receive IO message from a dead node *after* receiving > discovery message about node fail, {{ctx.discovery().node(uuid)}} will return > {{null}}, which in turn will cause {{NullPointerException}}. > We can fix it by peeking disco cache history for retrieving attributes of the > dead node. > See: > {code} > /** {@inheritDoc} */ > @Override public OperationSecurityContext withContext(UUID nodeId) { > return withContext( > secCtxs.computeIfAbsent(nodeId, > uuid -> nodeSecurityContext( > marsh, U.resolveClassLoader(ctx.config()), > ctx.discovery().node(uuid) > ) > ) > ); > } > {code} > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.ignite.internal.processors.security.SecurityUtils.nodeSecurityContext(SecurityUtils.java:135) > at > org.apache.ignite.internal.processors.security.IgniteSecurityProcessor.lambda$withContext$0(IgniteSecurityProcessor.java:112) > at > java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660) > at > org.apache.ignite.internal.processors.security.IgniteSecurityProcessor.withContext(IgniteSecurityProcessor.java:111) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12582) It is needed to set used cache for Spring Data dynamically
[ https://issues.apache.org/jira/browse/IGNITE-12582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-12582: -- Labels: spring-plugin (was: ) > It is needed to set used cache for Spring Data dynamically > -- > > Key: IGNITE-12582 > URL: https://issues.apache.org/jira/browse/IGNITE-12582 > Project: Ignite > Issue Type: Improvement > Components: spring >Affects Versions: 2.7.6 >Reporter: Sergey Chernolyas >Assignee: Sergey Chernolyas >Priority: Major > Labels: spring-plugin > Fix For: 2.8 > > > Hi! > My project needs to configure used cache by property, like > ""[spring.data|http://spring.data/].mongodb.uri: > mongodb://:@:/" from Spring Data for > MongoDB. Now, I can set cache for particular repository by annotation > "RepositoryConfig" only. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12582) It is needed to set used cache for Spring Data dynamically
[ https://issues.apache.org/jira/browse/IGNITE-12582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stepachev Maksim updated IGNITE-12582: -- Fix Version/s: 2.8 > It is needed to set used cache for Spring Data dynamically > -- > > Key: IGNITE-12582 > URL: https://issues.apache.org/jira/browse/IGNITE-12582 > Project: Ignite > Issue Type: Improvement > Components: spring >Affects Versions: 2.7.6 >Reporter: Sergey Chernolyas >Assignee: Sergey Chernolyas >Priority: Major > Fix For: 2.8 > > > Hi! > My project needs to configure used cache by property, like > ""[spring.data|http://spring.data/].mongodb.uri: > mongodb://:@:/" from Spring Data for > MongoDB. Now, I can set cache for particular repository by annotation > "RepositoryConfig" only. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12681) IgniteShutdownOnSupplyMessageFailureTest
[ https://issues.apache.org/jira/browse/IGNITE-12681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037004#comment-17037004 ] Stepachev Maksim commented on IGNITE-12681: --- [~agoncharuk] LGTM. > IgniteShutdownOnSupplyMessageFailureTest > > > Key: IGNITE-12681 > URL: https://issues.apache.org/jira/browse/IGNITE-12681 > Project: Ignite > Issue Type: Test >Reporter: Alexey Goncharuk >Assignee: Alexey Goncharuk >Priority: Major > Fix For: 2.9 > > Time Spent: 10m > Remaining Estimate: 0h > > The test checks that a node will be shut down by a failure handler by > listening for {{NODE_LEFT}} event. However, if the node shutdown happens > before a new node joins the cluster, the joining node will form a cluster by > itself with topology version = 1 and no event will be fired. > The test should be changed to specifically listen for the failure handler > invocation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12582) It is needed to set used cache for Spring Data dynamically
[ https://issues.apache.org/jira/browse/IGNITE-12582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033691#comment-17033691 ] Stepachev Maksim commented on IGNITE-12582: --- LGTM, Please add the visa. [~irakov] Please merge it after the visa. > It is needed to set used cache for Spring Data dynamically > -- > > Key: IGNITE-12582 > URL: https://issues.apache.org/jira/browse/IGNITE-12582 > Project: Ignite > Issue Type: Improvement > Components: spring >Affects Versions: 2.7.6 >Reporter: Sergey Chernolyas >Assignee: Sergey Chernolyas >Priority: Major > > Hi! > My project needs to configure used cache by property, like > ""[spring.data|http://spring.data/].mongodb.uri: > mongodb://:@:/" from Spring Data for > MongoDB. Now, I can set cache for particular repository by annotation > "RepositoryConfig" only. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12783) Partition state validation warnings erroneously logged when cache groups are used
[ https://issues.apache.org/jira/browse/IGNITE-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067601#comment-17067601 ] Stepachev Maksim commented on IGNITE-12783: --- [~slava.koptilin] Hi, LGTM! > Partition state validation warnings erroneously logged when cache groups are > used > - > > Key: IGNITE-12783 > URL: https://issues.apache.org/jira/browse/IGNITE-12783 > Project: Ignite > Issue Type: Bug >Affects Versions: 2.8 >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Minor > Fix For: 2.9 > > Time Spent: 10m > Remaining Estimate: 0h > > It seems that IGNITE-12206 does not cover all possible cases. For instance, > the following cache configurations are still validated and therefore it may > be the reason for erroneously warning. > {code:java} > String grpName = "test-group"; > CacheConfiguration cfg1 = new CacheConfiguration<>("cache-1") > .setBackups(1) > .setGroupName(grpName); > CacheConfiguration cfg2 = new CacheConfiguration<>("cache-2") > .setBackups(1) > .setExpiryPolicyFactory(AccessedExpiryPolicy.factoryOf(new > Duration(TimeUnit.SECONDS, 1))) > .setGroupName(grpName); > {code} > The following code takes into account only the first cache configuration for > a particular cache group: > {code:java|title=GridDhtPartitionsExchangeFuture#validatePartitionsState()} > CacheGroupContext grpCtx = cctx.cache().cacheGroup(grpDesc.groupId()); > ... > // Do not validate read or write through caches or caches with disabled > rebalance > // or ExpiryPolicy is set or validation is disabled. > boolean eternalExpiryPolicy = grpCtx != null && > (grpCtx.config().getExpiryPolicyFactory() == null > || grpCtx.config().getExpiryPolicyFactory().create() instanceof > EternalExpiryPolicy); > > if (grpCtx == null > ... > || !eternalExpiryPolicy > return null; // It means that validation should not be triggered. > {code} > The obvious way to fix the issue is to check all the configurations included > in the cache group as follows: > {code:java|title=GridDhtPartitionsExchangeFuture#validatePartitionsState()} > CacheGroupContext grpCtx = cctx.cache().cacheGroup(grpDesc.groupId()); > ... > boolean customExpiryPolicy = Optional.ofNullable(grpCtx) > .map((v) -> v.caches()) > .orElseGet(() -> Collections.emptyList()) > .stream() > .anyMatch(ctx -> ctx.expiry() != null && !(ctx.expiry() instanceof > EternalExpiryPolicy)); > if (grpCtx == null > ... > || customExpityPolicy > return null; // It means that validation should not be triggered. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-12995) Transactions hang up if node fails
Stepachev Maksim created IGNITE-12995: - Summary: Transactions hang up if node fails Key: IGNITE-12995 URL: https://issues.apache.org/jira/browse/IGNITE-12995 Project: Ignite Issue Type: Bug Affects Versions: 2.8 Reporter: Stepachev Maksim Assignee: Stepachev Maksim Now if node fails and some transaction that requires this node is in progress it hangs up and is rolled back after timeout. Although we already know that it cannot be committed. We should rollback active transactions on node fail. Reproducer attached. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12220) Allow to use cache-related permissions both at system and per-cache levels
[ https://issues.apache.org/jira/browse/IGNITE-12220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17065393#comment-17065393 ] Stepachev Maksim commented on IGNITE-12220: --- The code is LGTM again, but I add comments about style. Please fix it. > Allow to use cache-related permissions both at system and per-cache levels > -- > > Key: IGNITE-12220 > URL: https://issues.apache.org/jira/browse/IGNITE-12220 > Project: Ignite > Issue Type: Task > Components: security >Affects Versions: 2.7.6 >Reporter: Andrey Kuznetsov >Assignee: Sergei Ryzhov >Priority: Major > Time Spent: 5h > Remaining Estimate: 0h > > Currently, {{CACHE_CREATE}} and {{CACHE_DESTROY}} permissions are enforced to > be system-level permissions, see for instance > {{SecurityPermissionSetBuilder#appendCachePermissions}}. This looks > inflexible: Ignite Security implementations are not able to manage cache > creation and deletion permissions on per-cache basis (unlike get/put/remove > permissions). All such limitations should be found and removed in order to > allow all {{CACHE_*}} permissions to be set both at system and per-cache > levels. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-13578) Update ignite-kafka dependencies to get rid of reported CVEs
[ https://issues.apache.org/jira/browse/IGNITE-13578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17213904#comment-17213904 ] Stepachev Maksim commented on IGNITE-13578: --- This tests have ~57% fail rate. > Update ignite-kafka dependencies to get rid of reported CVEs > > > Key: IGNITE-13578 > URL: https://issues.apache.org/jira/browse/IGNITE-13578 > Project: Ignite > Issue Type: Bug > Components: integrations >Reporter: Stepachev Maksim >Assignee: Stepachev Maksim >Priority: Major > Fix For: 2.10 > > Time Spent: 10m > Remaining Estimate: 0h > > The libraries to update: > connect-api-2.1.1.jar > kafka-clients-2.1.1.jar -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13578) Update ignite-kafka dependencies to get rid of reported CVEs
Stepachev Maksim created IGNITE-13578: - Summary: Update ignite-kafka dependencies to get rid of reported CVEs Key: IGNITE-13578 URL: https://issues.apache.org/jira/browse/IGNITE-13578 Project: Ignite Issue Type: Bug Components: integrations Reporter: Stepachev Maksim Assignee: Stepachev Maksim Fix For: 2.10 The libraries to update: connect-api-2.1.1.jar kafka-clients-2.1.1.jar -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-13349) Migrate TcpDiscoveryStatistics to new metrics framework
[ https://issues.apache.org/jira/browse/IGNITE-13349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178891#comment-17178891 ] Stepachev Maksim commented on IGNITE-13349: --- [~nizhikov] Hi. I don't read a mail list about this feature. It looks fine for me. I added comments into PR, please check them and feel free to merge. > Migrate TcpDiscoveryStatistics to new metrics framework > --- > > Key: IGNITE-13349 > URL: https://issues.apache.org/jira/browse/IGNITE-13349 > Project: Ignite > Issue Type: Improvement >Reporter: Nikolay Izhikov >Assignee: Nikolay Izhikov >Priority: Major > Labels: IEP-35 > Fix For: 2.10 > > Time Spent: 0.5h > Remaining Estimate: 0h > > TcpDiscoveryStatistics should be migrated to the new metrics framework. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13499) Reconnected node sends old id as security subject id
Stepachev Maksim created IGNITE-13499: - Summary: Reconnected node sends old id as security subject id Key: IGNITE-13499 URL: https://issues.apache.org/jira/browse/IGNITE-13499 Project: Ignite Issue Type: Bug Components: security Affects Versions: 2.8.1 Reporter: Stepachev Maksim Assignee: Stepachev Maksim Fix For: 2.9 After reconnection a client node send old security subject id. We must invalidate inner structures. -- This message was sent by Atlassian Jira (v8.3.4#803005)