[jira] [Commented] (IGNITE-7019) Cluster can not survive after IgniteOOM

2018-02-09 Thread Anton Vinogradov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358389#comment-16358389
 ] 

Anton Vinogradov commented on IGNITE-7019:
--

[~cyberdemon], 

Please ping me once TC will be fully checked and I'll merge the changes. 

 

[~gvvinblade],

Thanks a lot for final review!

> Cluster can not survive after IgniteOOM
> ---
>
> Key: IGNITE-7019
> URL: https://issues.apache.org/jira/browse/IGNITE-7019
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.3
>Reporter: Mikhail Cherkasov
>Assignee: Dmitriy Sorokin
>Priority: Critical
>  Labels: iep-7
> Fix For: 2.5
>
>
> even if we have full sync mode and transactional cache we can't add new nodes 
> if there  was IgniteOOM, after adding new nodes and re-balancing, old nodes 
> can't evict partitions:
> {code}
> [2017-11-17 20:02:24,588][ERROR][sys-#65%DR1%][GridDhtPreloader] Partition 
> eviction failed, this can cause grid hang.
> class org.apache.ignite.internal.mem.IgniteOutOfMemoryException: Not enough 
> memory allocated [policyName=100MB_Region_Eviction, size=104.9 MB]
> Consider increasing memory policy size, enabling evictions, adding more nodes 
> to the cluster, reducing number of backups or reducing model size.
> at 
> org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl.allocatePage(PageMemoryNoStoreImpl.java:294)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePageNoReuse(DataStructure.java:117)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePage(DataStructure.java:105)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.addStripe(PagesList.java:413)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.getPageForPut(PagesList.java:528)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.put(PagesList.java:617)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.FreeListImpl.addForRecycle(FreeListImpl.java:582)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.reuseFreePages(BPlusTree.java:3847)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.releaseAll(BPlusTree.java:4106)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.access$6900(BPlusTree.java:3166)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doRemove(BPlusTree.java:1782)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.remove(BPlusTree.java:1567)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.remove(IgniteCacheOffheapManagerImpl.java:1387)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.remove(IgniteCacheOffheapManagerImpl.java:374)
> at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.removeValue(GridCacheMapEntry.java:3233)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.clearInternal(GridDhtCacheEntry.java:588)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.clearAll(GridDhtLocalPartition.java:892)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.tryEvict(GridDhtLocalPartition.java:750)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:593)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:580)
> at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6639)
> at 
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:967)
> at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> Discussion on the dev list: 
> http://apache-ignite-developers.2346864.n4.nabble.com/How-properly-handle-IgniteOOM-td25288.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-7019) Cluster can not survive after IgniteOOM

2018-02-09 Thread Igor Seliverstov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358365#comment-16358365
 ] 

Igor Seliverstov commented on IGNITE-7019:
--

[~cyberdemon], looks OK

> Cluster can not survive after IgniteOOM
> ---
>
> Key: IGNITE-7019
> URL: https://issues.apache.org/jira/browse/IGNITE-7019
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.3
>Reporter: Mikhail Cherkasov
>Assignee: Dmitriy Sorokin
>Priority: Critical
>  Labels: iep-7
> Fix For: 2.5
>
>
> even if we have full sync mode and transactional cache we can't add new nodes 
> if there  was IgniteOOM, after adding new nodes and re-balancing, old nodes 
> can't evict partitions:
> {code}
> [2017-11-17 20:02:24,588][ERROR][sys-#65%DR1%][GridDhtPreloader] Partition 
> eviction failed, this can cause grid hang.
> class org.apache.ignite.internal.mem.IgniteOutOfMemoryException: Not enough 
> memory allocated [policyName=100MB_Region_Eviction, size=104.9 MB]
> Consider increasing memory policy size, enabling evictions, adding more nodes 
> to the cluster, reducing number of backups or reducing model size.
> at 
> org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl.allocatePage(PageMemoryNoStoreImpl.java:294)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePageNoReuse(DataStructure.java:117)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePage(DataStructure.java:105)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.addStripe(PagesList.java:413)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.getPageForPut(PagesList.java:528)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.put(PagesList.java:617)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.FreeListImpl.addForRecycle(FreeListImpl.java:582)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.reuseFreePages(BPlusTree.java:3847)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.releaseAll(BPlusTree.java:4106)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.access$6900(BPlusTree.java:3166)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doRemove(BPlusTree.java:1782)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.remove(BPlusTree.java:1567)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.remove(IgniteCacheOffheapManagerImpl.java:1387)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.remove(IgniteCacheOffheapManagerImpl.java:374)
> at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.removeValue(GridCacheMapEntry.java:3233)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.clearInternal(GridDhtCacheEntry.java:588)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.clearAll(GridDhtLocalPartition.java:892)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.tryEvict(GridDhtLocalPartition.java:750)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:593)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:580)
> at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6639)
> at 
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:967)
> at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> Discussion on the dev list: 
> http://apache-ignite-developers.2346864.n4.nabble.com/How-properly-handle-IgniteOOM-td25288.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-7019) Cluster can not survive after IgniteOOM

2018-02-07 Thread Igor Seliverstov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16355742#comment-16355742
 ] 

Igor Seliverstov commented on IGNITE-7019:
--

[~cyberdemon], I've left a couple of comments on github

> Cluster can not survive after IgniteOOM
> ---
>
> Key: IGNITE-7019
> URL: https://issues.apache.org/jira/browse/IGNITE-7019
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.3
>Reporter: Mikhail Cherkasov
>Assignee: Dmitriy Sorokin
>Priority: Critical
>  Labels: iep-7
> Fix For: 2.5
>
>
> even if we have full sync mode and transactional cache we can't add new nodes 
> if there  was IgniteOOM, after adding new nodes and re-balancing, old nodes 
> can't evict partitions:
> {code}
> [2017-11-17 20:02:24,588][ERROR][sys-#65%DR1%][GridDhtPreloader] Partition 
> eviction failed, this can cause grid hang.
> class org.apache.ignite.internal.mem.IgniteOutOfMemoryException: Not enough 
> memory allocated [policyName=100MB_Region_Eviction, size=104.9 MB]
> Consider increasing memory policy size, enabling evictions, adding more nodes 
> to the cluster, reducing number of backups or reducing model size.
> at 
> org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl.allocatePage(PageMemoryNoStoreImpl.java:294)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePageNoReuse(DataStructure.java:117)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePage(DataStructure.java:105)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.addStripe(PagesList.java:413)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.getPageForPut(PagesList.java:528)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.put(PagesList.java:617)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.FreeListImpl.addForRecycle(FreeListImpl.java:582)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.reuseFreePages(BPlusTree.java:3847)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.releaseAll(BPlusTree.java:4106)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.access$6900(BPlusTree.java:3166)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doRemove(BPlusTree.java:1782)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.remove(BPlusTree.java:1567)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.remove(IgniteCacheOffheapManagerImpl.java:1387)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.remove(IgniteCacheOffheapManagerImpl.java:374)
> at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.removeValue(GridCacheMapEntry.java:3233)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.clearInternal(GridDhtCacheEntry.java:588)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.clearAll(GridDhtLocalPartition.java:892)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.tryEvict(GridDhtLocalPartition.java:750)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:593)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:580)
> at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6639)
> at 
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:967)
> at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> Discussion on the dev list: 
> http://apache-ignite-developers.2346864.n4.nabble.com/How-properly-handle-IgniteOOM-td25288.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-7019) Cluster can not survive after IgniteOOM

2018-01-29 Thread Dmitriy Sorokin (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343239#comment-16343239
 ] 

Dmitriy Sorokin commented on IGNITE-7019:
-

[~avinogradov],
Review my patch, please.

> Cluster can not survive after IgniteOOM
> ---
>
> Key: IGNITE-7019
> URL: https://issues.apache.org/jira/browse/IGNITE-7019
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.3
>Reporter: Mikhail Cherkasov
>Assignee: Dmitriy Sorokin
>Priority: Critical
>  Labels: iep-7
> Fix For: 2.5
>
>
> even if we have full sync mode and transactional cache we can't add new nodes 
> if there  was IgniteOOM, after adding new nodes and re-balancing, old nodes 
> can't evict partitions:
> {code}
> [2017-11-17 20:02:24,588][ERROR][sys-#65%DR1%][GridDhtPreloader] Partition 
> eviction failed, this can cause grid hang.
> class org.apache.ignite.internal.mem.IgniteOutOfMemoryException: Not enough 
> memory allocated [policyName=100MB_Region_Eviction, size=104.9 MB]
> Consider increasing memory policy size, enabling evictions, adding more nodes 
> to the cluster, reducing number of backups or reducing model size.
> at 
> org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl.allocatePage(PageMemoryNoStoreImpl.java:294)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePageNoReuse(DataStructure.java:117)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePage(DataStructure.java:105)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.addStripe(PagesList.java:413)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.getPageForPut(PagesList.java:528)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.put(PagesList.java:617)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.FreeListImpl.addForRecycle(FreeListImpl.java:582)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.reuseFreePages(BPlusTree.java:3847)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.releaseAll(BPlusTree.java:4106)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.access$6900(BPlusTree.java:3166)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doRemove(BPlusTree.java:1782)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.remove(BPlusTree.java:1567)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.remove(IgniteCacheOffheapManagerImpl.java:1387)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.remove(IgniteCacheOffheapManagerImpl.java:374)
> at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.removeValue(GridCacheMapEntry.java:3233)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.clearInternal(GridDhtCacheEntry.java:588)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.clearAll(GridDhtLocalPartition.java:892)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.tryEvict(GridDhtLocalPartition.java:750)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:593)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:580)
> at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6639)
> at 
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:967)
> at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> Discussion on the dev list: 
> http://apache-ignite-developers.2346864.n4.nabble.com/How-properly-handle-IgniteOOM-td25288.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-7019) Cluster can not survive after IgniteOOM

2018-01-26 Thread Dmitriy Sorokin (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340947#comment-16340947
 ] 

Dmitriy Sorokin commented on IGNITE-7019:
-

Final solution which was coded is passing ReuseBag instance as parameter 
through PagesList's getPageForPut and addStripe methods to allocatePage method. 
That allows use ReuseBag's pages before trying to allocate new pages.

> Cluster can not survive after IgniteOOM
> ---
>
> Key: IGNITE-7019
> URL: https://issues.apache.org/jira/browse/IGNITE-7019
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.3
>Reporter: Mikhail Cherkasov
>Assignee: Dmitriy Sorokin
>Priority: Critical
>  Labels: iep-7
> Fix For: 2.5
>
>
> even if we have full sync mode and transactional cache we can't add new nodes 
> if there  was IgniteOOM, after adding new nodes and re-balancing, old nodes 
> can't evict partitions:
> {code}
> [2017-11-17 20:02:24,588][ERROR][sys-#65%DR1%][GridDhtPreloader] Partition 
> eviction failed, this can cause grid hang.
> class org.apache.ignite.internal.mem.IgniteOutOfMemoryException: Not enough 
> memory allocated [policyName=100MB_Region_Eviction, size=104.9 MB]
> Consider increasing memory policy size, enabling evictions, adding more nodes 
> to the cluster, reducing number of backups or reducing model size.
> at 
> org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl.allocatePage(PageMemoryNoStoreImpl.java:294)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePageNoReuse(DataStructure.java:117)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePage(DataStructure.java:105)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.addStripe(PagesList.java:413)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.getPageForPut(PagesList.java:528)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.put(PagesList.java:617)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.FreeListImpl.addForRecycle(FreeListImpl.java:582)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.reuseFreePages(BPlusTree.java:3847)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.releaseAll(BPlusTree.java:4106)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.access$6900(BPlusTree.java:3166)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doRemove(BPlusTree.java:1782)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.remove(BPlusTree.java:1567)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.remove(IgniteCacheOffheapManagerImpl.java:1387)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.remove(IgniteCacheOffheapManagerImpl.java:374)
> at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.removeValue(GridCacheMapEntry.java:3233)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.clearInternal(GridDhtCacheEntry.java:588)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.clearAll(GridDhtLocalPartition.java:892)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.tryEvict(GridDhtLocalPartition.java:750)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:593)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:580)
> at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6639)
> at 
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:967)
> at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> Discussion on the dev list: 
> http://apache-ignite-developers.2346864.n4.nabble.com/How-properly-handle-IgniteOOM-td25288.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-7019) Cluster can not survive after IgniteOOM

2018-01-22 Thread Dmitriy Sorokin (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334085#comment-16334085
 ] 

Dmitriy Sorokin commented on IGNITE-7019:
-

We discussed possible solutions with [~mcherkasov] and [~avinogradov], and 
chose the following: first, when IOOME occured on page moving from bucket with 
lower index to higher one, we leave page on old bucket; second, we add 
periodical task for looking up such pages (placed on wrong buckets) and 
correcting its placement if possible (no IOOME on page moving).

Also we need reproducer for this bug, I'll make it at first.

> Cluster can not survive after IgniteOOM
> ---
>
> Key: IGNITE-7019
> URL: https://issues.apache.org/jira/browse/IGNITE-7019
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.3
>Reporter: Mikhail Cherkasov
>Assignee: Dmitriy Sorokin
>Priority: Critical
>  Labels: iep-7
> Fix For: 2.5
>
>
> even if we have full sync mode and transactional cache we can't add new nodes 
> if there  was IgniteOOM, after adding new nodes and re-balancing, old nodes 
> can't evict partitions:
> {code}
> [2017-11-17 20:02:24,588][ERROR][sys-#65%DR1%][GridDhtPreloader] Partition 
> eviction failed, this can cause grid hang.
> class org.apache.ignite.internal.mem.IgniteOutOfMemoryException: Not enough 
> memory allocated [policyName=100MB_Region_Eviction, size=104.9 MB]
> Consider increasing memory policy size, enabling evictions, adding more nodes 
> to the cluster, reducing number of backups or reducing model size.
> at 
> org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl.allocatePage(PageMemoryNoStoreImpl.java:294)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePageNoReuse(DataStructure.java:117)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePage(DataStructure.java:105)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.addStripe(PagesList.java:413)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.getPageForPut(PagesList.java:528)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.put(PagesList.java:617)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.FreeListImpl.addForRecycle(FreeListImpl.java:582)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.reuseFreePages(BPlusTree.java:3847)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.releaseAll(BPlusTree.java:4106)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.access$6900(BPlusTree.java:3166)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doRemove(BPlusTree.java:1782)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.remove(BPlusTree.java:1567)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.remove(IgniteCacheOffheapManagerImpl.java:1387)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.remove(IgniteCacheOffheapManagerImpl.java:374)
> at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.removeValue(GridCacheMapEntry.java:3233)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.clearInternal(GridDhtCacheEntry.java:588)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.clearAll(GridDhtLocalPartition.java:892)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.tryEvict(GridDhtLocalPartition.java:750)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:593)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:580)
> at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6639)
> at 
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:967)
> at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> Discussion on the dev list: 
> http://apache-ignite-developers.2346864.n4.nabble.com/How-properly-handle-IgniteOOM-td25288.html



--
This message was sent by Atlassian JIRA

[jira] [Commented] (IGNITE-7019) Cluster can not survive after IgniteOOM

2017-12-13 Thread Denis Magda (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290065#comment-16290065
 ] 

Denis Magda commented on IGNITE-7019:
-

This problem is related to the discussion around Ignite internal problems and 
their possible resolution:
http://apache-ignite-developers.2346864.n4.nabble.com/Internal-problems-requiring-graceful-node-shutdown-reboot-etc-td24856.html

Referring to that discussion, I would define a special IgniteFailureAction in 
response to IgniteOOM (IgniteFailureCause in terms of the new API). The action 
can purge, wipe out the page memory or do another extra steps.

> Cluster can not survive after IgniteOOM
> ---
>
> Key: IGNITE-7019
> URL: https://issues.apache.org/jira/browse/IGNITE-7019
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.3
>Reporter: Mikhail Cherkasov
>Priority: Critical
> Fix For: 2.4
>
>
> even if we have full sync mode and transactional cache we can't add new nodes 
> if there  was IgniteOOM, after adding new nodes and re-balancing, old nodes 
> can't evict partitions:
> [2017-11-17 20:02:24,588][ERROR][sys-#65%DR1%][GridDhtPreloader] Partition 
> eviction failed, this can cause grid hang.
> class org.apache.ignite.internal.mem.IgniteOutOfMemoryException: Not enough 
> memory allocated [policyName=100MB_Region_Eviction, size=104.9 MB]
> Consider increasing memory policy size, enabling evictions, adding more nodes 
> to the cluster, reducing number of backups or reducing model size.
> at 
> org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl.allocatePage(PageMemoryNoStoreImpl.java:294)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePageNoReuse(DataStructure.java:117)
> at 
> org.apache.ignite.internal.processors.cache.persistence.DataStructure.allocatePage(DataStructure.java:105)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.addStripe(PagesList.java:413)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.getPageForPut(PagesList.java:528)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.put(PagesList.java:617)
> at 
> org.apache.ignite.internal.processors.cache.persistence.freelist.FreeListImpl.addForRecycle(FreeListImpl.java:582)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.reuseFreePages(BPlusTree.java:3847)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.releaseAll(BPlusTree.java:4106)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Remove.access$6900(BPlusTree.java:3166)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doRemove(BPlusTree.java:1782)
> at 
> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.remove(BPlusTree.java:1567)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.remove(IgniteCacheOffheapManagerImpl.java:1387)
> at 
> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.remove(IgniteCacheOffheapManagerImpl.java:374)
> at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.removeValue(GridCacheMapEntry.java:3233)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.clearInternal(GridDhtCacheEntry.java:588)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.clearAll(GridDhtLocalPartition.java:892)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.tryEvict(GridDhtLocalPartition.java:750)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:593)
> at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPreloader$3.call(GridDhtPreloader.java:580)
> at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6639)
> at 
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$2.body(GridClosureProcessor.java:967)
> at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)