[jira] [Commented] (IGNITE-8469) Non-heap memory leak for calling cluster activation multi times

2018-11-08 Thread Maxim Muzafarov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679644#comment-16679644
 ] 

Maxim Muzafarov commented on IGNITE-8469:
-

[~mshonichev]

If you are checking the 2.7 release branch it seems to me that it's not 
possible to run a different cluster on the same PDS files (don't know if this 
correct). This is was my case.

> Non-heap memory leak for calling cluster activation multi times
> ---
>
> Key: IGNITE-8469
> URL: https://issues.apache.org/jira/browse/IGNITE-8469
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
> Fix For: 2.7
>
>
> Calling multiple time cluster (with enabled persistence and started client 
> nodes) activation {{ig3CB.cluster().active(true);}} leads to non-heap memory 
> leak.
> Line 
> {{org/apache/ignite/internal/pagemem/impl/PageMemoryNoStoreImpl.java:234}} 
> looks suspicious because of in case method 
> {{org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl#start}} 
> callled multi times (e.g. activate(true) called multi times) we lost info 
> about allocated regions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8469) Non-heap memory leak for calling cluster activation multi times

2018-11-07 Thread Max Shonichev (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678452#comment-16678452
 ] 

Max Shonichev commented on IGNITE-8469:
---

Seems that two grids with the same PDS scenario is not reproducible anymore, 
second node just refuses to start, process dies, thus no leaks, right? :)

{noformat}
[19:12:05,058][SEVERE][main][IgniteKernal] Got exception while starting (will 
rollback startup routine).
class org.apache.ignite.IgniteCheckedException: Failed to start processor: 
GridProcessorAdapter []
at 
org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1784)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1008)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2020)
at 
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1725)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1153)
at 
org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1071)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:957)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:856)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:726)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:695)
at org.apache.ignite.Ignition.start(Ignition.java:348)
at 
org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:301)
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to acquire 
file lock during 10 sec, (locked by [ba1467ea-0180-4f18-8151-973cb56ef4a7][]): 
/opt/tmp/gg/db/ign_8469/lock
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$FileLockHolder.tryLock(GridCacheDatabaseSharedManager.java:4303)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.acquireFileLock(GridCacheDatabaseSharedManager.java:593)
at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.start0(GridCacheDatabaseSharedManager.java:496)
at 
org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:61)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:741)
at 
org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1781)
... 11 more
[19:12:05,059][WARNING][main][IgniteKernal] Attempt to stop starting grid. This 
operation cannot be guaranteed to be successful.
[19:12:05,070][INFO][main][FilePageStoreManager] Cleanup cache stores [total=0, 
left=0, cleanFiles=false]
[19:12:05,076][INFO][main][IgniteKernal] 

>>> +--+
>>> Ignite ver.  
>>> stopped OK
>>> +--+
>>> Grid uptime: 00:00:11.197
{noformat}

[~Mmuzaf] was it your case? Or you'd started two Ignition manually within one 
JVM?

> Non-heap memory leak for calling cluster activation multi times
> ---
>
> Key: IGNITE-8469
> URL: https://issues.apache.org/jira/browse/IGNITE-8469
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
> Fix For: 2.7
>
>
> Calling multiple time cluster (with enabled persistence and started client 
> nodes) activation {{ig3CB.cluster().active(true);}} leads to non-heap memory 
> leak.
> Line 
> {{org/apache/ignite/internal/pagemem/impl/PageMemoryNoStoreImpl.java:234}} 
> looks suspicious because of in case method 
> {{org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl#start}} 
> callled multi times (e.g. activate(true) called multi times) we lost info 
> about allocated regions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8469) Non-heap memory leak for calling cluster activation multi times

2018-11-07 Thread Max Shonichev (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678378#comment-16678378
 ] 

Max Shonichev commented on IGNITE-8469:
---

Thank you for feedback, I'll try to test scenario with two clusters as well and 
report later

> Non-heap memory leak for calling cluster activation multi times
> ---
>
> Key: IGNITE-8469
> URL: https://issues.apache.org/jira/browse/IGNITE-8469
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
> Fix For: 2.7
>
>
> Calling multiple time cluster (with enabled persistence and started client 
> nodes) activation {{ig3CB.cluster().active(true);}} leads to non-heap memory 
> leak.
> Line 
> {{org/apache/ignite/internal/pagemem/impl/PageMemoryNoStoreImpl.java:234}} 
> looks suspicious because of in case method 
> {{org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl#start}} 
> callled multi times (e.g. activate(true) called multi times) we lost info 
> about allocated regions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8469) Non-heap memory leak for calling cluster activation multi times

2018-11-07 Thread Maxim Muzafarov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678367#comment-16678367
 ] 

Maxim Muzafarov commented on IGNITE-8469:
-

[~mshonichev]

Hello! Thanks for testing this issue. 
The reproducer was added to the test scope and merged with PR. Please, see 
{{PageMemoryNoStoreLeakTest}}.

As a high-level example of this memory leak, you can try to look at 
{{IgniteChangeGlobalStateTest.testActivateAfterFailGetLock}}. If we run it 100+ 
on TC without this fix it will fail with OOM. I don't remember all the details 
the memory leak scenario, but you should configure the backup cluster (with or 
without client nodes) and it should fail on activation when the main cluster is 
online. And this is the source of memory leak.

I've done several tests with and without this fix and it seems to me that 
everything works fine.


> Non-heap memory leak for calling cluster activation multi times
> ---
>
> Key: IGNITE-8469
> URL: https://issues.apache.org/jira/browse/IGNITE-8469
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
> Fix For: 2.7
>
>
> Calling multiple time cluster (with enabled persistence and started client 
> nodes) activation {{ig3CB.cluster().active(true);}} leads to non-heap memory 
> leak.
> Line 
> {{org/apache/ignite/internal/pagemem/impl/PageMemoryNoStoreImpl.java:234}} 
> looks suspicious because of in case method 
> {{org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl#start}} 
> callled multi times (e.g. activate(true) called multi times) we lost info 
> about allocated regions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8469) Non-heap memory leak for calling cluster activation multi times

2018-11-06 Thread Max Shonichev (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677009#comment-16677009
 ] 

Max Shonichev commented on IGNITE-8469:
---

ok, here are final results with 100 iterations activate/deactivate scenario:

without fix, total process memory is slowly increasing approx each 20 
iterations:
{noformat}
10298776
...
10283356
10284384
...
10286432
{noformat}

with fix, total process memory is changed a little first few runs, than remains 
constant all 90+ iterations
{noformat}
14501520
10099096
10232216
10232216
10297752
...
10297752
{noformat}

so, it seems, that this scenario also shows memory leak fixed, [~Mmuzaf], 
please confirm

> Non-heap memory leak for calling cluster activation multi times
> ---
>
> Key: IGNITE-8469
> URL: https://issues.apache.org/jira/browse/IGNITE-8469
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
> Fix For: 2.7
>
>
> Calling multiple time cluster (with enabled persistence and started client 
> nodes) activation {{ig3CB.cluster().active(true);}} leads to non-heap memory 
> leak.
> Line 
> {{org/apache/ignite/internal/pagemem/impl/PageMemoryNoStoreImpl.java:234}} 
> looks suspicious because of in case method 
> {{org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl#start}} 
> callled multi times (e.g. activate(true) called multi times) we lost info 
> about allocated regions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8469) Non-heap memory leak for calling cluster activation multi times

2018-11-06 Thread Max Shonichev (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676934#comment-16676934
 ] 

Max Shonichev commented on IGNITE-8469:
---

Possibly, I'm seeing a different kind of leak, about this scenario:
{quote}Global scenario for this leak is:

1.Start topology with client nodes, presistence enabled
2.Set active(true) for cluster. This activation should fail by some 
circumstances (e.g. some locks exists)
3.IgniteCacheDatabaseSharedManager started and onActive method called here. New 
memory segment allocated for client node
4.Set active(true) again. Activation successfull, non heap memory leak 
introduced{quote}

1. should I start client nodes with persistence enabled ?!
2. how to achive 'locks exist' condition to fail activation?
3. should IgniteCacheDatabaseSharedManager be started manually or you've meant 
that this is automagically called under the hood upon active(true)?


> Non-heap memory leak for calling cluster activation multi times
> ---
>
> Key: IGNITE-8469
> URL: https://issues.apache.org/jira/browse/IGNITE-8469
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
> Fix For: 2.7
>
>
> Calling multiple time cluster (with enabled persistence and started client 
> nodes) activation {{ig3CB.cluster().active(true);}} leads to non-heap memory 
> leak.
> Line 
> {{org/apache/ignite/internal/pagemem/impl/PageMemoryNoStoreImpl.java:234}} 
> looks suspicious because of in case method 
> {{org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl#start}} 
> callled multi times (e.g. activate(true) called multi times) we lost info 
> about allocated regions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8469) Non-heap memory leak for calling cluster activation multi times

2018-11-06 Thread Max Shonichev (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676922#comment-16676922
 ] 

Max Shonichev commented on IGNITE-8469:
---

[~Mmuzaf] what is exact reproducer for that issue? I'm executing 100 iterations 
of activate-deactivate and see small increase in total pages sum per process 
for both fixed and original version

> Non-heap memory leak for calling cluster activation multi times
> ---
>
> Key: IGNITE-8469
> URL: https://issues.apache.org/jira/browse/IGNITE-8469
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
> Fix For: 2.7
>
>
> Calling multiple time cluster (with enabled persistence and started client 
> nodes) activation {{ig3CB.cluster().active(true);}} leads to non-heap memory 
> leak.
> Line 
> {{org/apache/ignite/internal/pagemem/impl/PageMemoryNoStoreImpl.java:234}} 
> looks suspicious because of in case method 
> {{org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl#start}} 
> callled multi times (e.g. activate(true) called multi times) we lost info 
> about allocated regions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8469) Non-heap memory leak for calling cluster activation multi times

2018-05-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-8469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16482554#comment-16482554
 ] 

ASF GitHub Bot commented on IGNITE-8469:


Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/3986


> Non-heap memory leak for calling cluster activation multi times
> ---
>
> Key: IGNITE-8469
> URL: https://issues.apache.org/jira/browse/IGNITE-8469
> Project: Ignite
>  Issue Type: Bug
>  Components: persistence
>Reporter: Maxim Muzafarov
>Assignee: Maxim Muzafarov
>Priority: Major
> Fix For: 2.6
>
>
> Calling multiple time cluster (with enabled persistence and started client 
> nodes) activation {{ig3CB.cluster().active(true);}} leads to non-heap memory 
> leak.
> Line 
> {{org/apache/ignite/internal/pagemem/impl/PageMemoryNoStoreImpl.java:234}} 
> looks suspicious because of in case method 
> {{org.apache.ignite.internal.pagemem.impl.PageMemoryNoStoreImpl#start}} 
> callled multi times (e.g. activate(true) called multi times) we lost info 
> about allocated regions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)