Re: Ignite High memory usage though very less disk size

2021-08-02 Thread Denis Magda
To easily collect what Zhenya is asking for and expedite the
troubleshooting, you can do a simple monitoring set up:
https://www.gridgain.com/docs/tutorials/management-monitoring/ignite-storage-monitoring


-
Denis

On Fri, Jul 30, 2021 at 2:32 PM Zhenya Stanilovsky 
wrote:

> hi, Devakumar J
> There is not enough information for analysis.
> Do you have any monitoring ? If no — plz enable it and try to understand
> how huge cpu consumption and possibly gc pauses correlates with you tasks.
> Do you have enough heap (-Xmx param) ? What kind of processes are consume
> most heap ?
> Without all these info we can`t move forward in analysis.
>
> thanks !
>
>
> Hi,
>
> We have 3server+2client cluster setup. Also we have 2 completely different
> clusters for different regions.
>
> Both has similar set of integrations in terms of SQL queries/ CQ
> listeners/ Client connections.
>
> Also the VM hardware/OS settings also same.
>
> In cluster 1 through we have disk of 20GB but the cluster performance is
> really good and heap usage/CPU usage is optimal.
>
> In cluster 2 we do have less data only in disk but there is heavy
> fluctuations in heap usage and lot FULL GC happening pausing JVM for 7 to 8
> secs every minute. Only restart helps in this case.
>
>
> Only difference noticed between machines is memory page cache utilization.
> We have done page cache cleanup and restarted the cluster and page cache
> utilization become 105 GB out of 126GB RAM with in a day.
>
> Please find the metrics below and suggest any debugging steps to carry
> out/document to refer.
>
>
> Cluster 1:
>
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
> ^-- Node [id=27529ecd, name=server-node-3, uptime=2 days, 17:48:09.803]
> ^-- H/N/C [hosts=3, nodes=5, CPUs=24]
> ^-- CPU [cur=1.33%, avg=3.05%, GC=0%]
> ^-- PageMemory [pages=3375870]
> ^-- Heap [used=4372MB, free=73.31%, comm=5600MB]
> ^-- Off-heap [used=13341MB, free=20.03%, comm=16584MB]
> ^--   sysMemPlc region [used=0MB, free=99.99%, comm=100MB]
> ^--   metastoreMemPlc region [used=0MB, free=99.85%, comm=0MB]
> ^--   TxLog region [used=0MB, free=100%, comm=100MB]
> ^--   DefaultRegion region [used=13341MB, free=18.57%, comm=16384MB]
> ^-- Ignite persistence [used=20052MB]
> ^--   sysMemPlc region [used=0MB]
> ^--   metastoreMemPlc region [used=0MB]
> ^--   TxLog region [used=0MB]
> ^--   DefaultRegion region [used=20052MB]
> ^-- Outbound messages queue [size=0]
> ^-- Public thread pool [active=0, idle=0, qSize=0]
> ^-- System thread pool [active=0, idle=7, qSize=0]
>
> Cluster 2:
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
> ^-- Node [id=5905afb7, name=server-node-1, uptime=2 days, 05:49:04.925]
> ^-- H/N/C [hosts=3, nodes=5, CPUs=24]
> ^-- CPU [cur=1.23%, avg=6.4%, GC=0%]
> ^-- PageMemory [pages=1173731]
> ^-- Heap [used=13043MB, free=20.39%, comm=16384MB]
> ^-- Off-heap [used=4638MB, free=72.2%, comm=16584MB]
> ^--   sysMemPlc region [used=0MB, free=99.99%, comm=100MB]
> ^--   metastoreMemPlc region [used=0MB, free=99.91%, comm=0MB]
> ^--   TxLog region [used=0MB, free=100%, comm=100MB]
> ^--   DefaultRegion region [used=4638MB, free=71.69%, comm=16384MB]
> ^-- Ignite persistence [used=5423MB]
> ^--   sysMemPlc region [used=0MB]
> ^--   metastoreMemPlc region [used=0MB]
> ^--   TxLog region [used=0MB]
> ^--   DefaultRegion region [used=5422MB]
> ^-- Outbound messages queue [size=0]
> ^-- Public thread pool [active=0, idle=0, qSize=0]
> ^-- System thread pool [active=0, idle=5, qSize=0]
>
>
> Thanks & Regards,
> Devakumar J
>
>
> 
>  Virus-free.
> www.avast.com
> 
>
>
>
>
>
>


Re: Ignite High memory usage though very less disk size

2021-07-30 Thread Zhenya Stanilovsky

hi, Devakumar J
There is not enough information for analysis.
Do you have any monitoring ? If no — plz enable it and try to understand how 
huge cpu consumption and possibly gc pauses correlates with you tasks.
Do you have enough heap (-Xmx param) ? What kind of processes are consume most 
heap ?
Without all these info we can`t move forward in analysis.
 
thanks ! 
 
>Hi,
>
>We have 3server+2client cluster setup. Also we have 2 completely different 
>clusters for different regions.
>
>Both has similar set of integrations in terms of SQL queries/ CQ listeners/ 
>Client connections.
>
>Also the VM hardware/OS settings also same.
>
>In cluster 1 through we have disk of 20GB but the cluster performance is 
>really good and heap usage/CPU usage is optimal.
>
>In cluster 2 we do have less data only in disk but there is heavy fluctuations 
>in heap usage and lot FULL GC happening pausing JVM for 7 to 8 secs every 
>minute. Only restart helps in this case.
>
>
>Only difference noticed between machines is memory page cache utilization. We 
>have done page cache cleanup and restarted the cluster and page cache 
>utilization become 105 GB out of 126GB RAM with in a day.
>
>Please find the metrics below and suggest any debugging steps to carry 
>out/document to refer.
>
>
>Cluster 1:
>
>Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>    ^-- Node [id=27529ecd, name=server-node-3, uptime=2 days, 17:48:09.803]
>    ^-- H/N/C [hosts=3, nodes=5, CPUs=24]
>    ^-- CPU [cur=1.33%, avg=3.05%, GC=0%]
>    ^-- PageMemory [pages=3375870]
>    ^-- Heap [used=4372MB, free=73.31%, comm=5600MB]
>    ^-- Off-heap [used=13341MB, free=20.03%, comm=16584MB]
>    ^--   sysMemPlc region [used=0MB, free=99.99%, comm=100MB]
>    ^--   metastoreMemPlc region [used=0MB, free=99.85%, comm=0MB]
>    ^--   TxLog region [used=0MB, free=100%, comm=100MB]
>    ^--   DefaultRegion region [used=13341MB, free=18.57%, comm=16384MB]
>    ^-- Ignite persistence [used=20052MB]
>    ^--   sysMemPlc region [used=0MB]
>    ^--   metastoreMemPlc region [used=0MB]
>    ^--   TxLog region [used=0MB]
>    ^--   DefaultRegion region [used=20052MB]
>    ^-- Outbound messages queue [size=0]
>    ^-- Public thread pool [active=0, idle=0, qSize=0]
>    ^-- System thread pool [active=0, idle=7, qSize=0]
>
>Cluster 2:
>Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>    ^-- Node [id=5905afb7, name=server-node-1, uptime=2 days, 05:49:04.925]
>    ^-- H/N/C [hosts=3, nodes=5, CPUs=24]
>    ^-- CPU [cur=1.23%, avg=6.4%, GC=0%]
>    ^-- PageMemory [pages=1173731]
>    ^-- Heap [used=13043MB, free=20.39%, comm=16384MB]
>    ^-- Off-heap [used=4638MB, free=72.2%, comm=16584MB]
>    ^--   sysMemPlc region [used=0MB, free=99.99%, comm=100MB]
>    ^--   metastoreMemPlc region [used=0MB, free=99.91%, comm=0MB]
>    ^--   TxLog region [used=0MB, free=100%, comm=100MB]
>    ^--   DefaultRegion region [used=4638MB, free=71.69%, comm=16384MB]
>    ^-- Ignite persistence [used=5423MB]
>    ^--   sysMemPlc region [used=0MB]
>    ^--   metastoreMemPlc region [used=0MB]
>    ^--   TxLog region [used=0MB]
>    ^--   DefaultRegion region [used=5422MB]
>    ^-- Outbound messages queue [size=0]
>    ^-- Public thread pool [active=0, idle=0, qSize=0]
>    ^-- System thread pool [active=0, idle=5, qSize=0]
> 
>  Thanks & Regards ,
>Devakumar J
> 
>Virus-free.  www.avast.com 
 
 
 
 

Ignite High memory usage though very less disk size

2021-07-29 Thread DK
Hi,

We have 3server+2client cluster setup. Also we have 2 completely different
clusters for different regions.

Both has similar set of integrations in terms of SQL queries/ CQ listeners/
Client connections.

Also the VM hardware/OS settings also same.

In cluster 1 through we have disk of 20GB but the cluster performance is
really good and heap usage/CPU usage is optimal.

In cluster 2 we do have less data only in disk but there is heavy
fluctuations in heap usage and lot FULL GC happening pausing JVM for 7 to 8
secs every minute. Only restart helps in this case.


Only difference noticed between machines is memory page cache utilization.
We have done page cache cleanup and restarted the cluster and page cache
utilization become 105 GB out of 126GB RAM with in a day.

Please find the metrics below and suggest any debugging steps to carry
out/document to refer.


Cluster 1:

Metrics for local node (to disable set 'metricsLogFrequency' to 0)
^-- Node [id=27529ecd, name=server-node-3, uptime=2 days, 17:48:09.803]
^-- H/N/C [hosts=3, nodes=5, CPUs=24]
^-- CPU [cur=1.33%, avg=3.05%, GC=0%]
^-- PageMemory [pages=3375870]
^-- Heap [used=4372MB, free=73.31%, comm=5600MB]
^-- Off-heap [used=13341MB, free=20.03%, comm=16584MB]
^--   sysMemPlc region [used=0MB, free=99.99%, comm=100MB]
^--   metastoreMemPlc region [used=0MB, free=99.85%, comm=0MB]
^--   TxLog region [used=0MB, free=100%, comm=100MB]
^--   DefaultRegion region [used=13341MB, free=18.57%, comm=16384MB]
^-- Ignite persistence [used=20052MB]
^--   sysMemPlc region [used=0MB]
^--   metastoreMemPlc region [used=0MB]
^--   TxLog region [used=0MB]
^--   DefaultRegion region [used=20052MB]
^-- Outbound messages queue [size=0]
^-- Public thread pool [active=0, idle=0, qSize=0]
^-- System thread pool [active=0, idle=7, qSize=0]

Cluster 2:
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
^-- Node [id=5905afb7, name=server-node-1, uptime=2 days, 05:49:04.925]
^-- H/N/C [hosts=3, nodes=5, CPUs=24]
^-- CPU [cur=1.23%, avg=6.4%, GC=0%]
^-- PageMemory [pages=1173731]
^-- Heap [used=13043MB, free=20.39%, comm=16384MB]
^-- Off-heap [used=4638MB, free=72.2%, comm=16584MB]
^--   sysMemPlc region [used=0MB, free=99.99%, comm=100MB]
^--   metastoreMemPlc region [used=0MB, free=99.91%, comm=0MB]
^--   TxLog region [used=0MB, free=100%, comm=100MB]
^--   DefaultRegion region [used=4638MB, free=71.69%, comm=16384MB]
^-- Ignite persistence [used=5423MB]
^--   sysMemPlc region [used=0MB]
^--   metastoreMemPlc region [used=0MB]
^--   TxLog region [used=0MB]
^--   DefaultRegion region [used=5422MB]
^-- Outbound messages queue [size=0]
^-- Public thread pool [active=0, idle=0, qSize=0]
^-- System thread pool [active=0, idle=5, qSize=0]


Thanks & Regards,
Devakumar J


Virus-free.
www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>