[jira] [Comment Edited] (IGNITE-16136) System Thread pool starvation and out of memory

2022-08-16 Thread David Albrecht (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-16136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17580356#comment-17580356
 ] 

David Albrecht edited comment on IGNITE-16136 at 8/16/22 3:03 PM:
--

Yes we are using thick clients (Spring applications running in Tomcats) with 
one single server node. *But we do not use continous queries at least not 
intentionally.*

Up to now the problem has occurred on three different systems, each in a 
different application acting as an Ignite node. However unfortunately, we do 
not know how to reproduce the problem. 

We investigated two of the systems according to your questions and the above 
command returned:

System 1:
ignite\work\db\marshaller = 125 Files

System 2:
ignite\work\db\marshaller = 124 Files

We define the configuration using Spring configuration and properties. You can 
find the corresponding files attached.

 [^configuration.zip] 

Please let me know if thats sufficient to proceed.


was (Author: sawfish):
Yes we are using thick clients (Spring applications running in Tomcats). *But 
we do not use continous queries at least not intentionally.*

Up to now the problem has occurred on three different systems, each in a 
different application acting as an Ignite node. However unfortunately, we do 
not know how to reproduce the problem. 

We investigated two of the systems according to your questions and the above 
command returned:

System 1:
ignite\work\db\marshaller = 125 Files

System 2:
ignite\work\db\marshaller = 124 Files

We define the configuration using Spring configuration and properties. You can 
find the corresponding files attached.

 [^configuration.zip] 

Please let me know if thats sufficient to proceed.

> System Thread pool starvation and out of memory
> ---
>
> Key: IGNITE-16136
> URL: https://issues.apache.org/jira/browse/IGNITE-16136
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7.6
>Reporter: David Albrecht
>Assignee: Maxim Muzafarov
>Priority: Critical
>  Labels: ise
> Fix For: 2.14
>
> Attachments: configuration.zip, image-2021-12-15-21-13-43-775.png, 
> image-2021-12-15-21-17-47-652.png
>
>
> We are experiencing thread pool starvations and after some time out of memory 
> exceptions in some of our ignite client nodes while the server node seems to 
> be running without any problems. It seems like all sys threads are stuck when 
> calling MarshallerContextImpl.getClassName. Which in turn leads to a growing 
> worker queue.
>  
> First warnings regarding the thread pool starvation:
> {code:java}
> 10.12.21 11:22:34.603 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:27:34.654 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:32:34.713 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:37:34.764 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:42:34.796 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:47:34.839 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> {code}
> Out of memory error leading to a crash of the application:
> {code}
> Exception: java.lang.OutOfMemoryError thrown from the 
> UncaughtExceptionHandler in thread "https-openssl-nio-16443-ClientPoller"
> Exception: java.lang.OutOfMemoryError thrown from the 
> UncaughtExceptionHandler in thread "ajp-nio-16009-ClientPoller"
> 11-Dec-2021 03:07:24.446 SEVERE [Catalina-utility-1] 
> org.apache.coyote.AbstractProtocol.startAsyncTimeout Error processing async 
> timeouts
>   java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: 
> Java heap space
> {code}
> The queue full of messages:
>  !image-2021-12-15-21-17-47-652.png! 
> It seems like all sys threads are stuck while waiting at:
> {code}
> sys-#170
>   at jdk.internal.misc.Unsafe.park(ZJ)V (Native Method)
>   at java.util.concurrent.locks.LockSupport.park()V 

[jira] [Comment Edited] (IGNITE-16136) System Thread pool starvation and out of memory

2022-08-16 Thread David Albrecht (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-16136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17580356#comment-17580356
 ] 

David Albrecht edited comment on IGNITE-16136 at 8/16/22 3:03 PM:
--

Yes we are using thick clients (Spring applications running in Tomcats) with 
one single server node. *But we do not use continous queries at least not 
intentionally.*

Up to now the problem has occurred on three different systems, each in a 
different application acting as an Ignite client node. However unfortunately, 
we do not know how to reproduce the problem. 

We investigated two of the systems according to your questions and the above 
command returned:

System 1:
ignite\work\db\marshaller = 125 Files

System 2:
ignite\work\db\marshaller = 124 Files

We define the configuration using Spring configuration and properties. You can 
find the corresponding files attached.

 [^configuration.zip] 

Please let me know if thats sufficient to proceed.


was (Author: sawfish):
Yes we are using thick clients (Spring applications running in Tomcats) with 
one single server node. *But we do not use continous queries at least not 
intentionally.*

Up to now the problem has occurred on three different systems, each in a 
different application acting as an Ignite node. However unfortunately, we do 
not know how to reproduce the problem. 

We investigated two of the systems according to your questions and the above 
command returned:

System 1:
ignite\work\db\marshaller = 125 Files

System 2:
ignite\work\db\marshaller = 124 Files

We define the configuration using Spring configuration and properties. You can 
find the corresponding files attached.

 [^configuration.zip] 

Please let me know if thats sufficient to proceed.

> System Thread pool starvation and out of memory
> ---
>
> Key: IGNITE-16136
> URL: https://issues.apache.org/jira/browse/IGNITE-16136
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7.6
>Reporter: David Albrecht
>Assignee: Maxim Muzafarov
>Priority: Critical
>  Labels: ise
> Fix For: 2.14
>
> Attachments: configuration.zip, image-2021-12-15-21-13-43-775.png, 
> image-2021-12-15-21-17-47-652.png
>
>
> We are experiencing thread pool starvations and after some time out of memory 
> exceptions in some of our ignite client nodes while the server node seems to 
> be running without any problems. It seems like all sys threads are stuck when 
> calling MarshallerContextImpl.getClassName. Which in turn leads to a growing 
> worker queue.
>  
> First warnings regarding the thread pool starvation:
> {code:java}
> 10.12.21 11:22:34.603 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:27:34.654 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:32:34.713 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:37:34.764 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:42:34.796 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:47:34.839 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> {code}
> Out of memory error leading to a crash of the application:
> {code}
> Exception: java.lang.OutOfMemoryError thrown from the 
> UncaughtExceptionHandler in thread "https-openssl-nio-16443-ClientPoller"
> Exception: java.lang.OutOfMemoryError thrown from the 
> UncaughtExceptionHandler in thread "ajp-nio-16009-ClientPoller"
> 11-Dec-2021 03:07:24.446 SEVERE [Catalina-utility-1] 
> org.apache.coyote.AbstractProtocol.startAsyncTimeout Error processing async 
> timeouts
>   java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: 
> Java heap space
> {code}
> The queue full of messages:
>  !image-2021-12-15-21-17-47-652.png! 
> It seems like all sys threads are stuck while waiting at:
> {code}
> sys-#170
>   at jdk.internal.misc.Unsafe.park(ZJ)V (Native Method)
>   at 

[jira] [Comment Edited] (IGNITE-16136) System Thread pool starvation and out of memory

2022-08-15 Thread Maxim Muzafarov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-16136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579835#comment-17579835
 ] 

Maxim Muzafarov edited comment on IGNITE-16136 at 8/15/22 6:06 PM:
---

Please, provide the additional information about the number of registered 
binary metadata types on server nodes.
{code:java}
ls -la /ingite/work/db/marshaller/ | wc -l 
{code}
or since 2.10
{code:java}
select count(*) from SYS.BINARY_METADATA
{code}
[BINARY_METADATA|https://ignite.apache.org/docs/latest/monitoring-metrics/system-views#binary_metadata]


was (Author: mmuzaf):
Please, provide the additional information about the number of registered 
binary metadata types on server nodes.
{code:java}
ls -la /ingite/work/db/marshaller/ | wc -l 
{code}
or since 2.12
{code:java}
select count(*) from SYS.BINARY_METADATA
{code}
[BINARY_METADATA|https://ignite.apache.org/docs/latest/monitoring-metrics/system-views#binary_metadata]

> System Thread pool starvation and out of memory
> ---
>
> Key: IGNITE-16136
> URL: https://issues.apache.org/jira/browse/IGNITE-16136
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.7.6
>Reporter: David Albrecht
>Priority: Critical
>  Labels: ise
> Fix For: 2.14
>
> Attachments: image-2021-12-15-21-13-43-775.png, 
> image-2021-12-15-21-17-47-652.png
>
>
> We are experiencing thread pool starvations and after some time out of memory 
> exceptions in some of our ignite client nodes while the server node seems to 
> be running without any problems. It seems like all sys threads are stuck when 
> calling MarshallerContextImpl.getClassName. Which in turn leads to a growing 
> worker queue.
>  
> First warnings regarding the thread pool starvation:
> {code:java}
> 10.12.21 11:22:34.603 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:27:34.654 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:32:34.713 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:37:34.764 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:42:34.796 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> 10.12.21 11:47:34.839 [WARN ] 
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 3ms, is system thread pool size large enough?)
> {code}
> Out of memory error leading to a crash of the application:
> {code}
> Exception: java.lang.OutOfMemoryError thrown from the 
> UncaughtExceptionHandler in thread "https-openssl-nio-16443-ClientPoller"
> Exception: java.lang.OutOfMemoryError thrown from the 
> UncaughtExceptionHandler in thread "ajp-nio-16009-ClientPoller"
> 11-Dec-2021 03:07:24.446 SEVERE [Catalina-utility-1] 
> org.apache.coyote.AbstractProtocol.startAsyncTimeout Error processing async 
> timeouts
>   java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: 
> Java heap space
> {code}
> The queue full of messages:
>  !image-2021-12-15-21-17-47-652.png! 
> It seems like all sys threads are stuck while waiting at:
> {code}
> sys-#170
>   at jdk.internal.misc.Unsafe.park(ZJ)V (Native Method)
>   at java.util.concurrent.locks.LockSupport.park()V (LockSupport.java:323)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(Z)Ljava/lang/Object;
>  (GridFutureAdapter.java:178)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get()Ljava/lang/Object;
>  (GridFutureAdapter.java:141)
>   at 
> org.apache.ignite.internal.MarshallerContextImpl.getClassName(BI)Ljava/lang/String;
>  (MarshallerContextImpl.java:379)
>   at 
> org.apache.ignite.internal.MarshallerContextImpl.getClass(ILjava/lang/ClassLoader;)Ljava/lang/Class;
>  (MarshallerContextImpl.java:344)
>   at 
>