[ 
https://issues.apache.org/jira/browse/IGNITE-16136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17598573#comment-17598573
 ] 

Maxim Muzafarov commented on IGNITE-16136:
------------------------------------------

[~tledkov] 

You're right. The client node can't handle the mapping request due to the 
SYSTEM_POOL is busy by cache event messages. The issue described above related 
both for marshalling and binary metadata requests and leads for the system 
thread pool starvation. 

For the client node binary metadata and marshalling mappings propaged thought 
the discovery and communication SPIs. If the client node is waiting for the 
reply from the server node and concurrently receives the required mappings from 
discovery messages, then we can for sure release the thread locks immediately, 
thus no starvation will occur.

Moving processing of such a messages to the dedicated pool sound reasonable for 
me, but should be widely discussed with the whole Community. Currently, there 
is no need of such an actions to fix the starvation.

> System Thread pool starvation and out of memory
> -----------------------------------------------
>
>                 Key: IGNITE-16136
>                 URL: https://issues.apache.org/jira/browse/IGNITE-16136
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.7.6
>            Reporter: David Albrecht
>            Assignee: Maxim Muzafarov
>            Priority: Critical
>              Labels: ise
>             Fix For: 2.14
>
>         Attachments: configuration.zip, image-2021-12-15-21-13-43-775.png, 
> image-2021-12-15-21-17-47-652.png
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> We are experiencing thread pool starvations and after some time out of memory 
> exceptions in some of our ignite client nodes while the server node seems to 
> be running without any problems. It seems like all sys threads are stuck when 
> calling MarshallerContextImpl.getClassName. Which in turn leads to a growing 
> worker queue.
>  
> First warnings regarding the thread pool starvation:
> {code:java}
> 10.12.21 11:22:34.603 [WARN ]                                         
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 30000ms, is system thread pool size large enough?)
> 10.12.21 11:27:34.654 [WARN ]                                         
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 30000ms, is system thread pool size large enough?)
> 10.12.21 11:32:34.713 [WARN ]                                         
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 30000ms, is system thread pool size large enough?)
> 10.12.21 11:37:34.764 [WARN ]                                         
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 30000ms, is system thread pool size large enough?)
> 10.12.21 11:42:34.796 [WARN ]                                         
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 30000ms, is system thread pool size large enough?)
> 10.12.21 11:47:34.839 [WARN ]                                         
> IgniteKernal.warning(127): Possible thread pool starvation detected (no task 
> completed in last 30000ms, is system thread pool size large enough?)
> {code}
> Out of memory error leading to a crash of the application:
> {code}
> Exception: java.lang.OutOfMemoryError thrown from the 
> UncaughtExceptionHandler in thread "https-openssl-nio-16443-ClientPoller"
> Exception: java.lang.OutOfMemoryError thrown from the 
> UncaughtExceptionHandler in thread "ajp-nio-16009-ClientPoller"
> 11-Dec-2021 03:07:24.446 SEVERE [Catalina-utility-1] 
> org.apache.coyote.AbstractProtocol.startAsyncTimeout Error processing async 
> timeouts
>       java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: 
> Java heap space
> {code}
> The queue full of messages:
>  !image-2021-12-15-21-17-47-652.png! 
> It seems like all sys threads are stuck while waiting at:
> {code}
> sys-#170
>   at jdk.internal.misc.Unsafe.park(ZJ)V (Native Method)
>   at java.util.concurrent.locks.LockSupport.park()V (LockSupport.java:323)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(Z)Ljava/lang/Object;
>  (GridFutureAdapter.java:178)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get()Ljava/lang/Object;
>  (GridFutureAdapter.java:141)
>   at 
> org.apache.ignite.internal.MarshallerContextImpl.getClassName(BI)Ljava/lang/String;
>  (MarshallerContextImpl.java:379)
>   at 
> org.apache.ignite.internal.MarshallerContextImpl.getClass(ILjava/lang/ClassLoader;)Ljava/lang/Class;
>  (MarshallerContextImpl.java:344)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedMarshallerUtils.classDescriptor(Ljava/util/concurrent/ConcurrentMap;ILjava/lang/ClassLoader;Lorg/apache/ignite/marshaller/MarshallerContext;Lorg/apache/ignite/internal/marshaller/optimized/OptimizedMarshallerIdMapper;)Lorg/apache/ignite/internal/marshaller/optimized/OptimizedClassDescriptor;
>  (OptimizedMarshallerUtils.java:264)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readObject0()Ljava/lang/Object;
>  (OptimizedObjectInputStream.java:341)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride()Ljava/lang/Object;
>  (OptimizedObjectInputStream.java:198)
>   at 
> java.io.ObjectInputStream.readObject(Ljava/lang/Class;)Ljava/lang/Object; 
> (ObjectInputStream.java:484)
>   at java.io.ObjectInputStream.readObject()Ljava/lang/Object; 
> (ObjectInputStream.java:451)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readFields(Ljava/lang/Object;Lorg/apache/ignite/internal/marshaller/optimized/OptimizedClassDescriptor$ClassFields;)V
>  (OptimizedObjectInputStream.java:519)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readSerializable(Ljava/lang/Class;Ljava/util/List;Ljava/lang/reflect/Method;Lorg/apache/ignite/internal/marshaller/optimized/OptimizedClassDescriptor$Fields;)Ljava/lang/Object;
>  (OptimizedObjectInputStream.java:611)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedClassDescriptor.read(Lorg/apache/ignite/internal/marshaller/optimized/OptimizedObjectInputStream;)Ljava/lang/Object;
>  (OptimizedClassDescriptor.java:954)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readObject0()Ljava/lang/Object;
>  (OptimizedObjectInputStream.java:346)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride()Ljava/lang/Object;
>  (OptimizedObjectInputStream.java:198)
>   at 
> java.io.ObjectInputStream.readObject(Ljava/lang/Class;)Ljava/lang/Object; 
> (ObjectInputStream.java:484)
>   at java.io.ObjectInputStream.readObject()Ljava/lang/Object; 
> (ObjectInputStream.java:451)
>   at 
> org.apache.ignite.internal.GridEventConsumeHandler$EventWrapper.readExternal(Ljava/io/ObjectInput;)V
>  (GridEventConsumeHandler.java:558)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readExternalizable(Ljava/lang/reflect/Constructor;Ljava/lang/reflect/Method;)Ljava/lang/Object;
>  (OptimizedObjectInputStream.java:555)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedClassDescriptor.read(Lorg/apache/ignite/internal/marshaller/optimized/OptimizedObjectInputStream;)Ljava/lang/Object;
>  (OptimizedClassDescriptor.java:949)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readObject0()Ljava/lang/Object;
>  (OptimizedObjectInputStream.java:346)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride()Ljava/lang/Object;
>  (OptimizedObjectInputStream.java:198)
>   at 
> java.io.ObjectInputStream.readObject(Ljava/lang/Class;)Ljava/lang/Object; 
> (ObjectInputStream.java:484)
>   at java.io.ObjectInputStream.readObject()Ljava/lang/Object; 
> (ObjectInputStream.java:451)
>   at 
> java.util.concurrent.ConcurrentLinkedDeque.readObject(Ljava/io/ObjectInputStream;)V
>  (ConcurrentLinkedDeque.java:1588)
>   at 
> jdk.internal.reflect.GeneratedMethodAccessor268.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (Unknown Source)
>   at 
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (DelegatingMethodAccessorImpl.java:43)
>   at 
> java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
>  (Method.java:566)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readSerializable(Ljava/lang/Class;Ljava/util/List;Ljava/lang/reflect/Method;Lorg/apache/ignite/internal/marshaller/optimized/OptimizedClassDescriptor$Fields;)Ljava/lang/Object;
>  (OptimizedObjectInputStream.java:604)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedClassDescriptor.read(Lorg/apache/ignite/internal/marshaller/optimized/OptimizedObjectInputStream;)Ljava/lang/Object;
>  (OptimizedClassDescriptor.java:954)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readObject0()Ljava/lang/Object;
>  (OptimizedObjectInputStream.java:346)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedObjectInputStream.readObjectOverride()Ljava/lang/Object;
>  (OptimizedObjectInputStream.java:198)
>   at 
> java.io.ObjectInputStream.readObject(Ljava/lang/Class;)Ljava/lang/Object; 
> (ObjectInputStream.java:484)
>   at java.io.ObjectInputStream.readObject()Ljava/lang/Object; 
> (ObjectInputStream.java:451)
>   at 
> org.apache.ignite.internal.marshaller.optimized.OptimizedMarshaller.unmarshal0(Ljava/io/InputStream;Ljava/lang/ClassLoader;)Ljava/lang/Object;
>  (OptimizedMarshaller.java:228)
>   at 
> org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(Ljava/io/InputStream;Ljava/lang/ClassLoader;)Ljava/lang/Object;
>  (AbstractNodeNameAwareMarshaller.java:94)
>   at 
> org.apache.ignite.internal.binary.BinaryUtils.doReadOptimized(Lorg/apache/ignite/internal/binary/streams/BinaryInputStream;Lorg/apache/ignite/internal/binary/BinaryContext;Ljava/lang/ClassLoader;)Ljava/lang/Object;
>  (BinaryUtils.java:1762)
>   at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0()Ljava/lang/Object;
>  (BinaryReaderExImpl.java:1965)
>   at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize()Ljava/lang/Object;
>  (BinaryReaderExImpl.java:1717)
>   at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.readField(I)Ljava/lang/Object;
>  (BinaryReaderExImpl.java:1985)
>   at 
> org.apache.ignite.internal.binary.BinaryFieldAccessor$DefaultFinalClassAccessor.read0(Ljava/lang/Object;Lorg/apache/ignite/internal/binary/BinaryReaderExImpl;)V
>  (BinaryFieldAccessor.java:703)
>   at 
> org.apache.ignite.internal.binary.BinaryFieldAccessor.read(Ljava/lang/Object;Lorg/apache/ignite/internal/binary/BinaryReaderExImpl;)V
>  (BinaryFieldAccessor.java:188)
>   at 
> org.apache.ignite.internal.binary.BinaryClassDescriptor.read(Lorg/apache/ignite/internal/binary/BinaryReaderExImpl;)Ljava/lang/Object;
>  (BinaryClassDescriptor.java:875)
>   at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0()Ljava/lang/Object;
>  (BinaryReaderExImpl.java:1765)
>   at 
> org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize()Ljava/lang/Object;
>  (BinaryReaderExImpl.java:1717)
>   at 
> org.apache.ignite.internal.binary.GridBinaryMarshaller.deserialize([BLjava/lang/ClassLoader;)Ljava/lang/Object;
>  (GridBinaryMarshaller.java:313)
>   at 
> org.apache.ignite.internal.binary.BinaryMarshaller.unmarshal0([BLjava/lang/ClassLoader;)Ljava/lang/Object;
>  (BinaryMarshaller.java:102)
>   at 
> org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal([BLjava/lang/ClassLoader;)Ljava/lang/Object;
>  (AbstractNodeNameAwareMarshaller.java:82)
>   at 
> org.apache.ignite.internal.util.IgniteUtils.unmarshal(Lorg/apache/ignite/marshaller/Marshaller;[BLjava/lang/ClassLoader;)Ljava/lang/Object;
>  (IgniteUtils.java:10168)
>   at 
> org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$7.onMessage(Ljava/util/UUID;Ljava/lang/Object;B)V
>  (GridContinuousProcessor.java:269)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(Ljava/lang/Byte;Lorg/apache/ignite/internal/managers/communication/GridMessageListener;Ljava/util/UUID;Ljava/lang/Object;)V
>  (GridIoManager.java:1569)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(Lorg/apache/ignite/internal/managers/communication/GridIoMessage;Ljava/util/UUID;)V
>  (GridIoManager.java:1197)
>   at 
> org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(Lorg/apache/ignite/internal/managers/communication/GridIoManager;Lorg/apache/ignite/internal/managers/communication/GridIoMessage;Ljava/util/UUID;)V
>  (GridIoManager.java:127)
>   at org.apache.ignite.internal.managers.communication.GridIoManager$9.run()V 
> (GridIoManager.java:1093)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V
>  (ThreadPoolExecutor.java:1128)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run()V 
> (ThreadPoolExecutor.java:628)
>   at java.lang.Thread.run()V (Thread.java:829)
> {code}
> Screenshot of sys threads stacktrace:
>  !image-2021-12-15-21-13-43-775.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to