[ https://issues.apache.org/jira/browse/IGNITE-12793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17062448#comment-17062448 ]
Sergey Kosarev commented on IGNITE-12793: ----------------------------------------- [~sergey-chugunov], actually I've used gidgain community-edition 8.7.10 > Deadlock in the System Pool on Metadata processing > -------------------------------------------------- > > Key: IGNITE-12793 > URL: https://issues.apache.org/jira/browse/IGNITE-12793 > Project: Ignite > Issue Type: Bug > Affects Versions: 2.7.6 > Reporter: Sergey Kosarev > Priority: Major > > I've recently tried to apply Ilya's idea > (https://issues.apache.org/jira/browse/IGNITE-12663) of minimizing thread > pools and tried to set system pool to 3 in my own tests. > It caused deadlock on a client node and I think it can happen not only on > such small pool values. > Details are following: > I'm not using persistence currently (if it matters). > On the client note I use ignite compute to call a job on every server node > (there are 3 server nodes in the tests). > Then I've found in logs: > {{[10:55:21] : [Step 1/1] [2020-03-13 10:55:21,773] {grid-timeout-worker-#8} > [WARN] [o.a.i.i.IgniteKernal] - Possible thread pool starvation detected (no > task completed in last 30000ms, is system thread pool size large enough?) > [10:55:21] : [Step 1/1] ^-- System thread pool [active=3, idle=0, qSize=9]}} > I see in threaddumps that all 3 system pool workers do the same - processing > of job responses: > {{ "sys-#26" #605 daemon prio=5 os_prio=0 tid=0x0000000064a0a800 nid=0x1f34 > waiting on condition [0x000000007b91d000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140) > at > org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.metadata(CacheObjectBinaryProcessorImpl.java:749) > at > org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$1.metadata(CacheObjectBinaryProcessorImpl.java:250) > at > org.apache.ignite.internal.binary.BinaryContext.metadata(BinaryContext.java:1169) > at > org.apache.ignite.internal.binary.BinaryReaderExImpl.getOrCreateSchema(BinaryReaderExImpl.java:2005) > at > org.apache.ignite.internal.binary.BinaryReaderExImpl.<init>(BinaryReaderExImpl.java:285) > at > org.apache.ignite.internal.binary.BinaryReaderExImpl.<init>(BinaryReaderExImpl.java:184) > at > org.apache.ignite.internal.binary.BinaryUtils.doReadObject(BinaryUtils.java:1797) > at > org.apache.ignite.internal.binary.BinaryUtils.deserializeOrUnmarshal(BinaryUtils.java:2160) > at > org.apache.ignite.internal.binary.BinaryUtils.doReadCollection(BinaryUtils.java:2091) > at > org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1914) > at > org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1714) > at > org.apache.ignite.internal.binary.BinaryReaderExImpl.readField(BinaryReaderExImpl.java:1982) > at > org.apache.ignite.internal.binary.BinaryFieldAccessor$DefaultFinalClassAccessor.read0(BinaryFieldAccessor.java:702) > at > org.apache.ignite.internal.binary.BinaryFieldAccessor.read(BinaryFieldAccessor.java:187) > at > org.apache.ignite.internal.binary.BinaryClassDescriptor.read(BinaryClassDescriptor.java:887) > at > org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1762) > at > org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1714) > at > org.apache.ignite.internal.binary.BinaryUtils.doReadObject(BinaryUtils.java:1797) > at > org.apache.ignite.internal.binary.BinaryUtils.deserializeOrUnmarshal(BinaryUtils.java:2160) > at > org.apache.ignite.internal.binary.BinaryUtils.doReadCollection(BinaryUtils.java:2091) > at > org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1914) > at > org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1714) > at > org.apache.ignite.internal.binary.GridBinaryMarshaller.deserialize(GridBinaryMarshaller.java:306) > at > org.apache.ignite.internal.binary.BinaryMarshaller.unmarshal0(BinaryMarshaller.java:100) > at > org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.unmarshal(AbstractNodeNameAwareMarshaller.java:80) > at > org.apache.ignite.internal.util.IgniteUtils.unmarshal(IgniteUtils.java:10493) > at > org.apache.ignite.internal.processors.task.GridTaskWorker.onResponse(GridTaskWorker.java:828) > at > org.apache.ignite.internal.processors.task.GridTaskProcessor.processJobExecuteResponse(GridTaskProcessor.java:1134) > }} > As I found analyzing this stack trace, unmarshalling a user object the first > time(per type) causes Binary metadata request (despite I've registered this > type in BinaryConfiguration.setTypeConfiguration) > And all this futures will be completed after consequent > MetadataResponseMessage will be received and processed on the client node. > But MetadataResponseMessage(GridTopic.TOPIC_METADATA_REQ) is also processing > in system pool. (I see that method GridIoManager#processRegularMessage routes > it to the System Pool). > So it causes deadlock as the Sytem Pool is already full. -- This message was sent by Atlassian Jira (v8.3.4#803005)