[
https://issues.apache.org/jira/browse/IGNITE-17775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Bessonov updated IGNITE-17775:
-----------------------------------
Description:
h3. TL;DR
Message serialization registry behavior is inconsistent, it either throws an
AssertionError or NetworkConfigurationException if factory is not found. There
should be only one. This will simplify debugging situations where one forgot to
register a factory in the registry, as it's the case in the problem below.
There's no actual bug in messaging and mentioned exception is impossible to get
in normal circumstances.
h3. Original description
In some tests I observe network messages' deserialization errors and timeout
exceptions while waiting for response. In some cases there is negative group
type of the message, and this causes error:
{code:java}
java.lang.AssertionError: message type must not be negative, messageType=-5376
at
org.apache.ignite.network.MessageSerializationRegistryImpl.getFactory(MessageSerializationRegistryImpl.java:77)
at
org.apache.ignite.network.MessageSerializationRegistryImpl.createDeserializer(MessageSerializationRegistryImpl.java:102)
at
org.apache.ignite.internal.network.serialization.SerializationService.createDeserializer(SerializationService.java:68)
at
org.apache.ignite.internal.network.serialization.PerSessionSerializationService.createMessageDeserializer(PerSessionSerializationService.java:109)
at
org.apache.ignite.internal.network.netty.InboundDecoder.decode(InboundDecoder.java:89)
at
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
at
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833)
{code}
When the group or message type is positive but not existing, there should be a
NetworkConfigurationException but it's not displayed in logs, however, it
causes TimeoutExceptions because of messages loss.
This reproduces in
[https://github.com/gridgain/apache-ignite-3/tree/ignite-17523-2] in
ItTablesApiTest#testGetTableFromLaggedNode
was:
In some tests I observe network messages' deserialization errors and timeout
exceptions while waiting for response. In some cases there is negative group
type of the message, and this causes error:
{code:java}
java.lang.AssertionError: message type must not be negative, messageType=-5376
at
org.apache.ignite.network.MessageSerializationRegistryImpl.getFactory(MessageSerializationRegistryImpl.java:77)
at
org.apache.ignite.network.MessageSerializationRegistryImpl.createDeserializer(MessageSerializationRegistryImpl.java:102)
at
org.apache.ignite.internal.network.serialization.SerializationService.createDeserializer(SerializationService.java:68)
at
org.apache.ignite.internal.network.serialization.PerSessionSerializationService.createMessageDeserializer(PerSessionSerializationService.java:109)
at
org.apache.ignite.internal.network.netty.InboundDecoder.decode(InboundDecoder.java:89)
at
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
at
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833)
{code}
When the group or message type is positive but not existing, there should be a
NetworkConfigurationException but it's not displayed in logs, however, it
causes TimeoutExceptions because of messages loss.
This reproduces in
https://github.com/gridgain/apache-ignite-3/tree/ignite-17523-2 in
ItTablesApiTest#testGetTableFromLaggedNode
> Invalid data in network buffers causes message deserialization errors and
> messages loss
> ---------------------------------------------------------------------------------------
>
> Key: IGNITE-17775
> URL: https://issues.apache.org/jira/browse/IGNITE-17775
> Project: Ignite
> Issue Type: Bug
> Components: networking
> Reporter: Denis Chudov
> Priority: Major
> Labels: ignite-3
>
> h3. TL;DR
> Message serialization registry behavior is inconsistent, it either throws an
> AssertionError or NetworkConfigurationException if factory is not found.
> There should be only one. This will simplify debugging situations where one
> forgot to register a factory in the registry, as it's the case in the problem
> below. There's no actual bug in messaging and mentioned exception is
> impossible to get in normal circumstances.
> h3. Original description
> In some tests I observe network messages' deserialization errors and timeout
> exceptions while waiting for response. In some cases there is negative group
> type of the message, and this causes error:
> {code:java}
> java.lang.AssertionError: message type must not be negative, messageType=-5376
> at
> org.apache.ignite.network.MessageSerializationRegistryImpl.getFactory(MessageSerializationRegistryImpl.java:77)
> at
> org.apache.ignite.network.MessageSerializationRegistryImpl.createDeserializer(MessageSerializationRegistryImpl.java:102)
> at
> org.apache.ignite.internal.network.serialization.SerializationService.createDeserializer(SerializationService.java:68)
> at
> org.apache.ignite.internal.network.serialization.PerSessionSerializationService.createMessageDeserializer(PerSessionSerializationService.java:109)
> at
> org.apache.ignite.internal.network.netty.InboundDecoder.decode(InboundDecoder.java:89)
> at
> io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
> at
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
> at
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
> at
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
> at
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
> at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
> at
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> When the group or message type is positive but not existing, there should be
> a NetworkConfigurationException but it's not displayed in logs, however, it
> causes TimeoutExceptions because of messages loss.
> This reproduces in
> [https://github.com/gridgain/apache-ignite-3/tree/ignite-17523-2] in
> ItTablesApiTest#testGetTableFromLaggedNode
--
This message was sent by Atlassian Jira
(v8.20.10#820010)