[ 
https://issues.apache.org/jira/browse/IGNITE-17775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bessonov updated IGNITE-17775:
-----------------------------------
    Description: 
h3. TL;DR


Message serialization registry behavior is inconsistent, it either throws an 
AssertionError or NetworkConfigurationException if factory is not found. There 
should be only one. This will simplify debugging situations where one forgot to 
register a factory in the registry, as it's the case in the problem below. 
There's no actual bug in messaging and mentioned exception is impossible to get 
in normal circumstances.
h3. Original description

In some tests I observe network messages' deserialization errors and timeout 
exceptions while waiting for response. In some cases there is negative group 
type of the message, and this causes error:
{code:java}
java.lang.AssertionError: message type must not be negative, messageType=-5376
        at 
org.apache.ignite.network.MessageSerializationRegistryImpl.getFactory(MessageSerializationRegistryImpl.java:77)
        at 
org.apache.ignite.network.MessageSerializationRegistryImpl.createDeserializer(MessageSerializationRegistryImpl.java:102)
        at 
org.apache.ignite.internal.network.serialization.SerializationService.createDeserializer(SerializationService.java:68)
        at 
org.apache.ignite.internal.network.serialization.PerSessionSerializationService.createMessageDeserializer(PerSessionSerializationService.java:109)
        at 
org.apache.ignite.internal.network.netty.InboundDecoder.decode(InboundDecoder.java:89)
        at 
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
        at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
        at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
        at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
        at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:833)

{code}
When the group or message type is positive but not existing, there should be a 
NetworkConfigurationException but it's not displayed in logs, however, it 
causes TimeoutExceptions because of messages loss.

This reproduces in 
[https://github.com/gridgain/apache-ignite-3/tree/ignite-17523-2] in 
ItTablesApiTest#testGetTableFromLaggedNode

  was:
In some tests I observe network messages' deserialization errors and timeout 
exceptions while waiting for response. In some cases there is negative group 
type of the message, and this causes error:


{code:java}
java.lang.AssertionError: message type must not be negative, messageType=-5376
        at 
org.apache.ignite.network.MessageSerializationRegistryImpl.getFactory(MessageSerializationRegistryImpl.java:77)
        at 
org.apache.ignite.network.MessageSerializationRegistryImpl.createDeserializer(MessageSerializationRegistryImpl.java:102)
        at 
org.apache.ignite.internal.network.serialization.SerializationService.createDeserializer(SerializationService.java:68)
        at 
org.apache.ignite.internal.network.serialization.PerSessionSerializationService.createMessageDeserializer(PerSessionSerializationService.java:109)
        at 
org.apache.ignite.internal.network.netty.InboundDecoder.decode(InboundDecoder.java:89)
        at 
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
        at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
        at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
        at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
        at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:833)

{code}

When the group or message type is positive but not existing, there should be a 
NetworkConfigurationException but it's not displayed in logs, however, it 
causes TimeoutExceptions because of messages loss.

This reproduces in 
https://github.com/gridgain/apache-ignite-3/tree/ignite-17523-2 in 
ItTablesApiTest#testGetTableFromLaggedNode


> Invalid data in network buffers causes message deserialization errors and 
> messages loss
> ---------------------------------------------------------------------------------------
>
>                 Key: IGNITE-17775
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17775
>             Project: Ignite
>          Issue Type: Bug
>          Components: networking
>            Reporter: Denis Chudov
>            Priority: Major
>              Labels: ignite-3
>
> h3. TL;DR
> Message serialization registry behavior is inconsistent, it either throws an 
> AssertionError or NetworkConfigurationException if factory is not found. 
> There should be only one. This will simplify debugging situations where one 
> forgot to register a factory in the registry, as it's the case in the problem 
> below. There's no actual bug in messaging and mentioned exception is 
> impossible to get in normal circumstances.
> h3. Original description
> In some tests I observe network messages' deserialization errors and timeout 
> exceptions while waiting for response. In some cases there is negative group 
> type of the message, and this causes error:
> {code:java}
> java.lang.AssertionError: message type must not be negative, messageType=-5376
>       at 
> org.apache.ignite.network.MessageSerializationRegistryImpl.getFactory(MessageSerializationRegistryImpl.java:77)
>       at 
> org.apache.ignite.network.MessageSerializationRegistryImpl.createDeserializer(MessageSerializationRegistryImpl.java:102)
>       at 
> org.apache.ignite.internal.network.serialization.SerializationService.createDeserializer(SerializationService.java:68)
>       at 
> org.apache.ignite.internal.network.serialization.PerSessionSerializationService.createMessageDeserializer(PerSessionSerializationService.java:109)
>       at 
> org.apache.ignite.internal.network.netty.InboundDecoder.decode(InboundDecoder.java:89)
>       at 
> io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
>       at 
> io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
>       at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
>       at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
>       at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
>       at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
>       at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
>       at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
>       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
>       at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
>       at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>       at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>       at java.base/java.lang.Thread.run(Thread.java:833)
> {code}
> When the group or message type is positive but not existing, there should be 
> a NetworkConfigurationException but it's not displayed in logs, however, it 
> causes TimeoutExceptions because of messages loss.
> This reproduces in 
> [https://github.com/gridgain/apache-ignite-3/tree/ignite-17523-2] in 
> ItTablesApiTest#testGetTableFromLaggedNode



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to