[
https://issues.apache.org/jira/browse/IGNITE-2659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yakov Zhdanov updated IGNITE-2659:
----------------------------------
Description:
During normal cluster operation we got the following error, that completely
killed all communication with this node:
{noformat}
Runtime error caught during grid runnable execution: GridWorker
[name=grid-nio-worker-1, gridName=null, finished=false, isCancelled=false,
hashCode=558690914, interrupted=false, runner=grid-nio-worker-1-#69%null%]
java.lang.AssertionError: null
at
org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2$1.create(DirectByteBufferStreamImplV2.java:100)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2$1.create(DirectByteBufferStreamImplV2.java:98)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readArray(DirectByteBufferStreamImplV2.java:1337)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readByteArray(DirectByteBufferStreamImplV2.java:948)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.direct.DirectMessageReader.readByteArray(DirectMessageReader.java:173)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.managers.communication.GridIoMessage.readFrom(GridIoMessage.java:289)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridDirectParser.decode(GridDirectParser.java:76)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioCodecFilter.onMessageReceived(GridNioCodecFilter.java:104)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:107)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onMessageReceived(GridConnectionBytesVerifyFilter.java:123)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:107)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onMessageReceived(GridNioServer.java:2149)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioFilterChain.onMessageReceived(GridNioFilterChain.java:173)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processRead(GridNioServer.java:903)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeys(GridNioServer.java:1463)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1398)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1280)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
{noformat}
Update 10 May 2017
I have got the same error and I have the following suggestions:
# We cannot assert data we read from network. All assertions should be removed
and replaced with some runtime exception. When exception is thrown it should be
logged and connection should be closed and then reopened (Ignite resends
unacked messages automatically).
# In my case all NIO threads have died, but cluster has not kicked the node out
(we need to add test for this and fix it).
# We need to write some CRC after each message, e.g. hash of all written field
hashes and types and arrays' lengths. If CRC validation fails on receiver then
exception should be thrown (see pt. 1) and connection should be restored.
was:
During normal cluster operation we got the following error, that completely
killed all communication with this node:
{noformat}
Runtime error caught during grid runnable execution: GridWorker
[name=grid-nio-worker-1, gridName=null, finished=false, isCancelled=false,
hashCode=558690914, interrupted=false, runner=grid-nio-worker-1-#69%null%]
java.lang.AssertionError: null
at
org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2$1.create(DirectByteBufferStreamImplV2.java:100)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2$1.create(DirectByteBufferStreamImplV2.java:98)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readArray(DirectByteBufferStreamImplV2.java:1337)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readByteArray(DirectByteBufferStreamImplV2.java:948)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.direct.DirectMessageReader.readByteArray(DirectMessageReader.java:173)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.managers.communication.GridIoMessage.readFrom(GridIoMessage.java:289)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridDirectParser.decode(GridDirectParser.java:76)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioCodecFilter.onMessageReceived(GridNioCodecFilter.java:104)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:107)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onMessageReceived(GridConnectionBytesVerifyFilter.java:123)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:107)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onMessageReceived(GridNioServer.java:2149)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioFilterChain.onMessageReceived(GridNioFilterChain.java:173)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processRead(GridNioServer.java:903)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeys(GridNioServer.java:1463)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1398)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1280)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
~[ignite-core-1.5.0.final.jar:1.5.0.final]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
{noformat}
Update 10 May 2017
I have got the same error and I have the following suggestions:
# We cannot assert data we read from network. All assertions should be removed
and replaced with some runtime exception. When exception is thrown it should be
logged and connection should be closed and then reopened (Ignite resends
unacked messages automatically).
# In my case all NIO threads have died, but cluster has not kicked the node out
(we need to add test for this and fix it).
# We need to write some CRC after each message, e.g. hash of all written field
hashes and types and arrays' lengths.
> AssertionError in DirectByteBufferStreamImplV2
> ----------------------------------------------
>
> Key: IGNITE-2659
> URL: https://issues.apache.org/jira/browse/IGNITE-2659
> Project: Ignite
> Issue Type: Bug
> Affects Versions: 1.5.0.final
> Environment: java version "1.8.0_60"
> Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
> Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
> Reporter: Avihai Berkovitz
> Priority: Critical
>
> During normal cluster operation we got the following error, that completely
> killed all communication with this node:
> {noformat}
> Runtime error caught during grid runnable execution: GridWorker
> [name=grid-nio-worker-1, gridName=null, finished=false, isCancelled=false,
> hashCode=558690914, interrupted=false, runner=grid-nio-worker-1-#69%null%]
> java.lang.AssertionError: null
> at
> org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2$1.create(DirectByteBufferStreamImplV2.java:100)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2$1.create(DirectByteBufferStreamImplV2.java:98)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readArray(DirectByteBufferStreamImplV2.java:1337)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.direct.stream.v2.DirectByteBufferStreamImplV2.readByteArray(DirectByteBufferStreamImplV2.java:948)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.direct.DirectMessageReader.readByteArray(DirectMessageReader.java:173)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.managers.communication.GridIoMessage.readFrom(GridIoMessage.java:289)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.util.nio.GridDirectParser.decode(GridDirectParser.java:76)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.util.nio.GridNioCodecFilter.onMessageReceived(GridNioCodecFilter.java:104)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:107)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.util.nio.GridConnectionBytesVerifyFilter.onMessageReceived(GridConnectionBytesVerifyFilter.java:123)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:107)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onMessageReceived(GridNioServer.java:2149)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.util.nio.GridNioFilterChain.onMessageReceived(GridNioFilterChain.java:173)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processRead(GridNioServer.java:903)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeys(GridNioServer.java:1463)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1398)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1280)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> ~[ignite-core-1.5.0.final.jar:1.5.0.final]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> {noformat}
> Update 10 May 2017
> I have got the same error and I have the following suggestions:
> # We cannot assert data we read from network. All assertions should be
> removed and replaced with some runtime exception. When exception is thrown it
> should be logged and connection should be closed and then reopened (Ignite
> resends unacked messages automatically).
> # In my case all NIO threads have died, but cluster has not kicked the node
> out (we need to add test for this and fix it).
> # We need to write some CRC after each message, e.g. hash of all written
> field hashes and types and arrays' lengths. If CRC validation fails on
> receiver then exception should be thrown (see pt. 1) and connection should be
> restored.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)