[jira] [Commented] (IGNITE-12398) Apache Ignite Cluster(Amazon S3 Based Discovery) Nodes getting down if we connect Ignite Visor Command Line Interface

Aleksey Plekhanov (Jira) Tue, 21 Jul 2020 02:30:44 -0700


    [ 
https://issues.apache.org/jira/browse/IGNITE-12398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17161897#comment-17161897
 ]


Aleksey Plekhanov commented on IGNITE-12398:
--------------------------------------------

[~ravimsc], can you please provide more information about your case? Where do 
you start visor, inside S3 or outside? Do you set some additional properties in 
visor configuration ({{IgniteConfiguration.LocalHost}} for example)?
I've checked {{getAddress()}} method with address patterns you provided, and 
this method return not null for such values.
I found some problems with daemon nodes (visor uses deamon node to join 
cluster), they joined to ring (like server nodes) instead of joining like a 
client. When joined to ring IpFinder.registerAddresses is invoked and if some 
of addresses passed by visor is unresolvable, exception like yours can be 
thrown (I can't imagine how it can happen since Ignite uses IP address if host 
is unresolvable). But you can workaround it, just set ClientMode = true in 
visor configuration and visor will not join the ring and will connect to 
cluster like a client.
I think this ticket is not a blocker (workaround exists, we can't reproduce 
it). I've set priority to "critical" and have targeted the ticket to the next 
release.

> Apache Ignite Cluster(Amazon S3 Based Discovery) Nodes getting down if we 
> connect Ignite Visor Command Line Interface
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-12398
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12398
>             Project: Ignite
>          Issue Type: Bug
>          Components: aws, general, s3, visor
>    Affects Versions: 2.7
>         Environment: Production
>            Reporter: Ravi Kumar Powli
>            Assignee: Emmanouil Gkatziouras
>            Priority: Critical
>             Fix For: 2.10
>
>
> We have Apache Ignite 3 node cluster setup with Amazon S3 Based Discovery. If 
> we connect any one cluster node using Ignite Visor Command Line Interface it 
> got hang and automatically cluster nodes(all the three nodes) are going down. 
> Please find the below exception stacktrace.
> {noformat}
> [SEVERE][tcp-disco-msg-worker-#2%DataStoreIgniteCache%][] Critical system 
> error detected. Will be handled accordingly to configured handler 
> [hnd=NoOpFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext 
> [type=SYSTEM_WORKER_TERMINATION, err=java.lang.NullPointerException]]
> java.lang.NullPointerException
> at 
> org.apache.ignite.spi.discovery.tcp.ipfinder.s3.TcpDiscoveryS3IpFinder.key(TcpDiscoveryS3IpFinder.java:247)
> at 
> org.apache.ignite.spi.discovery.tcp.ipfinder.s3.TcpDiscoveryS3IpFinder.registerAddresses(TcpDiscoveryS3IpFinder.java:205)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddFinishedMessage(ServerImpl.java:4616)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddedMessage(ServerImpl.java:4232)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2816)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2611)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:7188)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2700)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7119)
> at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
> [10:36:54,600][SEVERE][tcp-disco-msg-worker-#2%DataStoreIgniteCache%][] 
> Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext 
> [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: GridWorker 
> [name=tcp-disco-msg-worker, igniteInstanceName=DataStoreIgniteCache, 
> finished=true, heartbeatTs=1574332614423]]]
> class org.apache.ignite.IgniteException: GridWorker 
> [name=tcp-disco-msg-worker, igniteInstanceName=DataStoreIgniteCache, 
> finished=true, heartbeatTs=1574332614423]
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
> at 
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
> at 
> org.apache.ignite.internal.worker.WorkersRegistry.onStopped(WorkersRegistry.java:169)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:153)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7119)
> at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
> [10:36:59] Ignite node stopped OK [name=DataStoreIgniteCache, 
> uptime=00:01:13.934]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (IGNITE-12398) Apache Ignite Cluster(Amazon S3 Based Discovery) Nodes getting down if we connect Ignite Visor Command Line Interface

Reply via email to