[ 
https://issues.apache.org/jira/browse/IGNITE-9354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Turik Campbell reassigned IGNITE-9354:
--------------------------------------

    Assignee: Turik Campbell

> HelloWorldGAExample hangs forever with additional nods in topology
> ------------------------------------------------------------------
>
>                 Key: IGNITE-9354
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9354
>             Project: Ignite
>          Issue Type: Bug
>          Components: ml
>    Affects Versions: 2.6
>            Reporter: Alex Volkov
>            Assignee: Turik Campbell
>            Priority: Major
>         Attachments: log.zip
>
>
> To reproduce this issue please follow these steps:
> 1. Run two nodes using ignite.sh script.
> For example:
> {code:java}
> bin/ignite.sh examples/config/example-ignite.xml -J-Xmx1g -J-Xms1g 
> -J-DCONSISTENT_ID=node1 -J-DIGNITE_QUIET=false
> {code}
> 2. Run  HelloWorldGAExample from IDEA IDE.
> *Expecting result:*
> Example successfully run and completed.
> *Actual result:*
> There are a lot of NPE exceptions in example log:
> {code:java}
> [2018-08-23 09:49:25,029][ERROR][pub-#19][GridJobWorker] Failed to execute 
> job due to unexpected runtime exception 
> [jobId=c296b856561-e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1, 
> ses=GridJobSessionImpl [ses=GridTaskSessionImpl 
> [taskName=o.a.i.ml.genetic.FitnessTask, dep=GridDeployment [ts=1535006960878, 
> depMode=SHARED, clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2, 
> clsLdrId=8d16b856561-e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1, userVer=0, 
> loc=true, 
> sampleClsName=o.a.i.i.processors.cache.distributed.dht.preloader.GridDhtPartitionFullMap,
>  pendingUndeploy=false, undeployed=false, usage=2], 
> taskClsName=o.a.i.ml.genetic.FitnessTask, 
> sesId=b196b856561-e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1, 
> startTime=1535006964236, endTime=9223372036854775807, 
> taskNodeId=e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1, 
> clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2, closed=false, cpSpi=null, 
> failSpi=null, loadSpi=null, usage=1, fullSup=false, internal=false, 
> topPred=o.a.i.i.cluster.ClusterGroupAdapter$AttributeFilter@2d746ce4, 
> subjId=e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1, mapFut=GridFutureAdapter 
> [ignoreInterrupts=false, state=INIT, res=null, hash=679592043]IgniteFuture 
> [orig=], execName=null], 
> jobId=c296b856561-e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1], err=null]
> java.lang.NullPointerException
> at org.apache.ignite.ml.genetic.FitnessJob.execute(FitnessJob.java:76)
> at org.apache.ignite.ml.genetic.FitnessJob.execute(FitnessJob.java:35)
> at 
> org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:568)
> at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6749)
> at 
> org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:562)
> at 
> org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:491)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> and it hangs on this one:
> {code:java}
> [2018-08-23 09:49:35,229][WARN ][pub-#17][AlwaysFailoverSpi] Received 
> topology with only nodes that job had failed on (forced to fail) 
> [failedNodes=[eac48ea7-da79-453a-a94c-291039c5cc15, 
> 0907d876-e0ce-4fda-966d-ad91a03f9722, e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1]]
> class org.apache.ignite.cluster.ClusterTopologyException: Failed to failover 
> a job to another node (failover SPI returned null) 
> [job=org.apache.ignite.ml.genetic.FitnessJob@35f8a9d3, node=TcpDiscoveryNode 
> [id=e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1, addrs=ArrayList [0:0:0:0:0:0:0:1, 
> 127.0.0.1, 172.25.4.42, 172.25.4.92], sockAddrs=HashSet [/172.25.4.42:47502, 
> /172.25.4.92:47502, /0:0:0:0:0:0:0:1:47502, /127.0.0.1:47502], 
> discPort=47502, order=3, intOrder=3, lastExchangeTime=1535006974981, 
> loc=true, ver=2.7.0#19700101-sha1:00000000, isClient=false]]
> at org.apache.ignite.internal.util.IgniteUtils$7.apply(IgniteUtils.java:853)
> at org.apache.ignite.internal.util.IgniteUtils$7.apply(IgniteUtils.java:851)
> at 
> org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:985)
> at 
> org.apache.ignite.internal.IgniteComputeImpl.execute(IgniteComputeImpl.java:541)
> at org.apache.ignite.ml.genetic.GAGrid.calculateFitness(GAGrid.java:102)
> at org.apache.ignite.ml.genetic.GAGrid.evolve(GAGrid.java:171)
> at 
> org.apache.ignite.examples.ml.genetic.helloworld.HelloWorldGAExample.main(HelloWorldGAExample.java:90)
> Caused by: class 
> org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Failed to 
> failover a job to another node (failover SPI returned null) 
> [job=org.apache.ignite.ml.genetic.FitnessJob@35f8a9d3, node=TcpDiscoveryNode 
> [id=e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1, addrs=ArrayList [0:0:0:0:0:0:0:1, 
> 127.0.0.1, 172.25.4.42, 172.25.4.92], sockAddrs=HashSet [/172.25.4.42:47502, 
> /172.25.4.92:47502, /0:0:0:0:0:0:0:1:47502, /127.0.0.1:47502], 
> discPort=47502, order=3, intOrder=3, lastExchangeTime=1535006974981, 
> loc=true, ver=2.7.0#19700101-sha1:00000000, isClient=false]]
> at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.checkTargetNode(GridTaskWorker.java:1235)
> at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.failover(GridTaskWorker.java:1203)
> at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.onResponse(GridTaskWorker.java:938)
> at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.processJobExecuteResponse(GridTaskProcessor.java:1077)
> at 
> org.apache.ignite.internal.processors.job.GridJobWorker.finishJob(GridJobWorker.java:931)
> at 
> org.apache.ignite.internal.processors.job.GridJobWorker.finishJob(GridJobWorker.java:779)
> at 
> org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:631)
> at 
> org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:491)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: class org.apache.ignite.compute.ComputeUserUndeclaredException: 
> Failed to execute job due to unexpected runtime exception 
> [jobId=f296b856561-e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1, 
> ses=GridJobSessionImpl [ses=GridTaskSessionImpl 
> [taskName=org.apache.ignite.ml.genetic.FitnessTask, dep=GridDeployment 
> [ts=1535006960878, depMode=SHARED, 
> clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2, 
> clsLdrId=8d16b856561-e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1, userVer=0, 
> loc=true, 
> sampleClsName=org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionFullMap,
>  pendingUndeploy=false, undeployed=false, usage=2], 
> taskClsName=org.apache.ignite.ml.genetic.FitnessTask, 
> sesId=b196b856561-e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1, 
> startTime=1535006964236, endTime=9223372036854775807, 
> taskNodeId=e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1, 
> clsLdr=sun.misc.Launcher$AppClassLoader@18b4aac2, closed=false, cpSpi=null, 
> failSpi=null, loadSpi=null, usage=1, fullSup=false, internal=false, 
> topPred=org.apache.ignite.internal.cluster.ClusterGroupAdapter$AttributeFilter@2d746ce4,
>  subjId=e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1, mapFut=GridFutureAdapter 
> [ignoreInterrupts=false, state=INIT, res=null, hash=1579959210]IgniteFuture 
> [orig=], execName=null], 
> jobId=f296b856561-e5eca24b-6f5a-4d3e-9e9e-94ad404b44d1], err=null]
> at 
> org.apache.ignite.internal.processors.job.GridJobWorker.handleThrowable(GridJobWorker.java:689)
> at 
> org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:621)
> ... 5 more
> Caused by: java.lang.NullPointerException
> at org.apache.ignite.ml.genetic.FitnessJob.execute(FitnessJob.java:76)
> at org.apache.ignite.ml.genetic.FitnessJob.execute(FitnessJob.java:35)
> at 
> org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:568)
> at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6749)
> at 
> org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:562)
> ... 5 more
> {code}
> Please let me know if you need full nodes and example logs.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to