[
https://issues.apache.org/jira/browse/IGNITE-12894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099326#comment-17099326
]
Vyacheslav Daradur commented on IGNITE-12894:
---------------------------------------------
[~PetrovMikhail], the suggestion proposed by you makes sense, but it covers the
only case - the race between request sender state and remote node to invoke and
allows the sender to resend request.
There are several disadvantages:
- multiple requests in comparison with possible handling of the case on a
remote node (without resend requests)
- the infinite loop is possible in case of deployment failed, we should
consider a case when deployment is failed and there is no sense resend request
I think we need a mechanism of waiting for a change of service topology on
Ignite node without resending of request. The approach covers case with
statically configured service from the task's description when service is
called before deployment finished on a single node cluster. Also, the issue
with remote proxy may be solved the same way.
> Cannot use IgniteAtomicSequence in Ignite services
> --------------------------------------------------
>
> Key: IGNITE-12894
> URL: https://issues.apache.org/jira/browse/IGNITE-12894
> Project: Ignite
> Issue Type: Bug
> Components: compute
> Affects Versions: 2.8
> Reporter: Alexey Kukushkin
> Assignee: Mikhail Petrov
> Priority: Major
> Labels: sbcf
>
> h2. Repro Steps
> Execute the below steps in default service deployment mode and in
> discovery-based service deployment mode.
> Use {{-DIGNITE_EVENT_DRIVEN_SERVICE_PROCESSOR_ENABLED=true}} JVM option to
> switch to the discovery-based service deployment mode.
> * Create a service initializing an {{IgniteAtomicService}} in method
> {{Service#init()}} and using the {{IgniteAtomicService}} in a business method.
> * Start an Ignite node with the service specified in the IgniteConfiguration
> * Invoke the service's business method on the Ignite node
> h3. Actual Result
> h4. In Default Service Deployment Mode
> Deadlock on the business method invocation
> h4. In Discovery-Based Service Deployment Mode
> The method invocation fails with {{IgniteException: Failed to find deployed
> service: IgniteTestService}}
> h2. Reproducer
> h3. Test.java
> {code:java}
> public interface Test {
> String sayHello(String name);
> }
> {code}
> h3. IgniteTestService.java
> {code:java}
> public class IgniteTestService implements Test, Service {
> private @IgniteInstanceResource Ignite ignite;
> private IgniteAtomicSequence seq;
> @Override public void cancel(ServiceContext ctx) {
> }
> @Override public void init(ServiceContext ctx) throws
> InterruptedException {
> seq = ignite.atomicSequence("TestSeq", 0, true);
> }
> @Override public void execute(ServiceContext ctx) {
> }
> @Override public String sayHello(String name) {
> return "Hello, " + name + "! #" + seq.getAndIncrement();
> }
> }
> {code}
> h3. Reproducer.java
> {code:java}
> public class Reproducer {
> public static void main(String[] args) {
> IgniteConfiguration igniteCfg = new IgniteConfiguration()
> .setServiceConfiguration(
> new ServiceConfiguration()
> .setName(IgniteTestService.class.getSimpleName())
> .setMaxPerNodeCount(1)
> .setTotalCount(0)
> .setService(new IgniteTestService())
> )
> .setDiscoverySpi(
> new TcpDiscoverySpi()
> .setIpFinder(new
> TcpDiscoveryVmIpFinder().setAddresses(Collections.singleton("127.0.0.1:47500")))
> );
> try (Ignite ignite = Ignition.start(igniteCfg)) {
>
> ignite.services().serviceProxy(IgniteTestService.class.getSimpleName(),
> Test.class, false)
> .sayHello("World");
> }
> }
> }
> {code}
> h2. Workaround
> Specifying a service wait timeout solves the problem in the discovery-based
> service deployment mode (but not in the default deployment mode):
> {code:java}
>
> ignite.services().serviceProxy(IgniteTestService.class.getSimpleName(),
> Test.class, false, 1_000)
> .sayHello("World");
> {code}
> This workaround cannot be used in Ignite.NET clients since .NET
> {{GetServiceProxy}} API does not support the service wait timeout, which is
> hard-coded to 0 on the server side.
> h2. Full Exception in Discovery-Based Service Deployment Mode
> {noformat}
> [01:08:54,653][SEVERE][services-deployment-worker-#52][IgniteServiceProcessor]
> Failed to initialize service (service will not be deployed):
> IgniteTestService
> class org.apache.ignite.IgniteInterruptedException: Got interrupted while
> waiting for future to complete.
> at
> org.apache.ignite.internal.util.IgniteUtils$3.apply(IgniteUtils.java:888)
> at
> org.apache.ignite.internal.util.IgniteUtils$3.apply(IgniteUtils.java:886)
> at
> org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1062)
> at
> org.apache.ignite.internal.IgniteKernal.atomicSequence(IgniteKernal.java:3999)
> at
> org.apache.ignite.internal.IgniteKernal.atomicSequence(IgniteKernal.java:3985)
> at Sandbox.Net.IgniteTestService.init(IgniteTestService.java:17)
> at
> org.apache.ignite.internal.processors.service.IgniteServiceProcessor.redeploy(IgniteServiceProcessor.java:1188)
> at
> org.apache.ignite.internal.processors.service.ServiceDeploymentTask.lambda$processDeploymentActions$5(ServiceDeploymentTask.java:318)
> at java.base/java.util.HashMap.forEach(HashMap.java:1336)
> at
> org.apache.ignite.internal.processors.service.ServiceDeploymentTask.processDeploymentActions(ServiceDeploymentTask.java:302)
> at
> org.apache.ignite.internal.processors.service.ServiceDeploymentTask.init(ServiceDeploymentTask.java:262)
> at
> org.apache.ignite.internal.processors.service.ServiceDeploymentManager$ServicesDeploymentWorker.body(ServiceDeploymentManager.java:475)
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at java.base/java.lang.Thread.run(Thread.java:834)
> [01:08:54,712][SEVERE][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
> Failed to reinitialize local partitions (rebalancing will be stopped):
> GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=1,
> minorTopVer=1], discoEvt=DiscoveryCustomEvent
> [customMsg=DynamicCacheChangeBatch
> [id=17576957171-7ae549c8-423a-40b4-9865-c28a2f4b9dd9, reqs=ArrayList
> [DynamicCacheChangeRequest
> [cacheName=ignite-sys-atomic-cache@default-ds-group, hasCfg=true,
> nodeId=5fe32117-84ee-4f1f-9e19-86b85ef8c987, clientStartOnly=false,
> stop=false, destroy=false, disabledAfterStartfalse]],
> exchangeActions=ExchangeActions
> [startCaches=[ignite-sys-atomic-cache@default-ds-group], stopCaches=null,
> startGrps=[default-ds-group], stopGrps=[], resetParts=null,
> stateChangeRequest=null], startCaches=false],
> affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1],
> super=DiscoveryEvent [evtNode=TcpDiscoveryNode
> [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987,
> consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500,
> addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1],
> sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500,
> /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500,
> order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true,
> ver=2.8.0#20200226-sha1:341b01df, isClient=false], topVer=1,
> nodeId8=5fe32117, msg=null, type=DISCOVERY_CUSTOM_EVT,
> tstamp=1586815734517]], nodeId=5fe32117, evt=DISCOVERY_CUSTOM_EVT]
> class org.apache.ignite.IgniteException: Failed to validate partitions state
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3886)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:3577)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:3485)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1610)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:891)
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3172)
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3021)
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: class
> org.apache.ignite.internal.IgniteInterruptedCheckedException: null
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11189)
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11059)
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11039)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3848)
> ... 8 more
> Caused by: java.lang.InterruptedException
> at
> java.base/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:418)
> at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:190)
> at
> org.apache.ignite.internal.util.IgniteUtils$Batch.result(IgniteUtils.java:11313)
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11179)
> ... 11 more
> [01:08:54,720][SEVERE][exchange-worker-#42][GridCachePartitionExchangeManager]
> Failed to wait for completion of partition map exchange (preloading will not
> start): GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryCustomEvent
> [customMsg=null, affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1],
> super=DiscoveryEvent [evtNode=TcpDiscoveryNode
> [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987,
> consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500,
> addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1],
> sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500,
> /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500,
> order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true,
> ver=2.8.0#20200226-sha1:341b01df, isClient=false], topVer=1,
> nodeId8=5fe32117, msg=null, type=DISCOVERY_CUSTOM_EVT,
> tstamp=1586815734517]], crd=TcpDiscoveryNode
> [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987,
> consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500,
> addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1],
> sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500,
> /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500,
> order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true,
> ver=2.8.0#20200226-sha1:341b01df, isClient=false],
> exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=1,
> minorTopVer=1], discoEvt=DiscoveryCustomEvent [customMsg=null,
> affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1],
> super=DiscoveryEvent [evtNode=TcpDiscoveryNode
> [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987,
> consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500,
> addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1],
> sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500,
> /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500,
> order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true,
> ver=2.8.0#20200226-sha1:341b01df, isClient=false], topVer=1,
> nodeId8=5fe32117, msg=null, type=DISCOVERY_CUSTOM_EVT,
> tstamp=1586815734517]], nodeId=5fe32117, evt=DISCOVERY_CUSTOM_EVT],
> added=true, exchangeType=ALL, initFut=GridFutureAdapter
> [ignoreInterrupts=false, state=DONE, res=true, hash=429760908], init=false,
> lastVer=null, partReleaseFut=PartitionReleaseFuture
> [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1],
> futures=[ExplicitLockReleaseFuture [topVer=AffinityTopologyVersion [topVer=1,
> minorTopVer=1], futures=[]], AtomicUpdateReleaseFuture
> [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[]],
> DataStreamerReleaseFuture [topVer=AffinityTopologyVersion [topVer=1,
> minorTopVer=1], futures=[]], LocalTxReleaseFuture
> [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[]],
> AllTxReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1],
> futures=[RemoteTxReleaseFuture [topVer=AffinityTopologyVersion [topVer=1,
> minorTopVer=1], futures=[]]]]]], exchActions=ExchangeActions
> [startCaches=[ignite-sys-atomic-cache@default-ds-group], stopCaches=null,
> startGrps=[default-ds-group], stopGrps=[], resetParts=null,
> stateChangeRequest=null], affChangeMsg=null, centralizedAff=false,
> forceAffReassignment=false, exchangeLocE=null,
> cacheChangeFailureMsgSent=false, done=true, state=CRD,
> registerCachesFuture=GridFinishedFuture [resFlag=2], partitionsSent=false,
> partitionsReceived=false, delayedLatestMsg=null,
> afterLsnrCompleteFut=GridFutureAdapter [ignoreInterrupts=false, state=DONE,
> res=null, hash=583816633], timeBag=o.a.i.i.util.TimeBag@5ac0d023,
> startTime=1087079935840199, initTime=1586815734527, rebalanced=false,
> evtLatch=0, remaining=HashSet [], mergedJoinExchMsgs=null, awaitMergedMsgs=0,
> super=GridFutureAdapter [ignoreInterrupts=false, state=DONE, res=class
> o.a.i.IgniteException: Failed to validate partitions state, hash=1371010775]]
> class org.apache.ignite.IgniteCheckedException: Failed to validate partitions
> state
> at
> org.apache.ignite.internal.util.IgniteUtils.cast(IgniteUtils.java:7509)
> at
> org.apache.ignite.internal.util.future.GridFutureAdapter.resolve(GridFutureAdapter.java:260)
> at
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:209)
> at
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:160)
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3200)
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3021)
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> Caused by: class
> org.apache.ignite.internal.IgniteInterruptedCheckedException: null
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: class org.apache.ignite.IgniteException: Failed to validate
> partitions state
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3886)
> Caused by: java.lang.InterruptedException
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:3577)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:3485)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1610)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:891)
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3172)
> ... 3 more
> Caused by: class
> org.apache.ignite.internal.IgniteInterruptedCheckedException: null
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11189)
> Caused by: class org.apache.ignite.IgniteException: Failed to validate
> partitions state
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11059)
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11039)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3848)
> ... 8 more
> Caused by: java.lang.InterruptedException
> at
> java.base/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:418)
> at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:190)
> at
> org.apache.ignite.internal.util.IgniteUtils$Batch.result(IgniteUtils.java:11313)
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11179)
> ... 11 more
> Caused by: class
> org.apache.ignite.internal.IgniteInterruptedCheckedException: null
> Caused by: java.lang.InterruptedException
> [01:08:54] Ignite node stopped OK [uptime=00:00:00.219]
> Exception in thread "main" class org.apache.ignite.IgniteException: Failed to
> find deployed service: IgniteTestService
> at
> org.apache.ignite.internal.processors.service.GridServiceProxy.invokeMethod(GridServiceProxy.java:169)
> at
> org.apache.ignite.internal.processors.service.GridServiceProxy$ProxyInvocationHandler.invoke(GridServiceProxy.java:364)
> at com.sun.proxy.$Proxy25.sayHello(Unknown Source)
> at Sandbox.Net.Reproducer.main(Reproducer.java:29)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)