[ 
https://issues.apache.org/jira/browse/IGNITE-12894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091560#comment-17091560
 ] 

Vyacheslav Daradur edited comment on IGNITE-12894 at 4/24/20, 1:16 PM:
-----------------------------------------------------------------------

Both issues: this and IGNITE-12490 can be fixed by improvements of our 
deployment guarantees, read this for details: 
[dev-list-thread|http://apache-ignite-developers.2346864.n4.nabble.com/Discovery-based-services-deployment-guarantees-question-td44866.html]

The main idea is allowing 
[GridServiceProxy#randomNodeForService|https://github.com/apache/ignite/blob/8cba313c9961b16e358834216e9992310f285985/modules/core/src/main/java/org/apache/ignite/internal/processors/service/GridServiceProxy.java#L283]
 to wait service deployment finished if it is registered in the cluster (but 
deployment has not finished yet).

It can be achieved in the same manner as for our ["API with a timeout" 
here|https://github.com/apache/ignite/blob/8dcd0f1d96dae965a0f5c479e6d0f4b4d50c6e2c/modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java#L821http://example.com]
 (mentioned as a workaround in current issue description).

Need add some conditions, something like this:
{code:java}
        IgniteUuid srvcUid = lookupRegisteredServiceId(name);

        if (srvcUid == null)
            return null; // Service is not registered in cluster: wasn't 
present in cfg and didn't deployed through API
        while (true) {
            ServiceInfo srvcDesc = registeredServices.get(srvcUid);
            if (srvcDesc == null) {
                if (timeout == 0)
                    return null;
                else
                    // Wait if someone sent service to deploy (as in current 
implementation)
            }
            if (!srvcDesc.topologySnapshot().isEmpty()) {
                return top;
            }
            // Wait using "servicesTopsUpdateMux" while service deployment 
finished and the topology will not be empty
            // or removed from "registeredServices" in case if deployment 
failure
{code}


was (Author: daradurvs):
Both issues: this and IGNITE-12490 can be fixed by improvements of our 
deployment guarantees, read this for details: 
[dev-list-thread|http://apache-ignite-developers.2346864.n4.nabble.com/Discovery-based-services-deployment-guarantees-question-td44866.html]

The main idea is allowing 
[GridServiceProxy#randomNodeForService|https://github.com/apache/ignite/blob/8cba313c9961b16e358834216e9992310f285985/modules/core/src/main/java/org/apache/ignite/internal/processors/service/GridServiceProxy.java#L283]
 to wait service deployment finished if it is registered in the cluster (but 
deployment has not finished yet).

It can be achieved in the same manner as for our ["API with a timeout" 
here|https://github.com/apache/ignite/blob/8dcd0f1d96dae965a0f5c479e6d0f4b4d50c6e2c/modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java#L821http://example.com]
 (mentioned as a workaround in current issue description).

Need add some conditions, something like this:
{code:java}
        IgniteUuid srvcUid = lookupRegisteredServiceId(name);

        if (srvcUid == null)
            return null; // Service is not registered in cluster: wasn't 
present in cfg and didn't deployed through API

        Map<UUID, Integer> top;

        while (true) {
            ServiceInfo srvcDesc = registeredServices.get(srvcUid);

            if (srvcDesc == null) {
                if (timeout == 0)
                    return null;
                else
                    // Wait if someone sent service to deploy (as in current 
implementation)
            }

            top = srvcDesc.topologySnapshot();

            if (!top.isEmpty()) {
                return top;
            }

            // Wait using "servicesTopsUpdateMux" while service deployment 
finished and the topology will not be empty
            // or removed from "registeredServices" in case if deployment 
failure
{code}

> Cannot use IgniteAtomicSequence in Ignite services
> --------------------------------------------------
>
>                 Key: IGNITE-12894
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12894
>             Project: Ignite
>          Issue Type: Bug
>          Components: compute
>    Affects Versions: 2.8
>            Reporter: Alexey Kukushkin
>            Assignee: Mikhail Petrov
>            Priority: Major
>              Labels: sbcf
>
> h2. Repro Steps
> Execute the below steps in default service deployment mode and in 
> discovery-based service deployment mode. 
>  Use {{-DIGNITE_EVENT_DRIVEN_SERVICE_PROCESSOR_ENABLED=true}} JVM option to 
> switch to the discovery-based service deployment mode.
>  * Create a service initializing an {{IgniteAtomicService}} in method 
> {{Service#init()}} and using the {{IgniteAtomicService}} in a business method.
>  * Start an Ignite node with the service specified in the IgniteConfiguration
>  * Invoke the service's business method on the Ignite node
> h3. Actual Result
> h4. In Default Service Deployment Mode
> Deadlock on the business method invocation
> h4. In Discovery-Based Service Deployment Mode
> The method invocation fails with {{IgniteException: Failed to find deployed 
> service: IgniteTestService}}
> h2. Reproducer
> h3. Test.java
> {code:java}
> public interface Test {
>     String sayHello(String name);
> }
> {code}
> h3. IgniteTestService.java
> {code:java}
> public class IgniteTestService implements Test, Service {
>     private @IgniteInstanceResource Ignite ignite;
>     private IgniteAtomicSequence seq;
>     @Override public void cancel(ServiceContext ctx) {
>     }
>     @Override public void init(ServiceContext ctx) throws 
> InterruptedException {
>         seq = ignite.atomicSequence("TestSeq", 0, true);
>     }
>     @Override public void execute(ServiceContext ctx) {
>     }
>     @Override public String sayHello(String name) {
>         return "Hello, " + name + "! #" + seq.getAndIncrement();
>     }
> }
> {code}
> h3. Reproducer.java
> {code:java}
> public class Reproducer {
>     public static void main(String[] args) {
>         IgniteConfiguration igniteCfg = new IgniteConfiguration()
>             .setServiceConfiguration(
>                 new ServiceConfiguration()
>                     .setName(IgniteTestService.class.getSimpleName())
>                     .setMaxPerNodeCount(1)
>                     .setTotalCount(0)
>                     .setService(new IgniteTestService())
>             )
>             .setDiscoverySpi(
>                 new TcpDiscoverySpi()
>                     .setIpFinder(new 
> TcpDiscoveryVmIpFinder().setAddresses(Collections.singleton("127.0.0.1:47500")))
>             );
>         try (Ignite ignite = Ignition.start(igniteCfg)) {
>             
> ignite.services().serviceProxy(IgniteTestService.class.getSimpleName(), 
> Test.class, false)
>                 .sayHello("World");
>         }
>     }
> }
> {code}
> h2. Workaround
> Specifying a service wait timeout solves the problem in the discovery-based 
> service deployment mode (but not in the default deployment mode):
> {code:java}
>             
> ignite.services().serviceProxy(IgniteTestService.class.getSimpleName(), 
> Test.class, false, 1_000)
>                 .sayHello("World");
> {code}
> This workaround cannot be used in Ignite.NET clients since .NET 
> {{GetServiceProxy}} API does not support the service wait timeout, which is 
> hard-coded to 0 on the server side.
> h2. Full Exception in Discovery-Based Service Deployment Mode
> {noformat}
> [01:08:54,653][SEVERE][services-deployment-worker-#52][IgniteServiceProcessor]
>  Failed to initialize service (service will not be deployed): 
> IgniteTestService
> class org.apache.ignite.IgniteInterruptedException: Got interrupted while 
> waiting for future to complete.
>       at 
> org.apache.ignite.internal.util.IgniteUtils$3.apply(IgniteUtils.java:888)
>       at 
> org.apache.ignite.internal.util.IgniteUtils$3.apply(IgniteUtils.java:886)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1062)
>       at 
> org.apache.ignite.internal.IgniteKernal.atomicSequence(IgniteKernal.java:3999)
>       at 
> org.apache.ignite.internal.IgniteKernal.atomicSequence(IgniteKernal.java:3985)
>       at Sandbox.Net.IgniteTestService.init(IgniteTestService.java:17)
>       at 
> org.apache.ignite.internal.processors.service.IgniteServiceProcessor.redeploy(IgniteServiceProcessor.java:1188)
>       at 
> org.apache.ignite.internal.processors.service.ServiceDeploymentTask.lambda$processDeploymentActions$5(ServiceDeploymentTask.java:318)
>       at java.base/java.util.HashMap.forEach(HashMap.java:1336)
>       at 
> org.apache.ignite.internal.processors.service.ServiceDeploymentTask.processDeploymentActions(ServiceDeploymentTask.java:302)
>       at 
> org.apache.ignite.internal.processors.service.ServiceDeploymentTask.init(ServiceDeploymentTask.java:262)
>       at 
> org.apache.ignite.internal.processors.service.ServiceDeploymentManager$ServicesDeploymentWorker.body(ServiceDeploymentManager.java:475)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>       at java.base/java.lang.Thread.run(Thread.java:834)
> [01:08:54,712][SEVERE][exchange-worker-#42][GridDhtPartitionsExchangeFuture] 
> Failed to reinitialize local partitions (rebalancing will be stopped): 
> GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=1, 
> minorTopVer=1], discoEvt=DiscoveryCustomEvent 
> [customMsg=DynamicCacheChangeBatch 
> [id=17576957171-7ae549c8-423a-40b4-9865-c28a2f4b9dd9, reqs=ArrayList 
> [DynamicCacheChangeRequest 
> [cacheName=ignite-sys-atomic-cache@default-ds-group, hasCfg=true, 
> nodeId=5fe32117-84ee-4f1f-9e19-86b85ef8c987, clientStartOnly=false, 
> stop=false, destroy=false, disabledAfterStartfalse]], 
> exchangeActions=ExchangeActions 
> [startCaches=[ignite-sys-atomic-cache@default-ds-group], stopCaches=null, 
> startGrps=[default-ds-group], stopGrps=[], resetParts=null, 
> stateChangeRequest=null], startCaches=false], 
> affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], 
> super=DiscoveryEvent [evtNode=TcpDiscoveryNode 
> [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987, 
> consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500, 
> addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1], 
> sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500, 
> /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500, 
> order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true, 
> ver=2.8.0#20200226-sha1:341b01df, isClient=false], topVer=1, 
> nodeId8=5fe32117, msg=null, type=DISCOVERY_CUSTOM_EVT, 
> tstamp=1586815734517]], nodeId=5fe32117, evt=DISCOVERY_CUSTOM_EVT]
> class org.apache.ignite.IgniteException: Failed to validate partitions state
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3886)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:3577)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:3485)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1610)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:891)
>       at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3172)
>       at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3021)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>       at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: class 
> org.apache.ignite.internal.IgniteInterruptedCheckedException: null
>       at 
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11189)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11059)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11039)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3848)
>       ... 8 more
> Caused by: java.lang.InterruptedException
>       at 
> java.base/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:418)
>       at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:190)
>       at 
> org.apache.ignite.internal.util.IgniteUtils$Batch.result(IgniteUtils.java:11313)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11179)
>       ... 11 more
> [01:08:54,720][SEVERE][exchange-worker-#42][GridCachePartitionExchangeManager]
>  Failed to wait for completion of partition map exchange (preloading will not 
> start): GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryCustomEvent 
> [customMsg=null, affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], 
> super=DiscoveryEvent [evtNode=TcpDiscoveryNode 
> [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987, 
> consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500, 
> addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1], 
> sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500, 
> /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500, 
> order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true, 
> ver=2.8.0#20200226-sha1:341b01df, isClient=false], topVer=1, 
> nodeId8=5fe32117, msg=null, type=DISCOVERY_CUSTOM_EVT, 
> tstamp=1586815734517]], crd=TcpDiscoveryNode 
> [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987, 
> consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500, 
> addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1], 
> sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500, 
> /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500, 
> order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true, 
> ver=2.8.0#20200226-sha1:341b01df, isClient=false], 
> exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=1, 
> minorTopVer=1], discoEvt=DiscoveryCustomEvent [customMsg=null, 
> affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], 
> super=DiscoveryEvent [evtNode=TcpDiscoveryNode 
> [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987, 
> consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500, 
> addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1], 
> sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500, 
> /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500, 
> order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true, 
> ver=2.8.0#20200226-sha1:341b01df, isClient=false], topVer=1, 
> nodeId8=5fe32117, msg=null, type=DISCOVERY_CUSTOM_EVT, 
> tstamp=1586815734517]], nodeId=5fe32117, evt=DISCOVERY_CUSTOM_EVT], 
> added=true, exchangeType=ALL, initFut=GridFutureAdapter 
> [ignoreInterrupts=false, state=DONE, res=true, hash=429760908], init=false, 
> lastVer=null, partReleaseFut=PartitionReleaseFuture 
> [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], 
> futures=[ExplicitLockReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, 
> minorTopVer=1], futures=[]], AtomicUpdateReleaseFuture 
> [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[]], 
> DataStreamerReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, 
> minorTopVer=1], futures=[]], LocalTxReleaseFuture 
> [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[]], 
> AllTxReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], 
> futures=[RemoteTxReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, 
> minorTopVer=1], futures=[]]]]]], exchActions=ExchangeActions 
> [startCaches=[ignite-sys-atomic-cache@default-ds-group], stopCaches=null, 
> startGrps=[default-ds-group], stopGrps=[], resetParts=null, 
> stateChangeRequest=null], affChangeMsg=null, centralizedAff=false, 
> forceAffReassignment=false, exchangeLocE=null, 
> cacheChangeFailureMsgSent=false, done=true, state=CRD, 
> registerCachesFuture=GridFinishedFuture [resFlag=2], partitionsSent=false, 
> partitionsReceived=false, delayedLatestMsg=null, 
> afterLsnrCompleteFut=GridFutureAdapter [ignoreInterrupts=false, state=DONE, 
> res=null, hash=583816633], timeBag=o.a.i.i.util.TimeBag@5ac0d023, 
> startTime=1087079935840199, initTime=1586815734527, rebalanced=false, 
> evtLatch=0, remaining=HashSet [], mergedJoinExchMsgs=null, awaitMergedMsgs=0, 
> super=GridFutureAdapter [ignoreInterrupts=false, state=DONE, res=class 
> o.a.i.IgniteException: Failed to validate partitions state, hash=1371010775]]
> class org.apache.ignite.IgniteCheckedException: Failed to validate partitions 
> state
>       at 
> org.apache.ignite.internal.util.IgniteUtils.cast(IgniteUtils.java:7509)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.resolve(GridFutureAdapter.java:260)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:209)
>       at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:160)
>       at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3200)
>       at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3021)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> Caused by: class 
> org.apache.ignite.internal.IgniteInterruptedCheckedException: null
>       at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: class org.apache.ignite.IgniteException: Failed to validate 
> partitions state
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3886)
> Caused by: java.lang.InterruptedException
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:3577)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:3485)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1610)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:891)
>       at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3172)
>       ... 3 more
> Caused by: class 
> org.apache.ignite.internal.IgniteInterruptedCheckedException: null
>       at 
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11189)
> Caused by: class org.apache.ignite.IgniteException: Failed to validate 
> partitions state
>       at 
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11059)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11039)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3848)
>       ... 8 more
> Caused by: java.lang.InterruptedException
>       at 
> java.base/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:418)
>       at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:190)
>       at 
> org.apache.ignite.internal.util.IgniteUtils$Batch.result(IgniteUtils.java:11313)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11179)
>       ... 11 more
> Caused by: class 
> org.apache.ignite.internal.IgniteInterruptedCheckedException: null
> Caused by: java.lang.InterruptedException
> [01:08:54] Ignite node stopped OK [uptime=00:00:00.219]
> Exception in thread "main" class org.apache.ignite.IgniteException: Failed to 
> find deployed service: IgniteTestService
>       at 
> org.apache.ignite.internal.processors.service.GridServiceProxy.invokeMethod(GridServiceProxy.java:169)
>       at 
> org.apache.ignite.internal.processors.service.GridServiceProxy$ProxyInvocationHandler.invoke(GridServiceProxy.java:364)
>       at com.sun.proxy.$Proxy25.sayHello(Unknown Source)
>       at Sandbox.Net.Reproducer.main(Reproducer.java:29)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to