[
https://issues.apache.org/jira/browse/IGNITE-12894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099638#comment-17099638
]
Mikhail Petrov commented on IGNITE-12894:
-----------------------------------------
[~daradurvs],
Frankly, I didn't understand what it means to "resend request". Do you mean
that [closure
invocation|https://github.com/apache/ignite/blob/8cba313c9961b16e358834216e9992310f285985/modules/core/src/main/java/org/apache/ignite/internal/processors/service/GridServiceProxy.java#L186]?
If so, it will be called in the busy-wait loop only if returned by
IgniteServiceProcessor#serviceTopology topology snapshot is stale(requested
service is not deployed on a node which was obtained from the topology). So, in
this case, we will "resend request" until full message will be received and
topology updated.
At the moment I am only concerned about the case when serviceProxy method was
invoked when the requested service topology was not initialized on local node
yet(first full message with topology for requested service was not received but
service was already registered)
I propose to block IgniteServiceProcessor#serviceTopology method call in this
case and wait for the full message receiving. And only after that return
topology snapshot to the GridServiceProxy which will determine the node-holder
for the requested service. So for this case, we will wait "for a change of
service topology on Ignite node without resending of request".
Am I missing something?
Here is draft PR – [1]. Could you please take a look?
[1] - [https://github.com/apache/ignite/pull/7771]
> Cannot use IgniteAtomicSequence in Ignite services
> --------------------------------------------------
>
> Key: IGNITE-12894
> URL: https://issues.apache.org/jira/browse/IGNITE-12894
> Project: Ignite
> Issue Type: Bug
> Components: compute
> Affects Versions: 2.8
> Reporter: Alexey Kukushkin
> Assignee: Mikhail Petrov
> Priority: Major
> Labels: sbcf
> Time Spent: 10m
> Remaining Estimate: 0h
>
> h2. Repro Steps
> Execute the below steps in default service deployment mode and in
> discovery-based service deployment mode.
> Use {{-DIGNITE_EVENT_DRIVEN_SERVICE_PROCESSOR_ENABLED=true}} JVM option to
> switch to the discovery-based service deployment mode.
> * Create a service initializing an {{IgniteAtomicService}} in method
> {{Service#init()}} and using the {{IgniteAtomicService}} in a business method.
> * Start an Ignite node with the service specified in the IgniteConfiguration
> * Invoke the service's business method on the Ignite node
> h3. Actual Result
> h4. In Default Service Deployment Mode
> Deadlock on the business method invocation
> h4. In Discovery-Based Service Deployment Mode
> The method invocation fails with {{IgniteException: Failed to find deployed
> service: IgniteTestService}}
> h2. Reproducer
> h3. Test.java
> {code:java}
> public interface Test {
> String sayHello(String name);
> }
> {code}
> h3. IgniteTestService.java
> {code:java}
> public class IgniteTestService implements Test, Service {
> private @IgniteInstanceResource Ignite ignite;
> private IgniteAtomicSequence seq;
> @Override public void cancel(ServiceContext ctx) {
> }
> @Override public void init(ServiceContext ctx) throws
> InterruptedException {
> seq = ignite.atomicSequence("TestSeq", 0, true);
> }
> @Override public void execute(ServiceContext ctx) {
> }
> @Override public String sayHello(String name) {
> return "Hello, " + name + "! #" + seq.getAndIncrement();
> }
> }
> {code}
> h3. Reproducer.java
> {code:java}
> public class Reproducer {
> public static void main(String[] args) {
> IgniteConfiguration igniteCfg = new IgniteConfiguration()
> .setServiceConfiguration(
> new ServiceConfiguration()
> .setName(IgniteTestService.class.getSimpleName())
> .setMaxPerNodeCount(1)
> .setTotalCount(0)
> .setService(new IgniteTestService())
> )
> .setDiscoverySpi(
> new TcpDiscoverySpi()
> .setIpFinder(new
> TcpDiscoveryVmIpFinder().setAddresses(Collections.singleton("127.0.0.1:47500")))
> );
> try (Ignite ignite = Ignition.start(igniteCfg)) {
>
> ignite.services().serviceProxy(IgniteTestService.class.getSimpleName(),
> Test.class, false)
> .sayHello("World");
> }
> }
> }
> {code}
> h2. Workaround
> Specifying a service wait timeout solves the problem in the discovery-based
> service deployment mode (but not in the default deployment mode):
> {code:java}
>
> ignite.services().serviceProxy(IgniteTestService.class.getSimpleName(),
> Test.class, false, 1_000)
> .sayHello("World");
> {code}
> This workaround cannot be used in Ignite.NET clients since .NET
> {{GetServiceProxy}} API does not support the service wait timeout, which is
> hard-coded to 0 on the server side.
> h2. Full Exception in Discovery-Based Service Deployment Mode
> {noformat}
> [01:08:54,653][SEVERE][services-deployment-worker-#52][IgniteServiceProcessor]
> Failed to initialize service (service will not be deployed):
> IgniteTestService
> class org.apache.ignite.IgniteInterruptedException: Got interrupted while
> waiting for future to complete.
> at
> org.apache.ignite.internal.util.IgniteUtils$3.apply(IgniteUtils.java:888)
> at
> org.apache.ignite.internal.util.IgniteUtils$3.apply(IgniteUtils.java:886)
> at
> org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1062)
> at
> org.apache.ignite.internal.IgniteKernal.atomicSequence(IgniteKernal.java:3999)
> at
> org.apache.ignite.internal.IgniteKernal.atomicSequence(IgniteKernal.java:3985)
> at Sandbox.Net.IgniteTestService.init(IgniteTestService.java:17)
> at
> org.apache.ignite.internal.processors.service.IgniteServiceProcessor.redeploy(IgniteServiceProcessor.java:1188)
> at
> org.apache.ignite.internal.processors.service.ServiceDeploymentTask.lambda$processDeploymentActions$5(ServiceDeploymentTask.java:318)
> at java.base/java.util.HashMap.forEach(HashMap.java:1336)
> at
> org.apache.ignite.internal.processors.service.ServiceDeploymentTask.processDeploymentActions(ServiceDeploymentTask.java:302)
> at
> org.apache.ignite.internal.processors.service.ServiceDeploymentTask.init(ServiceDeploymentTask.java:262)
> at
> org.apache.ignite.internal.processors.service.ServiceDeploymentManager$ServicesDeploymentWorker.body(ServiceDeploymentManager.java:475)
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at java.base/java.lang.Thread.run(Thread.java:834)
> [01:08:54,712][SEVERE][exchange-worker-#42][GridDhtPartitionsExchangeFuture]
> Failed to reinitialize local partitions (rebalancing will be stopped):
> GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=1,
> minorTopVer=1], discoEvt=DiscoveryCustomEvent
> [customMsg=DynamicCacheChangeBatch
> [id=17576957171-7ae549c8-423a-40b4-9865-c28a2f4b9dd9, reqs=ArrayList
> [DynamicCacheChangeRequest
> [cacheName=ignite-sys-atomic-cache@default-ds-group, hasCfg=true,
> nodeId=5fe32117-84ee-4f1f-9e19-86b85ef8c987, clientStartOnly=false,
> stop=false, destroy=false, disabledAfterStartfalse]],
> exchangeActions=ExchangeActions
> [startCaches=[ignite-sys-atomic-cache@default-ds-group], stopCaches=null,
> startGrps=[default-ds-group], stopGrps=[], resetParts=null,
> stateChangeRequest=null], startCaches=false],
> affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1],
> super=DiscoveryEvent [evtNode=TcpDiscoveryNode
> [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987,
> consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500,
> addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1],
> sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500,
> /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500,
> order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true,
> ver=2.8.0#20200226-sha1:341b01df, isClient=false], topVer=1,
> nodeId8=5fe32117, msg=null, type=DISCOVERY_CUSTOM_EVT,
> tstamp=1586815734517]], nodeId=5fe32117, evt=DISCOVERY_CUSTOM_EVT]
> class org.apache.ignite.IgniteException: Failed to validate partitions state
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3886)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:3577)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:3485)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1610)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:891)
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3172)
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3021)
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: class
> org.apache.ignite.internal.IgniteInterruptedCheckedException: null
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11189)
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11059)
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11039)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3848)
> ... 8 more
> Caused by: java.lang.InterruptedException
> at
> java.base/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:418)
> at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:190)
> at
> org.apache.ignite.internal.util.IgniteUtils$Batch.result(IgniteUtils.java:11313)
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11179)
> ... 11 more
> [01:08:54,720][SEVERE][exchange-worker-#42][GridCachePartitionExchangeManager]
> Failed to wait for completion of partition map exchange (preloading will not
> start): GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryCustomEvent
> [customMsg=null, affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1],
> super=DiscoveryEvent [evtNode=TcpDiscoveryNode
> [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987,
> consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500,
> addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1],
> sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500,
> /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500,
> order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true,
> ver=2.8.0#20200226-sha1:341b01df, isClient=false], topVer=1,
> nodeId8=5fe32117, msg=null, type=DISCOVERY_CUSTOM_EVT,
> tstamp=1586815734517]], crd=TcpDiscoveryNode
> [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987,
> consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500,
> addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1],
> sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500,
> /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500,
> order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true,
> ver=2.8.0#20200226-sha1:341b01df, isClient=false],
> exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=1,
> minorTopVer=1], discoEvt=DiscoveryCustomEvent [customMsg=null,
> affTopVer=AffinityTopologyVersion [topVer=1, minorTopVer=1],
> super=DiscoveryEvent [evtNode=TcpDiscoveryNode
> [id=5fe32117-84ee-4f1f-9e19-86b85ef8c987,
> consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.1.2,192.168.56.1:47500,
> addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.1.2, 192.168.56.1],
> sockAddrs=HashSet [kukushal-pc/172.22.44.97:47500, /0:0:0:0:0:0:0:1:47500,
> /127.0.0.1:47500, /192.168.56.1:47500, /192.168.1.2:47500], discPort=47500,
> order=1, intOrder=1, lastExchangeTime=1586815734079, loc=true,
> ver=2.8.0#20200226-sha1:341b01df, isClient=false], topVer=1,
> nodeId8=5fe32117, msg=null, type=DISCOVERY_CUSTOM_EVT,
> tstamp=1586815734517]], nodeId=5fe32117, evt=DISCOVERY_CUSTOM_EVT],
> added=true, exchangeType=ALL, initFut=GridFutureAdapter
> [ignoreInterrupts=false, state=DONE, res=true, hash=429760908], init=false,
> lastVer=null, partReleaseFut=PartitionReleaseFuture
> [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1],
> futures=[ExplicitLockReleaseFuture [topVer=AffinityTopologyVersion [topVer=1,
> minorTopVer=1], futures=[]], AtomicUpdateReleaseFuture
> [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[]],
> DataStreamerReleaseFuture [topVer=AffinityTopologyVersion [topVer=1,
> minorTopVer=1], futures=[]], LocalTxReleaseFuture
> [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], futures=[]],
> AllTxReleaseFuture [topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1],
> futures=[RemoteTxReleaseFuture [topVer=AffinityTopologyVersion [topVer=1,
> minorTopVer=1], futures=[]]]]]], exchActions=ExchangeActions
> [startCaches=[ignite-sys-atomic-cache@default-ds-group], stopCaches=null,
> startGrps=[default-ds-group], stopGrps=[], resetParts=null,
> stateChangeRequest=null], affChangeMsg=null, centralizedAff=false,
> forceAffReassignment=false, exchangeLocE=null,
> cacheChangeFailureMsgSent=false, done=true, state=CRD,
> registerCachesFuture=GridFinishedFuture [resFlag=2], partitionsSent=false,
> partitionsReceived=false, delayedLatestMsg=null,
> afterLsnrCompleteFut=GridFutureAdapter [ignoreInterrupts=false, state=DONE,
> res=null, hash=583816633], timeBag=o.a.i.i.util.TimeBag@5ac0d023,
> startTime=1087079935840199, initTime=1586815734527, rebalanced=false,
> evtLatch=0, remaining=HashSet [], mergedJoinExchMsgs=null, awaitMergedMsgs=0,
> super=GridFutureAdapter [ignoreInterrupts=false, state=DONE, res=class
> o.a.i.IgniteException: Failed to validate partitions state, hash=1371010775]]
> class org.apache.ignite.IgniteCheckedException: Failed to validate partitions
> state
> at
> org.apache.ignite.internal.util.IgniteUtils.cast(IgniteUtils.java:7509)
> at
> org.apache.ignite.internal.util.future.GridFutureAdapter.resolve(GridFutureAdapter.java:260)
> at
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:209)
> at
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:160)
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3200)
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3021)
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> Caused by: class
> org.apache.ignite.internal.IgniteInterruptedCheckedException: null
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: class org.apache.ignite.IgniteException: Failed to validate
> partitions state
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3886)
> Caused by: java.lang.InterruptedException
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:3577)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:3485)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1610)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:891)
> at
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3172)
> ... 3 more
> Caused by: class
> org.apache.ignite.internal.IgniteInterruptedCheckedException: null
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11189)
> Caused by: class org.apache.ignite.IgniteException: Failed to validate
> partitions state
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11059)
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11039)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.validatePartitionsState(GridDhtPartitionsExchangeFuture.java:3848)
> ... 8 more
> Caused by: java.lang.InterruptedException
> at
> java.base/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:418)
> at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:190)
> at
> org.apache.ignite.internal.util.IgniteUtils$Batch.result(IgniteUtils.java:11313)
> at
> org.apache.ignite.internal.util.IgniteUtils.doInParallel(IgniteUtils.java:11179)
> ... 11 more
> Caused by: class
> org.apache.ignite.internal.IgniteInterruptedCheckedException: null
> Caused by: java.lang.InterruptedException
> [01:08:54] Ignite node stopped OK [uptime=00:00:00.219]
> Exception in thread "main" class org.apache.ignite.IgniteException: Failed to
> find deployed service: IgniteTestService
> at
> org.apache.ignite.internal.processors.service.GridServiceProxy.invokeMethod(GridServiceProxy.java:169)
> at
> org.apache.ignite.internal.processors.service.GridServiceProxy$ProxyInvocationHandler.invoke(GridServiceProxy.java:364)
> at com.sun.proxy.$Proxy25.sayHello(Unknown Source)
> at Sandbox.Net.Reproducer.main(Reproducer.java:29)
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)