Hi! We have some problem with service grid.
Configuration - Data nodes (group of nodes)-2 PCs (CPU-8/mem-32). - Service nodes (group of nodes) -3 PCs (CPU-8/mem-32). Data node configuration: IgniteConfiguration cfg = new IgniteConfiguration configuration(); final UUID uuid = UUID.randomUUID(); DataStorageConfiguration psCfg = new DataStorageConfiguration configuration(); psCfg.setConcurrencyLevel(4); psCfg.setPageSize(8192); final configuration DataRegionConfiguration dataRegionConfiguration = new configuration DataRegionConfiguration(); dataRegionConfiguration.setMetricsEnabled(true); DataRegionConfiguration regionCfg = new DataRegionConfiguration configuration(); regionCfg.setName (format ("RM-node-scope-[%C] - config", UUID.randomUUID())); regionCfg.setMetricsEnabled(true); regionCfg.setMaxSize(4L * 1024 * 1024 * 1024); psCfg.setDataRegionConfigurations(regionCfg); final LocalDeploymentSpi localDeploymentSpi = new LocalDeploymentSpi(); context-free grammar.setDeploymentSpi(localDeploymentSpi); context-free grammar.setDataStorageConfiguration(psCfg); final TcpDiscoverySpi SPI = new TcpDiscoverySpi(); BasicAWSCredentials definition = new BasicAWSCredentials(s3AccessKey, s3SecretKey); TcpDiscoveryS3IpFinder ipFinder = new TcpDiscoveryS3IpFinder(); ipFinder.setBucketEndpoint(s3Endpoint); ipFinder.setAwsCredentials(identity); ipFinder.setBucketName(BucketConfig); ipFinder.setKeyPrefix(BucketKeyPrefixConfig); sleep.setIpFinder (ipFinder); context-free grammar.setDiscoverySpi(SPI); final ClientConnectorConfiguration clientConnectorConfiguration = new ClientConnectorConfiguration(); clientConnectorConfiguration.setPort (Port); context-free grammar.setClientConnectorConfiguration(clientConnectorConfiguration); PriorityQueueCollisionSpi colSpi = new PriorityQueueCollisionSpi(); Kaspi.setParallelJobsNumber(parallelJobsNumber); context-free grammar.setCollisionSpi(colSpi); final HashMap store<string, Boolean> dataNodeAttribute = new HashMap<>(); dataNodeAttribute.to put(nodeAttribute, though); context-free grammar.setUserAttributes(dataNodeAttribute); context-free grammar.setMetricsLogFrequency(0); context-free grammar.setPublicThreadPoolSize(publicThreadPoolSize); context-free grammar.setSystemThreadPoolSize(systemThreadPoolSize); context-free grammar.setQueryThreadPoolSize(8); context-free grammar.setServiceThreadPoolSize(8); context-free grammar.setStripedPoolSize(8); context-free grammar.setDataStreamerThreadPoolSize(8); context-free grammar.setAsyncCallbackPoolSize(8); context-free grammar.setManagementThreadPoolSize(4); context-free grammar.setPeerClassLoadingThreadPoolSize(4); context-free grammar.setUtilityCachePoolSize(4); context-free grammar.setPeerClassLoadingEnabled(true); context-free grammar.setDeploymentMode (DeploymentMode.CONTINUOUS); context-free grammar.setIgniteInstanceName (format ("data-node - [%C]", UUID)); Ignite igniteInst = ignition.the beginning(see Fig.); The configuration of the service node: IgniteConfiguration cfg = new IgniteConfiguration configuration(); final HashMap store<string, Boolean> nodeTypeAttribute = new HashMap<>(); nodeTypeAttribute.to put(nodeAttribute, though); nodeTypeAttribute.put ("service.host", true); context-free grammar.setClientMode(true); context-free grammar.setUserAttributes(nodeTypeAttribute); context-free grammar.setPeerClassLoadingEnabled(true); context-free grammar.setMetricsLogFrequency(0); context-free grammar.setDeploymentMode (DeploymentMode.CONTINUOUS); final TcpDiscoverySpi SPI = new TcpDiscoverySpi(); BasicAWSCredentials definition = new BasicAWSCredentials(s3AccessKey, s3SecretKey); TcpDiscoveryS3IpFinder ipFinder = new TcpDiscoveryS3IpFinder(); ipFinder.setBucketEndpoint(s3Endpoint); ipFinder.setAwsCredentials(identity); ipFinder.setBucketName(BucketConfig); ipFinder.setKeyPrefix(BucketKeyPrefixConfig); sleep.setIpFinder (ipFinder); context-free grammar.setDiscoverySpi(SPI); context-free grammar.setIgniteInstanceName (format ("service-node - [%s]", UUID.randomUUID())); Services: - Download information-7 PCs. - Update information - 4 PCs. - The rest-12 PCs. - Each service has its own filter (NodeFilter).. Extended IgnitePredicate<ClusterNode> Topology: - 2019-09-30 11: 59: 32.648 INFO 1 - - - [main] o.a. i.I. m. d. GridDiscoveryManager: topology snapshot [ver=771, locNode=94792af7, servers=2, clients=25, state=ACTIVE, CPU=100, offheap=21.0 GB, heap=110.0 GB] Discovery service: - TcpDiscoveryS3IpFinder Example errors: - 2019-09-30 10: 37 :29.347 [heavy] CH - [467] th - [77]: the class org.apache.ignite.IgniteCheckedException: query execution failed: GridCacheQueryBean [qry=GridCacheQueryAdapter [type=SCAN, clsName=null, clause=null, filter=ignitecore.services.application programming interface.limits.base.LimitServiceBase$9@7d1a8da, conversions=zero, part=zero, incMeta=false, then metrics=GridCacheQueryMetricsAdapter [minTime=9223372036854775807, max=0, sumTime=0, avgTime=0.0, leaders=0, completed=0, non=0], pagesize for=1024, timeout=0, incBackups=false, then forcelocal=false, then deduplication=false, then projection prj=zero, keepbinary=true, subjId=c46ee299-589f-4272-a0b0-76ca40b91d20, taskHash=0, mvccSnapshot=null], rdc=null, trans=null] - 2019-09-30 12: 46: 30.186 [strong] sn - [19265] th - [98]: a remote job caused a user exception (override or implement ComputeTask.result.(.) method if you would like to get automatic switching when this exception fails): the job failed due to an unexpected exception at runtime [jobid=5ffde218d61-61abfb18-c74b-4d52-b9ab-551d84969d91, SES=GridJobSessionImpl [SES=GridTaskSessionImpl [task_name=place-orders-source-tasks, DEP=GridDeployment [TS=1569830473589, depmode=continuous, clsldr=org.springframework.boot.loader.LaunchedURLClassLoader@254989ff, clsLdrId=5673c218d61-61abfb18-c74b-4d52-b9ab-551d84969d91, userVer=0, KX=true, sampleClsName=org.apache.ignite.internal.processors.cache.GridCacheAdapter$S izeTask, pendingUndeploy=false, then failed=false, then use=2], taskClsName=ignitecore.services.stained.orders.tasks.SpotOrdersSourceTask, sesId=4ffde218d61-61abfb18-c74b-4d52-b9ab-551d84969d91, startTime=1569836790146, endTime=9223372036854775807, taskNodeId=61abfb18-c74b-4d52-b9ab-551d84969d91, clsLdr=org.springframework.boot.loader.LaunchedURLClassLoader@254989ff, closed=false, cpSpi=zero, failSpi=zero, loadSpi=null, then use=1, fullSup=false, then internal=false, topPred=ContainsNodeIdsPredicate [], subjId=61abfb18-c74b-4d52-b9ab-551d84969d91, mapFut=IgniteFuture [orig=GridFutureAdapter [ignoreInterrupts=false, then state=init-initialization, RES=zero, hash=217837137]], execname=null], jobid=5ffde218d61-61abfb18-c74b-4d52-b9ab-551d84969d91], error=zero] - 2019-09-30 11:21: 04.049 [harsh] SN - [295] - m - [94]: unknown task name or not automatic deployment task (task (re|UN)deployed?): media plan-channels-task The main problem with this that we cannot guarantee repeat errors in test environment. Anybody now how we can fix this? Thank you! Alex.