Hi Alexandr, Thanks, I didn't know the metrics and topology info was different. I found the issue, we were not adding the nodes to the baseline topology due to a bug. We're using v2.8.0 in the upgrade/migration, previous version is 2.7
Regards, Courtney Robinson Founder and CEO, Hypi Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> https://hypi.io On Thu, Apr 30, 2020 at 11:36 PM Alexandr Shapkin <[email protected]> wrote: > Hi, > > > > I believe that you need to find the following message: > > > > 2020-04-30 16:57:44.8141|INFO|Test|Topology snapshot [ver=53, > locNode=26887aac, servers=3, clients=0, state=ACTIVE, CPUs=16, > offheap=8.3GB, heap=15.0GB] > > 2020-04-30 16:57:44.8141|INFO|Test| ^-- Baseline [id=0, size=3, online=3, > offline=0] > > > > Metrics don’t tell you l the actual topology and baseline snapshot. > > A node might be running, but not included into the baseline, that might be > the reason in your case. > > > > Also, what Ignite version do you use? > > > > *From: *Courtney Robinson <[email protected]> > *Sent: *Thursday, April 30, 2020 7:58 PM > *To: *[email protected] > *Subject: *Re: Backups not being done for SQL caches > > > > Hi Illya, > > Yes we have persistence enabled in this cluster. This is also change from > our current production deployment where we have our own CacheStore with > read and write through enabled. In this test cluster Ignite's native > persistence is being used without any external or custom CacheStore > implementation. > > > > From the Ignite logs it says all 3 nodes are present: > > > > 2020-04-30 16:53:20.468 INFO 9 --- [orker-#23%hypi%] > o.a.ignite.internal.IgniteKernal%hypi : > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=e0b6889f, name=hypi, uptime=19:15:06.473] > ^-- H/N/C [hosts=3, nodes=3, CPUs=3] > ^-- CPU [cur=-100%, avg=-100%, GC=0%] > ^-- PageMemory [pages=975] > ^-- Heap [used=781MB, free=92.37%, comm=4912MB] > ^-- Off-heap [used=3MB, free=99.91%, comm=4296MB] > ^-- sysMemPlc region [used=0MB, free=99.98%, comm=100MB] > ^-- metastoreMemPlc region [used=0MB, free=99.95%, comm=0MB] > ^-- TxLog region [used=0MB, free=100%, comm=100MB] > ^-- hypi region [used=3MB, free=99.91%, comm=4096MB] > ^-- Ignite persistence [used=3MB] > ^-- sysMemPlc region [used=0MB] > ^-- metastoreMemPlc region [used=0MB] > ^-- TxLog region [used=0MB] > ^-- hypi region [used=3MB] > ^-- Outbound messages queue [size=0] > ^-- Public thread pool [active=0, idle=0, qSize=0] > ^-- System thread pool [active=0, idle=6, qSize=0] > > > Regards, > > Courtney Robinson > > Founder and CEO, Hypi > > Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> > > https://hypi.io > > > > > > On Thu, Apr 30, 2020 at 3:12 PM Ilya Kasnacheev <[email protected]> > wrote: > > Hello! > > > > Do you have persistence? If so, are you sure that all 3 of your nodes are > in baseline topology? > > > > Regards, > > -- > > Ilya Kasnacheev > > > > > > чт, 30 апр. 2020 г. в 16:09, Courtney Robinson <[email protected] > >: > > We're continuing migration from using the Java API to purley SQL and have > encountered a situation on our development cluster where even though ALL > tables are created with backups=2, as in > > template=partitioned,backups=2,affinity_key=instanceId,atomicity=ATOMIC,cache_name=<some > name here> > > In the logs, with 3 nodes in this test environment we have: > > > > 2020-04-29 22:55:50.083 INFO 9 > *--- [orker-#40%hypi%] o.apache.ignite.internal.exchange.time : Started > exchange init [topVer=AffinityTopologyVersion [topVer=27, minorTopVer=1], > crd=true, evt=DISCOVERY_CUSTOM_EVT, > evtNode=e0b6889f-219b-4686-ab52-725bfe7848b2, > customEvt=DynamicCacheChangeBatch > [id=a81a0e7c171-3f0fbbc0-b996-448c-98f7-119d7e485f04, reqs=ArrayList > [DynamicCacheChangeRequest [cacheName=hypi_whatsapp_Item, hasCfg=true, > nodeId=e0b6889f-219b-4686-ab52-725bfe7848b2, clientStartOnly=false, > stop=false, destroy=false, disabledAfterStartfalse]], > exchangeActions=ExchangeActions [startCaches=[hypi_whatsapp_Item], > stopCaches=null, startGrps=[hypi_whatsapp_Item], stopGrps=[], > resetParts=null, stateChangeRequest=null], startCaches=false], > allowMerge=false, exchangeFreeSwitch=false]*2020-04-29 22:55:50.280 INFO 9 > > *--- [orker-#40%hypi%] o.a.i.i.p.cache.GridCacheProcessor : Started cache > [name=hypi_whatsapp_Item, id=1391701259, dataRegionName=hypi, > mode=PARTITIONED, atomicity=ATOMIC, backups=2, mvcc=false]*2020-04-29 22: > 55:50.289 INFO 9 > *--- [ sys-#648%hypi%] o.a.i.i.p.a.GridAffinityAssignmentCache : Local > node affinity assignment distribution is not ideal > [cache=hypi_whatsapp_Item, expectedPrimary=1024.00, actualPrimary=0, > expectedBackups=2048.00, actualBackups=0, warningThreshold=50.00%]*2020-04 > -29 22:55:50.293 INFO 9 > *--- [orker-#40%hypi%] .c.d.d.p.GridDhtPartitionsExchangeFuture : Finished > waiting for partition release future [topVer=AffinityTopologyVersion > [topVer=27, minorTopVer=1], waitTime=0ms, futInfo=NA, mode=DISTRIBUTED]* > 2020-04-29 22:55:50.330 INFO 9 *--- [orker-#40%hypi%] > .c.d.d.p.GridDhtPartitionsExchangeFuture : Finished waiting for partitions > release latch: ServerLatch [permits=0, pendingAcks=HashSet [], > super=CompletableLatch [id=CompletableLatchUid [id=exchange, > topVer=AffinityTopologyVersion [topVer=27, minorTopVer=1]]]]* > > > > You can see the line > > > > *Local node affinity assignment distribution is not ideal* > > > > but it's clear they the backup = 2 is there. To verify, I stoped 2 of the > three nodes and sure enough I get the exception > > > > Failed *to *find data nodes *for cache*: InstanceMapping > > > > Is there some additional configuration needed for partitioned SQL caches > to have the backups as configured? > > Until now we used the Java API with put/get and didn't have an issue with > backups. > > > > Full exception below: > > > > org.apache.ignite.cache.CacheServerNotFoundException: Failed *to *find data > nodes *for cache*: InstanceMapping > at > org.apache.ignite.internal.processors.query.h2.twostep.ReducePartitionMapper.stableDataNodes(ReducePartitionMapper.java:197) > at > org.apache.ignite.internal.processors.query.h2.twostep.ReducePartitionMapper.nodesForPartitions(ReducePartitionMapper.java:119) > at > org.apache.ignite.internal.processors.query.h2.twostep.GridReduceQueryExecutor.query(GridReduceQueryExecutor.java:466) > at > org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing$7.iterator(IgniteH2Indexing.java:1687) > at > org.apache.ignite.internal.processors.cache.QueryCursorImpl.iter(QueryCursorImpl.java:106) > at > org.apache.ignite.internal.processors.cache.query.RegisteredQueryCursor.iter(RegisteredQueryCursor.java:66) > at > org.apache.ignite.internal.processors.cache.QueryCursorImpl.iterator(QueryCursorImpl.java:96) > at io.hypi.arc.os.ignite.IgniteRepo.findInstanceCtx(IgniteRepo.java:140) > at io.hypi.arc.os.handlers.BaseHandler.evaluateQuery(BaseHandler.java:70) > at io.hypi.arc.os.handlers.HttpHandler.runQuery(HttpHandler.java:141) > at io.hypi.arc.os.handlers.HttpHandler.graphql(HttpHandler.java:135) > at jdk.internal.reflect.GeneratedMethodAccessor106.invoke(*Unknown *Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190) > at > org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138) > at > org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:104) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:892) > at > org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:797) > at > org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) > at > org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1039) > at > org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:942) > at > org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1005) > at > org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:908) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:660) > at > org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:882) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:741) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > io.hypi.arc.os.config.CorsConfiguration$1.doFilter(CorsConfiguration.java:60) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > org.springframework.boot.actuate.web.trace.servlet.HttpTraceFilter.doFilterInternal(HttpTraceFilter.java:88) > at > org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:118) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.filterAndRecordMetrics(WebMvcMetricsFilter.java:114) > at > org.springframework.boot.actuate.metrics.web.servlet.WebMvcMetricsFilter.doFilterInternal(WebMvcMetricsFilter.java:104) > at > org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:118) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:200) > at > org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:118) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96) > at > org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:526) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343) > at > org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:408) > at > org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66) > at > org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:860) > at > org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1587) > at > org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at > org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) > at java.base/java.lang.Thread.run(Thread.java:834) > > > > Regards, > > Courtney Robinson > > Founder and CEO, Hypi > > Tel: ++44 208 123 2413 (GMT+0) <https://hypi.io> > > https://hypi.io > > >
