[
https://issues.apache.org/jira/browse/IGNITE-13093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120160#comment-17120160
]
Stanilovsky Evgeny commented on IGNITE-13093:
---------------------------------------------
can you share part of your code which uses ignite ?
> Unidentified Apache Ignite worker blocked when inserting large amount of
> records to the persistent storage
> ----------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-13093
> URL: https://issues.apache.org/jira/browse/IGNITE-13093
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Affects Versions: 2.8.1
> Environment: Java 1.8.0_231
> Apache Ignite 2.8.1
> Windows 10, 64G memory
> Java settings
> -Xms1024m -Xmx50g -Xss1024m
> -Xverify:none
> -server
> -DIGNITE_QUIET=true
> -XX:+UseG1GC
> -XX:+DisableExplicitGC
> -Djava.net.preferIPv4Stack=true
> -XX:+AlwaysPreTouch
> -XX:+ScavengeBeforeFullGC
> -XX:+AggressiveOpts
> Reporter: Tomasz Grygo
> Priority: Blocker
> Attachments: 20_05_29__14_51.out.log, ignite.xml, log4j2.xml,
> thread_dump.txt
>
>
> I'm looking at Apache Ignite to use as a fast database. Performance is very
> important, I need to build it as fast as possible with resources available.
> First I copy all (450M) records from my original test database to Ignite
> caches through IgniteDataStreams using PK as a key. Database does not fit in
> memory so I have disk persistence enabled and eviction disabled. Data is
> inserted in parallel using 8 threads. I have only one but fairly powerful
> Windows PC doing all the work, no separate Ignite cluster. I'm not interested
> in cache recovery so WAL is disabled. Everything goes well until I hit around
> 310 million entries (2 hours of work). At this point Ignite starts to choke,
> inserts slow down and then stop with exceptions. Exception is triggered by
> systemWorkerBlockedTimeout setting set to 5 minutes. Extending this time does
> not help at all. Based on heap dump I tried adding
> -DIGNITE_PAGES_LIST_DISABLE_ONHEAP_CACHING=true and it failed slightly later
> but still could not finish the job. I read the performance guides and I tried
> tweaking other Ignite settings too but didn't see any impact. How can if find
> which worker is being blocked and why?
> 2020-05-27 21:54:26,176 [Storage2 ] [ERROR] - DTR_0030 worker Storage2 had
> error: FATAL ERROR java.lang.IllegalStateException: Data streamer has been
> closed.
> java.lang.IllegalStateException: Data streamer has been closed.
> at
> org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.closedException(DataStreamerImpl.java:1095)
> at
> org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.lock(DataStreamerImpl.java:446)
> at
> org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.addDataInternal(DataStreamerImpl.java:646)
> at
> org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.addDataInternal(DataStreamerImpl.java:631)
> at
> org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.addData(DataStreamerImpl.java:753)
> at
> com.sc.extr.cache.PureIgniteDynamicRowStorage.putIfAbsent(PureIgniteDynamicRowStorage.java:83)
> at
> com.sc.extr.cache.PureIgniteDynamicRowStorage.addRowOnKey(PureIgniteDynamicRowStorage.java:160)
> at
> com.sc.extr.tree.MultiCacheTreeBuilder.addRootRowToCache(MultiCacheTreeBuilder.java:409)
> at
> com.sc.extr.tree.MultiCacheTreeBuilder.parentRev1to1(MultiCacheTreeBuilder.java:237)
> at
> com.sc.extr.tree.MultiCacheTreeBuilder.addRowToCache(MultiCacheTreeBuilder.java:333)
> at
> com.sc.extr.tree.MultiCacheTreeBuilder.parentRev(MultiCacheTreeBuilder.java:274)
> at
> com.sc.extr.tree.MultiCacheTreeBuilder.addRow(MultiCacheTreeBuilder.java:379)
> at
> com.sc.extr.tree.MultiCacheTreeBuilder.process(MultiCacheTreeBuilder.java:206)
> at com.sc.bi.workflow.WorkTransformer.processOne(WorkTransformer.java:84)
> at com.sc.bi.workflow.WorkTransformer.doWork(WorkTransformer.java:145)
> at
> com.sc.bi.workflow.WorkTransformer.processQueue(WorkTransformer.java:210)
> at com.sc.bi.workflow.WorkTransformer.run(WorkTransformer.java:169)
> Caused by: class org.apache.ignite.IgniteCheckedException: Data streamer has
> been cancelled: DataStreamerImpl [bufLdrSzPerThread=4096,
> rcvr=org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$IsolatedUpdater@381b03ed,
> ioPlcRslvr=null, cacheName=PERSON.PTINTN, bufSize=512, parallelOps=0,
> timeout=-1, autoFlushFreq=0, bufMappings=ConcurrentHashMap
> {03e74462-12ec-4140-b9fb-a975572ac3bb=Buffer [node=TcpDiscoveryNode
> [id=03e74462-12ec-4140-b9fb-a975572ac3bb,
> consistentId=b01eb38b-7728-4e43-a697-0bc52f872e44, addrs=ArrayList
> [127.0.0.1, 172.27.179.112], sockAddrs=HashSet
> [SOFTBI-DEV.sc.com/172.27.179.112:47500, /127.0.0.1:47500], discPort=47500,
> order=1, intOrder=1, lastExchangeTime=1590614830815, loc=true,
> ver=2.8.1#20200521-sha1:86422096, isClient=false], isLocNode=true, idGen=0,
> sem=java.util.concurrent.Semaphore@2a869d9[Permits = 64],
> perNodeParallelOps=64, entriesCnt=2048, locFutsSize=0, reqsSize=0]},
> cacheObjProc=GridProcessorAdapter [],
> cacheObjCtx=org.apache.ignite.internal.processors.cache.CacheObjectContext@2a5313b0,
> cancelled=true, cancellationReason=null, failCntr=0,
> activeFuts=GridConcurrentHashSet [GridFutureAdapter [ignoreInterrupts=false,
> state=INIT, res=null, hash=2102798044], GridFutureAdapter
> [ignoreInterrupts=false, state=INIT, res=null, hash=1195632760],
> GridFutureAdapter [ignoreInterrupts=false, state=INIT, res=null,
> hash=370791970], GridFutureAdapter [ignoreInterrupts=false, state=INIT,
> res=null, hash=420732031], GridFutureAdapter [ignoreInterrupts=false,
> state=INIT, res=null, hash=1453517070]], jobPda=null, depCls=null,
> fut=DataStreamerFuture [super=GridFutureAdapter [ignoreInterrupts=false,
> state=INIT, res=null, hash=1165180540]], publicFut=IgniteFuture
> [orig=DataStreamerFuture [super=GridFutureAdapter [ignoreInterrupts=false,
> state=INIT, res=null, hash=1165180540]]], disconnectErr=null, closed=true,
> lastFlushTime=1590629894701, skipStore=false, keepBinary=false,
> maxRemapCnt=32, remapSem=java.util.concurrent.Semaphore@6e6f060b[Permits =
> 2147483647], remapOwning=false]
> at
> org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.closeEx(DataStreamerImpl.java:1347)
> at
> org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.closeEx(DataStreamerImpl.java:1318)
> at
> org.apache.ignite.internal.processors.datastreamer.DataStreamProcessor.onKernalStop(DataStreamProcessor.java:155)
> at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2551)
> at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2499)
> at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2650)
> at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2613)
> at org.apache.ignite.internal.IgnitionEx.stop(IgnitionEx.java:339)
> at
> org.apache.ignite.failure.StopNodeFailureHandler$1.run(StopNodeFailureHandler.java:36)
> at java.lang.Thread.run(Thread.java:748)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)