Hi, Throttling is disabled in ignite config as mentioned in prev reply. What do you suggest to make ignite catchup with SSD limits on checkpointing.
On Mon, 12 Sept 2022, 11:32 Zhenya Stanilovsky via user, < [email protected]> wrote: > > > > > > We have observed one interesting issue with checkpointing. We are using > 64G RAM 12 CPU with 3K iops/128mbps SSDs. Our application fills up the WAL > directory really fast and hence the RAM. We made the following observations > > 0. Not so bad news first, it resumes processing after getting stuck for > several minutes. > > 1. WAL and WAL Archive writes are a lot faster than writes to the work > directory through checkpointing. Very curious to know why this is the case. > checkpointing writes never exceeds 15 mbps while wal and wal archive go > really high upto max limits of ssd > > > Very simple example : sequential changing of 1 key, so in wal you obtain > all changes and in (in your terms — checkpointing) only one key change. > > > > 2. We observed that when offheap memory usage tend to zero , checkpointing > takes minutes to complete , sometimes 30+ minutes which stalls the > application writes completely on all nodes. It means the whole cluster > freezes. > > > Seems ignite enables throttling in such a case, you need some system and > cluster tuning. > > > > 3. Checkpointing thread get stuck at checkpointing page futures.get and > after several minutes, it logs this error and grid resumes processing > > "sys-stripe-0-#1" #19 prio=5 os_prio=0 cpu=86537.69ms elapsed=2166.63s > tid=0x00007fa52a6f1000 nid=0x3b waiting on condition [0x00007fa4c58be000] > java.lang.Thread.State: WAITING (parking) > at jdk.internal.misc.Unsafe.park([email protected]/Native Method) > at java.util.concurrent.locks.LockSupport.park([email protected]/Unknown > Source) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178) > at > org.apache.ignite.internal.util.future.GridFutureAdapter.getUninterruptibly(GridFutureAdapter.java:146) > at > org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:144) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1613) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processDhtAtomicUpdateRequest(GridDhtAtomicCache.java:3313) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:143) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:322) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:317) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1151) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:592) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:393) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:319) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:110) > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:309) > at > org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1908) > at > org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1529) > at > org.apache.ignite.internal.managers.communication.GridIoManager.access$5300(GridIoManager.java:242) > at > org.apache.ignite.internal.managers.communication.GridIoManager$9.execute(GridIoManager.java:1422) > at > org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:55) > at > org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:569) > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) > at java.lang.Thread.run([email protected]/Unknown Source) > CheckpointProgress pages = checkpointer.scheduleCheckpoint(0, "too many > dirty pages"); > > checkpointReadWriteLock.readUnlock(); > > if (timeout > 0 && U.currentTimeMillis() - start >= timeout) > failCheckpointReadLock(); > > try { > pages > .futureFor(LOCK_RELEASED) > .getUninterruptibly(); > } > > > [2022-09-09 18:58:35,148][ERROR][sys-stripe-9-#10][CheckpointTimeoutLock] > Checkpoint read lock acquisition has been timed out. > class > org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock$CheckpointReadLockTimeoutException: > Checkpoint read lock acquisition has been timed out. > at > org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.failCheckpointReadLock(CheckpointTimeoutLock.java:210) > at > org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:108) > at > org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1613) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processDhtAtomicUpdateRequest(GridDhtAtomicCache.java:3313) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:143) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:322) > [2022-09-09 18:58:35,148][INFO ][sys-stripe-7-#8][FailureProcessor] Thread > dump is hidden due to throttling settings. Set > IGNITE_DUMP_THREADS_ON_FAILURE_THROTTLING_TIMEOUT property to 0 to see all > thread dumps. > > > 4. Other nodes printy below logs during the window problematic node is > stuck at checkpointing > > [2022-09-09 18:58:35,153][WARN ][push-metrics-exporter-#80][G] >>> > Possible starvation in striped pool. > Thread name: sys-stripe-5-#6 > Queue: > [o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@eb9f832, > Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, > ordered=false, timeout=0, skipOnTimeout=false, > msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1, > arr=[351148]]]]], Message closure [msg=GridIoMessage [plc=2, > topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, > skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse > [futIds=GridLongList [idx=2, arr=[273841,273843]]]]], Message closure > [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, > timeout=0, skipOnTimeout=false, msg=GridNearSingleGetRequest > [futId=1662749921887, key=BinaryObjectImpl [arr= true, ctx=false, start=0], > flags=1, topVer=AffinityTopologyVersion [topVer=14, minorTopVer=0], > subjId=12746da1-ac0d-4ba1-933e-5aa3f92d2f68, taskNameHash=0, createTtl=-1, > accessTtl=-1, txLbl=null, mvccSnapshot=null]]], Message closure > [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, > timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse > [futIds=GridLongList [idx=1, arr=[351149]]]]], > o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@110ec0fa, > Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, > ordered=false, timeout=0, skipOnTimeout=false, > msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=10, > arr=[414638,414655,414658,414661,414662,414663,414666,414668,414673,414678]]]]], > o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@63ae8204, > o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@2d3cc0b, > Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, > ordered=false, timeout=0, skipOnTimeout=false, > msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1, > arr=[414667]]]]], Message closure [msg=GridIoMessage [plc=2, > topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, > skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse > [futIds=GridLongList [idx=4, arr=[351159,351162,351163,351164]]]]], Message > closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, > ordered=false, timeout=0, skipOnTimeout=false, > msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1, > arr=[290762]]]]], Message closure [msg=GridIoMessage [plc=2, > topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, > skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse > [futIds=GridLongList [idx=1, arr=[400357]]]]], > o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@71887193, > Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, > ordered=false, timeout=0, skipOnTimeout=false, > msg=GridDhtAtomicSingleUpdateRequest [key=BinaryObjectImpl [arr= true, > ctx=false, start=0], val=BinaryObjectImpl [arr= true, ctx=false, start=0], > prevVal=null, super=GridDhtAtomicAbstractUpdateRequest [onRes=false, > nearNodeId=null, nearFutId=0, flags=]]]], Message closure > [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, > timeout=0, skipOnTimeout=false, msg=GridNearAtomicSingleUpdateRequest > [key=BinaryObjectImpl [arr= true, ctx=false, start=0], > parent=GridNearAtomicAbstractSingleUpdateRequest [nodeId=null, > futId=1324019, topVer=AffinityTopologyVersion [topVer=14, minorTopVer=0], > parent=GridNearAtomicAbstractUpdateRequest [res=null, flags=]]]]]] > Deadlock: false > Completed: 205703 > > > On Wed, Sep 7, 2022 at 4:25 PM Zhenya Stanilovsky via user < > [email protected] > <//e.mail.ru/compose/?mailto=mailto%[email protected]>> wrote: > > Ok, Raymond i understand. But seems no one have good answer here, it > depends on appropriate fs and near (probably cloud) layer implementation. > If you not observe «throttling» messages (described in prev link) seems > it`s all ok, but of course you can benchmark your io by yourself with 3-rd > party tool. > > > > Thanks Zhenya. > > I have seen the link you provide has a lot of good information on this > system. But it does not talk about the check point writers in any detail. > > I appreciate this cannot be a bottleneck, my question is more related to: > "If I have more check pointing threads will check points take less time". > In our case we use AWS EFS so if each checkpoint thread is spending > relatively long times blocking on write I/O to the persistent store then > more check points allow more concurrent writes to take place. Of course, if > the check point threads themselves utilise async I/O tasks and > interleave I/O activities on that basis then there may not be an > opportunity for performance improvement, but I am not an expert in the > Ignite code base :) > > Raymond. > > > On Wed, Sep 7, 2022 at 7:51 PM Zhenya Stanilovsky via user < > [email protected] > <http://e.mail.ru/compose/?mailto=mailto%[email protected]>> wrote: > > > No, there is no any log and metrics suggestions and as i told earlier > — this place can`t became a bottleneck, if you have any performance > problems — describe them somehow wider and interesting reading here [1] > > [1] > https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood > > > > Thanks Zhenya. > > Is there any logging or metrics that would indicate if there was value > increasing the size of this pool? > > > On Fri, 2 Sep 2022 at 8:20 PM, Zhenya Stanilovsky via user < > [email protected] > <http://e.mail.ru/compose/?mailto=mailto%[email protected]>> wrote: > > Hi Raymond > > checkpoint threads is responsible for dumping modified pages, so you may > consider it as io bound only operation and pool size is amount of > disc writing workers. > I think that default is enough and no need for raising it, but it also up > to you. > > > > Hi, > > I am looking at our configuration of the Ignite checkpointing system to > ensure we have it tuned correctly. > > There is a checkpointing thread pool defined, which defaults to 4 threads > in size. I have not been able to find much of a discussion on when/how this > pool size should be changed to reflect the node size Ignite is running on. > > In our case, we are running 16 core servers with 128 GB RAM with > persistence on an NFS storage layer. > > Given the number of cores, and the relative latency of NFS compared to > local SSD, is 4 checkpointing threads appropriate, or are we likely to see > better performance if we increased it to 8 (or more)? > > If there is a discussion related to this a pointer to it would be good > (it's not really covered in the performance tuning section). > > Thanks, > Raymond. > > -- > > Raymond Wilson > Trimble Distinguished Engineer, Civil Construction Software (CCS) > 11 Birmingham Drive > <https://www.google.com/maps/search/11+Birmingham+Drive+%7C+Christchurch,+New+Zealand?entry=gmail&source=g> > | > <https://www.google.com/maps/search/11+Birmingham+Drive+%7C+Christchurch,+New+Zealand?entry=gmail&source=g> > Christchurch, New Zealand > <https://www.google.com/maps/search/11+Birmingham+Drive+%7C+Christchurch,+New+Zealand?entry=gmail&source=g> > [email protected] > <http://e.mail.ru/compose/?mailto=mailto%[email protected]> > > > > > > > > > > > > > > > > > -- > > Raymond Wilson > Trimble Distinguished Engineer, Civil Construction Software (CCS) > 11 Birmingham Drive | Christchurch, New Zealand > [email protected] > <http://e.mail.ru/compose/?mailto=mailto%[email protected]> > > > > > > > > > > > > > >
