Re: Re[6]: Checkpointing threads

Surinder Mehra Sun, 11 Sep 2022 23:19:04 -0700

Hi,
Throttling is disabled in ignite config as mentioned in prev reply. What do
you suggest to make ignite catchup with SSD limits on checkpointing.


On Mon, 12 Sept 2022, 11:32 Zhenya Stanilovsky via user, <
[email protected]> wrote:

>
>
>
>
>
> We have observed one interesting issue with checkpointing. We are using
> 64G RAM 12 CPU with 3K iops/128mbps SSDs. Our application fills up the WAL
> directory really fast and hence the RAM. We made the following observations
>
> 0. Not so bad news first, it resumes processing after getting stuck for
> several minutes.
>
> 1. WAL and WAL Archive writes are a lot faster than writes to the work
> directory through checkpointing. Very curious to know why this is the case.
> checkpointing writes never exceeds 15 mbps while wal and wal archive go
> really high upto max limits of ssd
>
>
> Very simple example : sequential changing of 1 key, so in wal you obtain
> all changes and in (in your terms — checkpointing) only one key change.
>
>
>
> 2. We observed that when offheap memory usage tend to zero , checkpointing
> takes minutes to complete , sometimes 30+ minutes which stalls the
> application writes completely on all nodes. It means the whole cluster
> freezes.
>
>
> Seems ignite enables throttling in such a case, you need some system and
> cluster tuning.
>
>
>
> 3. Checkpointing thread get stuck at checkpointing page futures.get and
> after several minutes, it logs this error and grid resumes processing
>
> "sys-stripe-0-#1" #19 prio=5 os_prio=0 cpu=86537.69ms elapsed=2166.63s
> tid=0x00007fa52a6f1000 nid=0x3b waiting on condition  [0x00007fa4c58be000]
>    java.lang.Thread.State: WAITING (parking)
> at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
> at java.util.concurrent.locks.LockSupport.park([email protected]/Unknown
> Source)
> at
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
> at
> org.apache.ignite.internal.util.future.GridFutureAdapter.getUninterruptibly(GridFutureAdapter.java:146)
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:144)
> at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1613)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processDhtAtomicUpdateRequest(GridDhtAtomicCache.java:3313)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:143)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:322)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:317)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1151)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:592)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:393)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:319)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:110)
> at
> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:309)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1908)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1529)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager.access$5300(GridIoManager.java:242)
> at
> org.apache.ignite.internal.managers.communication.GridIoManager$9.execute(GridIoManager.java:1422)
> at
> org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:55)
> at
> org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:569)
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at java.lang.Thread.run([email protected]/Unknown Source)
> CheckpointProgress pages = checkpointer.scheduleCheckpoint(0, "too many
> dirty pages");
>
> checkpointReadWriteLock.readUnlock();
>
> if (timeout > 0 && U.currentTimeMillis() - start >= timeout)
>     failCheckpointReadLock();
>
> try {
>     pages
>         .futureFor(LOCK_RELEASED)
>         .getUninterruptibly();
> }
>
>
> [2022-09-09 18:58:35,148][ERROR][sys-stripe-9-#10][CheckpointTimeoutLock]
> Checkpoint read lock acquisition has been timed out.
> class
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock$CheckpointReadLockTimeoutException:
> Checkpoint read lock acquisition has been timed out.
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.failCheckpointReadLock(CheckpointTimeoutLock.java:210)
> at
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:108)
> at
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1613)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processDhtAtomicUpdateRequest(GridDhtAtomicCache.java:3313)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$600(GridDhtAtomicCache.java:143)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$7.apply(GridDhtAtomicCache.java:322)
> [2022-09-09 18:58:35,148][INFO ][sys-stripe-7-#8][FailureProcessor] Thread
> dump is hidden due to throttling settings. Set
> IGNITE_DUMP_THREADS_ON_FAILURE_THROTTLING_TIMEOUT property to 0 to see all
> thread dumps.
>
>
> 4. Other nodes printy below logs during the window problematic node is
> stuck at checkpointing
>
> [2022-09-09 18:58:35,153][WARN ][push-metrics-exporter-#80][G] >>>
> Possible starvation in striped pool.
>     Thread name: sys-stripe-5-#6
>     Queue:
> [o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@eb9f832,
> Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
> ordered=false, timeout=0, skipOnTimeout=false,
> msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1,
> arr=[351148]]]]], Message closure [msg=GridIoMessage [plc=2,
> topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0,
> skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse
> [futIds=GridLongList [idx=2, arr=[273841,273843]]]]], Message closure
> [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false,
> timeout=0, skipOnTimeout=false, msg=GridNearSingleGetRequest
> [futId=1662749921887, key=BinaryObjectImpl [arr= true, ctx=false, start=0],
> flags=1, topVer=AffinityTopologyVersion [topVer=14, minorTopVer=0],
> subjId=12746da1-ac0d-4ba1-933e-5aa3f92d2f68, taskNameHash=0, createTtl=-1,
> accessTtl=-1, txLbl=null, mvccSnapshot=null]]], Message closure
> [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false,
> timeout=0, skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse
> [futIds=GridLongList [idx=1, arr=[351149]]]]],
> o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@110ec0fa,
> Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
> ordered=false, timeout=0, skipOnTimeout=false,
> msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=10,
> arr=[414638,414655,414658,414661,414662,414663,414666,414668,414673,414678]]]]],
> o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@63ae8204,
> o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@2d3cc0b,
> Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
> ordered=false, timeout=0, skipOnTimeout=false,
> msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1,
> arr=[414667]]]]], Message closure [msg=GridIoMessage [plc=2,
> topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0,
> skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse
> [futIds=GridLongList [idx=4, arr=[351159,351162,351163,351164]]]]], Message
> closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
> ordered=false, timeout=0, skipOnTimeout=false,
> msg=GridDhtAtomicDeferredUpdateResponse [futIds=GridLongList [idx=1,
> arr=[290762]]]]], Message closure [msg=GridIoMessage [plc=2,
> topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0,
> skipOnTimeout=false, msg=GridDhtAtomicDeferredUpdateResponse
> [futIds=GridLongList [idx=1, arr=[400357]]]]],
> o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$DeferredUpdateTimeout@71887193,
> Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8,
> ordered=false, timeout=0, skipOnTimeout=false,
> msg=GridDhtAtomicSingleUpdateRequest [key=BinaryObjectImpl [arr= true,
> ctx=false, start=0], val=BinaryObjectImpl [arr= true, ctx=false, start=0],
> prevVal=null, super=GridDhtAtomicAbstractUpdateRequest [onRes=false,
> nearNodeId=null, nearFutId=0, flags=]]]], Message closure
> [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false,
> timeout=0, skipOnTimeout=false, msg=GridNearAtomicSingleUpdateRequest
> [key=BinaryObjectImpl [arr= true, ctx=false, start=0],
> parent=GridNearAtomicAbstractSingleUpdateRequest [nodeId=null,
> futId=1324019, topVer=AffinityTopologyVersion [topVer=14, minorTopVer=0],
> parent=GridNearAtomicAbstractUpdateRequest [res=null, flags=]]]]]]
>     Deadlock: false
>     Completed: 205703
>
>
> On Wed, Sep 7, 2022 at 4:25 PM Zhenya Stanilovsky via user <
> [email protected]
> <//e.mail.ru/compose/?mailto=mailto%[email protected]>> wrote:
>
> Ok, Raymond i understand. But seems no one have good answer here, it
> depends on appropriate fs and near (probably cloud) layer implementation.
> If you not observe «throttling» messages (described in prev link) seems
> it`s all ok, but of course you can benchmark your io by yourself with 3-rd
> party tool.
>
>
>
> Thanks Zhenya.
>
> I have seen the link you provide has a lot of good information on this
> system. But it does not talk about the check point writers in any detail.
>
> I appreciate this cannot be a bottleneck, my question is more related to:
> "If I have more check pointing threads will check points take less time".
> In our case we use AWS EFS so if each checkpoint thread is spending
> relatively long times blocking on write I/O to the persistent store then
> more check points allow more concurrent writes to take place. Of course, if
> the check point threads themselves utilise async I/O tasks and
> interleave I/O activities on that basis then there may not be an
> opportunity for performance improvement, but I am not an expert in the
> Ignite code base :)
>
> Raymond.
>
>
> On Wed, Sep 7, 2022 at 7:51 PM Zhenya Stanilovsky via user <
> [email protected]
> <http://e.mail.ru/compose/?mailto=mailto%[email protected]>> wrote:
>
>
> No, there is no any log and metrics suggestions and as i told earlier
> — this place can`t became a bottleneck, if you have any performance
> problems — describe them somehow wider and interesting reading here [1]
>
> [1]
> https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood
>
>
>
> Thanks Zhenya.
>
> Is there any logging or metrics that would indicate if there was value
> increasing the size of this pool?
>
>
> On Fri, 2 Sep 2022 at 8:20 PM, Zhenya Stanilovsky via user <
> [email protected]
> <http://e.mail.ru/compose/?mailto=mailto%[email protected]>> wrote:
>
> Hi Raymond
>
> checkpoint threads is responsible for dumping modified pages, so you may
> consider it as io bound only operation and pool size is amount of
> disc writing workers.
> I think that default is enough and no need for raising it, but it also up
> to you.
>
>
>
> Hi,
>
> I am looking at our configuration of the Ignite checkpointing system to
> ensure we have it tuned correctly.
>
> There is a checkpointing thread pool defined, which defaults to 4 threads
> in size. I have not been able to find much of a discussion on when/how this
> pool size should be changed to reflect the node size Ignite is running on.
>
> In our case, we are running 16 core servers with 128 GB RAM with
> persistence on an NFS storage layer.
>
> Given the number of cores, and the relative latency of NFS compared to
> local SSD, is 4 checkpointing threads appropriate, or are we likely to see
> better performance if we increased it to 8 (or more)?
>
> If there is a discussion related to this a pointer to it would be good
> (it's not really covered in the performance tuning section).
>
> Thanks,
> Raymond.
>
> --
>
> Raymond Wilson
> Trimble Distinguished Engineer, Civil Construction Software (CCS)
> 11 Birmingham Drive
> <https://www.google.com/maps/search/11+Birmingham+Drive+%7C+Christchurch,+New+Zealand?entry=gmail&source=g>
>  |
> <https://www.google.com/maps/search/11+Birmingham+Drive+%7C+Christchurch,+New+Zealand?entry=gmail&source=g>
>  Christchurch, New Zealand
> <https://www.google.com/maps/search/11+Birmingham+Drive+%7C+Christchurch,+New+Zealand?entry=gmail&source=g>
> [email protected]
> <http://e.mail.ru/compose/?mailto=mailto%[email protected]>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> --
>
> Raymond Wilson
> Trimble Distinguished Engineer, Civil Construction Software (CCS)
> 11 Birmingham Drive | Christchurch, New Zealand
> [email protected]
> <http://e.mail.ru/compose/?mailto=mailto%[email protected]>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Re: Re[6]: Checkpointing threads

Reply via email to