Re: [PR] IGNITE-23115 Checkpoint single partition from a single thread [ignite-3]

via GitHub Thu, 12 Sep 2024 04:53:03 -0700


tkalkirill commented on code in PR #4379:
URL: https://github.com/apache/ignite-3/pull/4379#discussion_r1756708716



##########
modules/page-memory/src/main/java/org/apache/ignite/internal/pagememory/persistence/checkpoint/CheckpointPagesWriter.java:
##########
@@ -176,90 +166,85 @@ public void run() {
         }
     }
 
-    /**
-     * Writes dirty pages.
-     *
-     * @param writePageIds Queue of dirty page IDs to write.
-     * @return pagesToRetry Queue dirty page IDs which should be retried.
-     */
-    private IgniteConcurrentMultiPairQueue<PersistentPageMemory, FullPageId> 
writePages(
-            IgniteConcurrentMultiPairQueue<PersistentPageMemory, FullPageId> 
writePageIds
-    ) throws IgniteInternalCheckedException {
-        Map<PersistentPageMemory, List<FullPageId>> pageIdsToRetry = new 
HashMap<>();
-
-        Map<PersistentPageMemory, PageStoreWriter> pageStoreWriters = new 
HashMap<>();
+    private void writeDirtyPages(PersistentPageMemory pageMemory, 
GroupPartitionId partitionId) throws IgniteInternalCheckedException {
+        CheckpointDirtyPagesView checkpointDirtyPagesView = 
checkpointDirtyPagesView(pageMemory, partitionId);
 
-        // Page store writers for checkpoint buffer pages are located in a 
separate map, because they would not add elements to a
-        // "pageIdsToRetry" map. Instead, they would ignore unsuccessful write 
lock acquisitions. It's implemented this way in order to
-        // avoid duplicates in "pageIdsToRetry", there must only be a single 
source of pages to achieve that.
-        Map<PersistentPageMemory, PageStoreWriter> cpBufferPageStoreWriters = 
new HashMap<>();
+        checkpointProgress.onStartPartitionProcessing(partitionId);
 
-        ByteBuffer tmpWriteBuf = threadBuf.get();
+        try {
+            ByteBuffer tmpWriteBuf = threadBuf.get();
 
-        Result<PersistentPageMemory, FullPageId> queueResult = new Result<>();
+            if (shouldWriteMetaPage(partitionId)) {
+                writePartitionMeta(pageMemory, partitionId, 
tmpWriteBuf.rewind());
+            }
 
-        GroupPartitionId partitionId = null;
+            var pageIdsToRetry = new ArrayList<FullPageId>();
 
-        try {
-            // TODO https://issues.apache.org/jira/browse/IGNITE-23115 Try to 
write file per thread.
-            while (!shutdownNow.getAsBoolean() && 
writePageIds.next(queueResult)) {
+            for (int i = 0; i < checkpointDirtyPagesView.size() && 
!shutdownNow.getAsBoolean(); i++) {
                 updateHeartbeat.run();
 
-                FullPageId fullId = queueResult.getValue();
+                FullPageId pageId = checkpointDirtyPagesView.get(i);
 
-                PersistentPageMemory pageMemory = queueResult.getKey();
+                if (pageId.pageIdx() == 0) {
+                    // Skip meta-pages, they are written by 
"writePartitionMeta".
+                    continue;
+                }
 
-                if (hasPartitionChanged(partitionId, fullId)) {
-                    GroupPartitionId newPartitionId = toPartitionId(fullId);
+                writeDirtyPage(pageMemory, pageId, tmpWriteBuf, 
pageIdsToRetry);

Review Comment:
   > In old implementation, we had a log for every retry round. Is it still in 
the code?
   
   Yes, see in `CheckpointPagesWriter#writeRetryDirtyPages`.
   
   > If it is, there's a risk of polluting a log file.
   
   I didn't think about log pollution.
   And it could theoretically happen, since we do repeats for each partition 
and log about it.
   
   Maybe we should go back to the previous implementation and accumulate 
repeats and then repeat for everyone at once, what do you think?
   Maybe we should make it shared? i.e. also make a shared repeat queue and 
have free threads clean it up, what do you think? It would be better to make it 
a separate ticket.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] IGNITE-23115 Checkpoint single partition from a single thread [ignite-3]

Reply via email to