baiyangtx commented on code in PR #7866:
URL: https://github.com/apache/paimon/pull/7866#discussion_r3295861538


##########
paimon-core/src/main/java/org/apache/paimon/operation/ManifestFileMerger.java:
##########
@@ -272,27 +350,276 @@ public static Optional<List<ManifestFileMeta>> 
tryFullCompaction(
         return Optional.of(result);
     }
 
-    private static FullCompactionReadResult readForFullCompaction(
-            ManifestFileMeta file,
-            ManifestFile manifestFile,
+    private static final NormalizedKeyComputer NO_NORMALIZED_KEY_COMPUTER =
+            new NormalizedKeyComputer() {
+                @Override
+                public void putKey(InternalRow record, MemorySegment target, 
int offset) {
+                    // no-op
+                }
+
+                @Override
+                public int compareKey(
+                        MemorySegment segI, int offsetI, MemorySegment segJ, 
int offsetJ) {
+                    return 0;
+                }
+
+                @Override
+                public void swapKey(
+                        MemorySegment segI, int offsetI, MemorySegment segJ, 
int offsetJ) {
+                    // no-op
+                }
+
+                @Override
+                public int getNumKeyBytes() {
+                    return 0;
+                }
+
+                @Override
+                public boolean isKeyFullyDetermines() {
+                    return false;
+                }
+
+                @Override
+                public boolean invertKey() {
+                    return false;
+                }
+            };
+
+    private static int mergeUnsorted(
+            List<ManifestFileMeta> toBeMerged,
             Filter<ManifestFileMeta> mustChange,
-            Set<FileEntry.Identifier> deleteEntries) {
-        List<ManifestEntry> entries = new ArrayList<>();
-        boolean requireChange = mustChange.test(file);
-        for (ManifestEntry entry :
-                manifestFile.read(
-                        file.fileName(),
-                        file.fileSize(),
-                        FileEntry.addFilter(),
-                        Filter.alwaysTrue())) {
-            if (deleteEntries.contains(entry.identifier())) {
-                requireChange = true;
+            Set<FileEntry.Identifier> deleteEntries,
+            ManifestFile manifestFile,
+            RollingFileWriter<ManifestEntry, ManifestFileMeta> writer,
+            List<ManifestFileMeta> result)
+            throws Exception {
+        int actualRewriteCount = 0;
+        for (ManifestFileMeta file : toBeMerged) {
+            List<ManifestEntry> entries = new ArrayList<>();
+            boolean requireChange = mustChange.test(file);
+            for (ManifestEntry entry : manifestFile.read(file.fileName(), 
file.fileSize())) {
+                if (entry.kind() == FileKind.DELETE) {
+                    continue;
+                }
+
+                if (deleteEntries.contains(entry.identifier())) {
+                    requireChange = true;
+                } else {
+                    entries.add(entry);
+                }
+            }
+
+            if (requireChange) {
+                writer.write(entries);
+                actualRewriteCount++;
             } else {
-                entries.add(entry);
+                result.add(file);
             }
         }
+        return actualRewriteCount;
+    }
+
+    private static int mergeSortedByPartition(
+            List<ManifestFileMeta> toBeMerged,

Review Comment:
   @discivigour Good point — my approach indeed doesn't handle existing 
out-of-order manifests. I think our two PRs are actually complementary: yours 
handles the bootstrap/rewrite path for historical manifests, mine maintains 
sorted order going forward via inline compaction. I'd like to collaborate and 
align the configuration surface first so users see one coherent "manifest sort" 
feature rather than two separate mechanisms. WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to