n3nash commented on a change in pull request #1320: [HUDI-571] Add min/max 
headers on archived files
URL: https://github.com/apache/incubator-hudi/pull/1320#discussion_r379558732
 
 

 ##########
 File path: 
hudi-client/src/main/java/org/apache/hudi/io/HoodieCommitArchiveLog.java
 ##########
 @@ -268,6 +270,19 @@ public Path getArchiveFilePath() {
     return archiveFilePath;
   }
 
+  private void writeHeaderBlock(Schema wrapperSchema, List<HoodieInstant> 
instants) throws Exception {
+    if (!instants.isEmpty()) {
+      Collections.sort(instants, HoodieInstant.COMPARATOR);
+      HoodieInstant minInstant = instants.get(0);
+      HoodieInstant maxInstant = instants.get(instants.size() - 1);
+      Map<HeaderMetadataType, String> metadataMap = Maps.newHashMap();
+      metadataMap.put(HeaderMetadataType.SCHEMA, wrapperSchema.toString());
+      metadataMap.put(HeaderMetadataType.MIN_INSTANT_TIME, 
minInstant.getTimestamp());
+      metadataMap.put(HeaderMetadataType.MAX_INSTANT_TIME, 
maxInstant.getTimestamp());
+      this.writer.appendBlock(new HoodieAvroDataBlock(Collections.emptyList(), 
metadataMap));
+    }
+  }
+
   private void writeToFile(Schema wrapperSchema, List<IndexedRecord> records) 
throws Exception {
 
 Review comment:
   You are right that the file is closed after archiving all instants that 
qualify in that archiving process. But the next time an archival kicks in, it 
will check if the archival file is grown to a certain size (say 1GB), if not, 
it will append the next archival blocks to the same file..

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to