yihua commented on code in PR #13171:
URL: https://github.com/apache/hudi/pull/13171#discussion_r2053095626


##########
hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroReaderContext.java:
##########
@@ -198,4 +243,57 @@ private Object getFieldValueFromIndexedRecord(
     int pos = field.pos();
     return record.get(pos);
   }
+
+  /**
+   * Iterator that traverses the skeleton file and the base file in tandem.
+   * The iterator will only extract the fields requested in the provided 
schemas.
+   */
+  private static class BootstrapIterator implements 
ClosableIterator<IndexedRecord> {
+    private final ClosableIterator<IndexedRecord> skeletonFileIterator;
+    private final Schema skeletonRequiredSchema;
+    private final ClosableIterator<IndexedRecord> dataFileIterator;
+    private final Schema dataRequiredSchema;
+    private final Schema mergedSchema;
+    private final int skeletonFields;
+
+    public BootstrapIterator(ClosableIterator<IndexedRecord> 
skeletonFileIterator, Schema skeletonRequiredSchema,
+                             ClosableIterator<IndexedRecord> dataFileIterator, 
Schema dataRequiredSchema) {
+      this.skeletonFileIterator = skeletonFileIterator;
+      this.skeletonRequiredSchema = skeletonRequiredSchema;
+      this.dataFileIterator = dataFileIterator;
+      this.dataRequiredSchema = dataRequiredSchema;
+      this.mergedSchema = AvroSchemaUtils.mergeSchemas(skeletonRequiredSchema, 
dataRequiredSchema);
+      this.skeletonFields = skeletonRequiredSchema.getFields().size();
+    }
+
+    @Override
+    public void close() {
+      skeletonFileIterator.close();
+      dataFileIterator.close();
+    }
+
+    @Override
+    public boolean hasNext() {
+      checkState(dataFileIterator.hasNext() == skeletonFileIterator.hasNext(),
+          "Bootstrap data-file iterator and skeleton-file iterator have to be 
in-sync!");
+      return skeletonFileIterator.hasNext();

Review Comment:
   This assumes `skeletonFileIterator.hasNext()` to be idempotent.  To be safe, 
should `skeletonFileIterator.hasNext()` be called once if it's not idempotent?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to