suryaprasanna commented on code in PR #8915:
URL: https://github.com/apache/hudi/pull/8915#discussion_r1225136598


##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadataUtil.java:
##########
@@ -1361,7 +1363,13 @@ public static Set<String> 
getValidInstantTimestamps(HoodieTableMetaClient dataMe
         });
 
     // SOLO_COMMIT_TIMESTAMP is used during bootstrap so it is a valid 
timestamp
-    validInstantTimestamps.add(createIndexInitTimestamp(SOLO_COMMIT_TIMESTAMP, 
PARTITION_INITIALIZATION_TIME_SUFFIX));
+    List<String> metadataInitializationTimestamps = 
metadataMetaClient.getActiveTimeline()

Review Comment:
   Valid instants are mainly needed for reading the log files. 
   Whereas, first bootstrap commit on the partition, will create one set of 
dummy delete log blocks and a base file. So, all the data written into 
partition on the boostrap commit will be present in the base file. Even if dont 
include these SOLO_COMMIT_TIMESTAMP we should be ok I guess.
   But it will be good to include them to keep it consistent.



##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java:
##########
@@ -471,12 +471,9 @@ public Pair<HoodieMetadataLogRecordReader, Long> 
getLogRecordScanner(List<Hoodie
 
     // Only those log files which have a corresponding completed instant on 
the dataset should be read
     // This is because the metadata table is updated before the dataset 
instants are committed.
-    Set<String> validInstantTimestamps = HoodieTableMetadataUtil
-        .getValidInstantTimestamps(dataMetaClient, metadataMetaClient);
-
+    Set<String> validInstantTimestamps = 
HoodieTableMetadataUtil.getValidInstantTimestamps(dataMetaClient, 
metadataMetaClient);
     Option<HoodieInstant> latestMetadataInstant = 
metadataMetaClient.getActiveTimeline().filterCompletedInstants().lastInstant();
-    String latestMetadataInstantTime = 
latestMetadataInstant.map(HoodieInstant::getTimestamp).orElse(SOLO_COMMIT_TIMESTAMP);
-
+    String latestMetadataInstantTime = 
latestMetadataInstant.map(HoodieInstant::getTimestamp).orElse(HoodieTableMetadataUtil.createIndexInitTimestamp(SOLO_COMMIT_TIMESTAMP,
 0));

Review Comment:
   Yeah, I agree, we should not be in that state. So, throwing an exception 
makes sense.
   Also, this is under a public API, the construction of this class is also not 
safe. Since, we are allowing the object to be created even if the metadata 
table is not there. So, should we also throw exception in initIfNeeded method?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to