danny0405 commented on code in PR #18965:
URL: https://github.com/apache/hudi/pull/18965#discussion_r3395205665
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java:
##########
@@ -649,14 +649,39 @@ public static Stream<HoodieRecord>
createPartitionStatsRecords(String partitionP
*/
public static HoodieRecord<HoodieMetadataPayload>
createRecordIndexUpdate(String recordKey, String partition,
String fileId, String instantTime, int fileIdEncoding) {
+ return createRecordIndexUpdate(recordKey, partition, fileId,
parseRecordIndexInstantTime(instantTime), fileIdEncoding);
+ }
- HoodieKey key = new HoodieKey(recordKey,
MetadataPartitionType.RECORD_INDEX.getPartitionPath());
- long instantTimeMillis = -1;
+ /**
+ * Parses an instant time into the epoch millis stored in record index
entries.
+ * <p>
+ * Callers creating record index updates for many records of the same commit
should parse the
+ * instant time once with this method and use the millis-based
+ * {@link #createRecordIndexUpdate(String, String, String, long, int)}
overload per record.
+ */
+ public static long parseRecordIndexInstantTime(String instantTime) {
try {
- instantTimeMillis =
TimelineUtils.parseDateFromInstantTime(instantTime).getTime();
+ return TimelineUtils.parseDateFromInstantTime(instantTime).getTime();
} catch (Exception e) {
throw new HoodieMetadataException("Failed to create metadata payload for
record index. Instant time parsing for " + instantTime + " failed ", e);
}
+ }
+
+ /**
+ * Create and return a {@code HoodieMetadataPayload} to insert or update an
entry for the record index.
+ * <p>
+ * Same as {@link #createRecordIndexUpdate(String, String, String, String,
int)} but takes the
+ * instant time already parsed to epoch millis, so per-commit callers parse
it only once.
+ *
+ * @param recordKey Key of the record
+ * @param partition Name of the partition which contains the record
+ * @param fileId fileId which contains the record
+ * @param instantTimeMillis epoch millis of the instant when the record was
added
+ */
+ public static HoodieRecord<HoodieMetadataPayload>
createRecordIndexUpdate(String recordKey, String partition,
Review Comment:
Could we keep the same fail-fast guarantee on this overload? The old String
overload always parsed the instant before building the payload, so callers
could not persist an invalid record-index instant. This new public long
overload will accept values like -1L (used as a sentinel in nearby delete-only
code) and write them into HoodieRecordIndexInfo, which would later decode to a
bogus timeline instant. Please add a validation here, for example rejecting
negative millis, so future callers cannot accidentally create corrupt RLI
entries.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]