yihua commented on code in PR #11947:
URL: https://github.com/apache/hudi/pull/11947#discussion_r1801779609
##########
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/CompletionTimeQueryView.java:
##########
@@ -189,46 +193,78 @@ public Option<String> getCompletionTime(String beginTime)
{
* <p>By default, assumes there is at most 1 day time of duration for an
instant to accelerate the queries.
*
* @param timeline The timeline.
- * @param rangeStart The query range start completion time.
- * @param rangeEnd The query range end completion time.
+ * @param rangeStartCompletionTime The query range start completion time.
+ * @param rangeEndCompletionTime The query range end completion time.
* @param rangeType The range type.
*
* @return The sorted instant time list.
*/
public List<String> getStartTimes(
HoodieTimeline timeline,
- Option<String> rangeStart,
- Option<String> rangeEnd,
+ Option<String> rangeStartCompletionTime,
+ Option<String> rangeEndCompletionTime,
InstantRange.RangeType rangeType) {
// assumes any instant/transaction lasts at most 1 day to optimize the
query efficiency.
- return getStartTimes(timeline, rangeStart, rangeEnd, rangeType, s ->
HoodieInstantTimeGenerator.instantTimeMinusMillis(s, MILLI_SECONDS_IN_ONE_DAY));
+ return getStartTimes(
+ timeline,
+ rangeStartCompletionTime,
+ rangeEndCompletionTime,
+ rangeType,
+ GET_INSTANT_ONE_DAY_BEFORE);
}
/**
* Queries the instant start time with given completion time range.
*
- * @param rangeStart The query range start completion time.
- * @param rangeEnd The query range end completion time.
+ * @param rangeStartCompletionTime The query range start
completion time.
+ * @param rangeEndCompletionTime The query range end
completion time.
* @param earliestInstantTimeFunc The function to generate the earliest
start time boundary
* with the minimum completion time.
*
* @return The sorted instant time list.
*/
@VisibleForTesting
public List<String> getStartTimes(
- String rangeStart,
- String rangeEnd,
+ String rangeStartCompletionTime,
+ String rangeEndCompletionTime,
Function<String, String> earliestInstantTimeFunc) {
- return
getStartTimes(metaClient.getCommitsTimeline().filterCompletedInstants(),
Option.ofNullable(rangeStart), Option.ofNullable(rangeEnd),
- InstantRange.RangeType.CLOSED_CLOSED, earliestInstantTimeFunc);
+ return getStartTimes(
+ metaClient.getCommitsTimeline().filterCompletedInstants(),
+ Option.ofNullable(rangeStartCompletionTime),
+ Option.ofNullable(rangeEndCompletionTime),
+ InstantRange.RangeType.CLOSED_CLOSED,
+ earliestInstantTimeFunc);
+ }
+
+ /**
+ * Queries the instant start time with given completion time range.
+ *
+ * @param rangeStartCompletionTime The query range start
completion time.
+ * @param rangeEndCompletionTime The query range end
completion time.
+ * @param rangeType The range type.
+ * with the minimum completion time.
+ *
+ * @return The sorted instant time list.
+ */
+ @VisibleForTesting
+ public List<String> getStartTimes(
Review Comment:
Looks like this is not used any more.
##########
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieDefaultTimeline.java:
##########
@@ -629,16 +649,23 @@ private Set<String>
getOrCreatePendingClusteringInstantSet() {
/**
* Returns the first non savepoint commit on the timeline.
*/
- private static Option<HoodieInstant>
findFirstNonSavepointCommit(List<HoodieInstant> instants) {
+ private static Option<HoodieInstant> findFirstNonSavepointCommit(
+ List<HoodieInstant> instants,
+ boolean byCompletionTime) {
Set<String> savepointTimestamps = instants.stream()
.filter(entry ->
entry.getAction().equals(HoodieTimeline.SAVEPOINT_ACTION))
- .map(HoodieInstant::getTimestamp)
+ .map(byCompletionTime
+ ? HoodieInstant::getCompletionTime
+ : HoodieInstant::getTimestamp)
.collect(Collectors.toSet());
if (!savepointTimestamps.isEmpty()) {
// There are chances that there could be holes in the timeline due to
archival and savepoint interplay.
// So, the first non-savepoint commit is considered as beginning of the
active timeline.
return Option.fromJavaOptional(instants.stream()
- .filter(entry -> !savepointTimestamps.contains(entry.getTimestamp()))
+ .filter(entry -> !savepointTimestamps.contains(
Review Comment:
This still assumes that the `instants` is sorted by completion time to give
the correct results, but the passed-in `instants` is always sorted by instant
time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]