[ 
https://issues.apache.org/jira/browse/GOBBLIN-1838?focusedWorklogId=864405&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-864405
 ]

ASF GitHub Bot logged work on GOBBLIN-1838:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Jun/23 09:57
            Start Date: 08/Jun/23 09:57
    Worklog Time Spent: 10m 
      Work Description: wsarecv commented on code in PR #3701:
URL: https://github.com/apache/gobblin/pull/3701#discussion_r1222735916


##########
gobblin-completeness/src/main/java/org/apache/gobblin/completeness/verifier/KafkaAuditCountVerifier.java:
##########
@@ -120,27 +140,16 @@ public boolean isComplete(String datasetName, long 
beginInMillis, long endInMill
    */
   private double getCompletenessPercentage(String datasetName, long 
beginInMillis, long endInMillis) throws IOException {
     Map<String, Long> countsByTier = getTierAndCount(datasetName, 
beginInMillis, endInMillis);
-    log.info(String.format("Audit counts map for %s for range [%s,%s]", 
datasetName, beginInMillis, endInMillis));
-    countsByTier.forEach((x,y) -> log.info(String.format(" %s : %s ", x, y)));
+    validateTierCounts(datasetName, beginInMillis, endInMillis, countsByTier, 
this.srcTier, this.refTiers);
     if (countsByTier.isEmpty() && this.returnCompleteOnNoCounts) {

Review Comment:
   Good catch. Fixed it and added a new unit test case.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 864405)
    Time Spent: 50m  (was: 40m)

> Introduce total count based completion watermark
> ------------------------------------------------
>
>                 Key: GOBBLIN-1838
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1838
>             Project: Apache Gobblin
>          Issue Type: New Feature
>            Reporter: Andy Jiang
>            Priority: Major
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently the completion watermark is determined according to the 
> completeness percentage: "max of srcCount/refCount, for each refTier".
> This change introduces a new "total count based completion watermark", which 
> is determined by a new completeness percentage: "srcCount / sum of all 
> refCount, for each refTier".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to