Blazer-007 commented on code in PR #4151:
URL: https://github.com/apache/gobblin/pull/4151#discussion_r2482088728
##########
gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/activity/impl/CommitActivityImpl.java:
##########
@@ -290,4 +303,13 @@ private static String calcCommitId(WUProcessingSpec
workSpec) {
private static WorkUnitStream createEmptyWorkUnitStream() {
return new BasicWorkUnitStream.Builder(Lists.newArrayList()).build();
}
+
+ private List<DatasetTaskSummary>
convertDatasetStatsToTaskSummaries(Map<String, DatasetStats> datasetStats) {
+ List<DatasetTaskSummary> datasetTaskSummaries = Lists.newArrayList();
+ for (Map.Entry<String, DatasetStats> entry : datasetStats.entrySet()) {
+ datasetTaskSummaries.add(new DatasetTaskSummary(entry.getKey(),
entry.getValue().getRecordsWritten(), entry.getValue().getBytesWritten(),
entry.getValue().isSuccessfullyCommitted(),
entry.getValue().getDataQualityCheckStatus()));
+ }
+ log.info("Converted dataset stats to task summaries: {}",
datasetTaskSummaries);
Review Comment:
Should we make this consistent with `CommitStats` / `ExecGobblinStats` to
show only single row (combined values) into
`GaaSJobObservabilityEvent.datasetsMetrics` as well ?
##########
gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/work/CommitStats.java:
##########
@@ -41,11 +41,12 @@
@NoArgsConstructor // IMPORTANT: for jackson (de)serialization
@RequiredArgsConstructor
public class CommitStats {
- @NonNull private Map<String, DatasetStats> datasetStats;
@NonNull private int numCommittedWorkUnits;
+ @NonNull private long recordsWritten;
+ @NonNull private long bytesWritten;
@NonNull private Optional<FailedDatasetUrnsException> optFailure;
Review Comment:
`DatasetStats` have `dataQualityCheckStatus` as well, we should add that too
in `CommitStats` and `ExecGobblinStats`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]