Will-Lo commented on code in PR #3667:
URL: https://github.com/apache/gobblin/pull/3667#discussion_r1160947814


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventExperimental.avsc:
##########
@@ -188,6 +188,38 @@
           }
         }
       ]
-    }]
+    },
+    {
+      "name": "datasetsWritten",
+      "type": [
+        "null",
+        {
+          "type": "array",
+          "items": {
+            "type": "record",
+            "name": "DatasetMetric",
+            "doc": "DatasetMetric contains bytes and records written by 
Gobblin writers for the dataset URN.",
+            "fields": [
+              {
+                "name": "datasetUrn",
+                "type": "string",
+                "doc": "URN of the dataset"
+              },
+              {
+                "name": "bytesWritten",
+                "type": "long",
+                "doc": "Number of bytes written for the dataset"

Review Comment:
   For jobs that this is applicable, it's not usable for retention but it 
should be compatible with any writer that records its records/bytes written, 
e.g. Distcp, Hdfs -> Mysql. So it's dependent on the writer implementation, but 
I can see that almost every Gobblin writer implements `recordsWritten()` and 
`bytesWritten()` as its enforced in the `Writer` interface.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to