phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1040570689


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",

Review Comment:
   good wisdom for us to iterate after achieving e2e functionality, so I 
suggest a top-level "doc" string to clarify the clear intent to supersede this 
schema.
   
   an alt. suffix that just occurred to me is `Experimental`



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",

Review Comment:
   minor temptation to call this "flowGraphEdgeId" for clarity



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge 
can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job 
template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",

Review Comment:
   nit: where the job *ran*?



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge 
can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job 
template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",
+      "doc": "The ID of the spec executor that ran the job",
+      "compliance": "NONE"
+    },
+    {
+      "name": "issues",
+      "type": {
+        "type": "array",
+        "items": {
+          "type": "record",
+          "name": "Issue",
+          "doc": "Issue describes a specific unique problem in the job or 
application.\n\nIssue can be generated from log entries, health checks, and 
other places.",
+          "fields": [
+            {
+              "name": "time",
+              "type": {
+                "type": "typeref",
+                "name": "Timestamp",
+                "doc": "Epoch/UNIX time in milliseconds\n\nRepresents the 
number of milliseconds since the epoch of 1970-01-01T00:00:00Z",
+                "ref": "long"
+              },
+              "doc": "Time when the issue have occurred"

Review Comment:
   nit: strike -have- (perhaps carry over from `Issue.pdl`...)



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge 
can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job 
template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",

Review Comment:
   would this and the executorUrl still be set in cases where the job was not 
orchestrated / sent to an executor, or would it then need to be `null`?  if you 
wish always to indicate the executor that *would* be used, since it's more 
where would the job execute, even if it never got submitted, then just update 
the doc string to clarify



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge 
can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job 
template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",

Review Comment:
   nit: `executorUrl`



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge 
can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],

Review Comment:
   I'm not convinced on the modeling, so suggest omitting during our 
experimentation phase.  we'll refine the event later with what we've learned 
about processing, when graduating beyond that phase.



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."

Review Comment:
   this boolean is suspicious and perhaps could be derived from richer, more 
informative other fields.  do you see it as a sub-category of `FAILED` status?
   
   maybe a `jobOrchestratedTime`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to