[
https://issues.apache.org/jira/browse/GOBBLIN-2079?focusedWorklogId=923802&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-923802
]
ASF GitHub Bot logged work on GOBBLIN-2079:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 17/Jun/24 20:52
Start Date: 17/Jun/24 20:52
Worklog Time Spent: 10m
Work Description: Will-Lo commented on code in PR #3962:
URL: https://github.com/apache/gobblin/pull/3962#discussion_r1643418257
##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSFlowObservabilityEvent.avsc:
##########
@@ -0,0 +1,82 @@
+{
+ "type": "record",
+ "name": "GaaSFlowObservabilityEvent",
+ "namespace": "org.apache.gobblin.metrics",
+ "doc": "An event schema for GaaS to emit during and after a flow is
executed.",
+ "fields": [
+ {
+ "name": "eventTimestamp",
+ "type": "long",
+ "doc": "Time at which event was created in milliseconds from Unix Epoch"
+ },
+ {
+ "name": "flowGroup",
+ "type": "string",
+ "doc": "Flow group for the GaaS flow",
+ "compliance": "NONE"
+ },
+ {
+ "name": "flowName",
+ "type": "string",
+ "doc": "Flow name for the GaaS flow",
+ "compliance": "NONE"
+ },
+ {
+ "name": "flowExecutionId",
+ "type": "long",
+ "doc": "Flow execution id for the GaaS flow",
+ "compliance": "NONE"
+ },
+ {
+ "name": "lastFlowModificationTimestamp",
+ "type": "long",
+ "doc": "Timestamp in millis since Epoch when the flow config was last
modified"
+ },
+ {
+ "name": "sourceNode",
+ "type": "string",
+ "doc": "Source node for the flow edge",
+ "compliance": "NONE"
+ },
+ {
+ "name": "destinationNode",
+ "type": "string",
+ "doc": "Destination node for the flow edge",
+ "compliance": "NONE"
+ },
+ {
+ "name": "flowStatus",
+ "type": {
+ "type": "enum",
+ "name": "FlowStatus",
+ "symbols": [
+ "SUCCEEDED",
+ "COMPILATION_FAILURE",
+ "SUBMISSION_FAILURE",
+ "EXECUTION_FAILURE",
+ "CANCELLED"
+ ],
+ "doc": "Final flow status for the GaaS flow",
+ "compliance": "NONE"
+ }
+ },
+ {
+ "name": "effectiveUserUrn",
+ "type": [
+ "null",
+ "string"
+ ],
+ "doc": "User URN (if applicable) whose identity was used to run the
underlying Gobblin job e.g. myGroup",
+ "compliance": "NONE"
+ },
+ {
+ "name": "gaasId",
+ "type": [
+ "null",
+ "string"
+ ],
+ "default": null,
+ "doc": "The deployment ID of GaaS that is sending the event (if multiple
GaaS instances are running)"
+ }
+ ]
Review Comment:
Will add flow start/end timestamp.
Flow Properties themselves are not that useful since they can be queried via
the API right? they won't represent the properties running on the actual
executor; that would be the job properties from that level event
Issue Time Tracking
-------------------
Worklog Id: (was: 923802)
Time Spent: 1h (was: 50m)
> GaaSObservabilityEvents should have a dedicated flow level event
> ----------------------------------------------------------------
>
> Key: GOBBLIN-2079
> URL: https://issues.apache.org/jira/browse/GOBBLIN-2079
> Project: Apache Gobblin
> Issue Type: Improvement
> Reporter: William Lo
> Priority: Major
> Time Spent: 1h
> Remaining Estimate: 0h
>
> GaaSJobObservabilityEvents currently encapsulate both jobs and flows in GaaS.
> This is fine for single hop flows, but flows with multiple jobs encapsulated
> in them now have a mix of job level events with the majority of metadata, and
> flow level events which provide a better view to users when their flow fails
> at any given point.
> Since the data in both events differs vastly with most metadata only having
> contextual sense in the job level event, we should separate job and flow
> level events to their own respective event types.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)