[ 
https://issues.apache.org/jira/browse/GOBBLIN-2186?focusedWorklogId=950623&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-950623
 ]

ASF GitHub Bot logged work on GOBBLIN-2186:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Jan/25 05:34
            Start Date: 02/Jan/25 05:34
    Worklog Time Spent: 10m 
      Work Description: phet commented on code in PR #4089:
URL: https://github.com/apache/gobblin/pull/4089#discussion_r1900542681


##########
gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/activity/impl/GenerateWorkUnitsImpl.java:
##########
@@ -150,26 +156,28 @@ public GenerateWorkUnitsResult 
generateWorkUnits(Properties jobProps, EventSubmi
   protected List<WorkUnit> 
generateWorkUnitsForJobStateAndCollectCleanupPaths(JobState jobState, 
EventSubmitterContext eventSubmitterContext, Closer closer,
       Set<String> pathsToCleanUp)
       throws ReflectiveOperationException {
+    // report (timer) metrics for "Work Discovery", *planning only* - NOT 
including WU prep, like serialization, `DestinationDatasetHandlerService`ing, 
etc.
+    // IMPORTANT: for accurate timing, SEPARATELY emit 
`.createWorkPreparationTimer`, to record time prior to measuring the WU size 
required for that one

Review Comment:
   originally, in `AbstractJobLauncher` the "WU creation timer" measured only 
the planning - 
https://github.com/apache/gobblin/blob/7dbeebf7fecc748ea3ef90cc318214cf26ba5afa/gobblin-runtime/src/main/java/org/apache/gobblin/runtime/AbstractJobLauncher.java#L476
   
   that is what's included in the `GaaSJobObservabilityEvent`.
   
   the timer for WU prep happens a bit later - 
https://github.com/apache/gobblin/blob/7dbeebf7fecc748ea3ef90cc318214cf26ba5afa/gobblin-runtime/src/main/java/org/apache/gobblin/runtime/AbstractJobLauncher.java#L549
   
   so in this comment:
   > "Work Discovery", *planning only* - NOT including WU prep, like 
serialization, ...
   
   I just meant that we're timing only planning/creation, not the preparation 
such as serialization.
   
   as for WU serialization, there is no existing, historical event strictly for 
that.  typically that only takes a long time when memory-constrained and 
GC-bound.  although we could consider adding a new event to time that, for 
purposes of right-sizing, GC stats are more interesting than the duration it 
happens to take.  if anything, the former is what I'd prioritize.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 950623)
    Time Spent: 1h  (was: 50m)

> Ensure GoT jobs record Work Discovery planning timing for populating the 
> `GaaSJobObservabilityEvent` fields `jobPlanning{Start,End}Timestamp`
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-2186
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-2186
>             Project: Apache Gobblin
>          Issue Type: New Feature
>          Components: gobblin-core
>            Reporter: Kip Kohn
>            Assignee: Abhishek Tiwari
>            Priority: Minor
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> `GaaSJobObservabilityEvent`s for Gobblin-on-Temporal jobs have no values set 
> for the fields `jobPlanningStartTimestamp` and `jobPlanningEndTimestamp` 
> because no `TimingEvent.LauncherTimings.WORK_UNITS_CREATION` GTE (to record 
> those values) is emitted by `GenerateWorkUnitsImpl`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to