[ 
https://issues.apache.org/jira/browse/GOBBLIN-2166?focusedWorklogId=938748&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-938748
 ]

ASF GitHub Bot logged work on GOBBLIN-2166:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Oct/24 17:48
            Start Date: 17/Oct/24 17:48
    Worklog Time Spent: 10m 
      Work Description: phet commented on code in PR #4067:
URL: https://github.com/apache/gobblin/pull/4067#discussion_r1805175206


##########
gobblin-yarn/src/main/java/org/apache/gobblin/yarn/GobblinYarnAppLauncher.java:
##########
@@ -173,6 +174,8 @@ public class GobblinYarnAppLauncher {
 
   private static final String GOBBLIN_YARN_APPLICATION_TYPE = "GOBBLIN_YARN";
 
+  private static final String APPLICATION_TAGS_KEY = 
"hadoop-inject.mapreduce.job.tags";

Review Comment:
   I suppose you could even link here - 
https://github.com/azkaban/azkaban/blob/6db750049f6fdf7842e18b8d533a3b736429bdf4/az-hadoop-jobtype-plugin/src/main/java/azkaban/jobtype/AbstractHadoopJavaProcessJob.java#L96





Issue Time Tracking
-------------------

    Worklog Id:     (was: 938748)
    Time Spent: 0.5h  (was: 20m)

> GoT must fill in info required for RMAppSummaryEvent fields - azkabanexecid, 
> azkabanprojectname, azkabanflowid, azkabanjobid
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-2166
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-2166
>             Project: Apache Gobblin
>          Issue Type: Bug
>            Reporter: Abhishek Jain
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> right now, it's not possible to find `RMAppSummaryEvent`s by any of the above 
> named fields even though the `GaaS-Gobblin-Temporal-Azkaban` project is used 
> by GoT execs
> because `azkabanprojectname` is not populated in events for any GoT execution 
> (the way it IS for GoMR executions), the only way to locate 
> `RMAppSummaryEvent`s for GoT executions is `appid`.
> *why does this matter?*
> a significant consequence of missing these fields is it thwarts joining 
> `GaaSJobObservabilityEvent`s to `RMAppSummaryEvent`s.  this severely 
> complicates analysis, because the GaaS obs. event does NOT contain the YARN 
> appid, only the AZ flow ID.
> since there is clearly an AZ execution involved, the solution is for GoT to 
> set whatever props are required on the YARN app side, so YARN will emit 
> fully-populated `RMAppSummaryEvent`s, with all of their `azkaban*` fields set.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to