[ 
https://issues.apache.org/jira/browse/GOBBLIN-1806?focusedWorklogId=856175&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-856175
 ]

ASF GitHub Bot logged work on GOBBLIN-1806:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Apr/23 16:44
            Start Date: 11/Apr/23 16:44
    Worklog Time Spent: 10m 
      Work Description: phet commented on PR #3667:
URL: https://github.com/apache/gobblin/pull/3667#issuecomment-1503758833

   > Responding to the top level comment: `would we wish to pre-aggregate that 
within the event itself`, I think aggregation is needed because there can be 
thousands of tasks for large pipelines, which makes serializing all the states 
into an event lead to large Kafka events which we want to avoid. Also, I think 
most clients wouldn't care too much about the inner details of every individual 
task/mapper, it's mainly the concern of the Gobblin framework to deal with 
correctly.
   
   does this mean you'll add a summary field to aggregate measurement across 
all datasets?




Issue Time Tracking
-------------------

    Worklog Id:     (was: 856175)
    Time Spent: 2h 10m  (was: 2h)

> Create a GTE for recording bytes/records written for each dataset in a 
> Gobblin job
> ----------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-1806
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1806
>             Project: Apache Gobblin
>          Issue Type: Improvement
>          Components: gobblin-core
>            Reporter: William Lo
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Gobblin collects a lot of writer metrics on number of bytes and records 
> written to the sinks, but does not emit these metrics as part of a 
> GobblinTrackingEvent.
> We want to emit these in a GobblinTrackingEvent so that it can be ingested by 
> montioring systems and GaaS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to