[ 
https://issues.apache.org/jira/browse/TEZ-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240740#comment-14240740
 ] 

Rajesh Balamohan edited comment on TEZ-1610 at 12/10/14 7:00 AM:
-----------------------------------------------------------------

Had an offline discussion with [~sseth]. Restricting to the following set of 
counters for this JIRA. 
{code}
/**
   * Time taken to shuffle data. This includes time taken to fetch the data
   * & merging the data in parallel to fetching when needed.  This also 
includes any
   * waiting time related to event delays from source.
   *
   * Represented in milliseconds.
   */
  SHUFFLE_PHASE_TIME,

  /**
   * Time taken to merge data retrieved during shuffle.
   *
   * Relative to task start time and expressed in milliseconds.
   */
  MERGE_PHASE_TIME,

  /**
   * First event received from source relative to task start time.
   *
   * Represented in milliseconds
   */
  FIRST_EVENT_RECEIVED,

  /**
   * Last event received from source relative to task start time.
   *
   * Represented in milliseconds
   */
  LAST_EVENT_RECEIVED
{code}

This would need "tez.task.generate.counters.per.io=true".  TEZ-1829 addresses 
this to make it as a default option.



was (Author: rajesh.balamohan):
Had an offline discussion with [~sseth]. Restricting to the following set of 
counters for this JIRA. 
{code}
/**
   * Time taken to shuffle data. This includes time taken to fetch the data
   * & merging the data in parallel to fetching when needed.  This also 
includes any
   * waiting time related to event delays from source.
   *
   * Represented in milliseconds.
   */
  SHUFFLE_PHASE_TIME,

  /**
   * Time taken to merge data retrieved during shuffle.
   *
   * Relative to task start time and expressed in milliseconds.
   */
  MERGE_PHASE_TIME,

  /**
   * First event received from source relative to task start time.
   *
   * Represented in milliseconds
   */
  FIRST_EVENT_RECEIVED,

  /**
   * Last event received from source relative to task start time.
   *
   * Represented in milliseconds
   */
  LAST_EVENT_RECEIVED
{code}

This would need "tez.task.generate.counters.per.io=true".  Will create a 
separate JIRA to make this as a default option.


> additional task counters for fetchers
> -------------------------------------
>
>                 Key: TEZ-1610
>                 URL: https://issues.apache.org/jira/browse/TEZ-1610
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: TEZ-1610.1.patch, TEZ-1610.2.patch, TEZ-1610.4.patch
>
>
> - ShuffleFinishTime (per source)
> - Merge time (depending on broadcast/scatter-gather shuffle)
> This would be helpful in determining when shuffle started/ended for different 
> sources in a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to