Thanks! Btw, if we want to count task/vertex output, which counter should we take a look at? HDFS_BYTES_WRITTEN+ OUTPUT_BYTES_PHYSICAL?
Xiaoyong From: Rajesh Balamohan [mailto:[email protected]] Sent: Monday, July 13, 2015 1:46 PM To: [email protected] Cc: Hitesh Shah; Yifung Lin; Zhaomin Xu; Joe Zhang (SDE) Subject: Re: Tez Counter question For skew analysis, "SHUFFLE_BYTES (fetched from previous vertex) + HDFS_BYTES_READ (read from HDFS)" can be used. Along with this, REDUCE_INPUT_GROUPS & REDUCE_INPUT_RECORDS could give details on data skew. For example, consider "Map 1" & "Map 7" sending output to "Reducer 2". TaskCounter_Reducer_2_INPUT_Map_1 (i.e, Reducer 2 getting input from Map 1) REDUCE_INPUT_GROUPS 271 REDUCE_INPUT_RECORDS 16,084,685,867 SHUFFLE_BYTES 60,903,100,935 TaskCounter_Reducer_2_INPUT_Map_7 (i.e, Reducer 2 getting input from Map 7) REDUCE_INPUT_GROUPS 879 REDUCE_INPUT_RECORDS 1,696 SHUFFLE_BYTES 59,539 In this case, it is clear that there is data skew in the input from Map 1 to Reducer 2. Now one can drill down to "Map 1" to understand which task (or set of tasks) is generating most amount of data to "Reducer 2". Other points which might be useful for skew analysis 1. If the ratio of REDUCE_INPUT_GROUPS / REDUCE_INPUT_RECORDS is approximately 1.0, you can possibly increase the number of reducers for the vertex (if the vertex is slow). 2. If the ratio of REDUCE_INPUT_GROUPS / REDUCE_INPUT_RECORDS is lot less than 0.2 (~20%) and if almost all the records are processed by this reducer, it could mean data skew. 3. In some cases, REDUCE_INPUT_GROUPS/REDUCE_INPUT_RECORDS ratio might be in between (i.e 0.3 - 0.8). In such cases, if most of the records are processed by this reducer (as compared to the overall number of records in the vertex), you might want to check the partition logic. ~Rajesh.B On Mon, Jul 13, 2015 at 10:49 AM, Xiaoyong Zhu <[email protected]<mailto:[email protected]>> wrote: So - if we want to know if a vertex has data skew issue or not, which counter number should we use? Xiaoyong -----Original Message----- From: Hitesh Shah [mailto:[email protected]<mailto:[email protected]>] Sent: Thursday, July 9, 2015 1:39 PM To: [email protected]<mailto:[email protected]> Cc: Xiaoyong Zhu; Yifung Lin; Zhaomin Xu Subject: Re: Tez Counter question For data skew, you may also want to consider enabling "tez.task.generate.counters.per.io<http://tez.task.generate.counters.per.io>". This enables counters on a per edge basis which is more helpful for complex DAGs. - Hitesh On Jul 8, 2015, at 10:29 PM, Joe Zhang (SDE) <[email protected]<mailto:[email protected]>> wrote: > Hi Rajesh: > > Thanks for your reply. I want to know more detail , see inline > > Sorry for that I don't explain why I am so care about those counter. I am > trying to analysis the data skew issue for tez vertex . Now I can get several > related counter value including FILE_BYTES_READ, HDFS_BYTES_READ, > SHUFFLE_BYTES and so on. So I want to know which counter value is meaningful > for analyzing data skew ? > > Best wishes > Joe zhang > > From: Rajesh Balamohan > [mailto:[email protected]<mailto:[email protected]>] > Sent: Wednesday, July 8, 2015 4:57 PM > To: [email protected]<mailto:[email protected]> > Cc: Xiaoyong Zhu; Yifung Lin > Subject: Re: Tez Counter question > > FILE_BYTES_READ - Represents the data read from local disk > >>>>>>>>>>Joezhang : when or in which case mapper or reducer vertex need read > >>>>>>>>>>from local disk or write to local disk ? I am wondering why reducer > >>>>>>>>>>in tez has the data both read from local disk and shuffle from > >>>>>>>>>>parent node, as far as I know, the traditional reducer in MR1 only > >>>>>>>>>>read shuffle data(In memory and shuffle local disk), does tez > >>>>>>>>>>engine did some optimizations for this ? > > HDFS_BYTES_READ - Represents data read from HDFS (does not include > data read from disk) ;>>>>>>>>>>Joezhang : when or in which case mapper or > reducer vertex need read from hdfs or write tp hdfs? > > SHUFFLE_BYTES - Represents the data that was transferred over the wire while > doing shuffle. Downloaded data either gets into memory or disk (depending on > memory availability). So, SHUFFLE_BYTES_TO_MEM and SHUFFLE_BYTES_TO_DISK > would have correlation with SHUFFLE_BYTES. This does not have direct > relationship with FILE_BYTES_READ. However, in case of spills & merge, > FILES_BYTES_READ can be incremented correspondingly. > > ~Rajesh.B > > On Wed, Jul 8, 2015 at 1:25 PM, Joe Zhang (SDE) > <[email protected]<mailto:[email protected]>> wrote: > HI Tez experts: > > Now I am using Tez Rest API to get tez tasks running Info, but I am > confusing some concepts in Counter > > <1> For File system counters: > > counterName : FILE_BYTES_READ ? does it mean read from local disk or > somewhere else ? > > HDFS_BYTES_READ ? is it included by > FILE_BYTES_READ ? > > <2> For org.apache.tez.common.counters.TaskCounter: > > counterName SHUFFLE_BYTES ? does it have some relationship with > FILE_BYTES_READ ? which data should be included in it ? > > Best wishes > Joe zhang
