[jira] [Updated] (MAPREDUCE-7065) Improve information stored in ATSv2 for MR jobs

2018-03-12 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-7065:
--
Description: 
While exploring the possibility of retrieving every piece of information that 
JHS presents today through ATSv2, I found a few improvements we can make.

1) MR tasks are split by type in JHS, map tasks or reduce tasks. They are 
indistinguishably stored as entities of type MR_TASK. We can split MR_TASK into 
MR_REDUCE_TASK and MR_MAP_TASK. Similarly for MR_TASK_ATTEMPT

2) Task attempt final state are stored in the events, so we can not use 
infofilter to group task attempts by final state, which is what JHS does.

3) Display names of counters are not stored in JHS. We are currently storing 
(counter name, display name, value) as a metric (counter name, value). We can 
potentially store (counter name, display name) as an info. Similarly for 
sources of Job configuration properties

4) Job level counters and configuration properties are stored both in 
ApplicationTable and EntityTable. It's probably safe just to store MR specific 
counters in EntityTable.

 

One general problem I see around this area in MR:

1) We can precompute # of failed/killed/successful map/reduce task attempts and 
average map/reduce/shuffle/merge time in the AM. This would avoid iterating 
over all task attempts when JHS servers the Job Overview Page.

 

To fully replace JHS with ATSv2, three functionalities need to be supported by 
ATSv2

1) /apps/ query so that a list of all jobs can be retrieved (YARN-6058)

2) support streaming api to get all generic entities (YARN-5627)

3) support per-app data retention policy. Likely a setting in TimelineWriter 
that allow admins specifies how long information of a given application should 
be kepts, in the form of TTL in HBase.

  was:
While exploring the possibility of retrieving every piece of information that 
JHS presents today through ATSv2, I found a few improvements we can make.

1) MR tasks are split by type in JHS, map tasks or reduce tasks. They are 
indistinguishably stored as entities of type MR_TASK. We can split MR_TASK into 
MR_REDUCE_TASK and MR_MAP_TASK. Similarly for MR_TASK_ATTEMPT

2) Task attempt final state are stored in the events, so we can not use 
infofilter to group task attempts by final state, which is what JHS does.

3) Display names of counters are not stored in JHS. We are currently storing 
(counter name, display name, value) as a metric (counter name, value). We can 
potentially store (counter name, display name) as an info. Similarly for 
sources of Job configuration properties

4) Job level counters and configuration properties are stored both in 
ApplicationTable and EntityTable. It's probably safe just to store MR specific 
counters in EntityTable.

 

One general problem I see around this area in MR:

1) We can precompute # of failed/killed/successful map/reduce task attempts and 
average map/reduce/shuffle/merge time in the AM. This would avoid iterating 
over all task attempts when JHS servers the Job Overview Page.

 

To fully replace JHS with ATSv2, three functionalities need to be supported by 
ATSv2

1) /apps/ query so that a list of all jobs can be retrieved

2) support streaming api to get all generic entities (YARN-5627)

3) support per-app data retention policy. Likely a setting in TimelineWriter 
that allow admins specifies how long information of a given application should 
be kepts, in the form of TTL in HBase.


> Improve information stored in ATSv2 for MR jobs
> ---
>
> Key: MAPREDUCE-7065
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7065
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
>
> While exploring the possibility of retrieving every piece of information that 
> JHS presents today through ATSv2, I found a few improvements we can make.
> 1) MR tasks are split by type in JHS, map tasks or reduce tasks. They are 
> indistinguishably stored as entities of type MR_TASK. We can split MR_TASK 
> into MR_REDUCE_TASK and MR_MAP_TASK. Similarly for MR_TASK_ATTEMPT
> 2) Task attempt final state are stored in the events, so we can not use 
> infofilter to group task attempts by final state, which is what JHS does.
> 3) Display names of counters are not stored in JHS. We are currently storing 
> (counter name, display name, value) as a metric (counter name, value). We can 
> potentially store (counter name, display name) as an info. Similarly for 
> sources of Job configuration properties
> 4) Job level counters and configuration properties are stored both in 
> ApplicationTable and EntityTable. It's probably safe just to store MR 
> specific counters in EntityTable.
>  
> One general problem I see around this area in M

[jira] [Updated] (MAPREDUCE-7065) Improve information stored in ATSv2 for MR jobs

2018-03-12 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-7065:
--
Description: 
While exploring the possibility of retrieving every piece of information that 
JHS presents today through ATSv2, I found a few improvements we can make.

1) MR tasks are split by type in JHS, map tasks or reduce tasks. They are 
indistinguishably stored as entities of type MR_TASK. We can split MR_TASK into 
MR_REDUCE_TASK and MR_MAP_TASK. Similarly for MR_TASK_ATTEMPT

2) Task attempt final state are stored in the events, so we can not use 
infofilter to group task attempts by final state, which is what JHS does.

3) Display names of counters are not stored in JHS. We are currently storing 
(counter name, display name, value) as a metric (counter name, value). We can 
potentially store (counter name, display name) as an info. Similarly for 
sources of Job configuration properties

4) Job level counters and configuration properties are stored both in 
ApplicationTable and EntityTable. It's probably safe just to store MR specific 
counters in EntityTable.

 

One general problem I see around this area in MR:

1) We can precompute # of failed/killed/successful map/reduce task attempts and 
average map/reduce/shuffle/merge time in the AM. This would avoid iterating 
over all task attempts when JHS servers the Job Overview Page.

 

To fully replace JHS with ATSv2, three functionalities need to be supported by 
ATSv2

1) /apps/ query so that a list of all jobs can be retrieved

2) support streaming api to get all generic entities (YARN-5627)

3) support per-app data retention policy. Likely a setting in TimelineWriter 
that allow admins specifies how long information of a given application should 
be kepts, in the form of TTL in HBase.

  was:
While exploring the possibility of retrieving every piece of information that 
JHS presents today through ATSv2, I found a few improvements we can make.

1) MR tasks are split by type in JHS, map tasks or reduce tasks. They are 
indistinguishably stored as entities of type MR_TASK. We can split MR_TASK into 
MR_REDUCE_TASK and MR_MAP_TASK. Similarly for MR_TASK_ATTEMPT

2) Task attempt final state are stored in the events, so we can not use 
infofilter to group task attempts by final state, which is what JHS does.

3) Display names of counters are not stored in JHS. We are currently storing 
(counter name, display name, value) as a metric (counter name, value). We can 
potentially store (counter name, display name) as an info. Similarly for 
sources of Job configuration properties

4) Job level counters and configuration properties are stored both in 
ApplicationTable and EntityTable. It's probably safe just to store MR specific 
counters in EntityTable.

 

One general problem I see around this area in MR:

1) We can precompute # of failed/killed/successful map/reduce task attempts and 
average map/reduce/shuffle/merge time in the AM. This would avoid iterating 
over all task attempts when JHS servers the Job Overview Page.

 

To fully replace JHS with ATSv2, three functionalities need to be supported by 
ATSv2

1) /apps/ query so that a list of all jobs can be retrieved

2) support streaming api to get all generic entities (YARN-5672)

3) support per-app data retention policy. Likely a setting in TimelineWriter 
that allow admins specifies how long information of a given application should 
be kepts, in the form of TTL in HBase.


> Improve information stored in ATSv2 for MR jobs
> ---
>
> Key: MAPREDUCE-7065
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7065
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Haibo Chen
>Assignee: Haibo Chen
>Priority: Major
>
> While exploring the possibility of retrieving every piece of information that 
> JHS presents today through ATSv2, I found a few improvements we can make.
> 1) MR tasks are split by type in JHS, map tasks or reduce tasks. They are 
> indistinguishably stored as entities of type MR_TASK. We can split MR_TASK 
> into MR_REDUCE_TASK and MR_MAP_TASK. Similarly for MR_TASK_ATTEMPT
> 2) Task attempt final state are stored in the events, so we can not use 
> infofilter to group task attempts by final state, which is what JHS does.
> 3) Display names of counters are not stored in JHS. We are currently storing 
> (counter name, display name, value) as a metric (counter name, value). We can 
> potentially store (counter name, display name) as an info. Similarly for 
> sources of Job configuration properties
> 4) Job level counters and configuration properties are stored both in 
> ApplicationTable and EntityTable. It's probably safe just to store MR 
> specific counters in EntityTable.
>  
> One general problem I see around this area in MR:
> 1) We c