[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-07-13 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625703#comment-14625703
 ] 

Junping Du commented on YARN-3815:
--

Hi [~sjlee0], sorry for replying your comments late. Just busy in delivering a 
quick poc patch for app level aggregation (system metrics only, not include 
conflict idea part) in YARN-3816. Will back to your questions  when figure that 
out.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf, aggregation-design-discussion.pdf, 
> hbase-schema-proposal-for-aggregation.pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-07-02 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612798#comment-14612798
 ] 

Sangjin Lee commented on YARN-3815:
---

{quote}
We don't have to make it at container level I think but also not necessary for 
AM to retain and aggregate these values. AM could help to forward the values to 
per app timeline collector but don't have to aggregate them. Vinod got more 
ideas on this in offline discussion. [~vinodkv], can you comment on this?
{quote}

Interesting. Could you or [~vinodkv] shed light on the idea? It would still 
need to be captured in an entity or entities, right? I would think sending it 
as part of the container entities would be simpler and more consistent (in that 
the per-app collector can simply look at all container metrics as subject to 
aggregation). I'd love to hear more about this.

{quote}
I think "per-container averages" is not equal to per-container resource usage. 
Understanding application's real resource consumption/usage is one of the core 
use cases for new timeline service at the beginning so I don't think we should 
rule out anything important here.
{quote}

How is the per-container resource usage different than the per-container 
average described in the summary? Could you kindly provide its definition?

No doubt understanding applications' real resource consumption/usage is 
critical. Between the individual container resource usage (which are all 
captured), the aggregated resource usage at the app/flow level (which the basic 
real time aggregation addresses), and the running averages/max of the 
aggregated resource usage at the app/flow level, I think it definitely covers 
that need. What would be the gap that's not addressed by the above data?

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf, aggregation-design-discussion.pdf, 
> hbase-schema-proposal-for-aggregation.pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-07-02 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612687#comment-14612687
 ] 

Junping Du commented on YARN-3815:
--

Thanks [~sjlee0] for comments!
bq. I think it is pretty natural and straightforward for AMs to aggregate and 
retain values at the app level, but even if they set it at the container level, 
it could work.
I would rather say it is "natural" before timeline service v2 comes out. :) We 
don't have to make it at container level I think but also not necessary for AM 
to retain and aggregate these values. AM could help to forward the values to 
per app timeline collector but don't have to aggregate them. Vinod got more 
ideas on this in offline discussion. [~vinodkv], can you comment on this?

bq. Note that we're not proposing to keep the average as a time series. So I'm 
not sure if that is feasible.
If not, we may consider to change the proposal to support time series given the 
data is not too much here.

bq. We also ruled out per-container averages (explained in the summary), so 
per-task resource usage is not an example we're looking for.
I think "per-container averages" is not equal to per-container resource usage. 
Understanding application's real resource consumption/usage is one of the core 
use cases for new timeline service at the beginning so I don't think we should 
rule out anything important here.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf, aggregation-design-discussion.pdf, 
> hbase-schema-proposal-for-aggregation.pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-07-02 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612684#comment-14612684
 ] 

Sangjin Lee commented on YARN-3815:
---

{quote}
The use case here should be obviously. A quick real life example here is Google 
Borg - cluster management tools 
(http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/43438.pdf)
 which aggregate per-task resource usage information for usage-based charging, 
debugging job and long-term capacity planning.
{quote}

Thanks [~djp]. What I'm looking for is a little more specific examples. That's 
why we spent some time during the discussion to define precisely what we mean 
by "averages". We discovered that there were already two different definitions 
of the average for gauges. We also ruled out per-container averages (explained 
in the summary), so per-task resource usage is not an example we're looking for.

So as for the moving (but aggregate) average, are there other examples? What we 
discussed during the meeting (also in the summary) was the total CPU 
utilization of an app/flow. Other examples, and how they might be useful, or is 
that pretty much the best example?

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf, aggregation-design-discussion.pdf, 
> hbase-schema-proposal-for-aggregation.pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-07-02 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612680#comment-14612680
 ] 

Sangjin Lee commented on YARN-3815:
---

bq. This way sounds very clever. In addition, if we need resource consumption 
at any standpoint or time window (t1 - t2), we can simply do Avg(t2) * t2 - 
Avg(t1) * t1. This is much better than aggregating value on each stand point 
when query.

Note that we're not proposing to keep the average as a *time series*. So I'm 
not sure if that is feasible.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf, aggregation-design-discussion.pdf, 
> hbase-schema-proposal-for-aggregation.pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-07-02 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612678#comment-14612678
 ] 

Sangjin Lee commented on YARN-3815:
---

{quote}
We may consider to provide two ways here:
- For legacy applications - like MR, AM already have done aggregation on these 
counters themselves.
- For new application to build against YARN after timeline service v2, AM can 
delegate YARN timeline service to do aggregation instead of do it themselves. 
Our data model and aggregation mechanism should assure YARN timeline service 
can aggregate these framework-specif metrics without get predefined.
{quote}

I think it's a little more complicated than that. If a new YARN application 
wants to delegate aggregation to the YARN timeline service, it still needs to 
do at least the following:
- add the framework-specific metrics to the YARN container
- do *not* add any of those metrics to the YARN application

The framework-specific metrics set on the containers would still be transmitted 
by the AM (not by the node managers). Then, the YARN timeline service could 
look at *any* container metrics and apply the uniform aggregation rules.

Hopefully YARN apps can add metric values to container entities (there should 
be a natural mapping from unit of work to containers), otherwise it won't work 
for them...

I think it is pretty natural and straightforward for AMs to aggregate and 
retain values at the app level, but even if they set it at the container level, 
it could work.

On the other hand, if your app wants to own aggregation, then it should not set 
the metrics on the containers, or it would be done twice.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf, aggregation-design-discussion.pdf, 
> hbase-schema-proposal-for-aggregation.pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-07-02 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612648#comment-14612648
 ] 

Junping Du commented on YARN-3815:
--

bq. Also, it would be GREAT if you could give a clear and compelling use case 
(a real life example) on why such support would be crucial. Thanks!
The use case here should be obviously. A quick real life example here is Google 
Borg - cluster management tools 
(http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/43438.pdf)
 which aggregate per-task resource usage information for usage-based charging, 
debugging job and long-term capacity planning.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf, aggregation-design-discussion.pdf, 
> hbase-schema-proposal-for-aggregation.pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-07-02 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612629#comment-14612629
 ] 

Sangjin Lee commented on YARN-3815:
---

For gauges and their averages and max in particular, [~vinodkv], [~gtCarrera9], 
[~djp], could you please confirm what I captured in that document is exactly 
what we want to support? Could you please comment on that?

Also, it would be *GREAT* if you could give a clear and compelling use case (a 
real life example) on why such support would be crucial. Thanks!

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf, aggregation-design-discussion.pdf, 
> hbase-schema-proposal-for-aggregation.pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-07-02 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612620#comment-14612620
 ] 

Junping Du commented on YARN-3815:
--

bq.  app-level aggregation for framework-specific metrics will be done by the 
AM.
I think there is a little misunderstanding on this - just like I mentioned 
above, AM should/could get relieved from aggregating counters themselves after 
timeline service v2. Legacy AMs could still push aggregated counters to backend 
storage though. Others who also sit in the room, any comments here? 

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf, aggregation-design-discussion.pdf, 
> hbase-schema-proposal-for-aggregation.pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-07-02 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612592#comment-14612592
 ] 

Sangjin Lee commented on YARN-3815:
---

Here is my take on what's consensus, what's not, and what's currently out of 
scope. I may have misread the discussion and your impression/understanding may 
be different, so please feel free to chime in and comment on this!

(consensus or not controversial)
- applications table will be split from the main entities table
- app-level aggregation for framework-specific metrics will be done by the AM
- app-level aggregation for YARN-system container metrics will be done by the 
per-app timeline collector
- real-time aggregation does simple sum for all types of metrics
- metrics API will be updated to differentiate gauges and counters (the type 
information will need to be persisted in the storage)
- for gauges, in addition to the simple sum-based aggregation, support average 
and max
- the flow-run table will be created to handle app-to-flow-run ("real-time") 
aggregation as proposed in the native HBase schema design
- auxiliary tables will be implemented as proposed in the native HBase schema 
design
- time-based aggregation (daily, weekly, monthly, etc.) will be done via 
phoenix tables to enable ad-hoc queries

(questions remaining or undecided)
- for the average/max support for gauges (see above), confirm that's exactly 
what we want to support
- how to implement app-to-flow-run aggregation for gauges
- how to perform the time-based aggregation (mapreduce, using co-processor 
endpoints, etc.)
- how to handle long-running apps for time-based aggregation
- considering adopting "null delimiters" (or other phoenix-friendly tools) to 
support phoenix reading data from the native HBase tables
- using flow collectors, user collectors, and queue collectors as means of 
performing (higher-level) aggregation

(out of scope)
- support per-container averages for gauges
- any aggregation other than time-based aggregation for flows, users, and queues
- creating a dependency on the explicit YARN flow API

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf, aggregation-design-discussion.pdf, 
> hbase-schema-proposal-for-aggregation.pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-07-02 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612591#comment-14612591
 ] 

Junping Du commented on YARN-3815:
--

Thanks [~sjlee0] for nice writeup on the discussions.
Looks good for most parts to me. Some comments on app level aggregations:

bq. Framework‐specific metrics will be sent to the per‐app collector aggregated 
by the AM itself.
We may consider to provide two ways here:
- For legacy applications - like MR, AM already have done aggregation on these 
counters themselves.
- For new application to build against YARN after timeline service v2, AM can 
delegate YARN timeline service to do aggregation instead of do it themselves. 
Our data model and aggregation mechanism should assure YARN timeline service 
can aggregate these framework-specif metrics without get predefined.

bq. time average & max: the average multiplied by the elapsed time of the 
application represents the total resource usage over time.
This way sounds very clever. In addition, if we need resource consumption at 
any standpoint or time window (t1 - t2), we can simply do Avg(t2) * t2 - 
Avg(t1) * t1. This is much better than aggregating value on each stand point 
when query.


> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf, aggregation-design-discussion.pdf, 
> hbase-schema-proposal-for-aggregation.pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-07-02 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612529#comment-14612529
 ] 

Sangjin Lee commented on YARN-3815:
---

Some of us ([~gtCarrera9], [~vinodkv], [~djp], [~zjshen], [~vrushalic], and 
[~sjlee0]) had a face-to-face design discussion on the aggregation. I am going 
to post the summary of that discussion along with a proposal for an expanded 
native HBase schema to support aggregation.

I believe we are much closer to a consensus on the aggregation design, but some 
important questions still remain. For the sake of public discussion and 
inviting more participants and comments, we should follow up here on this JIRA.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-24 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600411#comment-14600411
 ] 

Li Lu commented on YARN-3815:
-

Sorry I mistakenly assigned this JIRA to myself. I've assigned back. 

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598781#comment-14598781
 ] 

Ted Yu commented on YARN-3815:
--

[~jrottinghuis]:
Your description makes sense.
Cell tag is supported since hbase 0.98+ so we can use it to mark completion.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-23 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598759#comment-14598759
 ] 

Joep Rottinghuis commented on YARN-3815:


Thanks [~ted_yu] for that link. I did find that code and I'm reading through it.
Yes it uses a coprocessor on the reading side to "collapse" values together and 
permanently "collapse" them together on compaction.

I want to use a similar approach here. We cannot use the delta write directly 
as-is for the following reasons:
- For running applications, if we wanted to write only the increment the AM (or 
ATS writer) will have to keep track of the previous values in order to write 
the increment only. When the AM crashes and/or the ATS writer restarts we won't 
know what previous value we had written (and what has already been aggregated. 
So, we'd have to write the increment plus the latest value.
- Ergo, why don't we just write the latest value to begin with and leave off 
the increment. Now we cannot "collapse" the deltas / latest value until the 
application is done. Otherwise we would again loose track of what was 
previously aggregated.
So the new approach would be to write the latest value for an app and indicate 
(using a cell tag) that the app is done and that it can be a collapsed. We 
would use the co-processor only on the read-side just like with the delta write 
and that co-processor would aggregate values on the fly for reads and collapse 
during writes. Those writes would be limited to one single row, so we wouldn't 
have any weird cross-region locking issues, nor delays and hickups in the write 
throughput.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-23 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598590#comment-14598590
 ] 

Sangjin Lee commented on YARN-3815:
---

Moving from offline discussions...

Now aggregation of *time series metrics* is rather tricky, and needs to be 
defined. Would an aggregated metric (e.g. at the flow level) of time series 
metrics (e.g. at the app level) be a time series itself? I see several problems 
with defining that as a time series. Individual app time series may be sampled 
at different times, and it's not clear what time series the aggregated flow 
metric would be.

I think it might be simpler to say that an aggregated flow metric of time 
series may not need to be a time series itself.

On the one hand, there is a general issue of at what time the aggregated values 
belong, regardless of whether they are time series or not. If all leaf values 
are recorded at the same time, it would be unambiguous. The aggregated metric 
value is of the same time. However, it is rarely the case.

I think the current implicit behavior in hadoop is simply to take the latest 
values and add them up. One example is the MR counters (task level and job 
level). The task level counters are obtained at different times. Still, the 
corresponding job counters are simply sums of all the latest task counters, 
although they may have been taken at different times. We're basically taking 
that as an approximation that's "good enough". In the end, the final numbers 
will become accurate. In other words, the final values would truly be the 
accurate aggregate values.

The time series basically adds another wrinkle to this. In case of a simple 
value, the final values are going to be correct, so this problem is less of an 
issue, but time series will retain intermediate values. Furthermore, their 
publishing interval may have no relationship with the publishing interval of 
the leaf values. I think the baseline approach should be either (1) do not use 
time series for the aggregated metrics, or (2) just to the best effort 
approximation by adding up the latest leaf values and store it with its own 
timestamp.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-23 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598577#comment-14598577
 ] 

Sangjin Lee commented on YARN-3815:
---

{quote}
About flow online aggregation, I am not quite sure on requirement yet. Do we 
really want real time for flow aggregated data or some fine-grained time 
interval (like 15 secs) should be good enough - if we want to show some nice 
metrics chart for flow, this should be fine.
{quote}

Yes, I agree with that. When I said "real time", it doesn't mean real time in 
the sense that every metric is accurate to the second. Most likely raw data 
themselves (e.g. container data) are written on an interval anyway. Some type 
of time interval for aggregation is implied.

{quote}
Any special reason not to handle it in the same way above - as HBase 
coprocessor? It just sound like gross-grained time interval. Isn't it?
{quote}

I do see your point in that what I called the "real time" aggregation can be 
considered the same type of aggregation as the "offline" aggregation only on a 
shorter time interval. However, we also need to think about the use cases of 
such aggregated data.

The former type of aggregation is very much something that can be plugged into 
UI such as the RM UI or ambari to show more immediate data. These data may 
change as the user refreshes the UI. So this is closer to the raw data.

On the other hand, the latter type of aggregation lends itself to more 
analytical and ad-hoc analysis of data. These can be used for calculating 
chargebacks, usage trending, reporting, etc. Perhaps it could even contain more 
detailed info than the "real time" aggregated data for the reporting/data 
mining purposes. And that's where we would like to consider using phoenix to 
enable arbitrary ad-hoc SQL queries.

One analogy [~jrottinghuis] brings up regarding this is OLTP v. OLAP.

That's why we also think it makes sense to do only "offline" (time-based) 
aggregation for users and queues. At least in our case with hRaven, there 
hasn't been a compelling reason to show user- or queue-aggregated data in 
semi-real time. It has been perfectly adequate to show time-based aggregation, 
as data like this tend to be used more for reporting and analysis.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-23 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14598556#comment-14598556
 ] 

Sangjin Lee commented on YARN-3815:
---

{quote}
AM currently leverage YARN's AppTimelineCollector to forward entities to 
backend storage, so making AM talk directly to backend storage is not 
considered to be safe.
{quote}

Just to be clear, I'm *not* proposing AMs writing directly to the backend 
storage. AMs continue to write through the app-level timeline collector. My 
proposal is that the AMs are responsible for setting the aggregated 
framework-specific metric values on the *YARN application entities*.

Let's consider the example of MR. MR itself would have its own entities such as 
job, tasks, and task attempts. These are distinct entities from the YARN 
entities such as application, app attempts, and containers. We can either (1) 
have the MR AM set framework-specific metric values at the YARN container 
entities and have YARN aggregate them to applications, or (2) have the MR AM 
set the aggregated values on the applications for itself.

I feel the latter approach is conceptually cleaner. The framework is ultimately 
responsible for its metrics (YARN doesn't even know what metrics there are). We 
could decide that YARN would look at the framework-specific metrics at the app 
level and aggregate them from the app level onward to flows, user, and queue.

In addition, most frameworks already have an aggregated view of the metrics. It 
would be very straightforward to emit them at the app level.

In summary, option (1) asks the framework to write metrics on its own entities 
(job, tasks, task attempts) plus YARN container entities. Option (2) asks the 
framework to write metrics on its own entities (job, tasks, task attempts) plus 
YARN app entities. IMO, the latter is a more reliable approach. We can discuss 
this further...

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-22 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596616#comment-14596616
 ] 

Ted Yu commented on YARN-3815:
--

bq. in the spirit of readless increments as used in Tephra

Readless increment feature is implemented in cdap, called delta write.
Please take a look at:
cdap-hbase-compat-0.98/src/main/java/co/cask/cdap/data2/increment/hbase98/IncrementHandler.java
cdap-hbase-compat-0.98//src/main/java/co/cask/cdap/data2/increment/hbase98/IncrementSummingScanner.java

The implementation uses hbase coprocessor, BTW

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-22 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596173#comment-14596173
 ] 

Ted Yu commented on YARN-3815:
--

My comment is related to usage of hbase.
bq. under framework_specific_metrics column family
Since column family name appears in every KeyValue, it would be better to use 
very short column family name. e.g. f_m for framework metrics.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-22 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14596129#comment-14596129
 ] 

Junping Du commented on YARN-3815:
--

Thanks [~sjlee0] and [~jrottinghuis] for review and good comments in detail. 
[~jrottinghuis]'s comments are pretty long and I could only reply part of it 
and will finish the left parts tomorrow. :)

bq. For framework-specific metrics, I would say this falls on the individual 
frameworks. The framework AM usually already aggregates them in memory 
(consider MR job counters for example). So for them it is straightforward to 
write them out directly onto the YARN app entities. Furthermore, it is 
problematic to add them to the sub-app YARN entities and ask YARN to aggregate 
them to the application. Framework’s sub-app entities may not even align with 
YARN’s sub-app entities. For example, in case of MR, there is a reasonable 
one-to-one mapping between a mapper/reducer task attempt and a container, but 
for other applications that may not be true. Forcing all frameworks to hang 
values at containers may not be practical. I think it’s far easier for 
frameworks to write aggregated values to the YARN app entities.
AM currently leverage YARN's AppTimelineCollector to forward entities to 
backend storage, so making AM talk directly to backend storage is not 
considered to be safe. It is also not necessary too because the real difficulty 
here is to aggregate framework specific metrics in other levels (flow, user and 
queue), because that beyond the life cycle of framework so YARN have to take 
care of it. Instead of asking frameworks to handle specific metrics themselves, 
I would like to propose to treat these metrics as "anonymous", it would pass 
both metrics name and value to YARN's collector and YARN's collector could 
aggregate it and store as dynamic column (under framework_specific_metrics 
column family) into app states table. So other (flow, user, etc.) level 
aggregation on freamework metrics could happen based on this.

bq. app-to-flow online aggregation. This is more or less live aggregated 
metrics at the flow level. This will still be based on the native HBase schema.
About flow online aggregation, I am not quite sure on requirement yet. Do we 
really want real time for flow aggregated data or some fine-grained time 
interval (like 15 secs) should be good enough - if we want to show some nice 
metrics chart for flow, this should be fine. Even for real time, we don't have 
to aggregate everything from raw entity table, we don't have to duplicated 
count metrics again for finished apps. Isn't it?

bq. (3) time-based flow aggregation: This is different than the online 
aggregation in the sense that it is aggregated along the time boundary (e.g. 
“daily”, “weekly”, etc.). This can be based on the Phoenix schema. This can be 
populated in an offline fashion (e.g. running a mapreduce job).
Any special reason not to handle it in the same way above - as HBase 
coprocessor? It just sound like gross-grained time interval. Isn't it?

bq. This is another “offline” aggregation type. Also, I believe we’re talking 
about only time-based aggregation. In other words, we would aggregate values 
for users only with a well-defined time window. There won’t be a “real-time” 
aggregation of values, similar to the flow aggregation.
I would also call for a fine-grained time interval (closed to real-time) 
because the aggregated resource metrics on user could be used in billing hadoop 
usage in a shared environment (no matter private or public cloud), so user need 
to know more details on resource consumption especially in some random peak 
time.

bq. Very much agree with separation into 2 categories "online" versus 
"periodic". I think this will be natural split between the native HBase tables 
for the former and the Phoenix approach for the latter to each emphasize their 
relative strengths.
I would question the necessary for "online" again if this mean "real time" 
instead of fine-grained time interval. Actually, as a building block, every 
container metrics (cpu, memory, etc.) are generated in a time interval instead 
of real time. As a result, we never know the exactly snapshot of whole system 
in a precisely time but only can try to getting closer.


> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the qu

[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-19 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593890#comment-14593890
 ] 

Joep Rottinghuis commented on YARN-3815:


For flow-level aggregates I'll separately write up ideas about how to do that.
In short we need to focus on write performance, plus the fact that we have to 
deal with the need to aggregate increments to aggregates from running 
applications. This makes it tricky to do correctly, specifically when apps (and 
ATS writers) can crash and need to restart. We'll have to keep track of the 
last values written. Initially I thought that using a coprocessor to do this 
server side solves the problem. The challenge is that it will be invoked in the 
write-path of individual stats, so slow writes to a second region server 
(hosting the agg table/row) can have a rippling affect on many writes. Even 
worse, we can end up with a deadlock situation under load conditions when the 
agg table/row happens to be hosted on the same region server and the current 
write is blocked on the completion of coprocessor which needs to write but is 
blocked on a full queue on its own region server.

It think the solution will be to do something in the spirit of readless 
increments as used in Tephra. Similarly we'd collapse values only when flushes 
or compactions happen, and then aggregation is restricted to a single row which 
is locked without issues. On reads we collapse the pre-aggregated values plus 
the values from currently running jobs. The significant difference will be that 
we can compact only when jobs are complete. I'll try to write up a more 
detailed design for this.

If we follow [~sjlee0]'s suggestion to make all the other aggregates periodic, 
then we can use mapreduce for those. The big advantage is that we can then use 
control records like we do in hRaven to efficiently keep track of what we have 
already aggregated. The tricky ones will be the long running ones we have to 
keep getting back to. Ideally we should be able to read the raw values once and 
then "spray" they out to the various aggregate tables (cluster, queue, user) 
per time period. Otherwise we end up scanning over the raw values over and over 
again.

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-19 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593867#comment-14593867
 ] 

Joep Rottinghuis commented on YARN-3815:


Very much agree with separation into 2 categories "online" versus "periodic". I 
think this will be natural split between the native HBase tables for the former 
and the Phoenix approach for the latter to each emphasize their relative 
strengths.

A few thoughts around time-based aggregations:
- If the aggregation time is smaller than the runtime of apps/flows we need to 
consider what that means for an aggregate. As an extreme example consider 
hourly aggregates for applications that take hours to complete. What do we 
actually count in that one hour? Do we only attribute to that hour the specific 
total metric that came in at that time, or do we try to apportion part of the 
increment to what happened only in that one hour? Ditto goes for daily 
aggregates when we have long running jobs. In hRaven we simply don't deal with 
this at all by making the simplifying assumption that all metrics and usage all 
happen in the instant that the job is completed. With ATSv2 being (near) 
real-time that will simply not work, so we need to consider what that means. 
Are we requiring apps to write at least once within each aggregation period?
- If we store aggregates in columns (hourly columns, daily columns) we need to 
limit the growth of # columns by making the next level aggregate part of the 
rowkey. This would limit 24 hourly columns to a single day row. Similarly we'd 
have 7 dailies in a week, or perhaps just up to 31 dailies in a month. All of 
these considerations come from a strong need to be able to limit the range over 
which we scan in order to get a reasonable performance in the face of lots of 
data.

{quote}
Flow level:
○ expect return: aggregated stats for a flow_run, flow_version and flow
{quote}
I think "flow" level aggregations should really only mean flow-run level 
aggregation in the sense of the separation that [~sjlee0] mentioned above for 
HBase native online aggregations. I'm not sure that flow_version rollups even 
make sense. Flow_version are important to be able to pass in as a filter: give 
me stats for this flow only matching this version. This is useful for cases 
such as reducer estimation where a job can make effective use only of previous 
run data if the version of the flow hasn't changed. The fact that there were 
three version of a Hive query is good to now. Knowing when each version first 
appeared is good to know. Knowing the total cost for version 2 is probably less 
useful.
Flow level aggregates are useful only with a particular timerange in mind. What 
was the cost for the DailyActiveUsers job (no matter the version) for the last 
week? How many bytes did the SearchJob read from HDFS in the last month?

Thoughts around queue level aggregation (in addition to Sangjin's comments that 
these should be time-based):
Queue level aggregates have additional complexities. First queues can come and 
go very quickly and apps can be moved from queue to queue. For the purpose of 
normal shorter lived applications it might be tempting to use the final queue 
that a job ran in (this is the assumption we make in hRaven). With long running 
apps this assumption breaks down.
Now if an app runs for an hour and accumulates some value X for a metric Y it 
will be recorded as such in the original queue agg. Now the application gets 
moved and the new value of metric Y is now Z. Are we going to aggregate Z-X in 
the new queue, or simply all of Z? The sums of all metrics Z in the queues will 
not be the same as the sums of all apps or flows.

In addition, queues can grow and shrink on the fly. Are we going to record 
that? In the very least we need to prefix the cluster in the rowkey so that we 
can differentiate different queues from different clusters.

And then there are hierarchical queues. Are we thinking of rolling stats to 
each level, or just in the individual leaf queue? Will we structure the rowkeys 
that we can do prefix scans for queues called /cluster/parent/childa 
/cluster/parent/childb ?



> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level,

[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-19 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14593649#comment-14593649
 ] 

Sangjin Lee commented on YARN-3815:
---

Thanks [~djp] for putting this together. I added comments in the offline doc, 
but I'll move the main one (high level comments) over here.

(0) on “aggregation”
Like you mentioned, I think it is helpful to make distinction on different 
types of aggregation we’re talking about here. These are somewhat separate 
functionalities. My sense of the types of aggregation is similar to yours, but 
not exactly the same. It would be good if we can converge on their definitions.

I see 4 types of aggregation:
- app-level aggregation
- app-to-flow aggregation (“online” or “real time”)
- time-based flow aggregation (“batch” or “periodic”)
- user/queue aggregation

I’ll explain my definitions in more detail below.

(1) app-level aggregation
This is aggregating metrics from sub-app entities (e.g. containers) to the YARN 
application. This can include both framework-specific metrics (e.g. HDFS bytes 
written for mapreduce) and YARN-system metrics (e.g. container CPU %).

It would be ideal for app entities to have values for these metrics aggregated 
from sub-app entities. How we do that is going to be different between 
framework-specific metrics and YARN-system metrics.

For framework-specific metrics, I would say this falls on the individual 
frameworks. The framework AM usually already aggregates them in memory 
(consider MR job counters for example). So for them it is straightforward to 
write them out directly onto the YARN app entities. Furthermore, it is 
problematic to add them to the sub-app YARN entities and ask YARN to aggregate 
them to the application. Framework’s sub-app entities may not even align with 
YARN’s sub-app entities. For example, in case of MR, there is a reasonable 
one-to-one mapping between a mapper/reducer task attempt and a container, but 
for other applications that may not be true. Forcing all frameworks to hang 
values at containers may not be practical. I think it’s far easier for 
frameworks to write aggregated values to the YARN app entities.

For YARN-system metrics, this would need to be done by YARN. I think we can 
have the timeline collector aggregate the values in memory and write them out 
periodically. The details need to be worked out, but that is definitely one way 
to go. The only tricky thing is then the container metrics should flow through 
the per-app timeline collector, and cannot come from the RM timeline collector 
(Junping pointed that out already).

(2) app-to-flow online aggregation
This is more or less live aggregated metrics at the flow level. This will still 
be based on the native HBase schema.

Actually doing the above for the app-level integration makes app-to-flow online 
aggregation simpler. It now only has to look at app entities to collect the 
data.

Initially we were thinking of leveraging a HBase co-processor, but there are 
some technical challenges with that. We had a discussion on possible ways of 
doing this, and [~jrottinghuis] has a proposal for this. I’ll let Joep chime in 
on this.

(3) time-based flow aggregation
This is different than the online aggregation in the sense that it is 
aggregated along the time boundary (e.g. “daily”, “weekly”, etc.).

This can be based on the Phoenix schema. This can be populated in an offline 
fashion (e.g. running a mapreduce job).

(4) user/queue aggregation
This is another “offline” aggregation type. Also, I believe we’re talking about 
only time-based aggregation. In other words, we would aggregate values for 
users only with a well-defined time window. There won’t be a “real-time” 
aggregation of values, similar to the flow aggregation.


> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics 

[jira] [Commented] (YARN-3815) [Aggregation] Application/Flow/User/Queue Level Aggregations

2015-06-17 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14589880#comment-14589880
 ] 

Junping Du commented on YARN-3815:
--

Attach proposal for the first version. Comments are welcome!

> [Aggregation] Application/Flow/User/Queue Level Aggregations
> 
>
> Key: YARN-3815
> URL: https://issues.apache.org/jira/browse/YARN-3815
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: Timeline Service Nextgen Flow, User, Queue Level 
> Aggregations (v1).pdf
>
>
> Per previous discussions in some design documents for YARN-2928, the basic 
> scenario is the query for stats can happen on:
> - Application level, expect return: an application with aggregated stats
> - Flow level, expect return: aggregated stats for a flow_run, flow_version 
> and flow 
> - User level, expect return: aggregated stats for applications submitted by 
> user
> - Queue level, expect return: aggregated stats for applications within the 
> Queue
> Application states is the basic building block for all other level 
> aggregations. We can provide Flow/User/Queue level aggregated statistics info 
> based on application states (a dedicated table for application states is 
> needed which is missing from previous design documents like HBase/Phoenix 
> schema design). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)