[jira] [Updated] (YARN-3817) [Aggregation] Flow and User level aggregation on Application States table
[ https://issues.apache.org/jira/browse/YARN-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vrushali C updated YARN-3817: - Parent Issue: YARN-7055 (was: YARN-5355) > [Aggregation] Flow and User level aggregation on Application States table > - > > Key: YARN-3817 > URL: https://issues.apache.org/jira/browse/YARN-3817 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Li Lu > Labels: YARN-5355 > Attachments: Detail Design for Flow and User Level Aggregation.pdf, > YARN-3817-poc-v1.patch, YARN-3817-poc-v1-rebase.patch > > > We need time-based flow/user level aggregation to present flow/user related > states to end users. > Flow level represents summary info of a specific flow. User level aggregation > represents summary info of a specific user, it should include summary info of > accumulated and statistic means (by two levels: application and flow), like: > number of Flows, applications, resource consumption, resource means per app > or flow, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3817) [Aggregation] Flow and User level aggregation on Application States table
[ https://issues.apache.org/jira/browse/YARN-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joep Rottinghuis updated YARN-3817: --- Parent Issue: YARN-5355 (was: YARN-2928) > [Aggregation] Flow and User level aggregation on Application States table > - > > Key: YARN-3817 > URL: https://issues.apache.org/jira/browse/YARN-3817 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Li Lu > Labels: YARN-5355 > Attachments: Detail Design for Flow and User Level Aggregation.pdf, > YARN-3817-poc-v1-rebase.patch, YARN-3817-poc-v1.patch > > > We need time-based flow/user level aggregation to present flow/user related > states to end users. > Flow level represents summary info of a specific flow. User level aggregation > represents summary info of a specific user, it should include summary info of > accumulated and statistic means (by two levels: application and flow), like: > number of Flows, applications, resource consumption, resource means per app > or flow, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3817) [Aggregation] Flow and User level aggregation on Application States table
[ https://issues.apache.org/jira/browse/YARN-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joep Rottinghuis updated YARN-3817: --- Labels: YARN-5355 (was: ) > [Aggregation] Flow and User level aggregation on Application States table > - > > Key: YARN-3817 > URL: https://issues.apache.org/jira/browse/YARN-3817 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Li Lu > Labels: YARN-5355 > Attachments: Detail Design for Flow and User Level Aggregation.pdf, > YARN-3817-poc-v1-rebase.patch, YARN-3817-poc-v1.patch > > > We need time-based flow/user level aggregation to present flow/user related > states to end users. > Flow level represents summary info of a specific flow. User level aggregation > represents summary info of a specific user, it should include summary info of > accumulated and statistic means (by two levels: application and flow), like: > number of Flows, applications, resource consumption, resource means per app > or flow, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3817) [Aggregation] Flow and User level aggregation on Application States table
[ https://issues.apache.org/jira/browse/YARN-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-3817: Attachment: YARN-3817-poc-v1-rebase.patch Rebase the POC patch to the feature-YARN-2928 branch. The new patch is based on YARN-3816-feature-YARN-2928.v4.1.patch. > [Aggregation] Flow and User level aggregation on Application States table > - > > Key: YARN-3817 > URL: https://issues.apache.org/jira/browse/YARN-3817 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Li Lu > Attachments: Detail Design for Flow and User Level Aggregation.pdf, > YARN-3817-poc-v1-rebase.patch, YARN-3817-poc-v1.patch > > > We need time-based flow/user level aggregation to present flow/user related > states to end users. > Flow level represents summary info of a specific flow. User level aggregation > represents summary info of a specific user, it should include summary info of > accumulated and statistic means (by two levels: application and flow), like: > number of Flows, applications, resource consumption, resource means per app > or flow, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3817) [Aggregation] Flow and User level aggregation on Application States table
[ https://issues.apache.org/jira/browse/YARN-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-3817: Attachment: YARN-3817-poc-v1.patch I'm attaching the first POC patch of our Phoenix based offline aggregator. The current patch adds a mapreduce based offline aggregator that will gather information from our HBase storage, perform the flow and user based aggregation, and writes aggregated data back to Phoenix. Generally, the expected input to the offline aggregator is a list of flows (active flow of the past time period, or a specially created list of flows within a given time window). The offline aggregator will firstly aggregate all flow run data for each flow in both the mapper and the reducer, then write them back into Phoenix. Meanwhile, the aggregated data is passed alone to the user level aggregation. The user level aggregation performs similar aggregations as the flow aggregations. There is a TimelineEntityWritable class to transfer TimelineEntities. Some TODOs: 1. Centralize some of the HBase reader related code for both the aggregation hbase reader and the hbase reader. 2. Create a "trigger" to launch the aggregator in a timely or ad-hoc fashion. 3. Separate configs. 4. Support aggregation on a specific time period. 5. More tests. Future TODOs: Reorganize our storage package and unit tests Some extra work performed in this patch: 1. No longer storing info fields in Phoenix writer. 2. Escaping special characters in Phoenix writer by quoting all column names (according to Phoenix team's suggestion). 3. Centralizing tests for aggregation and Phoenix. 4. Remove unused TestTimelineWriterUtil. > [Aggregation] Flow and User level aggregation on Application States table > - > > Key: YARN-3817 > URL: https://issues.apache.org/jira/browse/YARN-3817 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Li Lu > Attachments: Detail Design for Flow and User Level Aggregation.pdf, > YARN-3817-poc-v1.patch > > > We need time-based flow/user level aggregation to present flow/user related > states to end users. > Flow level represents summary info of a specific flow. User level aggregation > represents summary info of a specific user, it should include summary info of > accumulated and statistic means (by two levels: application and flow), like: > number of Flows, applications, resource consumption, resource means per app > or flow, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3817) [Aggregation] Flow and User level aggregation on Application States table
[ https://issues.apache.org/jira/browse/YARN-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-3817: Description: We need time-based flow/user level aggregation to present flow/user related states to end users. Flow level represents summary info of a specific flow. User level aggregation represents summary info of a specific user, it should include summary info of accumulated and statistic means (by two levels: application and flow), like: number of Flows, applications, resource consumption, resource means per app or flow, etc. was: We need flow/user level aggregation to present flow/user related states to end users. Flow level aggregation involve three levels aggregations: - The first level is Flow_run level which represents one execution of a flow and shows exactly aggregated data for a run of flow. - The 2nd level is Flow_version level which represents summary info of a version of flow. - The 3rd level is Flow level which represents summary info of a specific flow. User level aggregation represents summary info of a specific user, it should include summary info of accumulated and statistic means (by two levels: application and flow), like: number of Flows, applications, resource consumption, resource means per app or flow, etc. > [Aggregation] Flow and User level aggregation on Application States table > - > > Key: YARN-3817 > URL: https://issues.apache.org/jira/browse/YARN-3817 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Li Lu > Attachments: Detail Design for Flow and User Level Aggregation.pdf > > > We need time-based flow/user level aggregation to present flow/user related > states to end users. > Flow level represents summary info of a specific flow. User level aggregation > represents summary info of a specific user, it should include summary info of > accumulated and statistic means (by two levels: application and flow), like: > number of Flows, applications, resource consumption, resource means per app > or flow, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3817) [Aggregation] Flow and User level aggregation on Application States table
[ https://issues.apache.org/jira/browse/YARN-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3817: - Attachment: Detail Design for Flow and User Level Aggregation.pdf > [Aggregation] Flow and User level aggregation on Application States table > - > > Key: YARN-3817 > URL: https://issues.apache.org/jira/browse/YARN-3817 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Junping Du > Attachments: Detail Design for Flow and User Level Aggregation.pdf > > > We need flow/user level aggregation to present flow/user related states to > end users. > Flow level aggregation involve three levels aggregations: > - The first level is Flow_run level which represents one execution of a flow > and shows exactly aggregated data for a run of flow. > - The 2nd level is Flow_version level which represents summary info of a > version of flow. > - The 3rd level is Flow level which represents summary info of a specific > flow. > User level aggregation represents summary info of a specific user, it should > include summary info of accumulated and statistic means (by two levels: > application and flow), like: number of Flows, applications, resource > consumption, resource means per app or flow, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3817) [Aggregation] Flow and User level aggregation on Application States table
[ https://issues.apache.org/jira/browse/YARN-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3817: - Summary: [Aggregation] Flow and User level aggregation on Application States table (was: Flow and User level aggregation on Application States table) > [Aggregation] Flow and User level aggregation on Application States table > - > > Key: YARN-3817 > URL: https://issues.apache.org/jira/browse/YARN-3817 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Junping Du > > We need flow/user level aggregation to present flow/user related states to > end users. > Flow level aggregation involve three levels aggregations: > - The first level is Flow_run level which represents one execution of a flow > and shows exactly aggregated data for a run of flow. > - The 2nd level is Flow_version level which represents summary info of a > version of flow. > - The 3rd level is Flow level which represents summary info of a specific > flow. > User level aggregation represents summary info of a specific user, it should > include summary info of accumulated and statistic means (by two levels: > application and flow), like: number of Flows, applications, resource > consumption, resource means per app or flow, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)