[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-07-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369748#comment-15369748
 ] 

Hudson commented on YARN-4700:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10074 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10074/])
YARN-4700. ATS storage has one extra record each time the RM got (sjlee: rev 
d56dde490bea7dca4fcd589afd9b41a21dd9b4b1)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/TimelineStorageUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/HBaseTimelineWriterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestHBaseStorageFlowRun.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/reader/TestTimelineReaderWebServicesHBaseStorage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestHBaseStorageFlowActivity.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestFlowDataGenerator.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowActivityRowKey.java


> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Fix For: YARN-2928
>
> Attachments: YARN-4700-YARN-2928.v1.001.patch, 
> YARN-4700-YARN-2928.v1.002.patch, YARN-4700-YARN-2928.v1.003.patch, 
> YARN-4700-YARN-2928.v1.004.patch, YARN-4700-YARN-2928.wip.patch
>
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-03-06 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182247#comment-15182247
 ] 

Naganarasimha G R commented on YARN-4700:
-

Thanks for the review and the commit [~varun_saxena], [~vrushalic]  & [~sjlee0] 

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Fix For: YARN-2928
>
> Attachments: YARN-4700-YARN-2928.v1.001.patch, 
> YARN-4700-YARN-2928.v1.002.patch, YARN-4700-YARN-2928.v1.003.patch, 
> YARN-4700-YARN-2928.v1.004.patch, YARN-4700-YARN-2928.wip.patch
>
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-03-04 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179926#comment-15179926
 ] 

Varun Saxena commented on YARN-4700:


I have committed this to branch YARN-2928.
Thanks [~gtCarrera9] for reporting the issue and [~Naganarasimha] for the 
contribution.
Thanks to [~vrushalic] and [~sjlee0] for the reviews.

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Fix For: YARN-2928
>
> Attachments: YARN-4700-YARN-2928.v1.001.patch, 
> YARN-4700-YARN-2928.v1.002.patch, YARN-4700-YARN-2928.v1.003.patch, 
> YARN-4700-YARN-2928.v1.004.patch, YARN-4700-YARN-2928.wip.patch
>
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-03-03 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177826#comment-15177826
 ] 

Varun Saxena commented on YARN-4700:


Thanks [~Naganarasimha] for the latest patch. 
+1, looks good to me.

I will wait for a while so that if others have any comment, they can give. Will 
commit it later today.

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4700-YARN-2928.v1.001.patch, 
> YARN-4700-YARN-2928.v1.002.patch, YARN-4700-YARN-2928.v1.003.patch, 
> YARN-4700-YARN-2928.v1.004.patch, YARN-4700-YARN-2928.wip.patch
>
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-03-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177119#comment-15177119
 ] 

Hadoop QA commented on YARN-4700:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
32s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
36s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice:
 patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 23s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 30s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 49s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12791090/YARN-4700-YARN-2928.v1.004.patch
 |
| JIRA Issue | YARN-4700 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 391c03561be0 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | 

[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-03-02 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176611#comment-15176611
 ] 

Sangjin Lee commented on YARN-4700:
---

It seems that the unit test failures are real. So is the javadoc error. Could 
you please look into it? Thanks!

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4700-YARN-2928.v1.001.patch, 
> YARN-4700-YARN-2928.v1.002.patch, YARN-4700-YARN-2928.v1.003.patch, 
> YARN-4700-YARN-2928.wip.patch
>
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-03-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176399#comment-15176399
 ] 

Hadoop QA commented on YARN-4700:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
50s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
37s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice:
 patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 1m 48s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.7.0_95
 with JDK v1.7.0_95 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 25s {color} 
| {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK 
v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 8s {color} | 
{color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 53s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_72 Failed junit tests | 
hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
 |
|   | 

[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-03-02 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176364#comment-15176364
 ] 

Vrushali C commented on YARN-4700:
--

Thanks [~Naganarasimha Garla] for the updated patch. Overall it looks good. 

I have an extremely minor comment, please make the change _only_ if you plan to 
make another patch, else we can make those changes later.
- Lines 195 and 200 in TestFlowDataGenerator are commented out in the patch, we 
can remove them. 

+ 1 otherwise.


> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4700-YARN-2928.v1.001.patch, 
> YARN-4700-YARN-2928.v1.002.patch, YARN-4700-YARN-2928.v1.003.patch, 
> YARN-4700-YARN-2928.wip.patch
>
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-03-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175802#comment-15175802
 ] 

Varun Saxena commented on YARN-4700:


[~Naganarasimha], had a glance at the patch. It looks good to me in general.

# Changes in TestTimelineReaderWebServicesHBaseStorage l.801 are not required.
# I think javadoc should be fixable.
# In FlowActivityEntityRowKey#getRowKey, the javadoc says we are passing top of 
the day timestamp. But we are not. We are calculating it inside. We can change 
the param name and description(say to something like event timestamp).
# Although created time should be fine but should we use event timestamp at 
both the places ? Just for consistency.

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4700-YARN-2928.v1.001.patch, 
> YARN-4700-YARN-2928.v1.002.patch, YARN-4700-YARN-2928.wip.patch
>
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-03-02 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175689#comment-15175689
 ] 

Naganarasimha G R commented on YARN-4700:
-

As [~varun_saxena] pointed offline and the test results some hdfs modifications 
for my minihbase cluster to run has got into the patch, will re upload the 
patch without these changes ...

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4700-YARN-2928.v1.001.patch, 
> YARN-4700-YARN-2928.v1.002.patch, YARN-4700-YARN-2928.wip.patch
>
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-03-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175020#comment-15175020
 ] 

Hadoop QA commented on YARN-4700:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 54s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
49s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 30s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 10s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
6s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
30s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
26s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 8s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
5s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 2s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 5s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 7s 
{color} | {color:red} root: patch generated 1 new + 80 unchanged - 0 fixed = 81 
total (was 80) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
56s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 15s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed 
with JDK v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 4m 49s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.7.0_95
 with JDK v1.7.0_95 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 26s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 1s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 49m 38s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 8s 
{color} | {color:green} 

[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-03-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15173510#comment-15173510
 ] 

Varun Saxena commented on YARN-4700:


Thanks [~Naganarasimha] for the patch.
I have nothing further to add other than what Vrushali said about changes in 
main code. I have same comments.

Looked at the test failures.
For TestHBaseStorageFlowActivity,
FlowActivityRowKey constructor is used while parsing row key so I don't think 
we should be changing that.
I think we can just change the timestamps of the app events and as Vrushali 
suggested, keep all the timestamps within one day. So that we can test that 
different apps on a single day generate one flow for that day. Currently 4 flow 
activity entries are coming due to app event timestamps generating 4 different 
top of the day timestamps.

For the other test case failure i.e. in 
TestTimelineReaderWebServicesHBaseStorage, you will have to change daterange 
queries because those REST queries are based on current timestamp.




> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4700-YARN-2928.v1.001.patch, 
> YARN-4700-YARN-2928.wip.patch
>
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-29 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172651#comment-15172651
 ] 

Vrushali C commented on YARN-4700:
--

Hi [~Naganarasimha]

Thanks for the patch. I believe the constructor for FlowActivityRowKey should 
change to correctly calculate top of the day timestamp given the input 
timestamp. That is the reason the unit test is failing I think, since the 
FlowActivityRowKey is constructed with 
FlowActivityRowKey.getRowKey(clusterStop, appCreatedTime, user, flow). 

Also,I think we can remove the  function FlowActivityRowKey #getRowKey(String 
clusterId, String userId, String  * flowName) and only keep the 
FlowActivityRowKey# getRowKey(String clusterId, long dayTs, String userId, 
String flowName) . That way it's easier to clean up the unit tests as well.

And I think you can change the unit test to use different timestamps (but keep 
the same semantics, i.e. min start time should actually be the lowest one etc), 
that way it will be easier to refactor the unit test. Let me know if this 
helps. Right now the unit test checks in the flow activity table that one entry 
has been made for all of these 4 application entities so you can use the 
timestamps that belong to exactly the same day. Or if you use timestamps 
belonging to different days, change the test to look for those many entries.

Another thing is that, it looks like the event timestamp that is being used is 
the timelineEvents.next().getTimestamp(). It might be more explicit to fetch 
the exact created (or finished) event from the TimelineEntity and use the 
timestamp that belong to either ApplicationMetricsConstants.CREATED_EVENT_TYPE 
or ApplicationMetricsConstants.FINISHED_EVENT_TYPE. That way, we are using the 
accurate event time to make an entry into the flow activity table. You can use 
TimelineStorageUtils # getApplicationFinishedTime() function for getting the 
timestamp for the FINISHED event. You would have to write a new function to do 
a similar thing for fetching CREATED event timestamp (or refactor further and 
use the same function to get the right event's timestamp).

Hope this helps.. Let me know..

thanks
Vrushali

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4700-YARN-2928.v1.001.patch, 
> YARN-4700-YARN-2928.wip.patch
>
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172617#comment-15172617
 ] 

Hadoop QA commented on YARN-4700:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
56s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
38s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 4s {color} | 
{color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK 
v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 2s {color} | 
{color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 24m 34s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_72 Failed junit tests | 
hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
 |
|   | 
hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
 |
|   | 
hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 

[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-29 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172453#comment-15172453
 ] 

Naganarasimha G R commented on YARN-4700:
-

may be small correction in the patch, as {{storeInFlowActivityTable}} is used 
in both app created and app finished i need to find the time from the first 
event of the entity ?

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4700-YARN-2928.wip.patch
>
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-29 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172364#comment-15172364
 ] 

Naganarasimha G R commented on YARN-4700:
-

Oops *TimelineStorageUtils.getTopOfTheDayTimestamp()* is not called we need to 
push that part of it in {{FlowActivityRowKey.getRowKey(clusterId, 
te.getCreatedTime(), userId, flowName)}}

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-29 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172357#comment-15172357
 ] 

Naganarasimha G R commented on YARN-4700:
-

Hi [~sjlee0],
Based on the points from [~vrushalic] and [~varun_saxena], was creating a patch 
such that {{HBaseTimelineWriterImpl.storeInFlowActivityTable}} uses 
{{FlowActivityRowKey.getRowKey(clusterId, te.getCreatedTime(), userId, 
flowName)}} instead of the other overloaded method which doesn't take the 
timestamp.
This would take care of of calling 
{{TimelineStorageUtils.getTopOfTheDayTimestamp()}} right ?

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-29 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172335#comment-15172335
 ] 

Vrushali C commented on YARN-4700:
--

Hi [~sjlee0]
Yes, the flow activity table's row key always needs to  belong to the top of 
the day timestamp. But the event timestamp should be used to find out the top 
of that day.

bq.  If they meant that we would use the actual event timestamps as is for the 
row key, I'm not as sure.
No, we can't use the event timestamp as is. It needs to be top of the day of 
that timestamp.  Which is what I said in the previous comment, " the entry for 
that flow should go into THAT older day's row, hence we should use the event 
timestamp." 

You are right, the code in FlowActivityRowKey#getRowKey() needs to change to 
take the event timestamp, not the current time. I thought we were sending in 
null for the timestamp and hence using current time, but looks like it's 
directly using current time here. 

{code}
  long dayTs = TimelineStorageUtils.getTopOfTheDayTimestamp(System
.currentTimeMillis());
{code}



> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-29 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172332#comment-15172332
 ] 

Naganarasimha G R commented on YARN-4700:
-

It might be mostly the case of asynchronous or, its not necessary that ATS is 
running initially but starts up after RM failsover. 
But in any case would not better to link up with asynchronous and synchronous 
events for V2.

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-29 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172304#comment-15172304
 ] 

Sangjin Lee commented on YARN-4700:
---

I may have misread the comments in haste last Friday. If the comments meant 
that we would use the event timestamps instead of the current time and 
calculate the top-of-the-day timestamps from them, then I concur. If they meant 
that we would use the actual event timestamps *as is* for the row key, I'm not 
as sure.

My main concern there is it might make some of the queries we want to do 
against this table in the future harder or make them perform more poorly. For 
example, we could do a query like "return all flow activities in the last 7 
days". With a top-of-the-day timestamps, it would be a simple partial row key 
matching. With variable timestamps, it would become more of a range query. Are 
my concerns overblown?

If the solution we're discussing is the former, then I think it's quite 
straightforward. We need a little bit of change in 
{{FlowActivityRowKey.getRowKey()}} where we should apply 
{{TimelineStorageUtils.getTopOfTheDayTimestamp()}} on the provided timestamp.

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-27 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170704#comment-15170704
 ] 

Vrushali C commented on YARN-4700:
--

Hi [~sjlee0] 
Let me add some more explanation.

bq. Wait, I think we're using the day timestamp for a reason as this table is 
supposed to be a flow (daily) activity table.

Yes, the flow activity table indicates which apps were running at what time. If 
an event arrives late (or in this case, a replay causes it arrive at a later 
time), it still belongs to the day the app ran on. So the entry for that flow 
should go into THAT older day's row, hence we should use the event timestamp.

bq.  And some considerations are given to long running apps that will cross the 
day boundaries. 
For long running apps, we would most likely be making a snapshot entry that 
belongs to the day on which the app was running.

bq. I'd like us to stick with that unless there is a compelling reason not to?
So we are not changing the semantics here by using the event timestamp. We are 
actually making an explicit entry for the actual day on which the app ran, 
rather than relying on when the event reached the backend.

We can chat further on monday. 

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170397#comment-15170397
 ] 

Varun Saxena commented on YARN-4700:


This event can be emitted on recovery for RUNNING apps as well.

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170387#comment-15170387
 ] 

Varun Saxena commented on YARN-4700:


If we do not fix it in RM, as was the conclusion in those 2 JIRAs' and we use 
current timestamp, we will have to peek into the app table to see if the app 
already exists or not and if yes, do not enter in activity table.
The reason as it seems from the discussion on those JIRAs' is that ATS events 
are asynchronous(because we use a dispatcher in between) so its better to 
replay the events.

Maybe we can give priority to app start and finish events and make it 
synchronous for V2 as the app collector will be running within RM but this 
would block app finishing till ATS event completes.

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170376#comment-15170376
 ] 

Varun Saxena commented on YARN-4700:


bq.  I think we're using the day timestamp for a reason as this table is 
supposed to be a flow (daily) activity table. And some considerations are given 
to long running apps that will cross the day boundaries.
Let us assume RM does not restart. In that case, we will only get the start 
event and finish event once each. In that case, event timestamp will be close 
to current timestamp.
And if those are the only events we get, issue with long running apps(extending 
over more than 2 days) will anyways be there. For instance if  we get start 
event on day 1 and finish event on day 3 and if there is no other app for this 
flow, this will lead to no activity on day 2 even if we use current timestamp..
There was YARN-4069 which was filed for this issue and its with me. I was 
thinking of scheduling a global timer in RM which can emit  ATS events for all 
the running apps at a certain point of time.This should resolve long running 
app issue. This is not marked for 1st milestone though so no progress made.

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-26 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170102#comment-15170102
 ] 

Li Lu commented on YARN-4700:
-

bq. In the code that writes to the flow activity table, can we check the 
application status and make a decision not to write them?
I'm not very familiar with the replayed event sequence, but will we receive an 
application finished event for each of the finished applications? If so, it 
will make distinguishing the real events from the replayed events very 
difficult? [~Naganarasimha] what's your experience working with the SMP? 
Thanks! 

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170065#comment-15170065
 ] 

Sangjin Lee commented on YARN-4700:
---

Wait, I think we're using the day timestamp for a reason as this table is 
supposed to be a flow (daily) activity table. And some considerations are given 
to long running apps that will cross the day boundaries. I'd like us to stick 
with that unless there is a compelling reason not to?

In the code that writes to the flow activity table, can we check the 
application status and make a decision not to write them?

cc [~jrottinghuis]

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-26 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169939#comment-15169939
 ] 

Vrushali C commented on YARN-4700:
--

This was a good catch, thanks [~gtCarrera], [~varun_saxena] and 
[~Naganarasimha]! Let me know if I can be of any help. 

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-26 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169923#comment-15169923
 ] 

Vrushali C commented on YARN-4700:
--

I see. So when events are being replayed, we making new entries in the flow 
activity table since we are using the current time.

Yes, I think we should use the event timestamp. Should be a simple enough fix, 
take the event timestamp for the CREATED or the FINISHED event and use that 
instead of null in the HBaseTimelineWriterImpl#storeInFlowActivityTable 
function. 

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-26 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169764#comment-15169764
 ] 

Naganarasimha G R commented on YARN-4700:
-

Yes these were the jira's where we discussed about it but the 
[conclusion|https://issues.apache.org/jira/browse/YARN-4392?focusedCommentId=15033961=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15033961]
 was, we were ok with republishing the events with exact data rather than not 
publishing at all because its not guaranteed that ATS events for apps in state 
store are successfully published. 

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-26 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169756#comment-15169756
 ] 

Naganarasimha G R commented on YARN-4700:
-

Thanks [~varun_saxena], I was also thinking in the same lines of using event's 
timestamp than the current time stamp...

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169734#comment-15169734
 ] 

Varun Saxena commented on YARN-4700:


When we recover an app we can also get start and finish time for an app from 
the state store. This is updated then in RMAppImpl.
So when the event is replayed and reported to ATSv2, app start event would 
carry the same time as the time at which app originally started.

Currently for flow activity table, the inverted top of the day timestamp is 
generated based on current system time. I think we can instead use the 
timestamp coming in the event.
That should resolve this issue.
cc [~sjlee0], [~vrushalic]. I do not see any issue in using event's timestamp. 
Do you see any ?

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-26 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169721#comment-15169721
 ] 

Varun Saxena commented on YARN-4700:


Hmm...I think the JIRAs' are YARN-3127 and YARN-4392. Let me see how we can 
solve this problem.

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-26 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169531#comment-15169531
 ] 

Li Lu commented on YARN-4700:
-

I think the redundant events are coming from the work preserving RM restart, 
where the RM tries to "replay" application lifecycle events in the state store. 
I don't remember the JIRA number for fixing this for SMP (but I do remember 
[~Naganarasimha] was involved in the discussion), but seems like the conclusion 
was to handle this on the SMP/storage side rather than the RM side. For us, 
most of the tables are fine, but the flow activity table we need to distinguish 
a "real" activity from a replayed activity. 

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-25 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168515#comment-15168515
 ] 

Varun Saxena commented on YARN-4700:


[~Naganarasimha], I remember there was a JIRA with you regarding duplicate 
events being sent from RM(issue is not specific to this branch). What happened 
to that ?

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-25 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168443#comment-15168443
 ] 

Naganarasimha G R commented on YARN-4700:
-

Oops My mistake, i dint chk the default values set in the code just went by the 
discussion we had. Yes we resend the started and stopped events on recovery but 
with the same set of data, will try to verify whats making it enter as a new 
row with diff values.

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-25 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168173#comment-15168173
 ] 

Sangjin Lee commented on YARN-4700:
---

I see. This is regarding the flow activity table, right? Writes to the flow 
activity table are done when an APPLICATION_CREATED_EVENT is handled. Does a RM 
restart trigger a duplicate APPLICATION_CREATED_EVENT for existing apps?

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-25 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168104#comment-15168104
 ] 

Li Lu commented on YARN-4700:
-

[~sjlee0] I checked the flowrun activity table and I can see the row keys for 
the same flow is like this:
{code}
Current count: 1, row: 
yarn_cluster!\x7F\xFF\xFE\xAC\xEFl\xBB\xFF!llu!flow_1445894691726_1
...
Current count: 10, row: 
yarn_cluster!\x7F\xFF\xFE\xAD7\x85\xC3\xFF!llu!flow_1445894691726_1
...
Current count: 19, row: 
yarn_cluster!\x7F\xFF\xFE\xAD<\xAC\x1F\xFF!llu!flow_1445894691726_1
...
{code}

>From FlowActivityTable, I can see the row key contains inv_top_of_day, which 
>is the reason we have duplicated flows. However, for each of the flows, I only 
>ran it for once, and never touch them again. By design this should not 
>regenerate any new flow activities? Is this related to RM restart? 

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-25 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168080#comment-15168080
 ] 

Sangjin Lee commented on YARN-4700:
---

These are the current default values that are being set as the context info 
(which is then used as part of the row key):
{code:title=AppLevelTimelineCollector.java}
context.setClusterId(conf.get(YarnConfiguration.RM_CLUSTER_ID,
YarnConfiguration.DEFAULT_RM_CLUSTER_ID));
// Set the default values, which will be updated with an RPC call to get the
// context info from NM.
// Current user usually is not the app user, but keep this field non-null
context.setUserId(UserGroupInformation.getCurrentUser().getShortUserName());
// Use app ID to generate a default flow name for orphan app
context.setFlowName(
TimelineUtils.generateDefaultFlowNameBasedOnAppId(appId));
// Set the flow version to string 1 if it's an orphan app
context.setFlowVersion("1");
// Set the flow run ID to 1 if it's an orphan app
context.setFlowRunId(1L);
context.setAppId(appId.toString());
{code}

The flow name, version, and the run id may be overridden if the application 
sets the YARN tag.

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-25 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168063#comment-15168063
 ] 

Sangjin Lee commented on YARN-4700:
---

[~gtCarrera9], can you provide the exact row keys in this problem scenario? 
From what I can see in the code, if you're not setting the cluster id in the 
configuration (under key "yarn.resourcemanager.cluster-id"), it should be the 
default value ("yarn_cluster"). I'm not sure where the duplicate keys are 
coming from. Once we see the duplicate row keys for the same app, it should 
become clearer.

Are you seeing this with a DS app or a MR app?

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-25 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167741#comment-15167741
 ] 

Naganarasimha G R commented on YARN-4700:
-

Any suggestions for the above approach [~gtCarrera9]/[~sjlee0] ?

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-21 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156208#comment-15156208
 ] 

Li Lu commented on YARN-4700:
-

[~Naganarasimha] sure, please go ahead and take it. Thanks! 

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-21 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156126#comment-15156126
 ] 

Naganarasimha G R commented on YARN-4700:
-

hi [~gtCarrera9],  Thought  of taking this issue up, please reassign if you 
have already started working on this !
I think we can better consider default clusterID as static constant like 
"mycluster" / "yarncluster" ... thoughts?
cc /[~sjlee0]

> ATS storage has one extra record each time the RM got restarted
> ---
>
> Key: YARN-4700
> URL: https://issues.apache.org/jira/browse/YARN-4700
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Li Lu
>Assignee: Naganarasimha G R
>
> When testing the new web UI for ATS v2, I noticed that we're creating one 
> extra record for each finished application (but still hold in the RM state 
> store) each time the RM got restarted. It's quite possible that we add the 
> cluster start timestamp into the default cluster id, thus each time we're 
> creating a new record for one application (cluster id is a part of the row 
> key). We need to fix this behavior, probably by having a better default 
> cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)