[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369764#comment-15369764 ] Hudson commented on YARN-3904: -- SUCCESS: Integrated in Hadoop-trunk-Commit #10074 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10074/]) YARN-3904. Refactor timelineservice.storage to add support to online and (sjlee: rev 102b56ee96f0723dcc97d7f51b9ee910d8cd782b) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/PhoenixTimelineWriterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/PhoenixOfflineAggregationWriterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/OfflineAggregationInfo.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestTimelineWriterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestPhoenixOfflineAggregationWriterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/TimelineSchemaCreator.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestPhoenixTimelineWriterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/OfflineAggregationWriter.java > Refactor timelineservice.storage to add support to online and offline > aggregation writers > - > > Key: YARN-3904 > URL: https://issues.apache.org/jira/browse/YARN-3904 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > Fix For: YARN-2928 > > Attachments: YARN-3904-YARN-2928.001.patch, > YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, > YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, > YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch, > YARN-3904-YARN-2928.008.patch, YARN-3904-YARN-2928.009.patch > > > After we finished the design for time-based aggregation, we can adopt our > existing Phoenix storage into the storage of the aggregated data. In this > JIRA, I'm proposing to refactor writers to add support to aggregation > writers. Offline aggregation writers typically has less contextual > information. We can distinguish these writers by special naming. We can also > use CollectorContexts to model all contextual information and use it in our > writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700247#comment-14700247 ] Sangjin Lee commented on YARN-3904: --- The latest patch (v.9) LGTM. Any other comments? Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch, YARN-3904-YARN-2928.008.patch, YARN-3904-YARN-2928.009.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700250#comment-14700250 ] Vrushali C commented on YARN-3904: -- Thanks Li, latest patch looks good to me. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch, YARN-3904-YARN-2928.008.patch, YARN-3904-YARN-2928.009.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697549#comment-14697549 ] Vrushali C commented on YARN-3904: -- A couple of more things that came to mind. We need not change the patch for just these, but wanted to say what's on my mind. - Do we want to provide a dropTable api ? I think we should not. In production situation, this can be a costly mistake if someone is testing their code on the cluster. A drop table should be a very manual command so that one is aware that they are running it. - Are the '?' and ',' special characters in this line? Is so, we dont have to change this right now, but maybe next time this code is being looked at, could we make it into a constant {code} String sql = UPSERT INTO + info.getTableName() + ( + StringUtils.join(info.getPrimaryKeyList(), ,) + , created_time, modified_time, metric_names) + VALUES ( + StringUtils.repeat(?,, info.getPrimaryKeyList().length) + ?, ?, ?); {code} The patch looks good overall. thanks [~gtCarrera9] Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch, YARN-3904-YARN-2928.008.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697537#comment-14697537 ] Vrushali C commented on YARN-3904: -- A very minor comment.. I think there is a typo in PHEONIX_OFFLINE_STORAGE_CONN_STR_DEFAULT variable name in YarnConfiguration.java Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch, YARN-3904-YARN-2928.008.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697816#comment-14697816 ] Vrushali C commented on YARN-3904: -- +1 yes we can move ahead. I am quite curious, how is the accessibility being restricted? The method has no specifier so that means it is package level visible, no? Also, the annotations of @private and @VisibleForTesting are only annotations, they don't really affect the private/public accessibility of the function. Or am I mistaken? But that said, let's go ahead with the patch, my question is only for discussion purpose. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch, YARN-3904-YARN-2928.008.patch, YARN-3904-YARN-2928.009.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697818#comment-14697818 ] Hadoop QA commented on YARN-3904: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 17m 7s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 7m 55s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 51s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 11s | The applied patch generated 1 new checkstyle issues (total was 214, now 214). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 27s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 42s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 20s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 24s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 25s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 43m 0s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750594/YARN-3904-YARN-2928.009.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / f40c735 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8846/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8846/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8846/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8846/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8846/console | This message was automatically generated. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch, YARN-3904-YARN-2928.008.patch, YARN-3904-YARN-2928.009.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697758#comment-14697758 ] Li Lu commented on YARN-3904: - Thanks [~vrushalic]! I agree we should not make a public dropTable api. Actually in my code I'm restricting the accessibility of this method to test only. About the special characters, the comma and question marks are used for prepared SQL statements in JDBC, which should be quite stable by now. But I agree that we should clean up the sql statements when we touch this part in future. For now, if it's fine with all of us, maybe we can put this in and move forward with the offline aggregation implementations? Thanks! Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch, YARN-3904-YARN-2928.008.patch, YARN-3904-YARN-2928.009.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697824#comment-14697824 ] Li Lu commented on YARN-3904: - Oh right now the test is using this utility method so it has to be default. We're adding the annotations to avoid adding it to any public javadocs or API lists. This is also an agreement among the reviewers. I agree it's not quite enough, and I'm considering moving this dangerous part to test component in the offline aggregator JIRA. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch, YARN-3904-YARN-2928.008.patch, YARN-3904-YARN-2928.009.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693858#comment-14693858 ] Hadoop QA commented on YARN-3904: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 19s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 10m 7s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 12m 46s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 27s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 30s | The applied patch generated 1 new checkstyle issues (total was 214, now 214). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 49s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 52s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 8s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 31s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 45s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 51m 30s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749133/YARN-3904-YARN-2928.008.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / bcd755e | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8836/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8836/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8836/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8836/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8836/console | This message was automatically generated. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch, YARN-3904-YARN-2928.008.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661086#comment-14661086 ] Hadoop QA commented on YARN-3904: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 17m 40s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 8m 5s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 58s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 9s | The applied patch generated 1 new checkstyle issues (total was 214, now 214). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 29s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 1m 1s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 25s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 28s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 44m 18s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749133/YARN-3904-YARN-2928.008.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / 895ccfa | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8784/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8784/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8784/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8784/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8784/console | This message was automatically generated. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch, YARN-3904-YARN-2928.008.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659016#comment-14659016 ] Li Lu commented on YARN-3904: - Thanks [~sjlee0]! Any other comments from anyone? This JIRA is currently blocking the POC patch of YARN-3817. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652849#comment-14652849 ] Hadoop QA commented on YARN-3904: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 17m 17s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 7m 52s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 54s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 10s | The applied patch generated 1 new checkstyle issues (total was 214, now 214). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 30s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 41s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 19s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 31s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 43m 14s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12748554/YARN-3904-YARN-2928.007.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / df0ec47 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8758/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8758/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8758/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8758/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8758/console | This message was automatically generated. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652862#comment-14652862 ] Sangjin Lee commented on YARN-3904: --- {quote} I agree it is appealing to centralize table creations. After putting some thoughts here I think what we really want is a centralized workflow for storage schema creations. That is to say, when setting up a v2 timeline server, users can simply run data schema creator for once to create necessary data storage schemas. With this in mind, I added Phoenix schema creation into the existing data schema creator, with a separate option -p. However, I'm keeping the SQL statements for table creation inside the writer file so that we also have a centralized place for the Phoenix storage schema. {quote} I'm fine with that approach. {quote} We can definitely reuse this PreparedStatement (as well as the connections) after we integrated the aggregation writer with the aggregator. My plan is to use this (relatively) stable writer to unblock the future patch on flow and user level offline aggregation. After we have the whole workflow, we can gradually add optimizations. Thoughts? {quote} Yes, that sounds fine. Thanks! Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652280#comment-14652280 ] Sangjin Lee commented on YARN-3904: --- Sorry [~gtCarrera9] it took me a while to catch up with this. Thanks for the updated patch! (OfflineAggregationWriter.java) - Actually I'd like to ask whether this needs to be a service. :) Note that it is possible (or likely) that the writer will be executed in a mapreduce task. If this is executed within a mapper task, what does it mean for it be a service? It would not add much value, and at minimum the initialization/start/stop should be possbile outside the service framework, right? (PhoenixOfflineAggregationWriterImpl.java) - Should the primary key user and then cluster, or cluster and user? I think it might be better if it is cluster and user although it is different than the entity table. [~vrushalic]? - For the user aggregation tables, I believe the cluster needs to be included in the row key. Note that multiple clusters may write to the same HBase cluster... - As for createTables(), I'm also of the opinion that it might be better if we moved it to a dedicated creator class. Again, in a context of mapreduce job, it would mean that multiple mappers would compete to create this table. The concurrency will be sorted out by phoenix, but it doesn't seem very necessary. Not only the first time, but every time the aggregation job runs, it would do the create if not exists... work unnecessarily. Also, note that dropTables() can only be accessed by a separate java main process anyway. So it might be good to have a separate class that can let you create or drop tables explicitly. I don't know if it should be part of the existing schema creator or a separate one, and I can see pros and cons of either option. - l.156: My JDBC knowledge is bit outdated, but do you want to prepare the statement every time write is done? Don't you want to prepare it once and reuse it? That optimization will follow later? (OfflineAggregationInfo.java) - l.52-54: I would enforce the notion that this is a read-only object by making the members final Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650069#comment-14650069 ] Zhijie Shen commented on YARN-3904: --- bq. I'm not 100% sure if that's what we would like to do. Maybe we would like to decouple the offline aggregation module from our normal entity storage. Therefore, maybe it's also appealing to allow users specify if they need to create data schema in the offline aggregation process? Such as, setting one flag in the offline aggregator to create data schema? Make sense, but can we still make table creation centralized? I think we can make some option to create raw entity tables and aggregation tables separately. Thoughts? bq. After the changes in this JIRA, we will only have two types of TimelineWriters, one for FS (test only) and one for HBase. The setting on the offline storage should be independent from this setting, I assume? Yeah, I meant we currently have TIMELINE_SERVICE_READER|WRITER_CLASS pointing to a specific reader/writer implementation. However, it's better to have config such as blah.blah.backend.type. When backend.type = hbase, we user can access HBase both directly and via Phoenix, and we allow aggregation. This may not need to part of this jira, but just think it out loudly. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650098#comment-14650098 ] Hadoop QA commented on YARN-3904: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 21m 55s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 11m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 35s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 29s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 32s | The applied patch generated 1 new checkstyle issues (total was 214, now 214). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 52s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 43s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 43s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 29s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 34s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 54m 43s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12748260/YARN-3904-YARN-2928.006.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / df0ec47 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8743/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8743/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8743/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8743/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8743/console | This message was automatically generated. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648457#comment-14648457 ] Zhijie Shen commented on YARN-3904: --- [~gtCarrera9], thanks for the patch. Bellow are my comments: bq. The two failed tests passed on my local machine, and the failures appeared to be irrelevant. This said, we may still need to fix those intermittent test failures. Do we plan to fix it in this patch? Some high level comments: 1. As is also mentioned in YARN-3049, how about we refactoring reader/writer method signature in a separate jira to avoid conflicts? 2. I suggest moving the table creation stuff into TimelineSchemaCreator. 3. As HBase backend is accessed both directly and via Phoenix, it's good for us to cleanup the configuration to say we're using the HBase backend (comparing to FS backend) instead of specifically HBase or Phoenix writer/reader. Other patch details: 1. Make OfflineAggregationWriter extend Service, such that you don't need to define init. 2. Now we're working towards a production standard patch. Would you please write some javadoc to explain the schema of the aggregation tables like what we did for HBase tables. 3. The connection config should be moved to YarnConfiguration. 4. Why is info column family kept? I expect the aggregation table will only have metrics data 5. Let's also have a default PhoenixOfflineAggregationWriterImpl constructor to be used in the production code. 6. {{Class.forName(DRIVER_CLASS_NAME);}} doesn't need to be invoked every time we get a connection. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647132#comment-14647132 ] Li Lu commented on YARN-3904: - The two failed tests passed on my local machine, and the failures appeared to be irrelevant. This said, we may still need to fix those intermittent test failures. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647074#comment-14647074 ] Hadoop QA commented on YARN-3904: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 36s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 5 new or modified test files. | | {color:green}+1{color} | javac | 7m 55s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 42s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 14s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 27s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 39s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 46s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | yarn tests | 7m 58s | Tests failed in hadoop-yarn-server-timelineservice. | | | | 44m 48s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineWriterImpl | | | hadoop.yarn.server.timelineservice.storage.TestPhoenixOfflineAggregationWriterImpl | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747900/YARN-3904-YARN-2928.005.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / df0ec47 | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8713/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8713/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8713/console | This message was automatically generated. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14645313#comment-14645313 ] Sangjin Lee commented on YARN-3904: --- I went through the patch at a high level. I need to do another more detailed pass later, but wanted to share the high level feedback first. Regarding {{PhoenixOfflineAggregatorWriterImpl}}, does it have to implement the {{TimelineWriter}} interface? It is no longer plugged into the real-time write path, and as such, implementing {{TimelineWriter}} seems unnecessary. If we envision using this in a separate mechanism such as mapreduce, I think we ought to come up with a new interface for aggregation. For example, in l.106-114, seeing which field in {{TimelineCollectorContext}} is not null and triggering aggregation that way seems pretty awkward, and that might be because we're trying to use the {{TimelineWriter}} interface and work with {{TimelineCollectorContext}}, but {{TimelineCollectorContext}} wasn't really designed for that purpose, and we would be using it in a unexpected manner. This to me is an unnecessary constraint. Also, the actual work of reading the HBase tables (eventually the flow run table) and invoking the offline aggregator is not captured here. I presume it would come later? Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14645348#comment-14645348 ] Li Lu commented on YARN-3904: - Hi [~sjlee0], thanks so much for the review! Some quick comments: bq. Regarding PhoenixOfflineAggregatorWriterImpl, does it have to implement the TimelineWriter interface? It is no longer plugged into the real-time write path, and as such, implementing TimelineWriter seems unnecessary. That's actually exactly what I'm debating with myself! The more I'm working on the offline aggregator, the more I was feeling that it is not really beneficial to implement our offline storage as a {{TimelineWriter}}. However, the offline writer *is* actually a timeline writer. The natural distinction between the Phoenix writer with the HBase writer is if a writer works in the realtime or the offline workflow. Maybe we'd like to have something like {{TimelineRealTimeWriters}} and {{TimelineOfflineWriters}} (or {{TimelineOfflineStorage}} to accommodate both read and write code paths)? Realtime writers should focus on writing raw entity data with full context info as well as performing realtime aggregations. Offline writers can focus on offline aggregation storage. Thoughts? bq. If we envision using this in a separate mechanism such as mapreduce, I think we ought to come up with a new interface for aggregation. Yes. If we're separating realtime and offline writers, we have more freedom to design aggregation-specific writer interfaces. bq. Also, the actual work of reading the HBase tables (eventually the flow run table) and invoking the offline aggregator is not captured here. I'm planning to include the HBase aggregation table reader as part of YARN-3817, if that POC patch is not too big (so far I don't believe that's the case). Invoking the offline aggregator may probably come separately since we may need some further changes in the RM to post active flows. Does this plan work? Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644998#comment-14644998 ] Li Lu commented on YARN-3904: - Could anyone please review this patch? Thanks! Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14645005#comment-14645005 ] Sangjin Lee commented on YARN-3904: --- Thanks for the patch [~gtCarrera9]! I'll take a look at it today. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643666#comment-14643666 ] Hadoop QA commented on YARN-3904: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 17m 20s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 5 new or modified test files. | | {color:green}+1{color} | javac | 7m 54s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 2s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 16s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 26s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 39s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 48s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 1m 22s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 40m 14s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747454/YARN-3904-YARN-2928.004.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / df0ec47 | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8687/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8687/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8687/console | This message was automatically generated. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643734#comment-14643734 ] Vrushali C commented on YARN-3904: -- bq. One thing pending discussion is about the aggregation method. I feel this method is a little bit outdated. Could anyone remind me the assumed use case for it? Will it fit for real-time aggregations only? IIRC I think we had added it so that we could invoke/trigger aggregation explicitly from the collector/caller in addition to the background aggregation processing. This was provided so that aggregation is not just a behind-the-scenes processing effort but can be triggered on demand. I am thinking this would apply only to app-to-flow aggregation not the offline ones. But yes, it is probably outdated and we should update it as we see fit. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643842#comment-14643842 ] Li Lu commented on YARN-3904: - Thanks for the info! I'll keep the aggregate method intact. We can fix that part in the real time aggregation implementation I assume? Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641139#comment-14641139 ] Li Lu commented on YARN-3904: - Will finish the refactoring work after the bug fix patch is in. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641180#comment-14641180 ] Li Lu commented on YARN-3904: - One question about our current writer design is, do we have a designated use case for the {{aggregate}} method? I remember at the time when we was designing the writer interface, there was no such concepts as real-time or time-based aggregations. For time-based aggregation writers, the current {{aggregate}} lacks of cluster/user information that forms the primary keys of the aggregated entities. So, do we want to keep this method for real-time aggregation, or we want to slightly modify it to accommodate both real-time and time-based aggregation? Is the current {{aggregate}} interface good enough for real-time aggregation? ([~vrushalic] am I missing anything here? ) Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)