[jira] [Commented] (YARN-3409) Add constraint node labels
[ https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537676#comment-14537676 ] Dian Fu commented on YARN-3409: --- Just to post requirements discussed in YARN-3557 here: Constraint node labels should be supported to be added from both RM and NM. As some labels such as TRUSTED/UNTRUSTED described in YARN-3557 require to be added from RM and some labels such as GPU, FPGA, LINUX, WINDOWS are more suitable to be added from NM. A large cluster may have all these kinds of labels coexist. > Add constraint node labels > -- > > Key: YARN-3409 > URL: https://issues.apache.org/jira/browse/YARN-3409 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, capacityscheduler, client >Reporter: Wangda Tan >Assignee: Wangda Tan > > Specify only one label for each node (IAW, partition a cluster) is a way to > determinate how resources of a special set of nodes could be shared by a > group of entities (like teams, departments, etc.). Partitions of a cluster > has following characteristics: > - Cluster divided to several disjoint sub clusters. > - ACL/priority can apply on partition (Only market team / marke team has > priority to use the partition). > - Percentage of capacities can apply on partition (Market team has 40% > minimum capacity and Dev team has 60% of minimum capacity of the partition). > Constraints are orthogonal to partition, they’re describing attributes of > node’s hardware/software just for affinity. Some example of constraints: > - glibc version > - JDK version > - Type of CPU (x86_64/i686) > - Type of OS (windows, linux, etc.) > With this, application can be able to ask for resource has (glibc.version >= > 2.20 && JDK.version >= 8u20 && x86_64). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3557) Support Intel Trusted Execution Technology(TXT) in YARN scheduler
[ https://issues.apache.org/jira/browse/YARN-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537681#comment-14537681 ] Dian Fu commented on YARN-3557: --- Hi [~leftnoteasy], I have posted the requirements about supporting configure constraints node label from both RM and NM on YARN-3409. About support script based node label configuration at RM side, what's your thought? > Support Intel Trusted Execution Technology(TXT) in YARN scheduler > - > > Key: YARN-3557 > URL: https://issues.apache.org/jira/browse/YARN-3557 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Dian Fu > Attachments: Support TXT in YARN high level design doc.pdf > > > Intel TXT defines platform-level enhancements that provide the building > blocks for creating trusted platforms. A TXT aware YARN scheduler can > schedule security sensitive jobs on TXT enabled nodes only. YARN-2492 > provides the capacity to restrict YARN applications to run only on cluster > nodes that have a specified node label. This is a good mechanism that be > utilized for TXT aware YARN scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3170) YARN architecture document needs updating
[ https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537726#comment-14537726 ] Tsuyoshi Ozawa commented on YARN-3170: -- [~brahmareddy] thank you for updating. {quote} We call MapReduce running on YARN "MapReduce 2.0 (MRv2). {quote} A trailing double quotation is missing. Please add it before the period. > YARN architecture document needs updating > - > > Key: YARN-3170 > URL: https://issues.apache.org/jira/browse/YARN-3170 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > Labels: BB2015-05-TBR > Attachments: YARN-3170-002.patch, YARN-3170-003.patch, YARN-3170.patch > > > The marketing paragraph at the top, "NextGen MapReduce", etc are all > marketing rather than actual descriptions. It also needs some general > updates, esp given it reads as though 0.23 was just released yesterday. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3615) Yarn and Mapred queue CLI command support for Fairscheduler
Bibin A Chundatt created YARN-3615: -- Summary: Yarn and Mapred queue CLI command support for Fairscheduler Key: YARN-3615 URL: https://issues.apache.org/jira/browse/YARN-3615 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler, scheduler Reporter: Bibin A Chundatt Assignee: Naganarasimha G R Add support for CLI command when Fair scheduler is configured Listed few command which needs updation ./yarn queue -status *Current output* {code} Queue Name : root.sls_queue_2 State : RUNNING Capacity : 100.0% Current Capacity : 100.0% Maximum Capacity : -100.0% Default Node Label expression : Accessible Node Labels : {code} ./mapred queue -info ./mapred queue -list All the below commands currently displaying based on Capacity -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3170) YARN architecture document needs updating
[ https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated YARN-3170: --- Attachment: YARN-3170-004.patch > YARN architecture document needs updating > - > > Key: YARN-3170 > URL: https://issues.apache.org/jira/browse/YARN-3170 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > Labels: BB2015-05-TBR > Attachments: YARN-3170-002.patch, YARN-3170-003.patch, > YARN-3170-004.patch, YARN-3170.patch > > > The marketing paragraph at the top, "NextGen MapReduce", etc are all > marketing rather than actual descriptions. It also needs some general > updates, esp given it reads as though 0.23 was just released yesterday. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3170) YARN architecture document needs updating
[ https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537772#comment-14537772 ] Brahma Reddy Battula commented on YARN-3170: [~ozawa] updated the patch..Kindly Review..thanks > YARN architecture document needs updating > - > > Key: YARN-3170 > URL: https://issues.apache.org/jira/browse/YARN-3170 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > Labels: BB2015-05-TBR > Attachments: YARN-3170-002.patch, YARN-3170-003.patch, > YARN-3170-004.patch, YARN-3170.patch > > > The marketing paragraph at the top, "NextGen MapReduce", etc are all > marketing rather than actual descriptions. It also needs some general > updates, esp given it reads as though 0.23 was just released yesterday. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3170) YARN architecture document needs updating
[ https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537780#comment-14537780 ] Hadoop QA commented on YARN-3170: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 2m 53s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | release audit | 0m 20s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 57s | Site still builds. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | | | 6m 13s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731885/YARN-3170-004.patch | | Optional Tests | site | | git revision | trunk / 3fa2efc | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7860/console | This message was automatically generated. > YARN architecture document needs updating > - > > Key: YARN-3170 > URL: https://issues.apache.org/jira/browse/YARN-3170 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > Labels: BB2015-05-TBR > Attachments: YARN-3170-002.patch, YARN-3170-003.patch, > YARN-3170-004.patch, YARN-3170.patch > > > The marketing paragraph at the top, "NextGen MapReduce", etc are all > marketing rather than actual descriptions. It also needs some general > updates, esp given it reads as though 0.23 was just released yesterday. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3513) Remove unused variables in ContainersMonitorImpl and add debug log for overall resource usage by all containers
[ https://issues.apache.org/jira/browse/YARN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-3513: Attachment: YARN-3513.20150511-1.patch Ok [~devaraj.k], updated the patch as per your suggestion. > Remove unused variables in ContainersMonitorImpl and add debug log for > overall resource usage by all containers > > > Key: YARN-3513 > URL: https://issues.apache.org/jira/browse/YARN-3513 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Trivial > Labels: BB2015-05-TBR, newbie > Attachments: YARN-3513.20150421-1.patch, YARN-3513.20150503-1.patch, > YARN-3513.20150506-1.patch, YARN-3513.20150507-1.patch, > YARN-3513.20150508-1.patch, YARN-3513.20150508-1.patch, > YARN-3513.20150511-1.patch > > > Some local variables in MonitoringThread.run() : {{vmemStillInUsage and > pmemStillInUsage}} are not used and just updated. > Instead we need to add debug log for overall resource usage by all containers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-817) If input path does not exist application/job id is getting assigned.
[ https://issues.apache.org/jira/browse/YARN-817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith resolved YARN-817. - Resolution: Invalid > If input path does not exist application/job id is getting assigned. > > > Key: YARN-817 > URL: https://issues.apache.org/jira/browse/YARN-817 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.0.2-alpha, 2.0.1-alpha >Reporter: Nishan Shetty >Priority: Minor > > 1.Run job by giving input as some path which does not exist > 2.Application/job is is getting assigned. > 2013-06-12 16:00:24,494 INFO > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new > applicationId: 12 > Suggestion > Before assiging job/app id input path check can be made. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-817) If input path does not exist application/job id is getting assigned.
[ https://issues.apache.org/jira/browse/YARN-817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537817#comment-14537817 ] Rohith commented on YARN-817: - Input path is used by Application JVM. Application client should handle this before submiting the application to YARN. Closing as Invalid, reopen if any concern on this > If input path does not exist application/job id is getting assigned. > > > Key: YARN-817 > URL: https://issues.apache.org/jira/browse/YARN-817 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.0.2-alpha, 2.0.1-alpha >Reporter: Nishan Shetty >Priority: Minor > > 1.Run job by giving input as some path which does not exist > 2.Application/job is is getting assigned. > 2013-06-12 16:00:24,494 INFO > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new > applicationId: 12 > Suggestion > Before assiging job/app id input path check can be made. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3513) Remove unused variables in ContainersMonitorImpl and add debug log for overall resource usage by all containers
[ https://issues.apache.org/jira/browse/YARN-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537832#comment-14537832 ] Hadoop QA commented on YARN-3513: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 33s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 29s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 32s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 36s | The applied patch generated 1 new checkstyle issues (total was 27, now 27). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 38s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 1s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 5m 57s | Tests passed in hadoop-yarn-server-nodemanager. | | | | 41m 44s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731901/YARN-3513.20150511-1.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 3fa2efc | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/7861/artifact/patchprocess/diffcheckstylehadoop-yarn-server-nodemanager.txt | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/7861/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7861/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7861/console | This message was automatically generated. > Remove unused variables in ContainersMonitorImpl and add debug log for > overall resource usage by all containers > > > Key: YARN-3513 > URL: https://issues.apache.org/jira/browse/YARN-3513 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Trivial > Labels: BB2015-05-TBR, newbie > Attachments: YARN-3513.20150421-1.patch, YARN-3513.20150503-1.patch, > YARN-3513.20150506-1.patch, YARN-3513.20150507-1.patch, > YARN-3513.20150508-1.patch, YARN-3513.20150508-1.patch, > YARN-3513.20150511-1.patch > > > Some local variables in MonitoringThread.run() : {{vmemStillInUsage and > pmemStillInUsage}} are not used and just updated. > Instead we need to add debug log for overall resource usage by all containers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3614) FileSystemRMStateStore throw exception when failed to remove application, that cause resourcemanager to crash
[ https://issues.apache.org/jira/browse/YARN-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537837#comment-14537837 ] nijel commented on YARN-3614: - hi @lachisis bq.when standby resourcemanager try to transitiontoActive, it will cost more than ten minutes to load applications Is this a secure cluster ? > FileSystemRMStateStore throw exception when failed to remove application, > that cause resourcemanager to crash > - > > Key: YARN-3614 > URL: https://issues.apache.org/jira/browse/YARN-3614 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.5.0 >Reporter: lachisis >Priority: Critical > > FileSystemRMStateStore is only a accessorial plug-in of rmstore. > When it failed to remove application, I think warning is enough, but now > resourcemanager crashed. > Recently, I configure > "yarn.resourcemanager.state-store.max-completed-applications" to limit > applications number in rmstore. when applications number exceed the limit, > some old applications will be removed. If failed to remove, resourcemanager > will crash. > The following is log: > 2015-05-11 06:58:43,815 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Removing > info for app: application_1430994493305_0053 > 2015-05-11 06:58:43,815 INFO > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore: > Removing info for app: application_1430994493305_0053 at: > /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053 > 2015-05-11 06:58:43,816 ERROR > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Error > removing app: application_1430994493305_0053 > java.lang.Exception: Failed to delete > /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053 > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.deleteFile(FileSystemRMStateStore.java:572) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.removeApplicationStateInternal(FileSystemRMStateStore.java:471) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:185) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:171) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:806) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:879) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:874) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:745) > 2015-05-11 06:58:43,819 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received a > org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type > STATE_STORE_OP_FAILED. Cause: > java.lang.Exception: Failed to delete > /hadoop/rmstore/FSRMStateRoot/RMAppRoot/application_1430994493305_0053 > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.deleteFile(FileSystemRMStateStore.java:572) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.removeApplicationStateInternal(FileSystemRMStateStore.java:471) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:185) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$RemoveAppTransition.transition(RMStateStore.java:171) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at >
[jira] [Commented] (YARN-3409) Add constraint node labels
[ https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537858#comment-14537858 ] David Villegas commented on YARN-3409: -- Thanks for your comment, Wangda. I agree that loadAvg may not be useful in all cases. The main idea for dynamic label values is the system would be more extensible, and reduce human errors if some of the labels can be automatically populated. An example that comes to mind, based on Dian's comment, is the NodeManager's Operating System. Rather than having an administrator set it, it could be pre-set to the actual OS by the NM. > Add constraint node labels > -- > > Key: YARN-3409 > URL: https://issues.apache.org/jira/browse/YARN-3409 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, capacityscheduler, client >Reporter: Wangda Tan >Assignee: Wangda Tan > > Specify only one label for each node (IAW, partition a cluster) is a way to > determinate how resources of a special set of nodes could be shared by a > group of entities (like teams, departments, etc.). Partitions of a cluster > has following characteristics: > - Cluster divided to several disjoint sub clusters. > - ACL/priority can apply on partition (Only market team / marke team has > priority to use the partition). > - Percentage of capacities can apply on partition (Market team has 40% > minimum capacity and Dev team has 60% of minimum capacity of the partition). > Constraints are orthogonal to partition, they’re describing attributes of > node’s hardware/software just for affinity. Some example of constraints: > - glibc version > - JDK version > - Type of CPU (x86_64/i686) > - Type of OS (windows, linux, etc.) > With this, application can be able to ask for resource has (glibc.version >= > 2.20 && JDK.version >= 8u20 && x86_64). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3599) Fix the javadoc of DelegationTokenSecretManager in hadoop-yarn
[ https://issues.apache.org/jira/browse/YARN-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du resolved YARN-3599. -- Resolution: Duplicate > Fix the javadoc of DelegationTokenSecretManager in hadoop-yarn > -- > > Key: YARN-3599 > URL: https://issues.apache.org/jira/browse/YARN-3599 > Project: Hadoop YARN > Issue Type: Sub-task > Components: documentation >Reporter: Gabor Liptak >Priority: Trivial > Attachments: YARN-3599.1.patch, YARN-3599.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
[ https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3587: - Hadoop Flags: Reviewed > Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc. > - > > Key: YARN-3587 > URL: https://issues.apache.org/jira/browse/YARN-3587 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.0 >Reporter: Akira AJISAKA >Assignee: Gabor Liptak >Priority: Minor > Labels: newbie > Fix For: 2.8.0 > > Attachments: YARN-3587.1.patch, YARN-3587.patch > > > In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager, > the javadoc of the constructor is as follows: > {code} > /** >* Create a secret manager >* @param delegationKeyUpdateInterval the number of seconds for rolling new >*secret keys. >* @param delegationTokenMaxLifetime the maximum lifetime of the delegation >*tokens >* @param delegationTokenRenewInterval how often the tokens must be renewed >* @param delegationTokenRemoverScanInterval how often the tokens are > scanned >*for expired tokens >*/ > {code} > 1. "the number of seconds" should be "the number of milliseconds". > 2. It's better to add time unit to the description of other parameters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3598) Fix the javadoc of DelegationTokenSecretManager in hadoop-mapreduce
[ https://issues.apache.org/jira/browse/YARN-3598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du resolved YARN-3598. -- Resolution: Duplicate > Fix the javadoc of DelegationTokenSecretManager in hadoop-mapreduce > --- > > Key: YARN-3598 > URL: https://issues.apache.org/jira/browse/YARN-3598 > Project: Hadoop YARN > Issue Type: Sub-task > Components: documentation >Reporter: Gabor Liptak >Priority: Trivial > Attachments: YARN-3598.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3596) Fix the javadoc of DelegationTokenSecretManager in hadoop-common
[ https://issues.apache.org/jira/browse/YARN-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du resolved YARN-3596. -- Resolution: Duplicate > Fix the javadoc of DelegationTokenSecretManager in hadoop-common > > > Key: YARN-3596 > URL: https://issues.apache.org/jira/browse/YARN-3596 > Project: Hadoop YARN > Issue Type: Sub-task > Components: documentation >Reporter: Gabor Liptak >Priority: Trivial > Attachments: YARN-3596.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3597) Fix the javadoc of DelegationTokenSecretManager in hadoop-hdfs
[ https://issues.apache.org/jira/browse/YARN-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du resolved YARN-3597. -- Resolution: Duplicate > Fix the javadoc of DelegationTokenSecretManager in hadoop-hdfs > -- > > Key: YARN-3597 > URL: https://issues.apache.org/jira/browse/YARN-3597 > Project: Hadoop YARN > Issue Type: Sub-task > Components: documentation >Reporter: Gabor Liptak >Priority: Trivial > Attachments: YARN-3597.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
[ https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537869#comment-14537869 ] Hudson commented on YARN-3587: -- FAILURE: Integrated in Hadoop-trunk-Commit #7790 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7790/]) YARN-3587. Fix the javadoc of DelegationTokenSecretManager in yarn, etc. projects. Contributed by Gabor Liptak. (junping_du: rev 7e543c27fa2881aa65967be384a6203bd5b2304f) * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/RMDelegationTokenSecretManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineDelegationTokenSecretManagerService.java * hadoop-yarn-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JHSDelegationTokenSecretManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/security/token/delegation/DelegationTokenSecretManager.java > Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc. > - > > Key: YARN-3587 > URL: https://issues.apache.org/jira/browse/YARN-3587 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.0 >Reporter: Akira AJISAKA >Assignee: Gabor Liptak >Priority: Minor > Labels: newbie > Fix For: 2.8.0 > > Attachments: YARN-3587.1.patch, YARN-3587.patch > > > In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager, > the javadoc of the constructor is as follows: > {code} > /** >* Create a secret manager >* @param delegationKeyUpdateInterval the number of seconds for rolling new >*secret keys. >* @param delegationTokenMaxLifetime the maximum lifetime of the delegation >*tokens >* @param delegationTokenRenewInterval how often the tokens must be renewed >* @param delegationTokenRemoverScanInterval how often the tokens are > scanned >*for expired tokens >*/ > {code} > 1. "the number of seconds" should be "the number of milliseconds". > 2. It's better to add time unit to the description of other parameters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
[ https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3587: - Summary: Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc. (was: Fix the javadoc of DelegationTokenSecretManager in yarn project) > Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc. > - > > Key: YARN-3587 > URL: https://issues.apache.org/jira/browse/YARN-3587 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.0 >Reporter: Akira AJISAKA >Assignee: Gabor Liptak >Priority: Minor > Labels: newbie > Attachments: YARN-3587.1.patch, YARN-3587.patch > > > In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager, > the javadoc of the constructor is as follows: > {code} > /** >* Create a secret manager >* @param delegationKeyUpdateInterval the number of seconds for rolling new >*secret keys. >* @param delegationTokenMaxLifetime the maximum lifetime of the delegation >*tokens >* @param delegationTokenRenewInterval how often the tokens must be renewed >* @param delegationTokenRemoverScanInterval how often the tokens are > scanned >*for expired tokens >*/ > {code} > 1. "the number of seconds" should be "the number of milliseconds". > 2. It's better to add time unit to the description of other parameters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3276) Refactor and fix null casting in some map cast for TimelineEntity (old and new) and fix findbug warnings
[ https://issues.apache.org/jira/browse/YARN-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537911#comment-14537911 ] Junping Du commented on YARN-3276: -- Thanks [~zjshen] for review and comments! bq. TimelineServiceUtils -> TimelineServiceHelper? Sure. Will update it. bq. Is mapreduce using it? Maybe simply @Private In my understanding, @Private could means it could be used by "Common", "HDFS", "MapReduce", and "YARN", so it could be broader than current limitation? I didn't remove MapReduce here as from other places, it seems we always keep MapReduce there as a practice even no obviously reference from MR project. May be better to keep here as it is? bq. TimelineEvent are not covered? Nice catch! Will update it. bq. AllocateResponsePBImpl change is not related? Yes. There are several findbug warnings (this and change in TimelineMetric.java) involved in previous patch on branch YARN-2928. I think it could be too overkill to file a separated JIRA to fix this simple issues so I put the fix here and update the title a little bit. Make sense? > Refactor and fix null casting in some map cast for TimelineEntity (old and > new) and fix findbug warnings > > > Key: YARN-3276 > URL: https://issues.apache.org/jira/browse/YARN-3276 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Junping Du > Attachments: YARN-3276-YARN-2928.v3.patch, > YARN-3276-YARN-2928.v4.patch, YARN-3276-v2.patch, YARN-3276-v3.patch, > YARN-3276.patch > > > Per discussion in YARN-3087, we need to refactor some similar logic to cast > map to hashmap and get rid of NPE issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3276) Refactor and fix null casting in some map cast for TimelineEntity (old and new) and fix findbug warnings
[ https://issues.apache.org/jira/browse/YARN-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3276: - Attachment: YARN-3276-YARN-2928.v5.patch Fix most comments from [~zjshen] in v5 patch. > Refactor and fix null casting in some map cast for TimelineEntity (old and > new) and fix findbug warnings > > > Key: YARN-3276 > URL: https://issues.apache.org/jira/browse/YARN-3276 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Junping Du > Attachments: YARN-3276-YARN-2928.v3.patch, > YARN-3276-YARN-2928.v4.patch, YARN-3276-YARN-2928.v5.patch, > YARN-3276-v2.patch, YARN-3276-v3.patch, YARN-3276.patch > > > Per discussion in YARN-3087, we need to refactor some similar logic to cast > map to hashmap and get rid of NPE issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-401) ClientRMService.getQueueInfo can return stale application reports
[ https://issues.apache.org/jira/browse/YARN-401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-401. - Resolution: Duplicate This was fixed by YARN-2978. > ClientRMService.getQueueInfo can return stale application reports > - > > Key: YARN-401 > URL: https://issues.apache.org/jira/browse/YARN-401 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.0.2-alpha, 0.23.6 >Reporter: Jason Lowe >Priority: Minor > > ClientRMService.getQueueInfo is modifying a QueueInfo object when application > reports are requested. Unfortunately this QueueInfo object could be a > persisting object in the scheduler, and modifying it in this way can lead to > stale application reports being returned to the client. Here's an example > scenario with CapacityScheduler: > # A client asks for queue info on queue X with application reports > # ClientRMService.getQueueInfo modifies the queue's QueueInfo object and sets > application reports on it > # Another client asks for recursive queue info from the root queue without > application reports > # Since the old application reports are still attached to queue X's QueueInfo > object, these stale reports appear in the QueueInfo data for queue X in the > results > Normally if the client is not asking for application reports it won't be > looking for and act upon any application reports that happen to appear in the > queue info result. However we shouldn't be returning application reports in > the first place, and when we do, they shouldn't be stale. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3422) relatedentities always return empty list when primary filter is set
[ https://issues.apache.org/jira/browse/YARN-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537962#comment-14537962 ] Billie Rinaldi commented on YARN-3422: -- That's true, changing the name to indicate direction would also be helpful. I think that fixing this limitation would complicate the write path significantly and is probably not worthwhile in ATS v1. If someone were to implement it, we would need to take before and after performance measurements and possibly make the new feature optional. > relatedentities always return empty list when primary filter is set > --- > > Key: YARN-3422 > URL: https://issues.apache.org/jira/browse/YARN-3422 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Chang Li >Assignee: Chang Li > Attachments: YARN-3422.1.patch > > > When you curl for ats entities with a primary filter, the relatedentities > fields always return empty list -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3276) Refactor and fix null casting in some map cast for TimelineEntity (old and new) and fix findbug warnings
[ https://issues.apache.org/jira/browse/YARN-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537978#comment-14537978 ] Hadoop QA commented on YARN-3276: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 36s | Pre-patch YARN-2928 compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 9m 12s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 49s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 20s | The applied patch generated 2 new checkstyle issues (total was 105, now 107). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 42s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 40s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 50s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 57s | Tests passed in hadoop-yarn-common. | | | | 47m 14s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731922/YARN-3276-YARN-2928.v5.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / b3b791b | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/7862/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/7862/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/7862/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7862/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7862/console | This message was automatically generated. > Refactor and fix null casting in some map cast for TimelineEntity (old and > new) and fix findbug warnings > > > Key: YARN-3276 > URL: https://issues.apache.org/jira/browse/YARN-3276 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Junping Du > Attachments: YARN-3276-YARN-2928.v3.patch, > YARN-3276-YARN-2928.v4.patch, YARN-3276-YARN-2928.v5.patch, > YARN-3276-v2.patch, YARN-3276-v3.patch, YARN-3276.patch > > > Per discussion in YARN-3087, we need to refactor some similar logic to cast > map to hashmap and get rid of NPE issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3360) Add JMX metrics to TimelineDataManager
[ https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-3360: - Attachment: YARN-3360.002.patch Updated patch to trunk. > Add JMX metrics to TimelineDataManager > -- > > Key: YARN-3360 > URL: https://issues.apache.org/jira/browse/YARN-3360 > Project: Hadoop YARN > Issue Type: Improvement > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Jason Lowe > Labels: BB2015-05-TBR > Attachments: YARN-3360.001.patch, YARN-3360.002.patch > > > The TimelineDataManager currently has no metrics, outside of the standard JVM > metrics. It would be very useful to at least log basic counts of method > calls, time spent in those calls, and number of entities/events involved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
[ https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538017#comment-14538017 ] Akira AJISAKA commented on YARN-3587: - Agree with [~djp]. Late +1 from me. Thanks [~djp], [~jianhe], and [~gliptak] for contribution! > Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc. > - > > Key: YARN-3587 > URL: https://issues.apache.org/jira/browse/YARN-3587 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.0 >Reporter: Akira AJISAKA >Assignee: Gabor Liptak >Priority: Minor > Labels: newbie > Fix For: 2.8.0 > > Attachments: YARN-3587.1.patch, YARN-3587.patch > > > In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager, > the javadoc of the constructor is as follows: > {code} > /** >* Create a secret manager >* @param delegationKeyUpdateInterval the number of seconds for rolling new >*secret keys. >* @param delegationTokenMaxLifetime the maximum lifetime of the delegation >*tokens >* @param delegationTokenRenewInterval how often the tokens must be renewed >* @param delegationTokenRemoverScanInterval how often the tokens are > scanned >*for expired tokens >*/ > {code} > 1. "the number of seconds" should be "the number of milliseconds". > 2. It's better to add time unit to the description of other parameters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
[ https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538030#comment-14538030 ] Junping Du commented on YARN-3587: -- Thanks [~ajisakaa]! :) > Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc. > - > > Key: YARN-3587 > URL: https://issues.apache.org/jira/browse/YARN-3587 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.0 >Reporter: Akira AJISAKA >Assignee: Gabor Liptak >Priority: Minor > Labels: newbie > Fix For: 2.8.0 > > Attachments: YARN-3587.1.patch, YARN-3587.patch > > > In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager, > the javadoc of the constructor is as follows: > {code} > /** >* Create a secret manager >* @param delegationKeyUpdateInterval the number of seconds for rolling new >*secret keys. >* @param delegationTokenMaxLifetime the maximum lifetime of the delegation >*tokens >* @param delegationTokenRenewInterval how often the tokens must be renewed >* @param delegationTokenRemoverScanInterval how often the tokens are > scanned >*for expired tokens >*/ > {code} > 1. "the number of seconds" should be "the number of milliseconds". > 2. It's better to add time unit to the description of other parameters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
[ https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538034#comment-14538034 ] Hudson commented on YARN-3587: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #192 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/192/]) YARN-3587. Fix the javadoc of DelegationTokenSecretManager in yarn, etc. projects. Contributed by Gabor Liptak. (junping_du: rev 7e543c27fa2881aa65967be384a6203bd5b2304f) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineDelegationTokenSecretManagerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/RMDelegationTokenSecretManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/security/token/delegation/DelegationTokenSecretManager.java * hadoop-yarn-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JHSDelegationTokenSecretManager.java > Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc. > - > > Key: YARN-3587 > URL: https://issues.apache.org/jira/browse/YARN-3587 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.0 >Reporter: Akira AJISAKA >Assignee: Gabor Liptak >Priority: Minor > Labels: newbie > Fix For: 2.8.0 > > Attachments: YARN-3587.1.patch, YARN-3587.patch > > > In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager, > the javadoc of the constructor is as follows: > {code} > /** >* Create a secret manager >* @param delegationKeyUpdateInterval the number of seconds for rolling new >*secret keys. >* @param delegationTokenMaxLifetime the maximum lifetime of the delegation >*tokens >* @param delegationTokenRenewInterval how often the tokens must be renewed >* @param delegationTokenRemoverScanInterval how often the tokens are > scanned >*for expired tokens >*/ > {code} > 1. "the number of seconds" should be "the number of milliseconds". > 2. It's better to add time unit to the description of other parameters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc.
[ https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538046#comment-14538046 ] Hudson commented on YARN-3587: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2140 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2140/]) YARN-3587. Fix the javadoc of DelegationTokenSecretManager in yarn, etc. projects. Contributed by Gabor Liptak. (junping_du: rev 7e543c27fa2881aa65967be384a6203bd5b2304f) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/RMDelegationTokenSecretManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineDelegationTokenSecretManagerService.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/security/token/delegation/DelegationTokenSecretManager.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/JHSDelegationTokenSecretManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java * hadoop-yarn-project/CHANGES.txt * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java > Fix the javadoc of DelegationTokenSecretManager in projects of yarn, etc. > - > > Key: YARN-3587 > URL: https://issues.apache.org/jira/browse/YARN-3587 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.0 >Reporter: Akira AJISAKA >Assignee: Gabor Liptak >Priority: Minor > Labels: newbie > Fix For: 2.8.0 > > Attachments: YARN-3587.1.patch, YARN-3587.patch > > > In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager, > the javadoc of the constructor is as follows: > {code} > /** >* Create a secret manager >* @param delegationKeyUpdateInterval the number of seconds for rolling new >*secret keys. >* @param delegationTokenMaxLifetime the maximum lifetime of the delegation >*tokens >* @param delegationTokenRenewInterval how often the tokens must be renewed >* @param delegationTokenRemoverScanInterval how often the tokens are > scanned >*for expired tokens >*/ > {code} > 1. "the number of seconds" should be "the number of milliseconds". > 2. It's better to add time unit to the description of other parameters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538058#comment-14538058 ] Junping Du commented on YARN-3044: -- Sorry for coming late on this. Latest patch LGTM too. [~sjlee0], feel free to go ahead to commit this! However, for [~vinodkv]'s comments " We can take a dual pronged approach here? That or we make the RM-publisher itself a distributed push." which sounds reasonable to me but haven't fully addressed in this JIRA. Shall we open a new JIRA for further discussion on this? > [Event producers] Implement RM writing app lifecycle events to ATS > -- > > Key: YARN-3044 > URL: https://issues.apache.org/jira/browse/YARN-3044 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: BB2015-05-TBR > Attachments: YARN-3044-YARN-2928.004.patch, > YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, > YARN-3044-YARN-2928.007.patch, YARN-3044.20150325-1.patch, > YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch > > > Per design in YARN-2928, implement RM writing app lifecycle events to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3360) Add JMX metrics to TimelineDataManager
[ https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538078#comment-14538078 ] Hadoop QA commented on YARN-3360: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 39s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 4 new or modified test files. | | {color:green}+1{color} | javac | 7m 32s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 34s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 28s | The applied patch generated 19 new checkstyle issues (total was 7, now 26). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 47s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 3m 8s | Tests passed in hadoop-yarn-server-applicationhistoryservice. | | | | 38m 45s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731939/YARN-3360.002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 7e543c2 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/7863/artifact/patchprocess/diffcheckstylehadoop-yarn-server-applicationhistoryservice.txt | | hadoop-yarn-server-applicationhistoryservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/7863/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7863/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7863/console | This message was automatically generated. > Add JMX metrics to TimelineDataManager > -- > > Key: YARN-3360 > URL: https://issues.apache.org/jira/browse/YARN-3360 > Project: Hadoop YARN > Issue Type: Improvement > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Jason Lowe > Labels: BB2015-05-TBR > Attachments: YARN-3360.001.patch, YARN-3360.002.patch > > > The TimelineDataManager currently has no metrics, outside of the standard JVM > metrics. It would be very useful to at least log basic counts of method > calls, time spent in those calls, and number of entities/events involved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2942) Aggregated Log Files should be combined
[ https://issues.apache.org/jira/browse/YARN-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538087#comment-14538087 ] Jason Lowe commented on YARN-2942: -- My apologies for taking so long to respond. I took a look at the v6 and v7 proposals. If I understand them correctly they both propose that the NMs upload the original per-node aggregated log to HDFS and then something (either the NMs or the RM) later comes along and creates the aggregate-of-aggregates log with a side-index for faster searching and ability to correct for failed appends. These are reasonable ideas, and I prefer the simpler approach. However I didn't see details on solving the race condition where a log reader comes along, sees from the index file that the desired log isn't in the aggregate-of-aggregates, then opens the log and reads from it just as the log is deleted by the entity appending to the aggregate-of-aggregates. Since we don't have UNIX-style refcounting of open files in HDFS, deleting the log while the reader is trying to read from it is going to be disruptive. One thing to consider in the proposals -- do we want a threshold for a per-node log file where we do not try to append it to the aggregate-of-aggregates file? We have an internal solution where we create per-application har files of the logs, and that process intentionally skips files that are already "big enough" on their own. Saves significant time and network traffic aggregating files that are already beefy enough on their own to justify their existence, as we're primarily concerned with cleaning up the tiny logs per node, per app. Another issue from log aggregation we've seen in practice is that the proposals don't address the significant write load the per-node aggregate files place on the namenode. This isn't an absolute requirement for the design, but we've noticed it's not just about the number of files and blocks being created but also the overall write load associated with those files. It would be really nice to reduce that load significantly. Thinking off the top of my head, one possibility is to have the RM coordinate log aggregation across the nodes. It would work something like this: - NMs do not upload logs for an application to the aggregate file until told to do so by the RM (probably in NM heartbeat response) - NMs provide periodic progress reports in their heartbeat on how aggregation is proceeding and when it succeeds/fails. - RM coordinates and tracks aggregation process (which NM is "active", revoking NMs that have taken too long without progress, etc.) - Logs would remain on NM local disk and served from there until they are uploaded into the app aggregate file, similar to how they work today with the per-node aggregate file This has the advantages of only uploading the logs to HDFS once, only as a single aggregate file (plus index), and doesn't require ZooKeeper. A significant downside is that it prolongs the average time the logs will be available on HDFS for an application due to the serialized upload process. > Aggregated Log Files should be combined > --- > > Key: YARN-2942 > URL: https://issues.apache.org/jira/browse/YARN-2942 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.6.0 >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: CombinedAggregatedLogsProposal_v3.pdf, > CombinedAggregatedLogsProposal_v6.pdf, CombinedAggregatedLogsProposal_v7.pdf, > CompactedAggregatedLogsProposal_v1.pdf, > CompactedAggregatedLogsProposal_v2.pdf, > ConcatableAggregatedLogsProposal_v4.pdf, > ConcatableAggregatedLogsProposal_v5.pdf, YARN-2942-preliminary.001.patch, > YARN-2942-preliminary.002.patch, YARN-2942.001.patch, YARN-2942.002.patch, > YARN-2942.003.patch > > > Turning on log aggregation allows users to easily store container logs in > HDFS and subsequently view them in the YARN web UIs from a central place. > Currently, there is a separate log file for each Node Manager. This can be a > problem for HDFS if you have a cluster with many nodes as you’ll slowly start > accumulating many (possibly small) files per YARN application. The current > “solution” for this problem is to configure YARN (actually the JHS) to > automatically delete these files after some amount of time. > We should improve this by compacting the per-node aggregated log files into > one log file per application. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538146#comment-14538146 ] Zhijie Shen commented on YARN-3044: --- [~sjlee0], would you mind holding the commit for a while? I want to take a look at the last patch too:-) > [Event producers] Implement RM writing app lifecycle events to ATS > -- > > Key: YARN-3044 > URL: https://issues.apache.org/jira/browse/YARN-3044 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: BB2015-05-TBR > Attachments: YARN-3044-YARN-2928.004.patch, > YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, > YARN-3044-YARN-2928.007.patch, YARN-3044.20150325-1.patch, > YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch > > > Per design in YARN-2928, implement RM writing app lifecycle events to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538152#comment-14538152 ] Sangjin Lee commented on YARN-3044: --- No problem. Take your time. [~djp], I'll file a JIRA. > [Event producers] Implement RM writing app lifecycle events to ATS > -- > > Key: YARN-3044 > URL: https://issues.apache.org/jira/browse/YARN-3044 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: BB2015-05-TBR > Attachments: YARN-3044-YARN-2928.004.patch, > YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, > YARN-3044-YARN-2928.007.patch, YARN-3044.20150325-1.patch, > YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch > > > Per design in YARN-2928, implement RM writing app lifecycle events to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3616) determine how to generate YARN container events
Sangjin Lee created YARN-3616: - Summary: determine how to generate YARN container events Key: YARN-3616 URL: https://issues.apache.org/jira/browse/YARN-3616 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee The initial design called for the node manager to write YARN container events to take advantage of the distributed writes. RM acting as a sole writer of all YARN container events would have significant scalability problems. Still, there are some types of events that are not captured by the NM. The current implementation has both: RM writing container events and NM writing container events. We need to sort this out, and decide how we can write all needed container events in a scalable manner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538160#comment-14538160 ] Sangjin Lee commented on YARN-3044: --- YARN-3616 filed. > [Event producers] Implement RM writing app lifecycle events to ATS > -- > > Key: YARN-3044 > URL: https://issues.apache.org/jira/browse/YARN-3044 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: BB2015-05-TBR > Attachments: YARN-3044-YARN-2928.004.patch, > YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, > YARN-3044-YARN-2928.007.patch, YARN-3044.20150325-1.patch, > YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch > > > Per design in YARN-2928, implement RM writing app lifecycle events to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3170) YARN architecture document needs updating
[ https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538162#comment-14538162 ] Allen Wittenauer commented on YARN-3170: I'll be honest: I greatly dislike that entire first paragraph. MRv2 needs to be struck from the vocabulary. It was a marketing ploy to get YARN's acceptance into Hadoop as a subproject. It also helps underscore the problems of what to call the "new" MR API in the actual MR subproject. I'm inclined to think the entire paragraph should just get deleted. The second paragraph should be rewritten. There's little value in comparing YARN to earlier versions of Hadoop at this point. Don't describe YARN in terms of the JobTracker. If I'm new to Hadoop, I have no idea what the heck a JT even is. {code} An application is either a single job in the classical sense of Map-Reduce jobs or a DAG of jobs.{code} * Drop "in the classical sense of Map-Reduce jobs". {code} The ResourceManager and per-node slave, the NodeManager (*NM*), form the data-computation framework. The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system. {code} * Drop "per-node slave," * Drop "(*NM*)" * Add a description of the node manager after the description of the resource manager in this paragraph. > YARN architecture document needs updating > - > > Key: YARN-3170 > URL: https://issues.apache.org/jira/browse/YARN-3170 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > Labels: BB2015-05-TBR > Attachments: YARN-3170-002.patch, YARN-3170-003.patch, > YARN-3170-004.patch, YARN-3170.patch > > > The marketing paragraph at the top, "NextGen MapReduce", etc are all > marketing rather than actual descriptions. It also needs some general > updates, esp given it reads as though 0.23 was just released yesterday. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3505) Node's Log Aggregation Report with SUCCEED should not cached in RMApps
[ https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538165#comment-14538165 ] Junping Du commented on YARN-3505: -- bq. If this happens, that means the log aggregation still happens in some of NMs. I see. Agree that we don't need to do any cleanup in this case. Some minor comments on updated patch: In aggregateLogReport.java, {code} - if (report.getDiagnosticMessage() != null - && !report.getDiagnosticMessage().isEmpty()) { -curReport - .setDiagnosticMessage(curReport.getDiagnosticMessage() == null - ? report.getDiagnosticMessage() : curReport -.getDiagnosticMessage() + report.getDiagnosticMessage()); +if (!curReport.getLogAggregationStatus().equals( + LogAggregationStatus.SUCCEEDED) +&& !curReport.getLogAggregationStatus().equals( + LogAggregationStatus.FAILED) +&& (report.getLogAggregationStatus().equals( + LogAggregationStatus.SUCCEEDED) +|| report.getLogAggregationStatus().equals( + LogAggregationStatus.FAILED))) { + statusChanged = true; // anchor 1 for comments +} +if (report.getLogAggregationStatus() != LogAggregationStatus.RUNNING +|| curReport.getLogAggregationStatus() != +LogAggregationStatus.RUNNING_WITH_FAILURE) { + curReport.setLogAggregationStatus(report +.getLogAggregationStatus()); // anchor 2 for comments +} {code} Are we missing curReport.setLogAggregationStatus() (in above anchor 1 place)? We should set SUCCEEDED or FAILED to curReport. Isn't it? In addition, why we don't put statusChanged in above anchor 2 place? If we think statusChanged only hint status move to final state (SUCCEEDED or FAILED), then we should rename statusChanged to something like stateChangedToFinal which sounds more obviously. BTW, can we make logic here to be something simpler to make status get updated except only two cases?: 1. curReport.getLogAggregationStatus() = report.getLogAggregationStatus(); 2. curReport.getLogAggregationStatus() = RUNNING_WITH_FAILURE && report.getLogAggregationStatus() = RUNNING In updateLogAggregationDiagnosticMessages(), {code} if (report.getLogAggregationStatus() + == LogAggregationStatus.RUNNING || report.getLogAggregationStatus() + == LogAggregationStatus.SUCCEEDED || report.getLogAggregationStatus() + == LogAggregationStatus.FAILED) { {code} Why case of "report.getLogAggregationStatus() == LogAggregationStatus.FAILED" doesn't go to the other branch like: LogAggregationStatus.RUNNING_WITH_FAILURE? {code} +LogAggregationDiagnosticsForNMs.put(nodeId, diagnostics); {code} Move this into block of "diagnostics == null", right after " diagnostics = new ArrayList();", because we only need to call this the first time we put diagnostics info. The same problem for failureMessages too. > Node's Log Aggregation Report with SUCCEED should not cached in RMApps > -- > > Key: YARN-3505 > URL: https://issues.apache.org/jira/browse/YARN-3505 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation >Affects Versions: 2.8.0 >Reporter: Junping Du >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-3505.1.patch, YARN-3505.2.patch, > YARN-3505.2.rebase.patch, YARN-3505.3.patch > > > Per discussions in YARN-1402, we shouldn't cache all node's log aggregation > reports in RMApps for always, especially for those finished with SUCCEED. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3606) Spark container fails to launch if spark-assembly.jar file has different timestamp
[ https://issues.apache.org/jira/browse/YARN-3606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538190#comment-14538190 ] Steve Loughran commented on YARN-3606: -- Looking at timestamp is the strategy chosen based on a key assumption : there is a single artifact to localise by downloading from a single shared filesystem. Trying to use local filesystems, each with a cached copy of the artifact, isn't what the NM expects to be doing. If it is, then the normal localisation checks aren't I think the checksum is probably omitted as you have to read the whole file to see if it has changed; plus there's the cost of actually recalculating that checksum prior to launching every container. Timestamps aren't too great though —the check as stands will reject the same file with two different times *or* two differently sized files with the same timestamp. > Spark container fails to launch if spark-assembly.jar file has different > timestamp > -- > > Key: YARN-3606 > URL: https://issues.apache.org/jira/browse/YARN-3606 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.6.0 > Environment: YARN 2.6.0 > Spark 1.3.1 >Reporter: Michael Le >Priority: Minor > > In a YARN cluster, when submitting a Spark job, the Spark job will fail to > run because YARN fails to launch containers on the other nodes (not the node > where the job submission took place). > YARN checks for similar spark-assembly.jar file by looking at the timestamps. > This check will fail when the spark-assembly.jar is the same but copied to > the location at different time. > YARN throws this exception: > 15/05/07 20:13:22 INFO yarn.ExecutorRunnable: Setting up executor with > commands: List({{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill > %p', -Xms1024m, -Xmx1024m, -Djava.io.tmpdir={{PWD}}/tmp, > '-Dspark.driver.port=52357', -Dspark.yarn.app.container.log.dir=, > org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, > akka.tcp://sparkDriver@xxx:52357/user/CoarseGrainedScheduler, --executor-id, > 4, --hostname, xxx, --cores, 1, --app-id, application_1431047540996_0001, > --user-class-path, file:$PWD/__app__.jar, 1>, /stdout, 2>, > /stderr) > 15/05/07 20:13:22 INFO impl.ContainerManagementProtocolProxy: Opening proxy : > xxx:34165 > 15/05/07 20:13:27 INFO yarn.YarnAllocator: Completed container > container_1431047540996_0001_02_05 (state: COMPLETE, exit status: -1000) > 15/05/07 20:13:27 INFO yarn.YarnAllocator: Container marked as failed: > container_1431047540996_0001_02_05. Exit status: -1000. Diagnostics: > Resource > file:/home/spark/spark-1.3.1-bin-hadoop2.6/lib/spark-assembly-1.3.1-hadoop2.6.0.jar > changed on src filesystem (expected 1430944255000, was 1430944249000 > java.io.IOException: Resource > file:/home/spark/spark-1.3.1-bin-hadoop2.6/lib/spark-assembly-1.3.1-hadoop2.6.0.jar > changed on src filesystem (expected 1430944255000, was 1430944249000 > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) > at > org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Problem can be easily replicated by setting up two nodes and copying the > spark-assembly.jar to each node but changing the timestamp of the file on one > of the nodes. Then execute spark-shell --master yarn-client. Observe the > nodemanager log on the other node to find the error. > Work around is to make sure the jar file has the same timestamp. But it looks > like perhaps the function that does the copy and check of the jar file > (org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) should > check for file similarity using a checksum rather than timestamp. --
[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation
[ https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538216#comment-14538216 ] Wangda Tan commented on YARN-3434: -- Jenkins doesn't get back, sent a mail to hadoop-dev for help. > Interaction between reservations and userlimit can result in significant ULF > violation > -- > > Key: YARN-3434 > URL: https://issues.apache.org/jira/browse/YARN-3434 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.6.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Fix For: 2.8.0 > > Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch > > > ULF was set to 1.0 > User was able to consume 1.4X queue capacity. > It looks like when this application launched, it reserved about 1000 > containers, each 8G each, within about 5 seconds. I think this allowed the > logic in assignToUser() to allow the userlimit to be surpassed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation
[ https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538232#comment-14538232 ] Hadoop QA commented on YARN-3434: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731239/YARN-3434-branch2.7.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / b9cebfc | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7864/console | This message was automatically generated. > Interaction between reservations and userlimit can result in significant ULF > violation > -- > > Key: YARN-3434 > URL: https://issues.apache.org/jira/browse/YARN-3434 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.6.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Fix For: 2.8.0 > > Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch > > > ULF was set to 1.0 > User was able to consume 1.4X queue capacity. > It looks like when this application launched, it reserved about 1000 > containers, each 8G each, within about 5 seconds. I think this allowed the > logic in assignToUser() to allow the userlimit to be surpassed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3170) YARN architecture document needs updating
[ https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated YARN-3170: --- Attachment: YARN-3170-005.patch > YARN architecture document needs updating > - > > Key: YARN-3170 > URL: https://issues.apache.org/jira/browse/YARN-3170 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > Labels: BB2015-05-TBR > Attachments: YARN-3170-002.patch, YARN-3170-003.patch, > YARN-3170-004.patch, YARN-3170-005.patch, YARN-3170.patch > > > The marketing paragraph at the top, "NextGen MapReduce", etc are all > marketing rather than actual descriptions. It also needs some general > updates, esp given it reads as though 0.23 was just released yesterday. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3170) YARN architecture document needs updating
[ https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538242#comment-14538242 ] Brahma Reddy Battula commented on YARN-3170: [~aw] Thanks for taking look into this issue.. Updated the patch based on your comments..Kindly review...Let me anyother rework in second paragraph ( Mainly first line )... > YARN architecture document needs updating > - > > Key: YARN-3170 > URL: https://issues.apache.org/jira/browse/YARN-3170 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > Labels: BB2015-05-TBR > Attachments: YARN-3170-002.patch, YARN-3170-003.patch, > YARN-3170-004.patch, YARN-3170-005.patch, YARN-3170.patch > > > The marketing paragraph at the top, "NextGen MapReduce", etc are all > marketing rather than actual descriptions. It also needs some general > updates, esp given it reads as though 0.23 was just released yesterday. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3616) determine how to generate YARN container events
[ https://issues.apache.org/jira/browse/YARN-3616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R reassigned YARN-3616: --- Assignee: Naganarasimha G R > determine how to generate YARN container events > --- > > Key: YARN-3616 > URL: https://issues.apache.org/jira/browse/YARN-3616 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > > The initial design called for the node manager to write YARN container events > to take advantage of the distributed writes. RM acting as a sole writer of > all YARN container events would have significant scalability problems. > Still, there are some types of events that are not captured by the NM. The > current implementation has both: RM writing container events and NM writing > container events. > We need to sort this out, and decide how we can write all needed container > events in a scalable manner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation
[ https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538252#comment-14538252 ] Thomas Graves commented on YARN-3434: - whats your question exactly? For branch patches jenkins has never been hooked up. We generally download the patch, build and possibly the run the tests that apply and commit. > Interaction between reservations and userlimit can result in significant ULF > violation > -- > > Key: YARN-3434 > URL: https://issues.apache.org/jira/browse/YARN-3434 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.6.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Fix For: 2.8.0 > > Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch > > > ULF was set to 1.0 > User was able to consume 1.4X queue capacity. > It looks like when this application launched, it reserved about 1000 > containers, each 8G each, within about 5 seconds. I think this allowed the > logic in assignToUser() to allow the userlimit to be surpassed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3170) YARN architecture document needs updating
[ https://issues.apache.org/jira/browse/YARN-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538259#comment-14538259 ] Hadoop QA commented on YARN-3170: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 2m 53s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | release audit | 0m 20s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 55s | Site still builds. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | | | 6m 11s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731957/YARN-3170-005.patch | | Optional Tests | site | | git revision | trunk / b9cebfc | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7865/console | This message was automatically generated. > YARN architecture document needs updating > - > > Key: YARN-3170 > URL: https://issues.apache.org/jira/browse/YARN-3170 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Reporter: Allen Wittenauer >Assignee: Brahma Reddy Battula > Labels: BB2015-05-TBR > Attachments: YARN-3170-002.patch, YARN-3170-003.patch, > YARN-3170-004.patch, YARN-3170-005.patch, YARN-3170.patch > > > The marketing paragraph at the top, "NextGen MapReduce", etc are all > marketing rather than actual descriptions. It also needs some general > updates, esp given it reads as though 0.23 was just released yesterday. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation
[ https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538265#comment-14538265 ] Wangda Tan commented on YARN-3434: -- Just read [~aw]'s comment: https://issues.apache.org/jira/browse/HADOOP-11746?focusedCommentId=14499458&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14499458. Now it can only support branches after branch-2.7. So I will run all RM tests locally for YARN-3434 and commit. > Interaction between reservations and userlimit can result in significant ULF > violation > -- > > Key: YARN-3434 > URL: https://issues.apache.org/jira/browse/YARN-3434 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.6.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Fix For: 2.8.0 > > Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch > > > ULF was set to 1.0 > User was able to consume 1.4X queue capacity. > It looks like when this application launched, it reserved about 1000 > containers, each 8G each, within about 5 seconds. I think this allowed the > logic in assignToUser() to allow the userlimit to be surpassed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3616) determine how to generate YARN container events
[ https://issues.apache.org/jira/browse/YARN-3616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538303#comment-14538303 ] Naganarasimha G R commented on YARN-3616: - I would like to continue working on this issue :). Also to capture one important point from [~Vinodkv]'s review bq. The missing dots occur when a container's life-cycle ends either on the RM or the AM. We can take a dual pronged approach here? That or we make the RM-publisher itself a distributed push. IMO dual pronged approach would be better, we can rely on NMs to post normal life cycle events and in rare cases where NM cant handle, RM publish events directly to ATS. And might be here distributed push might not work as in the cases which Vinod mentioned NM might not be able to handle publishing as TimelineCollector might not be created as no container is created in the NM side for that app. Correct me if i am wrong. > determine how to generate YARN container events > --- > > Key: YARN-3616 > URL: https://issues.apache.org/jira/browse/YARN-3616 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > > The initial design called for the node manager to write YARN container events > to take advantage of the distributed writes. RM acting as a sole writer of > all YARN container events would have significant scalability problems. > Still, there are some types of events that are not captured by the NM. The > current implementation has both: RM writing container events and NM writing > container events. > We need to sort this out, and decide how we can write all needed container > events in a scalable manner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3595) Performance optimization using connection cache of Phoenix timeline writer
[ https://issues.apache.org/jira/browse/YARN-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538341#comment-14538341 ] Li Lu commented on YARN-3595: - Hi [~sjlee0], thanks for the suggestions. I think you're right that most complexities come from having a cache rather than a pool for those connections. I'll look into alternative solutions. > Performance optimization using connection cache of Phoenix timeline writer > -- > > Key: YARN-3595 > URL: https://issues.apache.org/jira/browse/YARN-3595 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > > The story about the connection cache in Phoenix timeline storage is a little > bit long. In YARN-3033 we planned to have shared writer layer for all > collectors in the same collector manager. In this way we can better reuse the > same heavy-weight storage layer connection, therefore it's more friendly to > conventional storage layer connections which are typically heavy-weight. > Phoenix, on the other hand, implements its own connection interface layer to > be light-weight, thread-unsafe. To make these connections work with our > "multiple collector, single writer" model, we're adding a thread indexed > connection cache. However, many performance critical factors are yet to be > tested. > In this JIRA we're tracing performance optimization efforts using this > connection cache. Previously we had a draft, but there was one implementation > challenge on cache evictions: There may be races between Guava cache's > removal listener calls (which close the connection) and normal references to > the connection. We need to carefully define the way they synchronize. > Performance-wise, at the very beginning stage we may need to understand: > # If the current, thread-based indexing is an appropriate approach, or we can > use some better ways to index the connections. > # the best size of the cache, presumably as the proposed default value of a > configuration. > # how long we need to preserve a connection in the cache. > Please feel free to add this list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation
[ https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538346#comment-14538346 ] Allen Wittenauer commented on YARN-3434: You can run test-patch.sh locally and specify the branch using --branch. > Interaction between reservations and userlimit can result in significant ULF > violation > -- > > Key: YARN-3434 > URL: https://issues.apache.org/jira/browse/YARN-3434 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.6.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Fix For: 2.8.0 > > Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch > > > ULF was set to 1.0 > User was able to consume 1.4X queue capacity. > It looks like when this application launched, it reserved about 1000 > containers, each 8G each, within about 5 seconds. I think this allowed the > logic in assignToUser() to allow the userlimit to be surpassed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation
[ https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538367#comment-14538367 ] Wangda Tan commented on YARN-3434: -- Thanks Allen! Trying it. > Interaction between reservations and userlimit can result in significant ULF > violation > -- > > Key: YARN-3434 > URL: https://issues.apache.org/jira/browse/YARN-3434 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.6.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Fix For: 2.8.0 > > Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch > > > ULF was set to 1.0 > User was able to consume 1.4X queue capacity. > It looks like when this application launched, it reserved about 1000 > containers, each 8G each, within about 5 seconds. I think this allowed the > logic in assignToUser() to allow the userlimit to be surpassed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538437#comment-14538437 ] Zhijie Shen commented on YARN-3044: --- Sorry to put my comments at last minute: 1. I'm still not sure why it is necessary to have RMContainerEntity. Whether the container entity comes from RM or NM, it's about container's info. Any reason we want differentiate both? At reader side, if I want to list all containers of an app, should I return RMContainerEntity or ContainerEntity? I incline to only having ContainerEntity, but RM and NM may put different info/event about it based on their knowledge. 2. Should v1 and v2 publisher only differentiate at publishEvent, however, it seems that we duplicate code more than that. And perhaps defining and implementing SystemMetricsEvent.toTimelineEvent can further cleanup the code. 3. I saw v2 is going to send config, but where the config is coming from. Did we conclude who and how to send the config? IAC, sending config seems to be half done. And we can use {{entity.addConfigs(event.getConfig());}}. No need to iterate over config collection and put each config one-by-one. 4. yarn.system-metrics-publisher.rm.publish.container-metrics -> yarn.rm.system-metrics-publisher.emit-container-events? {code} 374 public static final String RM_PUBLISH_CONTAINER_METRICS_ENABLED = YARN_PREFIX 375 + "system-metrics-publisher.rm.publish.container-metrics"; 376 public static final boolean DEFAULT_RM_PUBLISH_CONTAINER_METRICS_ENABLED = 377 false; {code} Moreover, I also think we should not have "yarn.system-metrics-publisher.enabled" too, and reuse the existing config. And it's not limited to RM metrics publisher, but all existing ATS service. IMHO, the better practice is to reuse the existing config. And we can have a global config (or env var) timeline-service.version to determine the service is enabled with v1 or v2 implementation. Anyway, it's a separate problem, I'll file a separate jira for it. 5. Methods/innner classes in SystemMetricsPublisher don't need to be changed to "public". Default is enough to access them? > [Event producers] Implement RM writing app lifecycle events to ATS > -- > > Key: YARN-3044 > URL: https://issues.apache.org/jira/browse/YARN-3044 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: BB2015-05-TBR > Attachments: YARN-3044-YARN-2928.004.patch, > YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, > YARN-3044-YARN-2928.007.patch, YARN-3044.20150325-1.patch, > YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch > > > Per design in YARN-2928, implement RM writing app lifecycle events to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend
[ https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538466#comment-14538466 ] Vinod Kumar Vavilapalli commented on YARN-3134: --- Tx folks, this is great progress! > [Storage implementation] Exploiting the option of using Phoenix to access > HBase backend > --- > > Key: YARN-3134 > URL: https://issues.apache.org/jira/browse/YARN-3134 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Li Lu > Fix For: YARN-2928 > > Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, > YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, > YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, > YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, > YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, > YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, > YARN-3134-YARN-2928.007.patch, YARN-3134DataSchema.pdf, > hadoop-zshen-nodemanager-d-128-95-184-84.dhcp4.washington.edu.out > > > Quote the introduction on Phoenix web page: > {code} > Apache Phoenix is a relational database layer over HBase delivered as a > client-embedded JDBC driver targeting low latency queries over HBase data. > Apache Phoenix takes your SQL query, compiles it into a series of HBase > scans, and orchestrates the running of those scans to produce regular JDBC > result sets. The table metadata is stored in an HBase table and versioned, > such that snapshot queries over prior versions will automatically use the > correct schema. Direct use of the HBase API, along with coprocessors and > custom filters, results in performance on the order of milliseconds for small > queries, or seconds for tens of millions of rows. > {code} > It may simply our implementation read/write data from/to HBase, and can > easily build index and compose complex query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3618) Fix unused variable to get CPU frequency on Windows systems
Georg Berendt created YARN-3618: --- Summary: Fix unused variable to get CPU frequency on Windows systems Key: YARN-3618 URL: https://issues.apache.org/jira/browse/YARN-3618 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.7.0 Environment: Windows 7 x64 SP1 Reporter: Georg Berendt Priority: Minor In the class 'WindowsResourceCalculatorPlugin.java' of the YARN project, there is an unused variable for CPU frequency. " /** {@inheritDoc} */ @Override public long getCpuFrequency() { refreshIfNeeded(); return -1; }" Please change '-1' to use 'cpuFrequencyKhz'. org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3617) Fix unused variable to get CPU frequency on Windows systems
Georg Berendt created YARN-3617: --- Summary: Fix unused variable to get CPU frequency on Windows systems Key: YARN-3617 URL: https://issues.apache.org/jira/browse/YARN-3617 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.7.0 Environment: Windows 7 x64 SP1 Reporter: Georg Berendt Priority: Minor In the class 'WindowsResourceCalculatorPlugin.java' of the YARN project, there is an unused variable for CPU frequency. " /** {@inheritDoc} */ @Override public long getCpuFrequency() { refreshIfNeeded(); return -1; }" Please change '-1' to use 'cpuFrequencyKhz'. org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException
Jason Lowe created YARN-3619: Summary: ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException Key: YARN-3619 URL: https://issues.apache.org/jira/browse/YARN-3619 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.7.0 Reporter: Jason Lowe ContainerMetrics is able to unregister itself during the getMetrics method, but that method can be called by MetricsSystemImpl.sampleMetrics which is trying to iterate the sources. This leads to a ConcurrentModificationException log like this: {noformat} 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN impl.MetricsSystemImpl: java.util.ConcurrentModificationException {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538581#comment-14538581 ] Jason Lowe commented on YARN-3619: -- This appears to have been caused by YARN-2984. [~kasha] would you mind taking a look? > ContainerMetrics unregisters during getMetrics and leads to > ConcurrentModificationException > --- > > Key: YARN-3619 > URL: https://issues.apache.org/jira/browse/YARN-3619 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.0 >Reporter: Jason Lowe > > ContainerMetrics is able to unregister itself during the getMetrics method, > but that method can be called by MetricsSystemImpl.sampleMetrics which is > trying to iterate the sources. This leads to a > ConcurrentModificationException log like this: > {noformat} > 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN > impl.MetricsSystemImpl: java.util.ConcurrentModificationException > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reassigned YARN-3619: -- Assignee: Karthik Kambatla > ContainerMetrics unregisters during getMetrics and leads to > ConcurrentModificationException > --- > > Key: YARN-3619 > URL: https://issues.apache.org/jira/browse/YARN-3619 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.0 >Reporter: Jason Lowe >Assignee: Karthik Kambatla > > ContainerMetrics is able to unregister itself during the getMetrics method, > but that method can be called by MetricsSystemImpl.sampleMetrics which is > trying to iterate the sources. This leads to a > ConcurrentModificationException log like this: > {noformat} > 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN > impl.MetricsSystemImpl: java.util.ConcurrentModificationException > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3620) MetricsSystemImpl fails to show backtrace when an error occurs
Jason Lowe created YARN-3620: Summary: MetricsSystemImpl fails to show backtrace when an error occurs Key: YARN-3620 URL: https://issues.apache.org/jira/browse/YARN-3620 Project: Hadoop YARN Issue Type: Bug Reporter: Jason Lowe Assignee: Jason Lowe While investigating YARN-3619 it was frustrating that MetricsSystemImpl was logging a ConcurrentModificationException but without any backtrace. Logging a backtrace would be very beneficial to tracking down the cause of the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3621) FairScheduler doesn't count AM vcores towards max-share
Karthik Kambatla created YARN-3621: -- Summary: FairScheduler doesn't count AM vcores towards max-share Key: YARN-3621 URL: https://issues.apache.org/jira/browse/YARN-3621 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.7.1 Reporter: Karthik Kambatla FairScheduler seems to not count AM vcores towards max-vcores. On a queue with maxVcores set to 1, I am able to run a sleep job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3622) Enable application client to communicate with new timeline service
Zhijie Shen created YARN-3622: - Summary: Enable application client to communicate with new timeline service Key: YARN-3622 URL: https://issues.apache.org/jira/browse/YARN-3622 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen YARN application has client and AM. We have the story to make TimelineClient work inside AM for v2, but not for client. TimelineClient inside app client needs to be taken care of too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3623) Having the config to indicate the timeline service version
Zhijie Shen created YARN-3623: - Summary: Having the config to indicate the timeline service version Key: YARN-3623 URL: https://issues.apache.org/jira/browse/YARN-3623 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen So far RM, MR AM, DA AM added/changed new config to enable the feature to write the timeline data to v2 server. It's good to have a YARN timeline-service.version config like timeline-service.enable to indicate the version of the running timeline service with the given YARN cluster. It's beneficial for users to more smoothly move from v1 to v2, as they don't need to change the existing config, but switch this config from v1 to v2. And each framework doesn't need to have their own v1/v2 config. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3505) Node's Log Aggregation Report with SUCCEED should not cached in RMApps
[ https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-3505: Attachment: YARN-3505.4.patch > Node's Log Aggregation Report with SUCCEED should not cached in RMApps > -- > > Key: YARN-3505 > URL: https://issues.apache.org/jira/browse/YARN-3505 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation >Affects Versions: 2.8.0 >Reporter: Junping Du >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-3505.1.patch, YARN-3505.2.patch, > YARN-3505.2.rebase.patch, YARN-3505.3.patch, YARN-3505.4.patch > > > Per discussions in YARN-1402, we shouldn't cache all node's log aggregation > reports in RMApps for always, especially for those finished with SUCCEED. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3505) Node's Log Aggregation Report with SUCCEED should not cached in RMApps
[ https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538716#comment-14538716 ] Xuan Gong commented on YARN-3505: - Upload a new patch to address all the comments > Node's Log Aggregation Report with SUCCEED should not cached in RMApps > -- > > Key: YARN-3505 > URL: https://issues.apache.org/jira/browse/YARN-3505 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation >Affects Versions: 2.8.0 >Reporter: Junping Du >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-3505.1.patch, YARN-3505.2.patch, > YARN-3505.2.rebase.patch, YARN-3505.3.patch, YARN-3505.4.patch > > > Per discussions in YARN-1402, we shouldn't cache all node's log aggregation > reports in RMApps for always, especially for those finished with SUCCEED. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3624) ApplicationHistoryServer reverses the order of the filters it gets
Mit Desai created YARN-3624: --- Summary: ApplicationHistoryServer reverses the order of the filters it gets Key: YARN-3624 URL: https://issues.apache.org/jira/browse/YARN-3624 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Affects Versions: 2.6.0 Reporter: Mit Desai Assignee: Mit Desai AppliactionHistoryServer should not alter the order in which it gets the filter chain. Additional filters should be added at the end of the chain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3624) ApplicationHistoryServer reverses the order of the filters it gets
[ https://issues.apache.org/jira/browse/YARN-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated YARN-3624: Attachment: YARN-3624.patch attaching the patch > ApplicationHistoryServer reverses the order of the filters it gets > -- > > Key: YARN-3624 > URL: https://issues.apache.org/jira/browse/YARN-3624 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Attachments: YARN-3624.patch > > > AppliactionHistoryServer should not alter the order in which it gets the > filter chain. Additional filters should be added at the end of the chain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation
[ https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538740#comment-14538740 ] Wangda Tan commented on YARN-3434: -- Ran it locally, all tests can passed, committing. > Interaction between reservations and userlimit can result in significant ULF > violation > -- > > Key: YARN-3434 > URL: https://issues.apache.org/jira/browse/YARN-3434 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.6.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Fix For: 2.8.0 > > Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch > > > ULF was set to 1.0 > User was able to consume 1.4X queue capacity. > It looks like when this application launched, it reserved about 1000 > containers, each 8G each, within about 5 seconds. I think this allowed the > logic in assignToUser() to allow the userlimit to be surpassed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3625) RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put
Jonathan Eagles created YARN-3625: - Summary: RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put Key: YARN-3625 URL: https://issues.apache.org/jira/browse/YARN-3625 Project: Hadoop YARN Issue Type: Bug Reporter: Jonathan Eagles Assignee: Jonathan Eagles -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3625) RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put
[ https://issues.apache.org/jira/browse/YARN-3625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated YARN-3625: -- Attachment: YARN-3625.1.patch > RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put > -- > > Key: YARN-3625 > URL: https://issues.apache.org/jira/browse/YARN-3625 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: YARN-3625.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3625) RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put
[ https://issues.apache.org/jira/browse/YARN-3625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated YARN-3625: -- Description: RollingLevelDBTimelineStore batches all entities in the same put to improve performance. This causes an error when relating to an entity in the same put however. > RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put > -- > > Key: YARN-3625 > URL: https://issues.apache.org/jira/browse/YARN-3625 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: YARN-3625.1.patch > > > RollingLevelDBTimelineStore batches all entities in the same put to improve > performance. This causes an error when relating to an entity in the same put > however. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3529) Add miniHBase cluster and Phoenix support to ATS v2 unit tests
[ https://issues.apache.org/jira/browse/YARN-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538789#comment-14538789 ] Zhijie Shen commented on YARN-3529: --- Thanks for the patch, Li! Some comments: 1. You should define the dependency version under {{./hadoop-project/pom.xml}}. And then you can remove the version info like {{$\{phoenix.version\}}} 2. Do we need to all the following configurable not only in the unit test? At least, for POC, do we need to config connString to point a real hbase cluster? {code} 94@VisibleForTesting 95static String connString = "jdbc:phoenix:localhost:2181:/hbase"; 96@VisibleForTesting 97static Properties connProperties = new Properties(); {code} 3. In TestPhoenixTimelineWriterImpl, shall we teardown the hbase cluster as well after dropping the tables? > Add miniHBase cluster and Phoenix support to ATS v2 unit tests > -- > > Key: YARN-3529 > URL: https://issues.apache.org/jira/browse/YARN-3529 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > Attachments: AbstractMiniHBaseClusterTest.java, > YARN-3529-YARN-2928.000.patch, output_minicluster2.txt > > > After we have our HBase and Phoenix writer implementations, we may want to > find a way to set up HBase and Phoenix in our unit tests. We need to do this > integration before the branch got merged back to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-1886) Exceptions in the RM log while cleaning up app attempt
[ https://issues.apache.org/jira/browse/YARN-1886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He resolved YARN-1886. --- Resolution: Duplicate > Exceptions in the RM log while cleaning up app attempt > -- > > Key: YARN-1886 > URL: https://issues.apache.org/jira/browse/YARN-1886 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Arpit Gupta > > Noticed exceptions in the RM log while HA tests were running where we killed > RM/AM/Namnode etc. > RM failed over and the new active RM tried to kill the old app attempt and > ran into this exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1886) Exceptions in the RM log while cleaning up app attempt
[ https://issues.apache.org/jira/browse/YARN-1886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538800#comment-14538800 ] Jian He commented on YARN-1886: --- YARN-1885 fixed this problem. close this > Exceptions in the RM log while cleaning up app attempt > -- > > Key: YARN-1886 > URL: https://issues.apache.org/jira/browse/YARN-1886 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.4.0 >Reporter: Arpit Gupta > > Noticed exceptions in the RM log while HA tests were running where we killed > RM/AM/Namnode etc. > RM failed over and the new active RM tried to kill the old app attempt and > ran into this exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3434) Interaction between reservations and userlimit can result in significant ULF violation
[ https://issues.apache.org/jira/browse/YARN-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538803#comment-14538803 ] Hudson commented on YARN-3434: -- FAILURE: Integrated in Hadoop-trunk-Commit #7799 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7799/]) Moved YARN-3434. (Interaction between reservations and userlimit can result in significant ULF violation.) From 2.8.0 to 2.7.1 (wangda: rev 1952f9395870e7b631d43418e075e774b9d2) * hadoop-yarn-project/CHANGES.txt > Interaction between reservations and userlimit can result in significant ULF > violation > -- > > Key: YARN-3434 > URL: https://issues.apache.org/jira/browse/YARN-3434 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.6.0 >Reporter: Thomas Graves >Assignee: Thomas Graves > Fix For: 2.8.0, 2.7.1 > > Attachments: YARN-3434-branch2.7.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, YARN-3434.patch, > YARN-3434.patch, YARN-3434.patch > > > ULF was set to 1.0 > User was able to consume 1.4X queue capacity. > It looks like when this application launched, it reserved about 1000 > containers, each 8G each, within about 5 seconds. I think this allowed the > logic in assignToUser() to allow the userlimit to be surpassed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2000) Fix ordering of starting services inside the RM
[ https://issues.apache.org/jira/browse/YARN-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538822#comment-14538822 ] Jian He commented on YARN-2000: --- bq. Probably we can have state-store stop last so that all the other services are stopped first and won't accept more requests and send events to state-store. Even if state-store stops first, the API calls such as submitApplication won't return true until the state-store operation completes. Nothing to be done, close. > Fix ordering of starting services inside the RM > --- > > Key: YARN-2000 > URL: https://issues.apache.org/jira/browse/YARN-2000 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Jian He >Assignee: Jian He > > The order of starting services in RM would be: > - Recovery of the app/attempts > - Start the scheduler and add scheduler app/attempts > - Start ResourceTrackerService and re-populate the containers in scheduler > based on the containers info from NMs > - ApplicationMasterService either don’t start or start but block until all > the previous NMs registers. > Other than these, there are other services like ClientRMService, Webapps > which we need to think about the order too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-2000) Fix ordering of starting services inside the RM
[ https://issues.apache.org/jira/browse/YARN-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He resolved YARN-2000. --- Resolution: Invalid > Fix ordering of starting services inside the RM > --- > > Key: YARN-2000 > URL: https://issues.apache.org/jira/browse/YARN-2000 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Jian He >Assignee: Jian He > > The order of starting services in RM would be: > - Recovery of the app/attempts > - Start the scheduler and add scheduler app/attempts > - Start ResourceTrackerService and re-populate the containers in scheduler > based on the containers info from NMs > - ApplicationMasterService either don’t start or start but block until all > the previous NMs registers. > Other than these, there are other services like ClientRMService, Webapps > which we need to think about the order too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI
[ https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538836#comment-14538836 ] Wangda Tan commented on YARN-3362: -- Hi Naga, Thanks for updating, 1) To your questions: https://issues.apache.org/jira/browse/YARN-3362?focusedCommentId=14537181&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14537181, You can refer to YARN-2824 for more information about why default cap of labeled resource set to zero. The default of max-cap is 100 because queue can use such resource without configure it. Let me know if you have more questions. 2) About showing resources of partitions, I think it's very helpful. I think you can include used-resource of each partition as well, You can file a separate ticket if it is hard to be added with this ticket. 3) About "Hide Hierarchy", I think it's good for queue capacity comparison, but admin may get confused after checked "Hide Hierarchy", it's better to be added to some other places instead of modify queue UI itself. > Add node label usage in RM CapacityScheduler web UI > --- > > Key: YARN-3362 > URL: https://issues.apache.org/jira/browse/YARN-3362 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager, webapp >Reporter: Wangda Tan >Assignee: Naganarasimha G R > Attachments: 2015.05.06 Folded Queues.png, 2015.05.06 Queue > Expanded.png, 2015.05.07_3362_Queue_Hierarchy.png, > 2015.05.10_3362_Queue_Hierarchy.png, CSWithLabelsView.png, > No-space-between-Active_user_info-and-next-queues.png, Screen Shot 2015-04-29 > at 11.42.17 AM.png, YARN-3362.20150428-3-modified.patch, > YARN-3362.20150428-3.patch, YARN-3362.20150506-1.patch, > YARN-3362.20150507-1.patch, YARN-3362.20150510-1.patch, > YARN-3362.20150511-1.patch, capacity-scheduler.xml > > > We don't have node label usage in RM CapacityScheduler web UI now, without > this, user will be hard to understand what happened to nodes have labels > assign to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3618) Fix unused variable to get CPU frequency on Windows systems
[ https://issues.apache.org/jira/browse/YARN-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula resolved YARN-3618. Resolution: Duplicate > Fix unused variable to get CPU frequency on Windows systems > --- > > Key: YARN-3618 > URL: https://issues.apache.org/jira/browse/YARN-3618 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.0 > Environment: Windows 7 x64 SP1 >Reporter: Georg Berendt >Priority: Minor > Labels: easyfix > Original Estimate: 1h > Remaining Estimate: 1h > > In the class 'WindowsResourceCalculatorPlugin.java' of the YARN project, > there is an unused variable for CPU frequency. > " /** {@inheritDoc} */ > @Override > public long getCpuFrequency() { > refreshIfNeeded(); > return -1; > }" > Please change '-1' to use 'cpuFrequencyKhz'. > org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3626) On Windows localized resources are not moved to the front of the classpath when they should be
Craig Welch created YARN-3626: - Summary: On Windows localized resources are not moved to the front of the classpath when they should be Key: YARN-3626 URL: https://issues.apache.org/jira/browse/YARN-3626 Project: Hadoop YARN Issue Type: Bug Components: yarn Environment: Windows Reporter: Craig Welch Assignee: Craig Welch In response to the mapreduce.job.user.classpath.first setting the classpath is ordered differently so that localized resources will appear before system classpath resources when tasks execute. On Windows this does not work because the localized resources are not linked into their final location when the classpath jar is created. To compensate for that localized jar resources are added directly to the classpath generated for the jar rather than being discovered from the localized directories. Unfortunately, they are always appended to the classpath, and so are never preferred over system resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3626) On Windows localized resources are not moved to the front of the classpath when they should be
[ https://issues.apache.org/jira/browse/YARN-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538849#comment-14538849 ] Craig Welch commented on YARN-3626: --- To resolve this, the situation should be detected and, when applicable, localized resources should be put at the beginning of the classpath rather than the end. > On Windows localized resources are not moved to the front of the classpath > when they should be > -- > > Key: YARN-3626 > URL: https://issues.apache.org/jira/browse/YARN-3626 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn > Environment: Windows >Reporter: Craig Welch >Assignee: Craig Welch > > In response to the mapreduce.job.user.classpath.first setting the classpath > is ordered differently so that localized resources will appear before system > classpath resources when tasks execute. On Windows this does not work > because the localized resources are not linked into their final location when > the classpath jar is created. To compensate for that localized jar resources > are added directly to the classpath generated for the jar rather than being > discovered from the localized directories. Unfortunately, they are always > appended to the classpath, and so are never preferred over system resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3618) Fix unused variable to get CPU frequency on Windows systems
[ https://issues.apache.org/jira/browse/YARN-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538844#comment-14538844 ] Brahma Reddy Battula commented on YARN-3618: Resloved as duplicate of YARN-3617,as both are same.. > Fix unused variable to get CPU frequency on Windows systems > --- > > Key: YARN-3618 > URL: https://issues.apache.org/jira/browse/YARN-3618 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.0 > Environment: Windows 7 x64 SP1 >Reporter: Georg Berendt >Priority: Minor > Labels: easyfix > Original Estimate: 1h > Remaining Estimate: 1h > > In the class 'WindowsResourceCalculatorPlugin.java' of the YARN project, > there is an unused variable for CPU frequency. > " /** {@inheritDoc} */ > @Override > public long getCpuFrequency() { > refreshIfNeeded(); > return -1; > }" > Please change '-1' to use 'cpuFrequencyKhz'. > org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1297) Miscellaneous Fair Scheduler speedups
[ https://issues.apache.org/jira/browse/YARN-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-1297: -- Attachment: YARN-1297.4.patch Updating patch to fix the test failure * Had missed accounting for app container recovery during scheduler recovery. > Miscellaneous Fair Scheduler speedups > - > > Key: YARN-1297 > URL: https://issues.apache.org/jira/browse/YARN-1297 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Sandy Ryza >Assignee: Arun Suresh > Labels: BB2015-05-TBR > Attachments: YARN-1297-1.patch, YARN-1297-2.patch, YARN-1297.3.patch, > YARN-1297.4.patch, YARN-1297.patch, YARN-1297.patch > > > I ran the Fair Scheduler's core scheduling loop through a profiler tool and > identified a bunch of minimally invasive changes that can shave off a few > milliseconds. > The main one is demoting a couple INFO log messages to DEBUG, which brought > my benchmark down from 16000 ms to 6000. > A few others (which had way less of an impact) were > * Most of the time in comparisons was being spent in Math.signum. I switched > this to direct ifs and elses and it halved the percent of time spent in > comparisons. > * I removed some unnecessary instantiations of Resource objects > * I made it so that queues' usage wasn't calculated from the applications up > each time getResourceUsage was called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3626) On Windows localized resources are not moved to the front of the classpath when they should be
[ https://issues.apache.org/jira/browse/YARN-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Craig Welch updated YARN-3626: -- Attachment: YARN-3626.0.patch The attached patch propagates the conditional as a yarn configuration option and moves localized resources to the front of the classpath when appropriate > On Windows localized resources are not moved to the front of the classpath > when they should be > -- > > Key: YARN-3626 > URL: https://issues.apache.org/jira/browse/YARN-3626 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn > Environment: Windows >Reporter: Craig Welch >Assignee: Craig Welch > Attachments: YARN-3626.0.patch > > > In response to the mapreduce.job.user.classpath.first setting the classpath > is ordered differently so that localized resources will appear before system > classpath resources when tasks execute. On Windows this does not work > because the localized resources are not linked into their final location when > the classpath jar is created. To compensate for that localized jar resources > are added directly to the classpath generated for the jar rather than being > discovered from the localized directories. Unfortunately, they are always > appended to the classpath, and so are never preferred over system resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3618) Fix unused variable to get CPU frequency on Windows systems
[ https://issues.apache.org/jira/browse/YARN-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538855#comment-14538855 ] Georg Berendt commented on YARN-3618: - Sorry, by posting the dialogue must have sent two POSTs > Fix unused variable to get CPU frequency on Windows systems > --- > > Key: YARN-3618 > URL: https://issues.apache.org/jira/browse/YARN-3618 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.0 > Environment: Windows 7 x64 SP1 >Reporter: Georg Berendt >Priority: Minor > Labels: easyfix > Original Estimate: 1h > Remaining Estimate: 1h > > In the class 'WindowsResourceCalculatorPlugin.java' of the YARN project, > there is an unused variable for CPU frequency. > " /** {@inheritDoc} */ > @Override > public long getCpuFrequency() { > refreshIfNeeded(); > return -1; > }" > Please change '-1' to use 'cpuFrequencyKhz'. > org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.
[ https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated YARN-221: - Attachment: YARN-221-trunk-v5.patch Here is the new patch with updated unit tests. > NM should provide a way for AM to tell it not to aggregate logs. > > > Key: YARN-221 > URL: https://issues.apache.org/jira/browse/YARN-221 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Robert Joseph Evans >Assignee: Ming Ma > Labels: BB2015-05-TBR > Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, > YARN-221-trunk-v3.patch, YARN-221-trunk-v4.patch, YARN-221-trunk-v5.patch > > > The NodeManager should provide a way for an AM to tell it that either the > logs should not be aggregated, that they should be aggregated with a high > priority, or that they should be aggregated but with a lower priority. The > AM should be able to do this in the ContainerLaunch context to provide a > default value, but should also be able to update the value when the container > is released. > This would allow for the NM to not aggregate logs in some cases, and avoid > connection to the NN at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI
[ https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538878#comment-14538878 ] Wangda Tan commented on YARN-3362: -- The latest patch LGTM. > Add node label usage in RM CapacityScheduler web UI > --- > > Key: YARN-3362 > URL: https://issues.apache.org/jira/browse/YARN-3362 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager, webapp >Reporter: Wangda Tan >Assignee: Naganarasimha G R > Attachments: 2015.05.06 Folded Queues.png, 2015.05.06 Queue > Expanded.png, 2015.05.07_3362_Queue_Hierarchy.png, > 2015.05.10_3362_Queue_Hierarchy.png, CSWithLabelsView.png, > No-space-between-Active_user_info-and-next-queues.png, Screen Shot 2015-04-29 > at 11.42.17 AM.png, YARN-3362.20150428-3-modified.patch, > YARN-3362.20150428-3.patch, YARN-3362.20150506-1.patch, > YARN-3362.20150507-1.patch, YARN-3362.20150510-1.patch, > YARN-3362.20150511-1.patch, capacity-scheduler.xml > > > We don't have node label usage in RM CapacityScheduler web UI now, without > this, user will be hard to understand what happened to nodes have labels > assign to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3521) Support return structured NodeLabel objects in REST API when call getClusterNodeLabels
[ https://issues.apache.org/jira/browse/YARN-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538890#comment-14538890 ] Wangda Tan commented on YARN-3521: -- Thanks for updating, [~sunilg], Latest patch LGTM, +1. > Support return structured NodeLabel objects in REST API when call > getClusterNodeLabels > -- > > Key: YARN-3521 > URL: https://issues.apache.org/jira/browse/YARN-3521 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager >Reporter: Wangda Tan >Assignee: Sunil G > Attachments: 0001-YARN-3521.patch, 0002-YARN-3521.patch, > 0003-YARN-3521.patch, 0004-YARN-3521.patch, 0005-YARN-3521.patch, > 0006-YARN-3521.patch, 0007-YARN-3521.patch > > > In YARN-3413, yarn cluster CLI returns NodeLabel instead of String, we should > make the same change in REST API side to make them consistency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3505) Node's Log Aggregation Report with SUCCEED should not cached in RMApps
[ https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538897#comment-14538897 ] Hadoop QA commented on YARN-3505: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 12s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 50s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 51s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 7s | The applied patch generated 1 new checkstyle issues (total was 1, now 2). | | {color:red}-1{color} | checkstyle | 2m 22s | The applied patch generated 2 new checkstyle issues (total was 70, now 63). | | {color:green}+1{color} | whitespace | 0m 21s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 41s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 35s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 0m 21s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 0m 24s | Tests passed in hadoop-yarn-server-common. | | {color:green}+1{color} | yarn tests | 6m 10s | Tests passed in hadoop-yarn-server-nodemanager. | | {color:red}-1{color} | yarn tests | 51m 55s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 102m 7s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCResponseId | | | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12732030/YARN-3505.4.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / ea11590 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/7866/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt https://builds.apache.org/job/PreCommit-YARN-Build/7866/artifact/patchprocess/diffcheckstylehadoop-yarn-server-common.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/7866/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/7866/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/7866/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/7866/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7866/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7866/console | This message was automatically generated. > Node's Log Aggregation Report with SUCCEED should not cached in RMApps > -- > > Key: YARN-3505 > URL: https://issues.apache.org/jira/browse/YARN-3505 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation >Affects Versions: 2.8.0 >Reporter: Junping Du >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-3505.1.patch, YARN-3505.2.patch, > YARN-3505.2.rebase.patch, YARN-3505.3.patch, YARN-3505.4.patch > > > Per discussions in YARN-1402, we shouldn't cache all node's log aggregation > reports in RMApps for always, especially for those finished with SUCCEED. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3624) ApplicationHistoryServer reverses the order of the filters it gets
[ https://issues.apache.org/jira/browse/YARN-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538898#comment-14538898 ] Hadoop QA commented on YARN-3624: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 48s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 38s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 38s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 27s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 49s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 3m 3s | Tests passed in hadoop-yarn-server-applicationhistoryservice. | | | | 38m 59s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12732032/YARN-3624.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 444836b | | hadoop-yarn-server-applicationhistoryservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/7867/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7867/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7867/console | This message was automatically generated. > ApplicationHistoryServer reverses the order of the filters it gets > -- > > Key: YARN-3624 > URL: https://issues.apache.org/jira/browse/YARN-3624 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Mit Desai > Attachments: YARN-3624.patch > > > AppliactionHistoryServer should not alter the order in which it gets the > filter chain. Additional filters should be added at the end of the chain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3545) Investigate the concurrency issue with the map of timeline collector
[ https://issues.apache.org/jira/browse/YARN-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-3545: Attachment: YARN-3545-YARN-2928.000.patch In this patch I'm using concurrent hash map to replace synchronized hash map. After removing the global lock, we need to consider two cases, concurrent putIfAbsent calls, and concurrent putIfAbsent call and get call. The case with concurrent putIfAbsent call and get call is addressed by a initialization barrier since the contention is low. With this solution on the best case each read will only have one volatile variable read, instead of getting the lock inside synchronized map. The case with multiple concurrent putIfAbsents is addressed by speculatively allocate a collector, and try to putIfAbsent it to the hash map. It then call postPut and publish this new collector to all readers if the putIfAbsent call succeed (returns null). If the putIfAbsent call failed, someone else has already allocated a collector and we need to use that collector. To speed up this case, I used a "fast path" such that the putIfAbsent call only tries to allocate collectors if there was no collector for it at the beginning of this method. I'd appreciate comments since I may miss something here... > Investigate the concurrency issue with the map of timeline collector > > > Key: YARN-3545 > URL: https://issues.apache.org/jira/browse/YARN-3545 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Li Lu > Attachments: YARN-3545-YARN-2928.000.patch > > > See the discussion in YARN-3390 for details. Let's continue the discussion > here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3625) RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put
[ https://issues.apache.org/jira/browse/YARN-3625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538905#comment-14538905 ] Hadoop QA commented on YARN-3625: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 58s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 36s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 45s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 27s | The applied patch generated 1 new checkstyle issues (total was 6, now 6). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 38s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 49s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 3m 12s | Tests passed in hadoop-yarn-server-applicationhistoryservice. | | | | 39m 25s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12732038/YARN-3625.1.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 444836b | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/7868/artifact/patchprocess/diffcheckstylehadoop-yarn-server-applicationhistoryservice.txt | | hadoop-yarn-server-applicationhistoryservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/7868/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7868/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7868/console | This message was automatically generated. > RollingLevelDBTimelineStore Incorrectly Forbids Related Entity in Same Put > -- > > Key: YARN-3625 > URL: https://issues.apache.org/jira/browse/YARN-3625 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Attachments: YARN-3625.1.patch > > > RollingLevelDBTimelineStore batches all entities in the same put to improve > performance. This causes an error when relating to an entity in the same put > however. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)
[ https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538911#comment-14538911 ] Zhijie Shen commented on YARN-2900: --- [~mitdesai], have you got the change to fix {{java.lang.IllegalStateException: STREAM}}? > Application (Attempt and Container) Not Found in AHS results in Internal > Server Error (500) > --- > > Key: YARN-2900 > URL: https://issues.apache.org/jira/browse/YARN-2900 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Jonathan Eagles >Assignee: Mit Desai > Attachments: YARN-2900-b2.patch, YARN-2900.patch, YARN-2900.patch, > YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, > YARN-2900.patch, YARN-2900.patch > > > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128) > at > org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118) > at > org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222) > at > org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) > at > org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218) > ... 59 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3539) Compatibility doc to state that ATS v1 is a stable REST API
[ https://issues.apache.org/jira/browse/YARN-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538954#comment-14538954 ] Zhijie Shen commented on YARN-3539: --- [~ste...@apache.org], did you have a chance to look at my last comment? The doc seems to still have some minor issue. > Compatibility doc to state that ATS v1 is a stable REST API > --- > > Key: YARN-3539 > URL: https://issues.apache.org/jira/browse/YARN-3539 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 2.7.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Labels: BB2015-05-TBR > Attachments: HADOOP-11826-001.patch, HADOOP-11826-002.patch, > TimelineServer.html, YARN-3539-003.patch, YARN-3539-004.patch, > YARN-3539-005.patch, YARN-3539-006.patch, YARN-3539-007.patch, > YARN-3539-008.patch, timeline_get_api_examples.txt > > > The ATS v2 discussion and YARN-2423 have raised the question: "how stable are > the ATSv1 APIs"? > The existing compatibility document actually states that the History Server > is [a stable REST > API|http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Compatibility.html#REST_APIs], > which effectively means that ATSv1 has already been declared as a stable API. > Clarify this by patching the compatibility document appropriately -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.
[ https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538959#comment-14538959 ] Hadoop QA commented on YARN-221: \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 48s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 38s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 14s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 49s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 40s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 45s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 0m 25s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 56s | Tests passed in hadoop-yarn-common. | | {color:green}+1{color} | yarn tests | 7m 56s | Tests passed in hadoop-yarn-server-nodemanager. | | | | 51m 46s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12732060/YARN-221-trunk-v5.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 444836b | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/7869/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/7869/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/7869/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7869/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7869/console | This message was automatically generated. > NM should provide a way for AM to tell it not to aggregate logs. > > > Key: YARN-221 > URL: https://issues.apache.org/jira/browse/YARN-221 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Robert Joseph Evans >Assignee: Ming Ma > Labels: BB2015-05-TBR > Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, > YARN-221-trunk-v3.patch, YARN-221-trunk-v4.patch, YARN-221-trunk-v5.patch > > > The NodeManager should provide a way for an AM to tell it that either the > logs should not be aggregated, that they should be aggregated with a high > priority, or that they should be aggregated but with a lower priority. The > AM should be able to do this in the ContainerLaunch context to provide a > default value, but should also be able to update the value when the container > is released. > This would allow for the NM to not aggregate logs in some cases, and avoid > connection to the NN at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2921) MockRM#waitForState methods can be too slow and flaky
[ https://issues.apache.org/jira/browse/YARN-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538956#comment-14538956 ] Wangda Tan commented on YARN-2921: -- Hi [~ozawa], Some comments: - In MockAM.waitForState, I'm not very understand about the change: 1. why minWaitMSec is needed? 2. Why fail the method if {{if (waitedMsecs >= timeoutMsecs)}} is true? I think it should check now-state against expected state. - In two MockRM.waitForState method, I think we should also check app.getState() instead of time, correct? - In TestRMRestart, you can use GenericTestUtils.waitFor instead. > MockRM#waitForState methods can be too slow and flaky > - > > Key: YARN-2921 > URL: https://issues.apache.org/jira/browse/YARN-2921 > Project: Hadoop YARN > Issue Type: Improvement > Components: test >Affects Versions: 2.6.0, 2.7.0 >Reporter: Karthik Kambatla >Assignee: Tsuyoshi Ozawa > Attachments: YARN-2921.001.patch, YARN-2921.002.patch, > YARN-2921.003.patch, YARN-2921.004.patch, YARN-2921.005.patch, > YARN-2921.006.patch, YARN-2921.007.patch > > > MockRM#waitForState methods currently sleep for too long (2 seconds and 1 > second). This leads to slow tests and sometimes failures if the > App/AppAttempt moves to another state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3489) RMServerUtils.validateResourceRequests should only obtain queue info once
[ https://issues.apache.org/jira/browse/YARN-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14538961#comment-14538961 ] Wangda Tan commented on YARN-3489: -- Committing. > RMServerUtils.validateResourceRequests should only obtain queue info once > - > > Key: YARN-3489 > URL: https://issues.apache.org/jira/browse/YARN-3489 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Varun Saxena > Labels: BB2015-05-RFC > Attachments: YARN-3489.01.patch, YARN-3489.02.patch, > YARN-3489.03.patch > > > Since the label support was added we now get the queue info for each request > being validated in SchedulerUtils.validateResourceRequest. If > validateResourceRequests needs to validate a lot of requests at a time (e.g.: > large cluster with lots of varied locality in the requests) then it will get > the queue info for each request. Since we build the queue info this > generates a lot of unnecessary garbage, as the queue isn't changing between > requests. We should grab the queue info once and pass it down rather than > building it again for each request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3545) Investigate the concurrency issue with the map of timeline collector
[ https://issues.apache.org/jira/browse/YARN-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539001#comment-14539001 ] Hadoop QA commented on YARN-3545: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 11s | Pre-patch YARN-2928 compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 42s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 45s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 33s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 43s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 38s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 0m 40s | The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 37m 3s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-timelineservice | | | Spinning on TimelineCollector.initialized in org.apache.hadoop.yarn.server.timelineservice.collector.TimelineCollectorManager.initializationBarrier(TimelineCollector) At TimelineCollectorManager.java: At TimelineCollectorManager.java:[line 161] | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12732071/YARN-3545-YARN-2928.000.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / b3b791b | | Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/7870/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-timelineservice.html | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/7870/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7870/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7870/console | This message was automatically generated. > Investigate the concurrency issue with the map of timeline collector > > > Key: YARN-3545 > URL: https://issues.apache.org/jira/browse/YARN-3545 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Li Lu > Attachments: YARN-3545-YARN-2928.000.patch > > > See the discussion in YARN-3390 for details. Let's continue the discussion > here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3489) RMServerUtils.validateResourceRequests should only obtain queue info once
[ https://issues.apache.org/jira/browse/YARN-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539007#comment-14539007 ] Hudson commented on YARN-3489: -- FAILURE: Integrated in Hadoop-trunk-Commit #7800 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7800/]) YARN-3489. RMServerUtils.validateResourceRequests should only obtain queue info once. (Varun Saxena via wangda) (wangda: rev d6f6741296639a73f5306e3ebefec84a40ca03e5) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java > RMServerUtils.validateResourceRequests should only obtain queue info once > - > > Key: YARN-3489 > URL: https://issues.apache.org/jira/browse/YARN-3489 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Varun Saxena > Labels: BB2015-05-RFC > Attachments: YARN-3489.01.patch, YARN-3489.02.patch, > YARN-3489.03.patch > > > Since the label support was added we now get the queue info for each request > being validated in SchedulerUtils.validateResourceRequest. If > validateResourceRequests needs to validate a lot of requests at a time (e.g.: > large cluster with lots of varied locality in the requests) then it will get > the queue info for each request. Since we build the queue info this > generates a lot of unnecessary garbage, as the queue isn't changing between > requests. We should grab the queue info once and pass it down rather than > building it again for each request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3617) Fix unused variable to get CPU frequency on Windows systems
[ https://issues.apache.org/jira/browse/YARN-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] J.Andreina reassigned YARN-3617: Assignee: J.Andreina > Fix unused variable to get CPU frequency on Windows systems > --- > > Key: YARN-3617 > URL: https://issues.apache.org/jira/browse/YARN-3617 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.0 > Environment: Windows 7 x64 SP1 >Reporter: Georg Berendt >Assignee: J.Andreina >Priority: Minor > Original Estimate: 1h > Remaining Estimate: 1h > > In the class 'WindowsResourceCalculatorPlugin.java' of the YARN project, > there is an unused variable for CPU frequency. > " /** {@inheritDoc} */ > @Override > public long getCpuFrequency() { > refreshIfNeeded(); > return -1; > }" > Please change '-1' to use 'cpuFrequencyKhz'. > org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)