[jira] [Created] (YARN-5255) Move ResourceRequest expansion for Delay Scheduling to the Scheduler
Arun Suresh created YARN-5255: - Summary: Move ResourceRequest expansion for Delay Scheduling to the Scheduler Key: YARN-5255 URL: https://issues.apache.org/jira/browse/YARN-5255 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh YARN-4879 proposes to enhance the Allocate Request by introducing a way to explicitly identify a ResourceRequest explicitly Currently, if relaxLocality == true, a Node specific request is expanded to rack and *ANY* request by the {{AMRMClient}} before being sent to the Scheduler. The requires the 3 copies to perform locality specific delay scheduling. It would be better to perform the expansion in the Scheduler itself, rather than the client, since: # The expansion is not really specified in the ApplicationMasterProtocol, a non-java client will have to duplicate the expansion logic. # Refactor out a lot of unnecessary code in the {{AMRMClientImpl}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-5255) Move ResourceRequest expansion for Delay Scheduling to the Scheduler
[ https://issues.apache.org/jira/browse/YARN-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh reassigned YARN-5255: - Assignee: Arun Suresh > Move ResourceRequest expansion for Delay Scheduling to the Scheduler > > > Key: YARN-5255 > URL: https://issues.apache.org/jira/browse/YARN-5255 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh > > YARN-4879 proposes to enhance the Allocate Request by introducing a way to > explicitly identify a ResourceRequest explicitly > Currently, if relaxLocality == true, a Node specific request is expanded to > rack and *ANY* request by the {{AMRMClient}} before being sent to the > Scheduler. The requires the 3 copies to perform locality specific delay > scheduling. > It would be better to perform the expansion in the Scheduler itself, rather > than the client, since: > # The expansion is not really specified in the ApplicationMasterProtocol, a > non-java client will have to duplicate the expansion logic. > # Refactor out a lot of unnecessary code in the {{AMRMClientImpl}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331133#comment-15331133 ] Varun Saxena commented on YARN-5243: Ohh. I submitted one more build. Will cancel it. > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch, > YARN-5243-YARN-2928.02.patch, YARN-5243-YARN-2928.03.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-5243: --- Assignee: Sangjin Lee (was: Varun Saxena) > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch, > YARN-5243-YARN-2928.02.patch, YARN-5243-YARN-2928.03.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-5243: --- Assignee: Varun Saxena (was: Sangjin Lee) > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch, > YARN-5243-YARN-2928.02.patch, YARN-5243-YARN-2928.03.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5254) capacity scheduler could only allocate a container with 1 vcore if using DefautResourceCalculator
[ https://issues.apache.org/jira/browse/YARN-5254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331125#comment-15331125 ] sandflee commented on YARN-5254: vcore info is droped by DefaultResourceCalculator while normalize resource request in allocate() , we can: 1, change DefaultResourceCalculator#normalize() 2, use DefaultResourceCalculator to normalize resource request like fairscheduler use 2, and update a patch > capacity scheduler could only allocate a container with 1 vcore if using > DefautResourceCalculator > --- > > Key: YARN-5254 > URL: https://issues.apache.org/jira/browse/YARN-5254 > Project: Hadoop YARN > Issue Type: Bug >Reporter: sandflee >Assignee: sandflee > Attachments: YARN-5254.01.patch > > > 1, capacity scheduler use DefaultResourceCalculator > 2, start distributeshell with args -container_vcores 2 > 3, could only get the container with vcores 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331122#comment-15331122 ] Sangjin Lee commented on YARN-5243: --- You don't have to. The above report is for a previous version (v.2). The latest run for v.3 is still going: https://builds.apache.org/job/PreCommit-YARN-Build/12020/ > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch, > YARN-5243-YARN-2928.02.patch, YARN-5243-YARN-2928.03.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-1773) ShuffleHeader should have a format that can inform about errors
[ https://issues.apache.org/jira/browse/YARN-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329895#comment-15329895 ] Varun Saxena edited comment on YARN-1773 at 6/15/16 4:23 AM: - [~djp], sorry this fell off my radar. Do not have bandwidth to fix it, in short term. I have unassigned it in case somebody else wants to take it up. Will assign it back to myself once I have bandwidth and if nobody else has taken it up. was (Author: varun_saxena): [~djp], sorry this fell off my radar. Do not have bandwidth to fix it, in short term. You can take it up, if you want. > ShuffleHeader should have a format that can inform about errors > --- > > Key: YARN-1773 > URL: https://issues.apache.org/jira/browse/YARN-1773 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.3.0, 2.4.0 >Reporter: Bikas Saha >Priority: Critical > > Currently, the ShuffleHeader (which is a Writable) simply tries to read the > successful header (mapid, reduceid etc). If there is an error then the input > will have an error message instead of (mapid, reducedid etc). Thus parsing > the ShuffleHeader fails and since we dont know where the error message ends, > we cannot consume the remaining input stream which may have good data from > the remaining map outputs. Being able to encode the error in the > ShuffleHeader will let us parse out the error correctly and move on to the > remaining data. > The shuffle handler response should say which maps are in error and which are > fine, what the error was for the erroneous maps. These will help report > diagnostics for easier upstream reporting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-1773) ShuffleHeader should have a format that can inform about errors
[ https://issues.apache.org/jira/browse/YARN-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-1773: --- Assignee: (was: Varun Saxena) > ShuffleHeader should have a format that can inform about errors > --- > > Key: YARN-1773 > URL: https://issues.apache.org/jira/browse/YARN-1773 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.3.0, 2.4.0 >Reporter: Bikas Saha >Priority: Critical > > Currently, the ShuffleHeader (which is a Writable) simply tries to read the > successful header (mapid, reduceid etc). If there is an error then the input > will have an error message instead of (mapid, reducedid etc). Thus parsing > the ShuffleHeader fails and since we dont know where the error message ends, > we cannot consume the remaining input stream which may have good data from > the remaining map outputs. Being able to encode the error in the > ShuffleHeader will let us parse out the error correctly and move on to the > remaining data. > The shuffle handler response should say which maps are in error and which are > fine, what the error was for the erroneous maps. These will help report > diagnostics for easier upstream reporting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331117#comment-15331117 ] Varun Saxena commented on YARN-5243: There seems to be something wrong with this build report. So many test cases failed are shown twice. Will invoke it again > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch, > YARN-5243-YARN-2928.02.patch, YARN-5243-YARN-2928.03.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5254) capacity scheduler could only allocate a container with 1 vcore if using DefautResourceCalculator
[ https://issues.apache.org/jira/browse/YARN-5254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sandflee updated YARN-5254: --- Attachment: YARN-5254.01.patch > capacity scheduler could only allocate a container with 1 vcore if using > DefautResourceCalculator > --- > > Key: YARN-5254 > URL: https://issues.apache.org/jira/browse/YARN-5254 > Project: Hadoop YARN > Issue Type: Bug >Reporter: sandflee >Assignee: sandflee > Attachments: YARN-5254.01.patch > > > 1, capacity scheduler use DefaultResourceCalculator > 2, start distributeshell with args -container_vcores 2 > 3, could only get the container with vcores 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5254) capacity scheduler could only allocate a container with 1 vcore if using DefautResourceCalculator
sandflee created YARN-5254: -- Summary: capacity scheduler could only allocate a container with 1 vcore if using DefautResourceCalculator Key: YARN-5254 URL: https://issues.apache.org/jira/browse/YARN-5254 Project: Hadoop YARN Issue Type: Bug Reporter: sandflee Assignee: sandflee 1, capacity scheduler use DefaultResourceCalculator 2, start distributeshell with args -container_vcores 2 3, could only get the container with vcores 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5241) FairScheduler fails to release container it is just allocated
[ https://issues.apache.org/jira/browse/YARN-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1533#comment-1533 ] ChenFolin commented on YARN-5241: - thanks Karthik Kambatla. > FairScheduler fails to release container it is just allocated > - > > Key: YARN-5241 > URL: https://issues.apache.org/jira/browse/YARN-5241 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.1, 2.8.0, 2.7.2 >Reporter: ChenFolin > Attachments: YARN-5241-001.patch, YARN-5241-002.patch, > YARN-5241-003.patch, repeatContainerCompleted.log > > > NodeManager heartbeat event NODE_UPDATE and ApplicationMaster allocate > operate may cause repeat container completed, it can lead something wrong. > Node releaseContainer can pervent repeat release operate: > like: > public synchronized void releaseContainer(Container container) { > if (!isValidContainer(container.getId())) { > LOG.error("Invalid container released " + container); > return; > } > FSAppAttempt containerCompleted did not prevent repeat container completed > operate. > Detail logs at attach file. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5241) FairScheduler fails to release container it is just allocated
[ https://issues.apache.org/jira/browse/YARN-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ChenFolin updated YARN-5241: Attachment: YARN-5241-003.patch update patch > FairScheduler fails to release container it is just allocated > - > > Key: YARN-5241 > URL: https://issues.apache.org/jira/browse/YARN-5241 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.1, 2.8.0, 2.7.2 >Reporter: ChenFolin > Attachments: YARN-5241-001.patch, YARN-5241-002.patch, > YARN-5241-003.patch, repeatContainerCompleted.log > > > NodeManager heartbeat event NODE_UPDATE and ApplicationMaster allocate > operate may cause repeat container completed, it can lead something wrong. > Node releaseContainer can pervent repeat release operate: > like: > public synchronized void releaseContainer(Container container) { > if (!isValidContainer(container.getId())) { > LOG.error("Invalid container released " + container); > return; > } > FSAppAttempt containerCompleted did not prevent repeat container completed > operate. > Detail logs at attach file. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers
[ https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331102#comment-15331102 ] Carlo Curino commented on YARN-5215: Regarding preemptable vs OPPORTUNISTIC, we had this conversation with [~kkaranasos] and [~asuresh], where containers types could be {{NON-PREEMPTABLE}}, {{PREEMPTABLE}}, {{OPPORTUNISTIC}}, describing basically increasing level of how likely is a task to be interrupted. For {{NON-PREEMPTABLE}} the system does everything it can not to interrupt (bar physical machine failures), {{PREEMPTABLE}} are containers with a low, but non-null chance of being interrupted by the scheduler (e.g., allocations above a queue capacity, but on dedicated resources), and {{OPPORTUNISTIC}} are tasks that have a high(er) chance of being interrupted as they run on the left-over capacity from other containers, or on overcommitted resources or, or other risky forms of resource harvesting (as in this JIRA). BTW [~elgoiri] I would suggest to turn this in an umbrella JIRA, and separate the sub-step you list [in the comment above | https://issues.apache.org/jira/browse/YARN-5215?focusedCommentId=15330649=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15330649]. Some are likely less controversial and can be settled/committed earlier, while other parts are too "interesting" to go in easily :-) > Scheduling containers based on external load in the servers > --- > > Key: YARN-5215 > URL: https://issues.apache.org/jira/browse/YARN-5215 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Inigo Goiri > Attachments: YARN-5215.000.patch, YARN-5215.001.patch > > > Currently YARN runs containers in the servers assuming that they own all the > resources. The proposal is to use the utilization information in the node and > the containers to estimate how much is consumed by external processes and > schedule based on this estimation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5171) Extend DistributedSchedulerProtocol to notify RM of containers allocated by the Node
[ https://issues.apache.org/jira/browse/YARN-5171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331020#comment-15331020 ] Konstantinos Karanasos commented on YARN-5171: -- Just went over the patch. Looks good. Some comments: # Not sure if we should introduce in this JIRA the methods addRMContainer and removeRMContainer in {{SchedulerApplicationAttempt}}, since they are not used anywhere. If we decide to pass this information around (including in the queues of the scheduler), then we can add them. Also naming should reveal that they have to do with opportunistic scheduling, in case we decide to keep them. # Let's put imports in single lines (there is wrapping in several places). # nit: In {{DistSchedAllocateRequestPBImpl}} and {{DistSchedAllocateRequest}}, let's put first the methods about getting/setting the AllocateRequest, to follow the same order as the protobuf too. Other than the above and the addition of some additional use cases, patch looks ok to me. > Extend DistributedSchedulerProtocol to notify RM of containers allocated by > the Node > > > Key: YARN-5171 > URL: https://issues.apache.org/jira/browse/YARN-5171 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Inigo Goiri > Attachments: YARN-5171.000.patch, YARN-5171.001.patch, > YARN-5171.002.patch, YARN-5171.003.patch, YARN-5171.004.patch, > YARN-5171.005.patch > > > Currently, the RM does not know about Containers allocated by the > OpportunisticContainerAllocator on the NM. This JIRA proposes to extend the > Distributed Scheduler request interceptor and the protocol to notify the RM > of new containers as and when they are allocated at the NM. The > {{RMContainer}} should also be extended to expose the {{ExecutionType}} of > the container. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331004#comment-15331004 ] Hadoop QA commented on YARN-5243: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 6 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 28s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 14s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 1s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 15s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 7m 0s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 34s {color} | {color:green} YARN-2928 passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 6s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in YARN-2928 has 1 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 44s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common in YARN-2928 has 3 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 31s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 56s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 31s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 5s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 5s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 25s {color} | {color:red} root: The patch generated 3 new + 1120 unchanged - 5 fixed = 1123 total (was 1125) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 7m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 9m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 7s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 8m 6s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 9s {color}
[jira] [Commented] (YARN-5207) Web-Proxy should support multi-homed systems.
[ https://issues.apache.org/jira/browse/YARN-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330987#comment-15330987 ] Zephyr Guo commented on YARN-5207: -- Find an easy way that don't have to modify code also can solve this issue.So close this issue. > Web-Proxy should support multi-homed systems. > - > > Key: YARN-5207 > URL: https://issues.apache.org/jira/browse/YARN-5207 > Project: Hadoop YARN > Issue Type: New Feature > Components: webapp >Affects Versions: 3.0.0-alpha1 >Reporter: Zephyr Guo > Fix For: 3.0.0-alpha1 > > Attachments: YARN-5207-v1.patch > > > {{WebAppProxyServlet}} use {{HttpClient}} to download URL for AppMaster. > {{HttpClient}} bind the local address according to > {{yarn.web-proxy.address}}.{{HttpClient}} should use the independent address > for following situation: > There are two network cards.{{CardA}} for user to access cluster, {{CardB}} > for cluster inner communication.User can only access {{Web-Proxy}} by > {{CardA}}. > {{CardA}} can't ping {{CardB}}. > So we have to bind {{Web-Proxy}} to {{CardA}}, and then specify > {{HttpClient}} use {{CardB}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5207) Web-Proxy should support multi-homed systems.
[ https://issues.apache.org/jira/browse/YARN-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zephyr Guo resolved YARN-5207. -- Resolution: Pending Closed > Web-Proxy should support multi-homed systems. > - > > Key: YARN-5207 > URL: https://issues.apache.org/jira/browse/YARN-5207 > Project: Hadoop YARN > Issue Type: New Feature > Components: webapp >Affects Versions: 3.0.0-alpha1 >Reporter: Zephyr Guo > Fix For: 3.0.0-alpha1 > > Attachments: YARN-5207-v1.patch > > > {{WebAppProxyServlet}} use {{HttpClient}} to download URL for AppMaster. > {{HttpClient}} bind the local address according to > {{yarn.web-proxy.address}}.{{HttpClient}} should use the independent address > for following situation: > There are two network cards.{{CardA}} for user to access cluster, {{CardB}} > for cluster inner communication.User can only access {{Web-Proxy}} by > {{CardA}}. > {{CardA}} can't ping {{CardB}}. > So we have to bind {{Web-Proxy}} to {{CardA}}, and then specify > {{HttpClient}} use {{CardB}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-5207) Web-Proxy should support multi-homed systems.
[ https://issues.apache.org/jira/browse/YARN-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zephyr Guo reopened YARN-5207: -- > Web-Proxy should support multi-homed systems. > - > > Key: YARN-5207 > URL: https://issues.apache.org/jira/browse/YARN-5207 > Project: Hadoop YARN > Issue Type: New Feature > Components: webapp >Affects Versions: 3.0.0-alpha1 >Reporter: Zephyr Guo > Fix For: 3.0.0-alpha1 > > Attachments: YARN-5207-v1.patch > > > {{WebAppProxyServlet}} use {{HttpClient}} to download URL for AppMaster. > {{HttpClient}} bind the local address according to > {{yarn.web-proxy.address}}.{{HttpClient}} should use the independent address > for following situation: > There are two network cards.{{CardA}} for user to access cluster, {{CardB}} > for cluster inner communication.User can only access {{Web-Proxy}} by > {{CardA}}. > {{CardA}} can't ping {{CardB}}. > So we have to bind {{Web-Proxy}} to {{CardA}}, and then specify > {{HttpClient}} use {{CardB}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5083) YARN CLI for AM logs does not give any error message if entered invalid am value
[ https://issues.apache.org/jira/browse/YARN-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330977#comment-15330977 ] Junping Du commented on YARN-5083: -- Thanks [~jianhe] for updating the patch. 2nd patch looks good. +1 pending on Jenkins result. > YARN CLI for AM logs does not give any error message if entered invalid am > value > > > Key: YARN-5083 > URL: https://issues.apache.org/jira/browse/YARN-5083 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Sumana Sathish >Assignee: Jian He > Attachments: YARN-5083.1.patch, YARN-5083.1.patch, YARN-5083.2.patch > > > Entering invalid value for am in yarn logs CLI does not give any error message > {code:title= there is no amattempt 30 for the application} > yarn logs -applicationId -am 30 > impl.TimelineClientImpl: Timeline service address: > INFO client.RMProxy: Connecting to ResourceManager at > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5237) Not all logs get aggregated with rolling log aggregation.
[ https://issues.apache.org/jira/browse/YARN-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330976#comment-15330976 ] Junping Du commented on YARN-5237: -- The checkstyle and whitespace issues reported by Jenkins is not valid. v4 patch LGTM. +1. Will commit patch shortly if no further comments from others. > Not all logs get aggregated with rolling log aggregation. > - > > Key: YARN-5237 > URL: https://issues.apache.org/jira/browse/YARN-5237 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-5237.1.patch, YARN-5237.2.patch, YARN-5237.3.patch, > YARN-5237.4.patch > > > Steps to reproduce: > 1) enable RM recovery > 2) Run a sleep job > 3) restart RM > 4) kill the application > We can not find that the logs for the first attempt -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5083) YARN CLI for AM logs does not give any error message if entered invalid am value
[ https://issues.apache.org/jira/browse/YARN-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-5083: -- Attachment: YARN-5083.2.patch new patch addressed the comment > YARN CLI for AM logs does not give any error message if entered invalid am > value > > > Key: YARN-5083 > URL: https://issues.apache.org/jira/browse/YARN-5083 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Sumana Sathish >Assignee: Jian He > Attachments: YARN-5083.1.patch, YARN-5083.1.patch, YARN-5083.2.patch > > > Entering invalid value for am in yarn logs CLI does not give any error message > {code:title= there is no amattempt 30 for the application} > yarn logs -applicationId -am 30 > impl.TimelineClientImpl: Timeline service address: > INFO client.RMProxy: Connecting to ResourceManager at > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5233) Support for specifying a path for ATS plugin jars
[ https://issues.apache.org/jira/browse/YARN-5233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330958#comment-15330958 ] Li Lu commented on YARN-5233: - Looking through discussions in YARN-4577, looks like we want something similar to it. > Support for specifying a path for ATS plugin jars > - > > Key: YARN-5233 > URL: https://issues.apache.org/jira/browse/YARN-5233 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: 2.8.0 >Reporter: Li Lu >Assignee: Li Lu > > Third-party plugins need to add their jars to ATS. Most of the times, > isolation is not needed. However, there needs to be a way to specify the > path. For now, the jars on that path can be added to default classloader. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5070) upgrade HBase version for first merge
[ https://issues.apache.org/jira/browse/YARN-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330937#comment-15330937 ] Sangjin Lee commented on YARN-5070: --- The latest patch LGTM. I'll wait for the jenkins result. > upgrade HBase version for first merge > - > > Key: YARN-5070 > URL: https://issues.apache.org/jira/browse/YARN-5070 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Vrushali C >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: YARN-5070-YARN-2928.01.patch, > YARN-5070-YARN-2928.02.patch, YARN-5070-YARN-2928.03.patch, > YARN-5070-YARN-2928.04.patch, YARN-5070-YARN-2928.05.patch, > YARN-5070-YARN-2928.06.patch, YARN-5070-YARN-2928.07.patch > > > Currently we set the HBase version for the timeline service storage to 1.0.1. > This is a fairly old version, and there are reasons to upgrade to a newer > version. We should upgrade it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3611) Support Docker Containers In LinuxContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330932#comment-15330932 ] Sidharta Seethana commented on YARN-3611: - hi [~templedf], please go ahead and file a JIRA for adding docs. Thanks! > Support Docker Containers In LinuxContainerExecutor > --- > > Key: YARN-3611 > URL: https://issues.apache.org/jira/browse/YARN-3611 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sidharta Seethana >Assignee: Sidharta Seethana > > Support Docker Containers In LinuxContainerExecutor > LinuxContainerExecutor provides useful functionality today with respect to > localization, cgroups based resource management and isolation for CPU, > network, disk etc. as well as security with a well-defined mechanism to > execute privileged operations using the container-executor utility. Bringing > docker support to LinuxContainerExecutor lets us use all of this > functionality when running docker containers under YARN, while not requiring > users and admins to configure and use a different ContainerExecutor. > There are several aspects here that need to be worked through : > * Mechanism(s) to let clients request docker-specific functionality - we > could initially implement this via environment variables without impacting > the client API. > * Security - both docker daemon as well as application > * Docker image localization > * Running a docker container via container-executor as a specified user > * “Isolate” the docker container in terms of CPU/network/disk/etc > * Communicating with and/or signaling the running container (ensure correct > pid handling) > * Figure out workarounds for certain performance-sensitive scenarios like > HDFS short-circuit reads > * All of these need to be achieved without changing the current behavior of > LinuxContainerExecutor -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5070) upgrade HBase version for first merge
[ https://issues.apache.org/jira/browse/YARN-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vrushali C updated YARN-5070: - Attachment: YARN-5070-YARN-2928.07.patch Thanks [~sjlee0], uploading v7 now. > upgrade HBase version for first merge > - > > Key: YARN-5070 > URL: https://issues.apache.org/jira/browse/YARN-5070 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Vrushali C >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: YARN-5070-YARN-2928.01.patch, > YARN-5070-YARN-2928.02.patch, YARN-5070-YARN-2928.03.patch, > YARN-5070-YARN-2928.04.patch, YARN-5070-YARN-2928.05.patch, > YARN-5070-YARN-2928.06.patch, YARN-5070-YARN-2928.07.patch > > > Currently we set the HBase version for the timeline service storage to 1.0.1. > This is a fairly old version, and there are reasons to upgrade to a newer > version. We should upgrade it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5070) upgrade HBase version for first merge
[ https://issues.apache.org/jira/browse/YARN-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330915#comment-15330915 ] Sangjin Lee commented on YARN-5070: --- Thanks for updating the patch. It appears that the following lines don't compile: {code} 239 + FlowRunRowKey 240 .parseRowKey((CellUtil.cloneRow(cells.get(0))).toString())); {code} A brace is misplaced. > upgrade HBase version for first merge > - > > Key: YARN-5070 > URL: https://issues.apache.org/jira/browse/YARN-5070 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Vrushali C >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: YARN-5070-YARN-2928.01.patch, > YARN-5070-YARN-2928.02.patch, YARN-5070-YARN-2928.03.patch, > YARN-5070-YARN-2928.04.patch, YARN-5070-YARN-2928.05.patch, > YARN-5070-YARN-2928.06.patch > > > Currently we set the HBase version for the timeline service storage to 1.0.1. > This is a fairly old version, and there are reasons to upgrade to a newer > version. We should upgrade it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5251) Yarn CLI to obtain App logs for last 'n' bytes fails with 'java.io.IOException' and for 'n' bytes fails with NumberFormatException
[ https://issues.apache.org/jira/browse/YARN-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330881#comment-15330881 ] Hadoop QA commented on YARN-5251: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 2s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 8s {color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 57s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 38s {color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 38s {color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 52m 41s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12810665/YARN-5251.1.patch | | JIRA Issue | YARN-5251 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux f122051032a2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / c77a109 | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/12021/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
[jira] [Comment Edited] (YARN-3611) Support Docker Containers In LinuxContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330864#comment-15330864 ] Daniel Templeton edited comment on YARN-3611 at 6/14/16 11:30 PM: -- Has the LinuxContainerExecutor Docker support been documented yet, or should I file a JIRA to add docs? I didn't see anything. was (Author: templedf): Has this JIRA been documented yet, or should I file a JIRA to add docs? I didn't see anything. > Support Docker Containers In LinuxContainerExecutor > --- > > Key: YARN-3611 > URL: https://issues.apache.org/jira/browse/YARN-3611 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sidharta Seethana >Assignee: Sidharta Seethana > > Support Docker Containers In LinuxContainerExecutor > LinuxContainerExecutor provides useful functionality today with respect to > localization, cgroups based resource management and isolation for CPU, > network, disk etc. as well as security with a well-defined mechanism to > execute privileged operations using the container-executor utility. Bringing > docker support to LinuxContainerExecutor lets us use all of this > functionality when running docker containers under YARN, while not requiring > users and admins to configure and use a different ContainerExecutor. > There are several aspects here that need to be worked through : > * Mechanism(s) to let clients request docker-specific functionality - we > could initially implement this via environment variables without impacting > the client API. > * Security - both docker daemon as well as application > * Docker image localization > * Running a docker container via container-executor as a specified user > * “Isolate” the docker container in terms of CPU/network/disk/etc > * Communicating with and/or signaling the running container (ensure correct > pid handling) > * Figure out workarounds for certain performance-sensitive scenarios like > HDFS short-circuit reads > * All of these need to be achieved without changing the current behavior of > LinuxContainerExecutor -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3611) Support Docker Containers In LinuxContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330864#comment-15330864 ] Daniel Templeton commented on YARN-3611: Has this JIRA been documented yet, or should I file a JIRA to add docs? I didn't see anything. > Support Docker Containers In LinuxContainerExecutor > --- > > Key: YARN-3611 > URL: https://issues.apache.org/jira/browse/YARN-3611 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sidharta Seethana >Assignee: Sidharta Seethana > > Support Docker Containers In LinuxContainerExecutor > LinuxContainerExecutor provides useful functionality today with respect to > localization, cgroups based resource management and isolation for CPU, > network, disk etc. as well as security with a well-defined mechanism to > execute privileged operations using the container-executor utility. Bringing > docker support to LinuxContainerExecutor lets us use all of this > functionality when running docker containers under YARN, while not requiring > users and admins to configure and use a different ContainerExecutor. > There are several aspects here that need to be worked through : > * Mechanism(s) to let clients request docker-specific functionality - we > could initially implement this via environment variables without impacting > the client API. > * Security - both docker daemon as well as application > * Docker image localization > * Running a docker container via container-executor as a specified user > * “Isolate” the docker container in terms of CPU/network/disk/etc > * Communicating with and/or signaling the running container (ensure correct > pid handling) > * Figure out workarounds for certain performance-sensitive scenarios like > HDFS short-circuit reads > * All of these need to be achieved without changing the current behavior of > LinuxContainerExecutor -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4844) Add getMemorySize/getVirtualCoresSize to o.a.h.y.api.records.Resource
[ https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330856#comment-15330856 ] Hadoop QA commented on YARN-4844: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 31s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 4m 34s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 48s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 40s {color} | {color:green} branch-2.8 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 34s {color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 9s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 50s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 55s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 29s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s {color} | {color:green} branch-2.8 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s {color} | {color:green} branch-2.8 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 34s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 33s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s {color} | {color:green} root: The patch generated 0 new + 118 unchanged - 1 fixed = 118 total (was 119) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 48s {color} | {color:green} hadoop-mapreduce-client-core in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 39s {color} | {color:green} hadoop-mapreduce-client-common in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 104m 28s {color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 31s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_95. {color} |
[jira] [Commented] (YARN-4844) Add getMemorySize/getVirtualCoresSize to o.a.h.y.api.records.Resource
[ https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330849#comment-15330849 ] Hadoop QA commented on YARN-4844: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 36s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 4m 32s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 3s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 16s {color} | {color:green} branch-2.8 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 44s {color} | {color:green} branch-2.8 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 8s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 53s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 53s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 30s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s {color} | {color:green} branch-2.8 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s {color} | {color:green} branch-2.8 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 7s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 44s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 4s {color} | {color:green} root: The patch generated 0 new + 118 unchanged - 1 fixed = 118 total (was 119) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 35s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 29s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 11s {color} | {color:green} hadoop-mapreduce-client-core in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 45s {color} | {color:green} hadoop-mapreduce-client-common in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 101m 38s {color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 32s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_101.
[jira] [Updated] (YARN-5070) upgrade HBase version for first merge
[ https://issues.apache.org/jira/browse/YARN-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vrushali C updated YARN-5070: - Attachment: YARN-5070-YARN-2928.06.patch Attaching patch v6 addressing [~jrottinghuis] 's and [~sjlee0] 's suggestions > upgrade HBase version for first merge > - > > Key: YARN-5070 > URL: https://issues.apache.org/jira/browse/YARN-5070 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Vrushali C >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: YARN-5070-YARN-2928.01.patch, > YARN-5070-YARN-2928.02.patch, YARN-5070-YARN-2928.03.patch, > YARN-5070-YARN-2928.04.patch, YARN-5070-YARN-2928.05.patch, > YARN-5070-YARN-2928.06.patch > > > Currently we set the HBase version for the timeline service storage to 1.0.1. > This is a fairly old version, and there are reasons to upgrade to a newer > version. We should upgrade it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4844) Add getMemorySize/getVirtualCoresSize to o.a.h.y.api.records.Resource
[ https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330840#comment-15330840 ] Hadoop QA commented on YARN-4844: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 49s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 4m 32s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 8s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 59s {color} | {color:green} branch-2.8 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 41s {color} | {color:green} branch-2.8 passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 11s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 58s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 57s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 42s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s {color} | {color:green} branch-2.8 passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 27s {color} | {color:green} branch-2.8 passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 47s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 17s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 9s {color} | {color:green} root: The patch generated 0 new + 118 unchanged - 1 fixed = 118 total (was 119) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 34s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 3s {color} | {color:green} hadoop-mapreduce-client-core in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 44s {color} | {color:green} hadoop-mapreduce-client-common in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 106m 17s {color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 33s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_101.
[jira] [Commented] (YARN-4876) [Phase 1] Decoupled Init / Destroy of Containers from Start / Stop
[ https://issues.apache.org/jira/browse/YARN-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330818#comment-15330818 ] Hadoop QA commented on YARN-4876: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} | {color:red} YARN-4876 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805024/YARN-4876.002.patch | | JIRA Issue | YARN-4876 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/12022/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > [Phase 1] Decoupled Init / Destroy of Containers from Start / Stop > -- > > Key: YARN-4876 > URL: https://issues.apache.org/jira/browse/YARN-4876 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Marco Rabozzi > Attachments: YARN-4876-design-doc.pdf, YARN-4876.002.patch, > YARN-4876.01.patch > > > Introduce *initialize* and *destroy* container API into the > *ContainerManagementProtocol* and decouple the actual start of a container > from the initialization. This will allow AMs to re-start a container without > having to lose the allocation. > Additionally, if the localization of the container is associated to the > initialize (and the cleanup with the destroy), This can also be used by > applications to upgrade a Container by *re-initializing* with a new > *ContainerLaunchContext* -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers
[ https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330759#comment-15330759 ] Inigo Goiri commented on YARN-5215: --- My initial proposal was to add a generic support for external resources. However, we could also follow the node-level agent approach which could even show as a unmanaged fake container. That solution is also OK with me. Going a little bit deeper into the example, in our most extreme scenario, we would set the guaranteed to 0GB and the opportuinistic to 16GB. In any case, if we go into preemption, then we should leverage what we are doing in YARN-1011. Regarding YARN-5202, they use the concept of preemptable which is pretty much the same as the OPPORTUNISTIC one. Actually, in our internal deployment right now, we just assume that everything running on YARN is preemptable and we preempt the youngest container. > Scheduling containers based on external load in the servers > --- > > Key: YARN-5215 > URL: https://issues.apache.org/jira/browse/YARN-5215 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Inigo Goiri > Attachments: YARN-5215.000.patch, YARN-5215.001.patch > > > Currently YARN runs containers in the servers assuming that they own all the > resources. The proposal is to use the utilization information in the node and > the containers to estimate how much is consumed by external processes and > schedule based on this estimation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5070) upgrade HBase version for first merge
[ https://issues.apache.org/jira/browse/YARN-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330740#comment-15330740 ] Vrushali C commented on YARN-5070: -- Thanks [~jrottinghuis] and [~sjlee0] for the reviews, I will look into making the changes. [~sjlee0] : - will make the constructor related changes - No, actually setting batch size to -1 means effectively no limit for the loop in the nextInternal function at line 231. Let me see if I can build a scanner context without setting the batch limit. > upgrade HBase version for first merge > - > > Key: YARN-5070 > URL: https://issues.apache.org/jira/browse/YARN-5070 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Vrushali C >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: YARN-5070-YARN-2928.01.patch, > YARN-5070-YARN-2928.02.patch, YARN-5070-YARN-2928.03.patch, > YARN-5070-YARN-2928.04.patch, YARN-5070-YARN-2928.05.patch > > > Currently we set the HBase version for the timeline service storage to 1.0.1. > This is a fairly old version, and there are reasons to upgrade to a newer > version. We should upgrade it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-1942) Deprecate toString/fromString methods from ConverterUtils and move them to records classes like ContainerId/ApplicationId, etc.
[ https://issues.apache.org/jira/browse/YARN-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-1942: - Summary: Deprecate toString/fromString methods from ConverterUtils and move them to records classes like ContainerId/ApplicationId, etc. (was: Many of ConverterUtils methods need to have public interfaces) > Deprecate toString/fromString methods from ConverterUtils and move them to > records classes like ContainerId/ApplicationId, etc. > --- > > Key: YARN-1942 > URL: https://issues.apache.org/jira/browse/YARN-1942 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api >Affects Versions: 2.4.0 >Reporter: Thomas Graves >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-1942-branch-2.0012.patch, > YARN-1942-branch-2.0013.patch, YARN-1942-branch-2.8.0013.patch, > YARN-1942-branch-2.8.0013_.patch, YARN-1942.1.patch, YARN-1942.10.patch, > YARN-1942.11.patch, YARN-1942.12.patch, YARN-1942.13.patch, > YARN-1942.2.patch, YARN-1942.3.patch, YARN-1942.4.patch, YARN-1942.5.patch, > YARN-1942.6.patch, YARN-1942.8.patch, YARN-1942.9.patch > > > ConverterUtils has a bunch of functions that are useful to application > masters. It should either be made public or we make some of the utilities > in it public or we provide other external apis for application masters to > use. Note that distributedshell and MR are both using these interfaces. > For instance the main use case I see right now is for getting the application > attempt id within the appmaster: > String containerIdStr = > System.getenv(Environment.CONTAINER_ID.name()); > ConverterUtils.toContainerId > ContainerId containerId = ConverterUtils.toContainerId(containerIdStr); > ApplicationAttemptId applicationAttemptId = > containerId.getApplicationAttemptId(); > I don't see any other way for the application master to get this information. > If there is please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers
[ https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330719#comment-15330719 ] Karthik Kambatla commented on YARN-5215: We happen to have a similar low-latency framework running alongside (and occasionally on) YARN. So, I am quite sympathetic to the problem. In the past, I have wondered if it makes sense to have a separate node-level agent that these other (white-listed) services could register with to get updates on each others' usage. That way, each framework is aware of others running on the cluster and the resources can be handed off more gracefully. If we are indeed looking to steal resources from these other services, I would think those resources should be allocated only to OPPORTUNISTIC containers and likely better handled through YARN-1011. For instance, in your earlier example, we would actually set yarn.nodemanager.resource.memory-mb to 14 GB which is allocated to GUARANTEED containers and YARN would also allocate OPPORTUNISTIC containers upto 2 GB based on how much of it is used by other frameworks. And, as Jason was mentioning earlier (IIUC), YARN-5202 provides this without the support for special OPPORTUNISTIC containers. Am I missing something? > Scheduling containers based on external load in the servers > --- > > Key: YARN-5215 > URL: https://issues.apache.org/jira/browse/YARN-5215 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Inigo Goiri > Attachments: YARN-5215.000.patch, YARN-5215.001.patch > > > Currently YARN runs containers in the servers assuming that they own all the > resources. The proposal is to use the utilization information in the node and > the containers to estimate how much is consumed by external processes and > schedule based on this estimation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1942) Many of ConverterUtils methods need to have public interfaces
[ https://issues.apache.org/jira/browse/YARN-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330711#comment-15330711 ] Jian He commented on YARN-1942: --- lgtm, thanks ! > Many of ConverterUtils methods need to have public interfaces > - > > Key: YARN-1942 > URL: https://issues.apache.org/jira/browse/YARN-1942 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api >Affects Versions: 2.4.0 >Reporter: Thomas Graves >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-1942-branch-2.0012.patch, > YARN-1942-branch-2.0013.patch, YARN-1942-branch-2.8.0013.patch, > YARN-1942-branch-2.8.0013_.patch, YARN-1942.1.patch, YARN-1942.10.patch, > YARN-1942.11.patch, YARN-1942.12.patch, YARN-1942.13.patch, > YARN-1942.2.patch, YARN-1942.3.patch, YARN-1942.4.patch, YARN-1942.5.patch, > YARN-1942.6.patch, YARN-1942.8.patch, YARN-1942.9.patch > > > ConverterUtils has a bunch of functions that are useful to application > masters. It should either be made public or we make some of the utilities > in it public or we provide other external apis for application masters to > use. Note that distributedshell and MR are both using these interfaces. > For instance the main use case I see right now is for getting the application > attempt id within the appmaster: > String containerIdStr = > System.getenv(Environment.CONTAINER_ID.name()); > ConverterUtils.toContainerId > ContainerId containerId = ConverterUtils.toContainerId(containerIdStr); > ApplicationAttemptId applicationAttemptId = > containerId.getApplicationAttemptId(); > I don't see any other way for the application master to get this information. > If there is please let me know. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5070) upgrade HBase version for first merge
[ https://issues.apache.org/jira/browse/YARN-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330682#comment-15330682 ] Sangjin Lee commented on YARN-5070: --- Thanks [~vrushalic] for the updated patch! The patch looks good for the most part. I only have a couple of minor points. (FlowScanner.java) - The 2 constructors are essentially duplicates except for {{batchSize}}. How about having one constructor call the other to eliminate the duplication? For example, {code} FlowScanner(RegionCoprocessorEnvironment env, InternalScanner internalScanner, FlowScannerOperation action) { this(env, null, internalScanner, action); } FlowScanner(RegionCoprocessorEnvironment env, Scan incomingScan, InternalScanner internalScanner, FlowScannerOperation action) { this.batchSize = incomingScan == null ? -1 : incomingScan.getBatch(); ... } {code} - l.148, 160: it appears for {{ScannerContext}} that batch size of -1 would mean immediately reaching the limit? I'm looking at {{ScannerContext.checkBatchLimit()}}. We learned that these methods are not really exercised, but perhaps we can simply not set any batch limit to be on the safe side? > upgrade HBase version for first merge > - > > Key: YARN-5070 > URL: https://issues.apache.org/jira/browse/YARN-5070 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Vrushali C >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: YARN-5070-YARN-2928.01.patch, > YARN-5070-YARN-2928.02.patch, YARN-5070-YARN-2928.03.patch, > YARN-5070-YARN-2928.04.patch, YARN-5070-YARN-2928.05.patch > > > Currently we set the HBase version for the timeline service storage to 1.0.1. > This is a fairly old version, and there are reasons to upgrade to a newer > version. We should upgrade it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers
[ https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330649#comment-15330649 ] Inigo Goiri commented on YARN-5215: --- [~kasha], in our use case we are targeting co-locating with latency sensitive workloads and they have diurnal patterns. For this type of workload, we need to be fairly reactive. Actually, preempting containers at the NM following the {{ContainersMonitor}} loop would be ideal. The improvements in utilization are significant as right now we are just reserving for the peak of the latency sensitive workloads (around ~50%) of the machine. We tried at some point to have a separate service to periodically change the resources of the NMs but it's harder to operate. In any case, in this first patch, we are just preventing scheduling containers and not adding preemption. I can add the following changes to the current patch: # UI improvements # History in the utilization to take decisions # Preempting containers from the RM # Preempting containers from the NM The problem with preemption is that we would go into what to preempt and that might have some dependencies in the opportunistic stuff in YARN-1011. > Scheduling containers based on external load in the servers > --- > > Key: YARN-5215 > URL: https://issues.apache.org/jira/browse/YARN-5215 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Inigo Goiri > Attachments: YARN-5215.000.patch, YARN-5215.001.patch > > > Currently YARN runs containers in the servers assuming that they own all the > resources. The proposal is to use the utilization information in the node and > the containers to estimate how much is consumed by external processes and > schedule based on this estimation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5252) EventDispatcher$EventProcessor.run() throws a findbugs error
[ https://issues.apache.org/jira/browse/YARN-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee resolved YARN-5252. --- Resolution: Duplicate Thanks [~asuresh]! Hadn't noticed that one. > EventDispatcher$EventProcessor.run() throws a findbugs error > > > Key: YARN-5252 > URL: https://issues.apache.org/jira/browse/YARN-5252 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Sangjin Lee >Priority: Minor > > Findbugs complains {{EventDispatcher$EventProcessor.run()}} invokes > {{System.exit()}}. This comes up every time yarn-common is touched. We should > either address it or make it an exception if there is a good reason for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5253) NodeStatusPBImpl throws a bunch of synchronization findbugs warnings
[ https://issues.apache.org/jira/browse/YARN-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee resolved YARN-5253. --- Resolution: Duplicate Fixed by YARN-5075. > NodeStatusPBImpl throws a bunch of synchronization findbugs warnings > > > Key: YARN-5253 > URL: https://issues.apache.org/jira/browse/YARN-5253 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Sangjin Lee >Priority: Minor > > There are several IS2_INCONSISTENT_SYNC findbugs warnings on > {{NodeStatusPBImpl}}. This should be addressed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking
[ https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330625#comment-15330625 ] Robert Kanter commented on YARN-4676: - {quote} 2. I am not very sure the context of this point. The "earlier comments" link lead to comments No 22 which is about separate timer for the poll, which was addressed by the previous patch.{quote} Sorry, it looks like my link didn't work right. I was referring to a different comment that [~vvasudev] made that was not numbered. I'll quote it this time: {quote} Robert Kanter, Karthik Kambatla, Junping Du - instead of storing the timeouts in a state store, we could also modify the RM-NM protocol to support a delayed shutdown. That way when the node is decommissioned gracefully, we tell the NM to shutdown after the specified timeout. There'll have to some logic to cancel a shutdown for handling re-commissioned nodes but we won't need to worry about updating the RM state store with timeouts/timestamps. It also avoids the clock skew issue that Karthik mentioned above. Like Karthik and Robert mentioned, I'm fine with handling this in a follow up JIRA as long as the command exits without doing anything if graceful decommission is specified and the cluster is setup with work preserving restart.{quote} I think this should simplify things because we wouldn't have to do anything special for HA and the RM doesn't have to keep track anything. I know that's a bit different than what you've been working on so far, but what do you think about this idea? > Automatic and Asynchronous Decommissioning Nodes Status Tracking > > > Key: YARN-4676 > URL: https://issues.apache.org/jira/browse/YARN-4676 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.8.0 >Reporter: Daniel Zhi >Assignee: Daniel Zhi > Labels: features > Attachments: GracefulDecommissionYarnNode.pdf, > GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, YARN-4676.005.patch, > YARN-4676.006.patch, YARN-4676.007.patch, YARN-4676.008.patch, > YARN-4676.009.patch, YARN-4676.010.patch, YARN-4676.011.patch, > YARN-4676.012.patch, YARN-4676.013.patch, YARN-4676.014.patch, > YARN-4676.015.patch, YARN-4676.016.patch > > > YARN-4676 implements an automatic, asynchronous and flexible mechanism to > graceful decommission > YARN nodes. After user issues the refreshNodes request, ResourceManager > automatically evaluates > status of all affected nodes to kicks out decommission or recommission > actions. RM asynchronously > tracks container and application status related to DECOMMISSIONING nodes to > decommission the > nodes immediately after there are ready to be decommissioned. Decommissioning > timeout at individual > nodes granularity is supported and could be dynamically updated. The > mechanism naturally supports multiple > independent graceful decommissioning “sessions” where each one involves > different sets of nodes with > different timeout settings. Such support is ideal and necessary for graceful > decommission request issued > by external cluster management software instead of human. > DecommissioningNodeWatcher inside ResourceTrackingService tracks > DECOMMISSIONING nodes status automatically and asynchronously after > client/admin made the graceful decommission request. It tracks > DECOMMISSIONING nodes status to decide when, after all running containers on > the node have completed, will be transitioned into DECOMMISSIONED state. > NodesListManager detect and handle include and exclude list changes to kick > out decommission or recommission as necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-5243: -- Attachment: YARN-5243-YARN-2928.03.patch Posted patch v.3. Fixed the unused imports on {{ResourceManager.java}}. Also addressed one of the checkstyle issues. The other two are about the lengths of methods that are existing problems. Not sure what can be done in this branch. I filed 2 upstream JIRAs for the findbugs issues as they are trunk issues (YARN-5252 and YARN-5253). > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch, > YARN-5243-YARN-2928.02.patch, YARN-5243-YARN-2928.03.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5253) NodeStatusPBImpl throws a bunch of synchronization findbugs warnings
Sangjin Lee created YARN-5253: - Summary: NodeStatusPBImpl throws a bunch of synchronization findbugs warnings Key: YARN-5253 URL: https://issues.apache.org/jira/browse/YARN-5253 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.8.0 Reporter: Sangjin Lee Priority: Minor There are several IS2_INCONSISTENT_SYNC findbugs warnings on {{NodeStatusPBImpl}}. This should be addressed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5252) EventDispatcher$EventProcessor.run() throws a findbugs error
[ https://issues.apache.org/jira/browse/YARN-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330599#comment-15330599 ] Arun Suresh commented on YARN-5252: --- [~sjlee0], YARN-5075 fixes this... > EventDispatcher$EventProcessor.run() throws a findbugs error > > > Key: YARN-5252 > URL: https://issues.apache.org/jira/browse/YARN-5252 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Sangjin Lee >Priority: Minor > > Findbugs complains {{EventDispatcher$EventProcessor.run()}} invokes > {{System.exit()}}. This comes up every time yarn-common is touched. We should > either address it or make it an exception if there is a good reason for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5252) EventDispatcher$EventProcessor.run() throws a findbugs error
Sangjin Lee created YARN-5252: - Summary: EventDispatcher$EventProcessor.run() throws a findbugs error Key: YARN-5252 URL: https://issues.apache.org/jira/browse/YARN-5252 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.8.0 Reporter: Sangjin Lee Priority: Minor Findbugs complains {{EventDispatcher$EventProcessor.run()}} invokes {{System.exit()}}. This comes up every time yarn-common is touched. We should either address it or make it an exception if there is a good reason for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5215) Scheduling containers based on external load in the servers
[ https://issues.apache.org/jira/browse/YARN-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330588#comment-15330588 ] Karthik Kambatla commented on YARN-5215: I am generally supportive of this. Few questions to clarify the usecase and approach: # How dynamic does this need to be? # And, what range of utilization improvements are we targeting here? 60 - 80, 75 - 80? # What are the characteristics of other workload running on these nodes? The reason I ask is to see if other approaches would suffice. For instance, would it be enough to gracefully increase/decrease the resources for Yarn on each node? i.e., {{yarn.nodemanager.resource.*}}. By graceful, I mean the decrease succeeds only after the tasks using those resources finish. > Scheduling containers based on external load in the servers > --- > > Key: YARN-5215 > URL: https://issues.apache.org/jira/browse/YARN-5215 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Inigo Goiri > Attachments: YARN-5215.000.patch, YARN-5215.001.patch > > > Currently YARN runs containers in the servers assuming that they own all the > resources. The proposal is to use the utilization information in the node and > the containers to estimate how much is consumed by external processes and > schedule based on this estimation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-4876) [Phase 1] Decoupled Init / Destroy of Containers from Start / Stop
[ https://issues.apache.org/jira/browse/YARN-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330496#comment-15330496 ] Arun Suresh edited comment on YARN-4876 at 6/14/16 8:56 PM: Aggregating and posting some design points on the patch based on offline discussions with [~marco.rabozzi] : h4. ContainerImpl state machine In the current patch, containers that are initialized using the new initializeContainers APIs keep waiting for startContainers requests within the LOCALIZED state after resource localization. When the START_CONTAINER event is generated upon request from the application master, the container transits to a new LAUNCHING state waiting for a CONTAINER_LAUNCHED event (this is fired asynchronously by ContainerLaunch when the container process is being started). Upon receiving the CONTAINER_LAUNCHED event, the container state is updated to RUNNING. For containers that do not allow multi-start (i.e. those that are initialized and started using the standard startContainers API), the START_CONTAINER event is automatically sent after localization. The role of the new “LAUNCHING” state is to make a clear distinction between the following two situations: # The container has been localized and is waiting for a start request (LOCALIZED state) # The container has received a start request and it is being started (LAUNCHING state) In this fashion, we can allow a start (or a restart) of an idle container only if the container is in the LOCALIZED state and if it allows multi-start. >From a first analysis, it seems that the new LAUNCHING state and the already >present RELAUNCHING state could by merged into a single LAUNCHING state to >reduce the state machine complexity. The destroyContainers API is equivalent to stopContainers if the specified containers do not allow multi-start. On the other hand, in case of a container that allows multi-start, the stopContainers API kills the container process and reverts the container state machine to “LOCALIZED”. However, in order to properly catch the termination of a container process for which a stop request has been issued, an additional “STOPPING” state has been inserted. If the container is in RUNNING state and it allows multi-start, the application master can issue a stopContainers request upon which the container state is updated to STOPPING and an asynchronous request to kill the container process is sent. Within the stopping state, similarly to the KILLING state, the container termination events (CONTAINER_EXITED_WITH_SUCCESS, CONTAINER_KILLED_ON_REQUEST, CONTAINER_EXITED_WITH_FAILURE) are considered as a successful container stop, upon which the container state reverts to LOCALIZED. h4. Working directory cleanup When a container is in the LOCALIZED state and multi-start is enabled, the application master can issue the following 3 new types of requests: # StartContainers (ContainerLaunchContext == NULL) # InitializeContainers # StartContainers (ContainerLaunchContext != NULL) In case 1) the container is simply started using the ContainerLaunchContext issued in the previous InitializeContainers request (the state machine transitions for this case are the ones described in the previous section). Case 2) and 3) both perform reinitialization and relocalization of container resources, the only difference between 2) and 3) is that in 3) the container is also started after relocalization. Currently, when the container is reinitialized, the container working directory is deleted to ensure a clean state for the subsequent container starts. Actually, we could relax this behavior and allow the application master to specify a deletion policy for container reinitialization. Depending on the requirements we might want to address this aspect here or in a follow up JIRA. h4. Log handling Currently, there is no special handling of logs for a restarted container. The application master can decide either to append the new logs to the old ones or overwrite the old logs. This can be simply achieved by changing the launch command (e.g. in Linux use “>>” to append and “>” to overwrite). h4. Token expiration Both the InitializeContainers and the StartContainers APIs require a container token to authorize the request. For long running containers, the token might expire and the application master won’t be able to request a restart or a reinitialization of a container. This limitation currently holds also for the IncreaseContainerResource API. We might need to address container token renewal in a separated JIRA. h4. Recovery for container that allows multi-start The current patch does not fully support recovery of containers that allows multi-start. Indeed, after a restart of the NodeManager, if the container is not running, the NodeManager cannot distinguish between a stopped container waiting for start or a container that completed
[jira] [Updated] (YARN-5251) Yarn CLI to obtain App logs for last 'n' bytes fails with 'java.io.IOException' and for 'n' bytes fails with NumberFormatException
[ https://issues.apache.org/jira/browse/YARN-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-5251: Attachment: YARN-5251.1.patch > Yarn CLI to obtain App logs for last 'n' bytes fails with > 'java.io.IOException' and for 'n' bytes fails with NumberFormatException > -- > > Key: YARN-5251 > URL: https://issues.apache.org/jira/browse/YARN-5251 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Sumana Sathish >Assignee: Xuan Gong >Priority: Blocker > Attachments: YARN-5251.1.patch > > > {code} > yarn logs -applicationId application_1465421211793_0017 -size 1024 >> appLog1 > on finished application > 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #2 tokens > and #1 secret keys for NM use for launching container > 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of > containertok" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Long.parseLong(Long.java:589) > at java.lang.Long.parseLong(Long.java:631) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readContainerLogs(AggregatedLogFormat.java:691) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:767) > at > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAllContainersLogs(LogCLIHelpers.java:354) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchApplicationLogs(LogsCLI.java:830) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:231) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:264) > {code} > {code} > yarn logs -applicationId application_1465421211793_0004 -containerId > container_e07_1465421211793_0004_01_01 -logFiles syslog -size -1000 > Exception in thread "main" java.io.IOException: The bytes were skipped are > different from the caller requested > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readContainerLogsForALogType(AggregatedLogFormat.java:838) > at > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAContainerLogsForALogType(LogCLIHelpers.java:300) > at > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAContainersLogsForALogTypeWithoutNodeId(LogCLIHelpers.java:224) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.printContainerLogsForFinishedApplicationWithoutNodeId(LogsCLI.java:447) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:782) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:228) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:264) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5241) FairScheduler fails to release container it is just allocated
[ https://issues.apache.org/jira/browse/YARN-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330540#comment-15330540 ] Karthik Kambatla commented on YARN-5241: Thanks for updating the patch, [~chenfolin]. I understand the issue now and the approach in the patch seems reasonable. For improved readability, can we do the following: # Set needRelease if it either newlyAllocatedContainers or liveContainers has it. # Also, would be nice to remove from both collections at the same time. The code would likely look like: {code} boolean needRelease = newlyAllocatedContainers.contains(rmContainer) || liveContainers.containsKey(rmContainer.getContainerId()); newlyAllocatedContainers.remove(rmContainer); liveContainers.remove(rmContainer.getContainerId()); {code} # The indentation of the if check for whether debug logging is enabled is not proper. Mind updating that? > FairScheduler fails to release container it is just allocated > - > > Key: YARN-5241 > URL: https://issues.apache.org/jira/browse/YARN-5241 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.1, 2.8.0, 2.7.2 >Reporter: ChenFolin > Attachments: YARN-5241-001.patch, YARN-5241-002.patch, > repeatContainerCompleted.log > > > NodeManager heartbeat event NODE_UPDATE and ApplicationMaster allocate > operate may cause repeat container completed, it can lead something wrong. > Node releaseContainer can pervent repeat release operate: > like: > public synchronized void releaseContainer(Container container) { > if (!isValidContainer(container.getId())) { > LOG.error("Invalid container released " + container); > return; > } > FSAppAttempt containerCompleted did not prevent repeat container completed > operate. > Detail logs at attach file. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5241) FairScheduler fails to release container it is just allocated
[ https://issues.apache.org/jira/browse/YARN-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330542#comment-15330542 ] Karthik Kambatla commented on YARN-5241: Oh, and it would be good to add a unit test. > FairScheduler fails to release container it is just allocated > - > > Key: YARN-5241 > URL: https://issues.apache.org/jira/browse/YARN-5241 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.1, 2.8.0, 2.7.2 >Reporter: ChenFolin > Attachments: YARN-5241-001.patch, YARN-5241-002.patch, > repeatContainerCompleted.log > > > NodeManager heartbeat event NODE_UPDATE and ApplicationMaster allocate > operate may cause repeat container completed, it can lead something wrong. > Node releaseContainer can pervent repeat release operate: > like: > public synchronized void releaseContainer(Container container) { > if (!isValidContainer(container.getId())) { > LOG.error("Invalid container released " + container); > return; > } > FSAppAttempt containerCompleted did not prevent repeat container completed > operate. > Detail logs at attach file. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5241) FairScheduler fails to release container it is just allocated
[ https://issues.apache.org/jira/browse/YARN-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-5241: --- Summary: FairScheduler fails to release container it is just allocated (was: FairScheduler repeat container completed) > FairScheduler fails to release container it is just allocated > - > > Key: YARN-5241 > URL: https://issues.apache.org/jira/browse/YARN-5241 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.1, 2.8.0, 2.7.2 >Reporter: ChenFolin > Attachments: YARN-5241-001.patch, YARN-5241-002.patch, > repeatContainerCompleted.log > > > NodeManager heartbeat event NODE_UPDATE and ApplicationMaster allocate > operate may cause repeat container completed, it can lead something wrong. > Node releaseContainer can pervent repeat release operate: > like: > public synchronized void releaseContainer(Container container) { > if (!isValidContainer(container.getId())) { > LOG.error("Invalid container released " + container); > return; > } > FSAppAttempt containerCompleted did not prevent repeat container completed > operate. > Detail logs at attach file. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4876) [Phase 1] Decoupled Init / Destroy of Containers from Start / Stop
[ https://issues.apache.org/jira/browse/YARN-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330501#comment-15330501 ] Arun Suresh commented on YARN-4876: --- [~vinodkv], [~jianhe], [~vvasudev] .. would like to hear your thoughts on the above.. > [Phase 1] Decoupled Init / Destroy of Containers from Start / Stop > -- > > Key: YARN-4876 > URL: https://issues.apache.org/jira/browse/YARN-4876 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Marco Rabozzi > Attachments: YARN-4876-design-doc.pdf, YARN-4876.002.patch, > YARN-4876.01.patch > > > Introduce *initialize* and *destroy* container API into the > *ContainerManagementProtocol* and decouple the actual start of a container > from the initialization. This will allow AMs to re-start a container without > having to lose the allocation. > Additionally, if the localization of the container is associated to the > initialize (and the cleanup with the destroy), This can also be used by > applications to upgrade a Container by *re-initializing* with a new > *ContainerLaunchContext* -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4876) [Phase 1] Decoupled Init / Destroy of Containers from Start / Stop
[ https://issues.apache.org/jira/browse/YARN-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330496#comment-15330496 ] Arun Suresh commented on YARN-4876: --- Aggregating and posting some design points on the patch based on offline discussions with [~marco.rabozzi] : h4. ContainerImpl state machine In the current patch, containers that are initialized using the new initializeContainers APIs keep waiting for startContainers requests within the LOCALIZED state after resource localization. When the START_CONTAINER event is generated upon request from the application master, the container transits to a new LAUNCHING state waiting for a CONTAINER_LAUNCHED event (this is fired asynchronously by ContainerLaunch when the container process is being started). Upon receiving the CONTAINER_LAUNCHED event, the container state is updated to RUNNING. For containers that do not allow multi-start (i.e. those that are initialized and started using the standard startContainers API), the START_CONTAINER event is automatically sent after localization. The role of the new “LAUNCHING” state is to make a clear distinction between the following two situations: # The container has been localized and is waiting for a start request (LOCALIZED state) # The container has received a start request and it is being started (LAUNCHING state) In this fashion, we can allow a start (or a restart) of an idle container only if the container is in the LOCALIZED state and if it allows multi-start. >From a first analysis, it seems that the new LAUNCHING state and the already >present RELAUNCHING state could by merged into a single LAUNCHING state to >reduce the state machine complexity. The destroyContainers API is equivalent to stopContainers if the specified containers do not allow multi-start. On the other hand, in case of a container that allows multi-start, the stopContainers API kills the container process and reverts the container state machine to “LOCALIZED”. However, in order to properly catch the termination of a container process for which a stop request has been issued, an additional “STOPPING” state has been inserted. If the container is in RUNNING state and it allows multi-start, the application master can issue a stopContainers request upon which the container state is updated to STOPPING and an asynchronous request to kill the container process is sent. Within the stopping state, similarly to the KILLING state, the container termination events (CONTAINER_EXITED_WITH_SUCCESS, CONTAINER_KILLED_ON_REQUEST, CONTAINER_EXITED_WITH_FAILURE) are considered as a successful container stop, upon which the container state reverts to LOCALIZED. h4. Working directory cleanup When a container is in the LOCALIZED state and multi-start is enabled, the application master can issue the following 3 new types of requests: # StartContainers (ContainerLaunchContext == NULL) # InitializeContainers # StartContainers (ContainerLaunchContext != NULL) In case 1) the container is simply started using the ContainerLaunchContext issued in the previous InitializeContainers request (the state machine transitions for this case are the ones described in the previous section). Case 2) and 3) both perform reinitialization and relocalization of container resources, the only difference between 2) and 3) is that in 3) the container is also started after relocalization. Currently, when the container is reinitialized, the container working directory is deleted to ensure a clean state for the subsequent container starts. Actually, we could relax this behavior and allow the application master to specify a deletion policy for container reinitialization. Depending on the requirements we might want to address this aspect here or in a follow up JIRA. h4. Log handling Currently, there is no special handling of logs for a restarted container. The application master can decide either to append the new logs to the old ones or overwrite the old logs. This can be simply achieved by changing the launch command (e.g. in Linux use “>>” to append and “>” to overwrite). h4. Token expiration Both the InitializeContainers and the StartContainers APIs require a container token to authorize the request. For long running containers, the token might expire and the application master won’t be able to request a restart or a reinitialization of a container. This limitation currently holds also for the IncreaseContainerResource API. We might need to address container token renewal in a separated JIRA. h4. Recovery for container that allows multi-start The current patch does not fully support recovery of containers that allows multi-start. Indeed, after a restart of the NodeManager, if the container is not running, the NodeManager cannot distinguish between a stopped container waiting for start or a container that completed its execution successfully. Additional
[jira] [Updated] (YARN-5251) Yarn CLI to obtain App logs for last 'n' bytes fails with 'java.io.IOException' and for 'n' bytes fails with NumberFormatException
[ https://issues.apache.org/jira/browse/YARN-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-5251: Priority: Blocker (was: Major) > Yarn CLI to obtain App logs for last 'n' bytes fails with > 'java.io.IOException' and for 'n' bytes fails with NumberFormatException > -- > > Key: YARN-5251 > URL: https://issues.apache.org/jira/browse/YARN-5251 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Sumana Sathish >Assignee: Xuan Gong >Priority: Blocker > > {code} > yarn logs -applicationId application_1465421211793_0017 -size 1024 >> appLog1 > on finished application > 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #2 tokens > and #1 secret keys for NM use for launching container > 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of > containertok" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Long.parseLong(Long.java:589) > at java.lang.Long.parseLong(Long.java:631) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readContainerLogs(AggregatedLogFormat.java:691) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:767) > at > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAllContainersLogs(LogCLIHelpers.java:354) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchApplicationLogs(LogsCLI.java:830) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:231) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:264) > {code} > {code} > yarn logs -applicationId application_1465421211793_0004 -containerId > container_e07_1465421211793_0004_01_01 -logFiles syslog -size -1000 > Exception in thread "main" java.io.IOException: The bytes were skipped are > different from the caller requested > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readContainerLogsForALogType(AggregatedLogFormat.java:838) > at > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAContainerLogsForALogType(LogCLIHelpers.java:300) > at > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAContainersLogsForALogTypeWithoutNodeId(LogCLIHelpers.java:224) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.printContainerLogsForFinishedApplicationWithoutNodeId(LogsCLI.java:447) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:782) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:228) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:264) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5249) app logs with 'n' bytes via CLI fails with NumberFormatException
[ https://issues.apache.org/jira/browse/YARN-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330273#comment-15330273 ] Xuan Gong commented on YARN-5249: - Close this as duplicate. We will fix this in YARN-5251 > app logs with 'n' bytes via CLI fails with NumberFormatException > > > Key: YARN-5249 > URL: https://issues.apache.org/jira/browse/YARN-5249 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.9.0 >Reporter: Sumana Sathish >Assignee: Xuan Gong > Attachments: YARN-5249.1.patch > > > app logs with 'n' bytes via CLI fails with NumberFormatException for finished > appliction > {code} > yarn logs -applicationId application_1465421211793_0017 -size 1024 >> appLog1 > on finished application > 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #2 tokens > and #1 secret keys for NM use for launching container > 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of > containertok" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Long.parseLong(Long.java:589) > at java.lang.Long.parseLong(Long.java:631) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readContainerLogs(AggregatedLogFormat.java:691) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:767) > at > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAllContainersLogs(LogCLIHelpers.java:354) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchApplicationLogs(LogsCLI.java:830) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:231) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:264) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5249) app logs with 'n' bytes via CLI fails with NumberFormatException
[ https://issues.apache.org/jira/browse/YARN-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330274#comment-15330274 ] Xuan Gong commented on YARN-5249: - Close this as duplicate. We will fix this in YARN-5251 > app logs with 'n' bytes via CLI fails with NumberFormatException > > > Key: YARN-5249 > URL: https://issues.apache.org/jira/browse/YARN-5249 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.9.0 >Reporter: Sumana Sathish >Assignee: Xuan Gong > Attachments: YARN-5249.1.patch > > > app logs with 'n' bytes via CLI fails with NumberFormatException for finished > appliction > {code} > yarn logs -applicationId application_1465421211793_0017 -size 1024 >> appLog1 > on finished application > 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #2 tokens > and #1 secret keys for NM use for launching container > 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of > containertok" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Long.parseLong(Long.java:589) > at java.lang.Long.parseLong(Long.java:631) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readContainerLogs(AggregatedLogFormat.java:691) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:767) > at > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAllContainersLogs(LogCLIHelpers.java:354) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchApplicationLogs(LogsCLI.java:830) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:231) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:264) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5231) obtaining app logs for last 'n' bytes using CLI gives 'java.io.IOException'
[ https://issues.apache.org/jira/browse/YARN-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330244#comment-15330244 ] Xuan Gong commented on YARN-5231: - Close this as duplicate. We will fix this in YARN-5251 > obtaining app logs for last 'n' bytes using CLI gives 'java.io.IOException' > --- > > Key: YARN-5231 > URL: https://issues.apache.org/jira/browse/YARN-5231 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 2.9.0 >Reporter: Sumana Sathish >Assignee: Xuan Gong >Priority: Blocker > Attachments: YARN-5231.1.patch > > > Obtaining logs for last 'n' bytes gives the following exception > {code} > yarn logs -applicationId application_1465421211793_0004 -containerId > container_e07_1465421211793_0004_01_01 -logFiles syslog -size -1000 > Exception in thread "main" java.io.IOException: The bytes were skipped are > different from the caller requested > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readContainerLogsForALogType(AggregatedLogFormat.java:838) > at > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAContainerLogsForALogType(LogCLIHelpers.java:300) > at > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAContainersLogsForALogTypeWithoutNodeId(LogCLIHelpers.java:224) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.printContainerLogsForFinishedApplicationWithoutNodeId(LogsCLI.java:447) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:782) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:228) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:264) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5251) Yarn CLI to obtain App logs for last 'n' bytes fails with 'java.io.IOException' and for 'n' bytes fails with NumberFormatException
Xuan Gong created YARN-5251: --- Summary: Yarn CLI to obtain App logs for last 'n' bytes fails with 'java.io.IOException' and for 'n' bytes fails with NumberFormatException Key: YARN-5251 URL: https://issues.apache.org/jira/browse/YARN-5251 Project: Hadoop YARN Issue Type: Sub-task Reporter: Sumana Sathish Assignee: Xuan Gong {code} yarn logs -applicationId application_1465421211793_0017 -size 1024 >> appLog1 on finished application 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #2 tokens and #1 secret keys for NM use for launching container 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of containertok" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:589) at java.lang.Long.parseLong(Long.java:631) at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readContainerLogs(AggregatedLogFormat.java:691) at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:767) at org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAllContainersLogs(LogCLIHelpers.java:354) at org.apache.hadoop.yarn.client.cli.LogsCLI.fetchApplicationLogs(LogsCLI.java:830) at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:231) at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:264) {code} {code} yarn logs -applicationId application_1465421211793_0004 -containerId container_e07_1465421211793_0004_01_01 -logFiles syslog -size -1000 Exception in thread "main" java.io.IOException: The bytes were skipped are different from the caller requested at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readContainerLogsForALogType(AggregatedLogFormat.java:838) at org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAContainerLogsForALogType(LogCLIHelpers.java:300) at org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAContainersLogsForALogTypeWithoutNodeId(LogCLIHelpers.java:224) at org.apache.hadoop.yarn.client.cli.LogsCLI.printContainerLogsForFinishedApplicationWithoutNodeId(LogsCLI.java:447) at org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:782) at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:228) at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:264) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330212#comment-15330212 ] Varun Saxena commented on YARN-5243: [~sjlee0], thanks for the patch. The patch looks good. However, there are a bunch of unused/duplicate imports too in {{ResourceManager.java}}. Probably introduced during rebase. This can be fixed here as well. > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch, > YARN-5243-YARN-2928.02.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5250) YARN queues are case-sensitive
[ https://issues.apache.org/jira/browse/YARN-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anurag Tangri updated YARN-5250: Description: Hi, capacity scheduler on YARN give error if users trying to use queue name in case in-sensitive manner. for eg if a queue name is 'abc'. If user submited to queue 'Abc', it does not let user submit the job. This is correct behavior from current implementation but should queues not be case insensitive ? Most production deployments will not have queues with name as 'abc' and 'Abc'. Thanks, Anurag Tangri was: Hi, capacity scheduler on YARN give error if users trying to use queue name in case in-sensitive manner. for eg if a queue name is 'abc'. If user submited to queue 'Abc', it does not let user submit the job. This is correct behavior from current implementation but should queues not be case insensitive ? Most production deployments will not have queeus with name as 'abc' and 'Abc'. Thanks, Anurag Tangri > YARN queues are case-sensitive > -- > > Key: YARN-5250 > URL: https://issues.apache.org/jira/browse/YARN-5250 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 2.6.0 >Reporter: Anurag Tangri >Priority: Minor > > Hi, > capacity scheduler on YARN give error if users trying to use queue name in > case in-sensitive manner. > for eg if a queue name is 'abc'. > If user submited to queue 'Abc', it does not let user submit the job. > This is correct behavior from current implementation but should queues not be > case insensitive ? > Most production deployments will not have queues with name as 'abc' and 'Abc'. > Thanks, > Anurag Tangri -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5249) app logs with 'n' bytes via CLI fails with NumberFormatException
[ https://issues.apache.org/jira/browse/YARN-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330186#comment-15330186 ] Hadoop QA commented on YARN-5249: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 2s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 9s {color} | {color:red} hadoop-yarn-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 48s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 36s {color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 39s {color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 51m 50s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.logaggregation.TestAggregatedLogFormat | | | hadoop.yarn.client.cli.TestLogsCLI | | | hadoop.yarn.client.api.impl.TestYarnClient | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12810575/YARN-5249.1.patch | | JIRA Issue | YARN-5249 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux e17de02c52ae 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 8e8cb4c | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | unit |
[jira] [Updated] (YARN-5250) YARN queues are case-sensitive
[ https://issues.apache.org/jira/browse/YARN-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anurag Tangri updated YARN-5250: Description: Hi, capacity scheduler on YARN give error if users trying to use queue name in case in-sensitive manner. for eg if a queue name is 'abc'. If user submited to queue 'Abc', it does not let user submit the job. This is correct behavior from current implementation but should queues not be case insensitive ? Most production deployments will not have queeus with name as 'abc' and 'Abc'. Thanks, Anurag Tangri was: Hi, capacity scheduler on YARN give error if users trying to use queue name in case in-sensitive manner. for eg if a queue name is 'abc'. If user submited to queue 'Abc', it does not let uer submit the job. This is correct behavior from current implementation but should queues not be case insensitive ? Most production deployments will not have queeus with name as 'abc' and 'Abc'. Thanks, Anurag Tangri > YARN queues are case-sensitive > -- > > Key: YARN-5250 > URL: https://issues.apache.org/jira/browse/YARN-5250 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 2.6.0 >Reporter: Anurag Tangri >Priority: Minor > > Hi, > capacity scheduler on YARN give error if users trying to use queue name in > case in-sensitive manner. > for eg if a queue name is 'abc'. > If user submited to queue 'Abc', it does not let user submit the job. > This is correct behavior from current implementation but should queues not be > case insensitive ? > Most production deployments will not have queeus with name as 'abc' and 'Abc'. > Thanks, > Anurag Tangri -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5231) obtaining app logs for last 'n' bytes using CLI gives 'java.io.IOException'
[ https://issues.apache.org/jira/browse/YARN-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330174#comment-15330174 ] Xuan Gong commented on YARN-5231: - [~djp] We already have the test cases in TestLogsCLI to cover this > obtaining app logs for last 'n' bytes using CLI gives 'java.io.IOException' > --- > > Key: YARN-5231 > URL: https://issues.apache.org/jira/browse/YARN-5231 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 2.9.0 >Reporter: Sumana Sathish >Assignee: Xuan Gong >Priority: Blocker > Attachments: YARN-5231.1.patch > > > Obtaining logs for last 'n' bytes gives the following exception > {code} > yarn logs -applicationId application_1465421211793_0004 -containerId > container_e07_1465421211793_0004_01_01 -logFiles syslog -size -1000 > Exception in thread "main" java.io.IOException: The bytes were skipped are > different from the caller requested > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readContainerLogsForALogType(AggregatedLogFormat.java:838) > at > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAContainerLogsForALogType(LogCLIHelpers.java:300) > at > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAContainersLogsForALogTypeWithoutNodeId(LogCLIHelpers.java:224) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.printContainerLogsForFinishedApplicationWithoutNodeId(LogsCLI.java:447) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:782) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:228) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:264) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5224) Logs for a completed container are not available in the yarn logs output for a live application
[ https://issues.apache.org/jira/browse/YARN-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330172#comment-15330172 ] Vinod Kumar Vavilapalli commented on YARN-5224: --- bq. We could still get the logs if the container logs still exist in NM Local log directory even if we remove container from NMContext. bq. To solve this issue, we need to create a separate NMWebService to get log name of the container. Instead of a completely new path you added - {{"/containerlogs/container/$containerid$/logfiles"}}, the following would be more consistent - Existing {{"/containers/$containerid"}} returns container-info (and the list of containerLogFiles inside the info) - New {{"/containers/$containerid/logs"}} returns names of the log-files, and may be simple metadata like size of log-file etc - Existing logs-webservice should have been {{"/containerlogs/$containerid/$filename"}} -> {{"/containers/$containerid$/logs/$filename"}} returns contents. Obviously we cannot remove the old API, as it is public. The nice thing is this is the same pattern that the UI flow also follows. /cc [~vvasudev] > Logs for a completed container are not available in the yarn logs output for > a live application > --- > > Key: YARN-5224 > URL: https://issues.apache.org/jira/browse/YARN-5224 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.9.0 >Reporter: Siddharth Seth >Assignee: Xuan Gong > Attachments: YARN-5224.1.patch, YARN-5224.2.patch, YARN-5224.3.patch > > > This affects 'short' jobs like MapReduce and Tez more than long running apps. > Related: YARN-5193 (but that only covers long running apps) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4143) Optimize the check for AMContainer allocation needed by blacklisting and ContainerType
[ https://issues.apache.org/jira/browse/YARN-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330131#comment-15330131 ] Hadoop QA commented on YARN-4143: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 56s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 36m 41s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 51m 51s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12810564/YARN-4143.002.patch | | JIRA Issue | YARN-4143 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux cce0395aff7c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 8e8cb4c | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/12014/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit test logs | https://builds.apache.org/job/PreCommit-YARN-Build/12014/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/12014/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/12014/console |
[jira] [Commented] (YARN-5085) Add support for change of container ExecutionType
[ https://issues.apache.org/jira/browse/YARN-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330116#comment-15330116 ] Roni Burd commented on YARN-5085: - Regarding: I see the need for a promotion, why would we want to demote? A couple of scenarios come comes to mind 1) long running processes - a user may temporary upgrade some tokens to catch up after a live site or deal with an important skew with and event. 2) A distributed scheduled or RM may race some containers to make a deadline or to optimize latency in the face of limited containers in their queue. Demotion is basically a way to flip some container from OP -> GUA and another from GUA -> OP and preserving the queue limit. > Add support for change of container ExecutionType > - > > Key: YARN-5085 > URL: https://issues.apache.org/jira/browse/YARN-5085 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Arun Suresh > > YARN-2882 introduced the concept of {{ExecutionType}} for containers and it > also introduced the concept of OPPORTUNISTIC ExecutionType. > YARN-4335 introduced changes to the ResourceRequest so that AMs may request > that the Container allocated against the ResourceRequest is of a particular > {{ExecutionType}}. > This JIRA proposes to provide support for the AM to change the ExecutionType > of a previously requested Container. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5250) YARN queues are case-sensitive
Anurag Tangri created YARN-5250: --- Summary: YARN queues are case-sensitive Key: YARN-5250 URL: https://issues.apache.org/jira/browse/YARN-5250 Project: Hadoop YARN Issue Type: Bug Components: capacity scheduler Affects Versions: 2.6.0 Reporter: Anurag Tangri Priority: Minor Hi, capacity scheduler on YARN give error if users trying to use queue name in case in-sensitive manner. for eg if a queue name is 'abc'. If user submited to queue 'Abc', it does not let uer submit the job. This is correct behavior from current implementation but should queues not be case insensitive ? Most production deployments will not have queeus with name as 'abc' and 'Abc'. Thanks, Anurag Tangri -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4844) Add getMemorySize/getVirtualCoresSize to o.a.h.y.api.records.Resource
[ https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4844: - Attachment: YARN-4844-branch-2.8.addendum.2.patch Attached patch for branch-2.8 as well. > Add getMemorySize/getVirtualCoresSize to o.a.h.y.api.records.Resource > - > > Key: YARN-4844 > URL: https://issues.apache.org/jira/browse/YARN-4844 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Fix For: 2.8.0 > > Attachments: YARN-4844-branch-2.8.0016_.patch, > YARN-4844-branch-2.8.addendum.2.patch, YARN-4844-branch-2.addendum.1_.patch, > YARN-4844-branch-2.addendum.2.patch, YARN-4844.1.patch, YARN-4844.10.patch, > YARN-4844.11.patch, YARN-4844.12.patch, YARN-4844.13.patch, > YARN-4844.14.patch, YARN-4844.15.patch, YARN-4844.16.branch-2.patch, > YARN-4844.16.patch, YARN-4844.2.patch, YARN-4844.3.patch, YARN-4844.4.patch, > YARN-4844.5.patch, YARN-4844.6.patch, YARN-4844.7.patch, > YARN-4844.8.branch-2.patch, YARN-4844.8.patch, YARN-4844.9.branch, > YARN-4844.9.branch-2.patch > > > We use int32 for memory now, if a cluster has 10k nodes, each node has 210G > memory, we will get a negative total cluster memory. > And another case that easier overflows int32 is: we added all pending > resources of running apps to cluster's total pending resources. If a > problematic app requires too much resources (let's say 1M+ containers, each > of them has 3G containers), int32 will be not enough. > Even if we can cap each app's pending request, we cannot handle the case that > there're many running apps, each of them has capped but still significant > numbers of pending resources. > So we may possibly need to add getMemoryLong/getVirtualCoreLong to > o.a.h.y.api.records.Resource. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5249) app logs with 'n' bytes via CLI fails with NumberFormatException
[ https://issues.apache.org/jira/browse/YARN-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-5249: Attachment: YARN-5249.1.patch > app logs with 'n' bytes via CLI fails with NumberFormatException > > > Key: YARN-5249 > URL: https://issues.apache.org/jira/browse/YARN-5249 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.9.0 >Reporter: Sumana Sathish >Assignee: Xuan Gong > Attachments: YARN-5249.1.patch > > > app logs with 'n' bytes via CLI fails with NumberFormatException for finished > appliction > {code} > yarn logs -applicationId application_1465421211793_0017 -size 1024 >> appLog1 > on finished application > 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #2 tokens > and #1 secret keys for NM use for launching container > 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of > containertok" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Long.parseLong(Long.java:589) > at java.lang.Long.parseLong(Long.java:631) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readContainerLogs(AggregatedLogFormat.java:691) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:767) > at > org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAllContainersLogs(LogCLIHelpers.java:354) > at > org.apache.hadoop.yarn.client.cli.LogsCLI.fetchApplicationLogs(LogsCLI.java:830) > at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:231) > at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:264) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5249) app logs with 'n' bytes via CLI fails with NumberFormatException
Xuan Gong created YARN-5249: --- Summary: app logs with 'n' bytes via CLI fails with NumberFormatException Key: YARN-5249 URL: https://issues.apache.org/jira/browse/YARN-5249 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.9.0 Reporter: Sumana Sathish Assignee: Xuan Gong app logs with 'n' bytes via CLI fails with NumberFormatException for finished appliction {code} yarn logs -applicationId application_1465421211793_0017 -size 1024 >> appLog1 on finished application 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #2 tokens and #1 secret keys for NM use for launching container 2016-06-13 18:44:25,989 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Size of containertok" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:589) at java.lang.Long.parseLong(Long.java:631) at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readContainerLogs(AggregatedLogFormat.java:691) at org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.readAContainerLogsForALogType(AggregatedLogFormat.java:767) at org.apache.hadoop.yarn.logaggregation.LogCLIHelpers.dumpAllContainersLogs(LogCLIHelpers.java:354) at org.apache.hadoop.yarn.client.cli.LogsCLI.fetchApplicationLogs(LogsCLI.java:830) at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:231) at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:264) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330029#comment-15330029 ] Varun Saxena commented on YARN-5243: I have narrowed down the fix to YARN-5118. Its an existing trunk issue. > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch, > YARN-5243-YARN-2928.02.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-4844) Add getMemorySize/getVirtualCoresSize to o.a.h.y.api.records.Resource
[ https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan reopened YARN-4844: -- Reopen the issue to address following issues: 1) In Resource.java, add default implementation to new methods, avoid downstream projects use this mocked object to do tests. 2) Revert mapreduce.JobStatus changes (from int to long), even this is a Evolving API, since this could break downstream project which uses mocked JobStatus to do tests. > Add getMemorySize/getVirtualCoresSize to o.a.h.y.api.records.Resource > - > > Key: YARN-4844 > URL: https://issues.apache.org/jira/browse/YARN-4844 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Fix For: 2.8.0 > > Attachments: YARN-4844-branch-2.8.0016_.patch, > YARN-4844-branch-2.addendum.1_.patch, YARN-4844-branch-2.addendum.2.patch, > YARN-4844.1.patch, YARN-4844.10.patch, YARN-4844.11.patch, > YARN-4844.12.patch, YARN-4844.13.patch, YARN-4844.14.patch, > YARN-4844.15.patch, YARN-4844.16.branch-2.patch, YARN-4844.16.patch, > YARN-4844.2.patch, YARN-4844.3.patch, YARN-4844.4.patch, YARN-4844.5.patch, > YARN-4844.6.patch, YARN-4844.7.patch, YARN-4844.8.branch-2.patch, > YARN-4844.8.patch, YARN-4844.9.branch, YARN-4844.9.branch-2.patch > > > We use int32 for memory now, if a cluster has 10k nodes, each node has 210G > memory, we will get a negative total cluster memory. > And another case that easier overflows int32 is: we added all pending > resources of running apps to cluster's total pending resources. If a > problematic app requires too much resources (let's say 1M+ containers, each > of them has 3G containers), int32 will be not enough. > Even if we can cap each app's pending request, we cannot handle the case that > there're many running apps, each of them has capped but still significant > numbers of pending resources. > So we may possibly need to add getMemoryLong/getVirtualCoreLong to > o.a.h.y.api.records.Resource. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4844) Add getMemorySize/getVirtualCoresSize to o.a.h.y.api.records.Resource
[ https://issues.apache.org/jira/browse/YARN-4844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4844: - Attachment: YARN-4844-branch-2.addendum.2.patch > Add getMemorySize/getVirtualCoresSize to o.a.h.y.api.records.Resource > - > > Key: YARN-4844 > URL: https://issues.apache.org/jira/browse/YARN-4844 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > Fix For: 2.8.0 > > Attachments: YARN-4844-branch-2.8.0016_.patch, > YARN-4844-branch-2.addendum.1_.patch, YARN-4844-branch-2.addendum.2.patch, > YARN-4844.1.patch, YARN-4844.10.patch, YARN-4844.11.patch, > YARN-4844.12.patch, YARN-4844.13.patch, YARN-4844.14.patch, > YARN-4844.15.patch, YARN-4844.16.branch-2.patch, YARN-4844.16.patch, > YARN-4844.2.patch, YARN-4844.3.patch, YARN-4844.4.patch, YARN-4844.5.patch, > YARN-4844.6.patch, YARN-4844.7.patch, YARN-4844.8.branch-2.patch, > YARN-4844.8.patch, YARN-4844.9.branch, YARN-4844.9.branch-2.patch > > > We use int32 for memory now, if a cluster has 10k nodes, each node has 210G > memory, we will get a negative total cluster memory. > And another case that easier overflows int32 is: we added all pending > resources of running apps to cluster's total pending resources. If a > problematic app requires too much resources (let's say 1M+ containers, each > of them has 3G containers), int32 will be not enough. > Even if we can cap each app's pending request, we cannot handle the case that > there're many running apps, each of them has capped but still significant > numbers of pending resources. > So we may possibly need to add getMemoryLong/getVirtualCoreLong to > o.a.h.y.api.records.Resource. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4143) Optimize the check for AMContainer allocation needed by blacklisting and ContainerType
[ https://issues.apache.org/jira/browse/YARN-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-4143: --- Attachment: YARN-4143.002.patch Since [~adhoot] doesn't seem to be working on this one anymore, I just rebased it and posted a new patch. This patch turns out to be considerably simpler than its predecessor because some of the cleanup that was in the original patch made it in already. [~sunilg], wanna have a look? > Optimize the check for AMContainer allocation needed by blacklisting and > ContainerType > -- > > Key: YARN-4143 > URL: https://issues.apache.org/jira/browse/YARN-4143 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Anubhav Dhoot >Assignee: Daniel Templeton > Attachments: YARN-4143.001.patch, YARN-4143.002.patch > > > In YARN-2005 there are checks made to determine if the allocation is for an > AM container. This happens in every allocate call and should be optimized > away since it changes only once per SchedulerApplicationAttempt -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1773) ShuffleHeader should have a format that can inform about errors
[ https://issues.apache.org/jira/browse/YARN-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329895#comment-15329895 ] Varun Saxena commented on YARN-1773: [~djp], sorry this fell off my radar. Do not have bandwidth to fix it, in short term. You can take it up, if you want. > ShuffleHeader should have a format that can inform about errors > --- > > Key: YARN-1773 > URL: https://issues.apache.org/jira/browse/YARN-1773 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.3.0, 2.4.0 >Reporter: Bikas Saha >Assignee: Varun Saxena >Priority: Critical > > Currently, the ShuffleHeader (which is a Writable) simply tries to read the > successful header (mapid, reduceid etc). If there is an error then the input > will have an error message instead of (mapid, reducedid etc). Thus parsing > the ShuffleHeader fails and since we dont know where the error message ends, > we cannot consume the remaining input stream which may have good data from > the remaining map outputs. Being able to encode the error in the > ShuffleHeader will let us parse out the error correctly and move on to the > remaining data. > The shuffle handler response should say which maps are in error and which are > fine, what the error was for the erroneous maps. These will help report > diagnostics for easier upstream reporting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5070) upgrade HBase version for first merge
[ https://issues.apache.org/jira/browse/YARN-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329892#comment-15329892 ] Joep Rottinghuis commented on YARN-5070: Thanks [~vrushalic] patch 05 looks good to me. There are a few items that we could improve upon, but those aren't introduced in this patch, so we should tackle them in a separate jira: * Line 261 " + FlowRunRowKey.parseRowKey(cells.get(0).getRow()).toString());" uses a deprecated method. * Line 208 has @SuppressWarnings("deprecation"). Is that really needed? * you already left a TODO, which we can tackle separately as well. > upgrade HBase version for first merge > - > > Key: YARN-5070 > URL: https://issues.apache.org/jira/browse/YARN-5070 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Vrushali C >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: YARN-5070-YARN-2928.01.patch, > YARN-5070-YARN-2928.02.patch, YARN-5070-YARN-2928.03.patch, > YARN-5070-YARN-2928.04.patch, YARN-5070-YARN-2928.05.patch > > > Currently we set the HBase version for the timeline service storage to 1.0.1. > This is a fairly old version, and there are reasons to upgrade to a newer > version. We should upgrade it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5214) Pending on synchronized method DirectoryCollection#checkDirs can hang NM's NodeStatusUpdater
[ https://issues.apache.org/jira/browse/YARN-5214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329884#comment-15329884 ] Hadoop QA commented on YARN-5214: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 53s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 2 new + 7 unchanged - 2 fixed = 9 total (was 9) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 57s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 26m 3s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12810462/YARN-5214.patch | | JIRA Issue | YARN-5214 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 7110c16259f9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 8e8cb4c | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/12013/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/12013/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/12013/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Pending on synchronized method DirectoryCollection#checkDirs can hang NM's >
[jira] [Commented] (YARN-5224) Logs for a completed container are not available in the yarn logs output for a live application
[ https://issues.apache.org/jira/browse/YARN-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329878#comment-15329878 ] Xuan Gong commented on YARN-5224: - [~djp] The unit test failure is not related to this patch. Created a separate jira: https://issues.apache.org/jira/browse/YARN-5248 to track it > Logs for a completed container are not available in the yarn logs output for > a live application > --- > > Key: YARN-5224 > URL: https://issues.apache.org/jira/browse/YARN-5224 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.9.0 >Reporter: Siddharth Seth >Assignee: Xuan Gong > Attachments: YARN-5224.1.patch, YARN-5224.2.patch, YARN-5224.3.patch > > > This affects 'short' jobs like MapReduce and Tez more than long running apps. > Related: YARN-5193 (but that only covers long running apps) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5248) TestLogsCLI#testFetchApplictionLogs fails in trunk/branch-2
Xuan Gong created YARN-5248: --- Summary: TestLogsCLI#testFetchApplictionLogs fails in trunk/branch-2 Key: YARN-5248 URL: https://issues.apache.org/jira/browse/YARN-5248 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-1773) ShuffleHeader should have a format that can inform about errors
[ https://issues.apache.org/jira/browse/YARN-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329865#comment-15329865 ] Junping Du commented on YARN-1773: -- Hi [~varun_saxena], this JIRA has pending for a while. Do you still have plan to fix it? > ShuffleHeader should have a format that can inform about errors > --- > > Key: YARN-1773 > URL: https://issues.apache.org/jira/browse/YARN-1773 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.3.0, 2.4.0 >Reporter: Bikas Saha >Assignee: Varun Saxena >Priority: Critical > > Currently, the ShuffleHeader (which is a Writable) simply tries to read the > successful header (mapid, reduceid etc). If there is an error then the input > will have an error message instead of (mapid, reducedid etc). Thus parsing > the ShuffleHeader fails and since we dont know where the error message ends, > we cannot consume the remaining input stream which may have good data from > the remaining map outputs. Being able to encode the error in the > ShuffleHeader will let us parse out the error correctly and move on to the > remaining data. > The shuffle handler response should say which maps are in error and which are > fine, what the error was for the erroneous maps. These will help report > diagnostics for easier upstream reporting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329835#comment-15329835 ] Sangjin Lee commented on YARN-5243: --- I now think that it might be an existing trunk issue. YARN-5117 seems to be what we're seeing. The distributed scheduling feature has been in active merge into trunk in individual JIRAs, and we may need to rebase to the latest to pick up a more complete feature and bugfixes along the way. > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch, > YARN-5243-YARN-2928.02.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5214) Pending on synchronized method DirectoryCollection#checkDirs can hang NM's NodeStatusUpdater
[ https://issues.apache.org/jira/browse/YARN-5214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-5214: - Attachment: YARN-5214.patch Thanks [~nroberts]. Attach a patch to replace whole synchronized methods with fine grained read/write lock on shared resources (localDirs, errorDirs, fullDirs, numFailures, etc.) between threads. The key idea here is to release lock on testDirs() which is a heavy IO operation sometimes get blocked in heavy IO case. Please note that this is not the finest-grain lock as we could aggressively make different shared resources access wait on different locks, but that could make some logic here a bit tricky and could bring risk of deadlock. > Pending on synchronized method DirectoryCollection#checkDirs can hang NM's > NodeStatusUpdater > > > Key: YARN-5214 > URL: https://issues.apache.org/jira/browse/YARN-5214 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: YARN-5214.patch > > > In one cluster, we notice NM's heartbeat to RM is suddenly stopped and wait a > while and marked LOST by RM. From the log, the NM daemon is still running, > but jstack hints NM's NodeStatusUpdater thread get blocked: > 1. Node Status Updater thread get blocked by 0x8065eae8 > {noformat} > "Node Status Updater" #191 prio=5 os_prio=0 tid=0x7f0354194000 nid=0x26fa > waiting for monitor entry [0x7f035945a000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.getFailedDirs(DirectoryCollection.java:170) > - waiting to lock <0x8065eae8> (a > org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection) > at > org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.getDisksHealthReport(LocalDirsHandlerService.java:287) > at > org.apache.hadoop.yarn.server.nodemanager.NodeHealthCheckerService.getHealthReport(NodeHealthCheckerService.java:58) > at > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.getNodeStatus(NodeStatusUpdaterImpl.java:389) > at > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.access$300(NodeStatusUpdaterImpl.java:83) > at > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl$1.run(NodeStatusUpdaterImpl.java:643) > at java.lang.Thread.run(Thread.java:745) > {noformat} > 2. The actual holder of this lock is DiskHealthMonitor: > {noformat} > "DiskHealthMonitor-Timer" #132 daemon prio=5 os_prio=0 tid=0x7f0397393000 > nid=0x26bd runnable [0x7f035e511000] >java.lang.Thread.State: RUNNABLE > at java.io.UnixFileSystem.createDirectory(Native Method) > at java.io.File.mkdir(File.java:1316) > at > org.apache.hadoop.util.DiskChecker.mkdirsWithExistsCheck(DiskChecker.java:67) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:104) > at > org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.verifyDirUsingMkdir(DirectoryCollection.java:340) > at > org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.testDirs(DirectoryCollection.java:312) > at > org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.checkDirs(DirectoryCollection.java:231) > - locked <0x8065eae8> (a > org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection) > at > org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.checkDirs(LocalDirsHandlerService.java:389) > at > org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.access$400(LocalDirsHandlerService.java:50) > at > org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService$MonitoringTimerTask.run(LocalDirsHandlerService.java:122) > at java.util.TimerThread.mainLoop(Timer.java:555) > at java.util.TimerThread.run(Timer.java:505) > {noformat} > This disk operation could take longer time than expectation especially in > high IO throughput case and we should have fine-grained lock for related > operations here. > The same issue on HDFS get raised and fixed in HDFS-7489, and we probably > should have similar fix here. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-5243: -- Attachment: YARN-5243-YARN-2928.02.patch Posted patch v.2. Fixed the unit test failure (the {{ResourceManager}} change was corrected so that the timeline collector manager is initialized before the system metrics publisher). I haven't tracked down the {{TestQueuingContainerManager}} failure yet which appears to be an existing issue (either in our branch or in trunk). > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch, > YARN-5243-YARN-2928.02.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329699#comment-15329699 ] Sangjin Lee commented on YARN-5243: --- Hmm, I tried the base trunk commit on which we rebased our branch ( {{e61d431}} ), and the test passes on that commit. > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329689#comment-15329689 ] Varun Saxena commented on YARN-5243: TestQueuingContainerManager must have been brought over from trunk as its related to distributed scheduling work. > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329684#comment-15329684 ] Varun Saxena edited comment on YARN-5243 at 6/14/16 3:42 PM: - Thanks [~sjlee0] for the patch. TestQueuingContainerManager must be because of trunk code as its related to distributed scheduling. I could not find a directly mapping JIRA fixing this though. TestDistributedShell is failing due to order of starting of collector manager and SMP. Collector manager should be started before SMP but in the patch its the other way round. was (Author: varun_saxena): Thanks [~sjlee0] for the patch. TestQueuingContainerManager must be because of trunk code as its related to distributed scheduling. I could not find a directly mapping JIRA though. TestDistributedShell is failing due to order of starting of collector manager and SMP. Collector manager should be started before SMP but in the patch its the other way round. > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-5243: --- Comment: was deleted (was: Thanks [~sjlee0] for the patch. TestQueuingContainerManager must be because of trunk code as its related to distributed scheduling. I could not find a directly mapping JIRA fixing this though. TestDistributedShell is failing due to order of starting of collector manager and SMP. Collector manager should be started before SMP but in the patch its the other way round.) > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329684#comment-15329684 ] Varun Saxena commented on YARN-5243: Thanks [~sjlee0] for the patch. TestQueuingContainerManager must be because of trunk code as its related to distributed scheduling. I could not find a directly mapping JIRA though. TestDistributedShell is failing due to order of starting of collector manager and SMP. Collector manager should be started before SMP but in the patch its the other way round. > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5243) fix several rebase and other miscellaneous issues before merge
[ https://issues.apache.org/jira/browse/YARN-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329680#comment-15329680 ] Sangjin Lee commented on YARN-5243: --- The {{TestDistributedShell}} failure is introduced with the patch. However, it appears {{TestQueuingContainerManager}} has been failing on our branch for some time. I can reproduce it as far back as commit {{2c93006}}. I'm looking into both. > fix several rebase and other miscellaneous issues before merge > -- > > Key: YARN-5243 > URL: https://issues.apache.org/jira/browse/YARN-5243 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee > Labels: yarn-2928-1st-milestone > Attachments: YARN-5243-YARN-2928.01.patch > > > I have come across a couple of miscellaneous issues while inspecting the > diffs against the trunk. > We also need to review one last time (probably after the final rebase) to > ensure the timeline services v.2 leaves no impact when disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3388) Allocation in LeafQueue could get stuck because DRF calculator isn't well supported when computing user-limit
[ https://issues.apache.org/jira/browse/YARN-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Roberts updated YARN-3388: - Attachment: YARN-3388-v3.patch [~leftnoteasy], [~eepayne]. Ok, "soon" was extremely relative;) Sorry about that. I think I addressed Wangda's comments but I need label partition experts to take a look. Any ideas why people don't hit this more often? We find it's very easy to get stuck at queueCapacity even though userLimitFactor and maxCapacity say the system should allocate further. Do you think people aren't using DRF and are mostly just using memory as the resource? > Allocation in LeafQueue could get stuck because DRF calculator isn't well > supported when computing user-limit > - > > Key: YARN-3388 > URL: https://issues.apache.org/jira/browse/YARN-3388 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 2.6.0 >Reporter: Nathan Roberts >Assignee: Nathan Roberts > Attachments: YARN-3388-v0.patch, YARN-3388-v1.patch, > YARN-3388-v2.patch, YARN-3388-v3.patch > > > When there are multiple active users in a queue, it should be possible for > those users to make use of capacity up-to max_capacity (or close). The > resources should be fairly distributed among the active users in the queue. > This works pretty well when there is a single resource being scheduled. > However, when there are multiple resources the situation gets more complex > and the current algorithm tends to get stuck at Capacity. > Example illustrated in subsequent comment. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5224) Logs for a completed container are not available in the yarn logs output for a live application
[ https://issues.apache.org/jira/browse/YARN-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329557#comment-15329557 ] Junping Du commented on YARN-5224: -- Unit test failure seem to be related. [~xgong], can you fix it? Thx! > Logs for a completed container are not available in the yarn logs output for > a live application > --- > > Key: YARN-5224 > URL: https://issues.apache.org/jira/browse/YARN-5224 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.9.0 >Reporter: Siddharth Seth >Assignee: Xuan Gong > Attachments: YARN-5224.1.patch, YARN-5224.2.patch, YARN-5224.3.patch > > > This affects 'short' jobs like MapReduce and Tez more than long running apps. > Related: YARN-5193 (but that only covers long running apps) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5224) Logs for a completed container are not available in the yarn logs output for a live application
[ https://issues.apache.org/jira/browse/YARN-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329539#comment-15329539 ] Hadoop QA commented on YARN-5224: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 58s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 52s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 32s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 11 unchanged - 1 fixed = 13 total (was 12) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 50s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 34s {color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 39m 17s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.client.cli.TestLogsCLI | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:2c91fd8 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12810182/YARN-5224.3.patch | | JIRA Issue | YARN-5224 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux d61d8fa21da0 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 8e8cb4c | | Default Java | 1.8.0_91 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/12010/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/12010/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt | | unit test logs |
[jira] [Commented] (YARN-5130) Mark ContainerStatus and NodeReport as evolving
[ https://issues.apache.org/jira/browse/YARN-5130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329527#comment-15329527 ] Sunil G commented on YARN-5130: --- Yes. I agree with your point. There should be a strict notion while making changes on a Stable interface especially to those exposed to client and other modules. We could have some test cases to detect this, still a hack in test cases could take us back to original problem. > Mark ContainerStatus and NodeReport as evolving > --- > > Key: YARN-5130 > URL: https://issues.apache.org/jira/browse/YARN-5130 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Gergely Novák >Priority: Minor > Attachments: YARN-5130.001.patch > > > It turns out that slider won't build as the {{ContainerStatus}} and > {{NodeReport}} classes have added more abstract methods, so breaking the mock > objects. > While it is everyone's freedom to change things, these classes are both tagged > {code} > @Public > @Stable > {code} > Given they aren't stable, can someone mark them as {{@Evolving}}? That way > when downstream code breaks, we can be less disappointed -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5224) Logs for a completed container are not available in the yarn logs output for a live application
[ https://issues.apache.org/jira/browse/YARN-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329485#comment-15329485 ] Junping Du commented on YARN-5224: -- Thanks [~xgong] for updating the patch. Jenkins seem to be on strike again. Kick off it manually. +1 pending on Jenkins result. > Logs for a completed container are not available in the yarn logs output for > a live application > --- > > Key: YARN-5224 > URL: https://issues.apache.org/jira/browse/YARN-5224 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.9.0 >Reporter: Siddharth Seth >Assignee: Xuan Gong > Attachments: YARN-5224.1.patch, YARN-5224.2.patch, YARN-5224.3.patch > > > This affects 'short' jobs like MapReduce and Tez more than long running apps. > Related: YARN-5193 (but that only covers long running apps) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-4764) Application submission fails when submitted queue is not available in scheduler xml
[ https://issues.apache.org/jira/browse/YARN-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329398#comment-15329398 ] Brahma Reddy Battula edited comment on YARN-4764 at 6/14/16 1:01 PM: - Just linking the broken jira. was (Author: brahmareddy): Just linking broken to jira. > Application submission fails when submitted queue is not available in > scheduler xml > --- > > Key: YARN-4764 > URL: https://issues.apache.org/jira/browse/YARN-4764 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Fix For: 2.9.0 > > Attachments: 0001-YARN-4764.patch, 0002-YARN-4764.patch > > > Available queues in capacity scheduler > -root > --queue1 > --queue2 > Submit application with queue3 > {noformat} > 16/03/04 16:40:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: > job_1457077554812_1901 > 16/03/04 16:40:08 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, > Service: ha-hdfs:hacluster, Ident: (HDFS_DELEGATION_TOKEN token 3938 for > mapred with renewer yarn) > 16/03/04 16:40:08 WARN retry.RetryInvocationHandler: Exception while invoking > class > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication > over rm2. Not retrying because try once and fail. > java.lang.NullPointerException: java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:366) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:289) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:618) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:252) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:483) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2305) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2301) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2301) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:422) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateRuntimeException(RPCUtil.java:85) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:122) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.submitApplication(ApplicationClientProtocolPBClientImpl.java:272) > {noformat} > Should be queue doesnt exist -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org